Assignment Ii Cst131

June 2020
PDF

Download

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View Assignment Ii Cst131 as PDF for free.

More details

Words: 3,182
Pages: 19

Preview
Full text

SCHOOL OF COMPUTER SCIENCES U N I V ER S IT I SA I N S M A L AY S IA

CST131 - COMPUTER ORGANISATION SEM I 2009/2010 ASSIGNMENT 2

NAME

MATRIC NO.

DA RWAE SH S HA N MU G A M

106705

S H A RV E S H S . S I VA S H A R E N

106458

K AV IN DR A N GU NA S EL IN

102771

06th OCTOBER 2009

CST 131 - Assignment 2

Assignment 2 (CST131 – Computer Organisation) Sharvesh S.Sivasharen (106458), Darwaesh Shanmugam (106705), Kavindran Gunaselin (102771). School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang.

Abstract

Our purpose for this assignment is to discuss the variation of the design of Memory and CPU structure due to the rapid increment of processors speed. In these assignment, we have stated the introduction of Memory and CPU structure as well as our motive for choosing that two aspect for our assignment. Our main discussion is about the variation that occurred in the design of Memory and CPU structure due to the increment in the processor speed. We do also discuss about the pros and cons of previous efforts as well as the possibilities of future efforts. At the end of our assignment we were able to produce the current effects of the increment of processor speed on the design of Memory and CPU structure as well as the future efforts that could be applied by the processor manufacturers to address increment in processor speed.

i

CST 131 - Assignment 2

1.

Introduction and the rationality for choosing the two aspects.

The increase in processor speed has significant demands on the memory system and central processing unit (CPU) structure. Dramatic changes had been by manufactures at the production of memories and CPU. According to the Moore’s Law, the bigger pieces of hardware will generally be slower than smaller pieces and vice versa because for the smaller piece of hardware, the logic and memory are placed closer together on more densely packed chips, the electrical path length is shortened where by this increases the operating speed. There are two reasons why this simple principle is applicable to memories built from similar technologies. First, larger memories have more signal delay and require more levels to decode addresses to fetch the required datum. Second, in most technologies we can obtain smaller memories that are much faster than larger memories. This is primarily because the designer can use more power per memory cell in a smaller design. The fastest memories are normally available in smaller number of bits per chips at any point in time, and they cost substantially more per byte [1]. This is situation is keep on changing from time to time and the memory design varies. This is the reason that encourages us to choose memory as our first factor for this assignment. As we know there are limitations in making a CPU smaller, so the CPU structure is often modified to cope with the demand for speed and smaller size. New technologies and designs have been implemented in a CPU such as pipelining and hyperthreading technology. CPU is considered like a heart of a person where the computer is just a useless piece of metal box with electronic chips in it without a CPU. CPU does all the processing and this is why we have chosen CPU structure as our second factor for this assignment.

As we know, processer speed is increasing from time to time and the processor manufacturers are taking plenty of action to address this problem. The main rational behind our action to choose memory and CPU structure for our assignment is both of these contributes to the “bottleneck problem”, where if the memory is high, it can keep many instructions and the CPU slow, only can execute few instructions at a time, there is no increase in processor speed and vice versa. To increase the processor speed, the designs of memory and CPU structure are the ones which are affected the most.

1

CST 131 - Assignment 2

2.

2.1

Memories and the affects of the design by the increase of processor speed

Cache Memory

This is the closest memory to the CPU that holds the most recently accessed code or data. It is small and fast. The caches are made up of SRAM (static random access memory). When the CPU finds the requested data item in the cache, it is called a cache hit. When the CPU does not find the data item it needs in the cache, it results in cache miss, and the data item must be fetched from main memory. The size of caches can vary widely in their size and organization, and there may be more than one level of cache in a hierarchy [2].

The increase in processor speed has boosted the intention to create multi-level caches. Nowadays there are caches which have a two level structure, with the level closer to the processor designated as L1 cache and the next level designated as L2 cache. Larger caches improve effective bandwidth by sending fewer requests (misses) across the interconnection. Nowadays on-chip caches are increasing to the range of megabyte. Some of the programs might have working sets that are too large to fit in these caches. Further, large on-chip caches have the effect of increasing the system cost [3]. The multi-level caches has increased to the level L3 and L4 this is due to the rate of increase in the processor speed which is keep on increasing from time to time.

Figure 1.1 The process that occurs in cache memory 2

CST 131 - Assignment 2

2.2 Internal / Main Memory

Internal memory provides the needs of caches and functions as the I/O interface. The main memory is usually made up of DRAM (Dynamic random access memory) and has relatively large storage capacity compared to the caches (SRAMs). The DRAMs also have larger access times as compared to SRAMs.

The rate of the increase in processor speed has exceeds the rate of improvement in DRAM memory speed. Separate chips are used for allowing microprocessors to use expensive packages that dissipate high power and provide more of pins to make wider connection to external memory, while allowing DRAMs to use inexpensive packages which dissipate low power and use only a few dozen pins. The increase of processor speed has made the computer designers to scale the number of memory chips independent of the number of processors [2].

Figure 1.2 Processes that occurs in the internal memory

3

CST 131 - Assignment 2

3.

CPU structures and the affects of the design by the increase of processor speed.

3.1 Pipelining

Pipelining is a technique where few instructions are overlapped for execution. [1] In the Intel’s 8088-8086 processor, the instructions are fetched soon as the previous one is complete. This means it is done step by step. For example if there is 9 instructions and each instruction takes 6 unit of time this means the process finishes at the 54th unit. Where else with structured pipeline several instructions could be fetched at the same time. A single-cycle of instruction is divided into six stages. A six stage pipeline will allow up to six instructions to be executed for a single clock cycle. Therefore a single cycle must be divided into six parts, with each part corresponds to a stage of instructions.

Fetch Instruction (FI): read next instruction Decode Instruction (DI): determine opcode & operand specifiers Calculate Operands (CO): addressing mode Fetch Operands(FO): from memory (for register, no need) Execute Instructions(EI): execute and store results in dest

The process is faster if there are more stages in a pipeline as it decreases the execution time. This means the process is faster because the first and the last instruction use the external buses. The cycle of the next instruction can begin at the same time as the internal decoding of the instruction.

4

CST 131 - Assignment 2

Figure 2.1 Six stage pipeline.

As shown in the figure above, there are six stage pipeline and it reduces the execution time from [T1=nk = 9 instruction x 6 stages] 54 unit of time to [Tk= k+(n-1)=6 stages +(9 instruction – 1)] 14 unit of time. This means the execution is speed up to [speedup = S= T1/Tk= 54/14] 3.86. This process proves that pipelining has affected the processor speed.

5

CST 131 - Assignment 2

Figure 2.2 Graph of speedup factor versus number of instructions.

Figure 2.3 Graph of speedup factor versus number of stages.

The two graphs above prove that the processing speed is increased when the stages per cycle of instruction are increased as it can execute many instructions at once. Pipelining does not decrease the time for a single datum to be processed. It only increases the throughput of the system when processing a stream of instructions

6

CST 131 - Assignment 2

3.2 Hyper-Threading

Hyper-Threading enables a processor to function as two logical processors. [5] A CPU has many parts and Hyper-threading enables all the parts of the CPU to work on different concurrently. Hyper-threading is the next level of super threading which is an improvement of single threaded CPU. The differences between hyper-threading and super threading are there are no restrictions that all the instructions issued by the front end on each clock are from the same thread. The figure below shows the illustration of a single-threaded CPU, a super-threaded CPU and a hyper threaded CPU.

Pipeline bubbles

Single-threaded CPU

Super-threaded CPU

Hyper-threaded CPU

Figure 3.1 Illustration of Single-threading, Super-threading and Hyper-threading processor.

The figure 3.1 clearly shows that single-threaded CPU only runs instructions for a program while the other instructions from other programs waits for it to end, this means there are many pipeline bubbles (unused stages in a pipeline). Where else for the super threaded, the processors are able to executing more than one thread at a time. This means the processor is capable to run instructions for few programs at a time. It is noticed that there are still pipeline bubbles in super-threaded CPU. There are restrictions in super-threaded CPU; all the

7

CST 131 - Assignment 2

instructions issued by the front end on each clock are from the same thread. The arrows in the super-threaded CPU (figure 3.1) show the limitation of mixing the instructions. Compared to the illustration of hyper-threading (figure3.1), we can say that the CPU’s front end is fully utilized as there are no pipeline bubbles. This clearly shows that the processor can now execute more instructions and the speed has been affected. Now with the hyper-threading the processor is much faster as if there are two CPU running. [6]

8

CST 131 - Assignment 2

4.

Discussion.

4.1 Cache Memory

4.1.1 Pros of Increment in Cache Memory

By increasing the cache memory, data can be transferred much faster because there is no usage of system bus. Besides, effective bandwidth could be improved by sending fewer requests to the interconnection. More cache memory could hold working sets for more programs.

4.1.2 Cons of increment in Cache Memory

There might be programs with working sets that can’t fit in these caches and the cost of the system might increase by increasing caches.

4.2 Internal / Main Memory

4.2.1 Pros of Internal / Main Memory

By increasing the size of main memory the number of pins will be increased. This makes the response of the memory faster. Increment in pins make a smaller amount of memory is added to the cache. Thus, it will make the process faster.

4.2.2 Cons of Internal / Main Memory

Increment in main memory has limit and after it exceeds it will end up in more virtual memory and the cost of the system will increase by the increase in processor speed

9

CST 131 - Assignment 2

4.3 Pipelining

4.3.1 Pros of Pipelining

Pipelining reduces the cycle time of processor which increases instruction issued rate. Pipeline can also save circuitry and reduce from implementing a more complex combinational circuit.

4.3.2 Cons of Pipelining

In pipelining, only single instruction is executed at a single time. This prevents the branch prediction delay but it’s not so efficient. The latency caused by instruction in a non pipelined processor is slightly lower compared to a pipelined processor. A pipelined processor is not stable in its bandwidth compared to a non pipelined processor. This is because a different program makes it hard to predict.

4.4 Hyper-Threading

4.4.1 Pros of Hyper-Threading

Hyper-Threading improves the support for multi-threaded codes. It also enables multiple threads to be executed simultaneously. Reaction and responses of are much faster than a single-thread. There will not be any performance loss if only one thread is active. Hyper-threading increases performance with multiple threads and it has better resource utilization.

10

CST 131 - Assignment 2

4.4.2 Cons of Hyper-Threading Hyper-Threading can only process like a dual processor but the performances is not as fast as a true dual core processor. To take advantage of hyper-threading performance, serial execution cannot be used because threads are non-deterministic and involve extra design. Threads do also have increased overhead. Another disadvantages of hyper-threading is that shared resource may have conflicts

11

CST 131 - Assignment 2

5.

Future Trends.

5.1 Future of Memory

Figure 5.1 Illustration of Quantum dots

Quantum dots could well be the future of memory. These quantum dots which are currently in development are atoms of semiconductor which is memory and are capable of creating storage which is fast and longer lasting then current memories in the market such as DRAM .Research have recently shown that quantum dot memory is capable of storing 1 terabyte which is 1000 gigabytes of data per square inch which is vast improvement compared to the technology we have now .They have also shown that it is possible to write information on to this memory in just a fraction of seconds. Although this technology is far from reaching the consumer market, quantum dot may well be the future of memory technology. It looks very promising and could replace all of the different kinds of memory existing today if it can be mass manufactured in the industry. [11] 5.2 MRAM

The second technology currently being developed is MRAM which stands for magnetic random access memory which is combination of information storage in magnetic materials with the semiconductor technology which is used in today memory chips such as the DRAM dynamic random access memory. MRAM works by using the magnetization of the electrons called spin which enters from a magnet into the semiconductor spin polarized transports the information stored in the magnet can be read out electronically at ultra high

12

CST 131 - Assignment 2

speed. The information density of such a MRAM technology would be much higher than today’s RAM where the access times for MRAM may be even smaller. The resulting memory technology offers many advantages which include a less complex structure than that of today DRAM because it has better capabilities for downscaling. There is every reason to expect that MRAM will take over as the standard chip in computing technology when CMOS technology has reached its limits. [12]

Figure 5.2 Illustration of MRAM

5.3 Pipelining and Instructions

The future of pipelining involves the development of using more pipelines if possible and thru parallelism if the technology permits with each pipeline being a superscalar. This process involves moving and processing multiple instructions simultaneously through the fetch, analyze, execute, and store stages. Instructions are run through stages in parallel which every clock cycle is used to process instructions. As a stage of a pipeline in a computers CPU finishes manipulating an instruction, the instruction is pass on to the next stage and gets another instruction from the stage before it while moving several instructions along the pipeline simultaneously. This process is more efficient than it would be if each instruction had to start at the first stage after the previous instruction finished the final stage. The more pipelines a CPU have, the faster it can execute instructions. There would be problem with this

13

CST 131 - Assignment 2

technology but it could be solved by using branch prediction unit which correctly guess the correct instruction sequence.

Figure 5.3 Comparison of RISC, Superscalar, VLIW pipelining

14

CST 131 - Assignment 2

6.

Conclusion

By doing this assignment, we were able to identify the two main aspects that affect the processor speed. Besides we were able to identify the design of memory and CPU structure that varies according to the increment of processor speed. We were also able to discuss the advantages and disadvantages of the affects and the possibilities of future efforts. The future of the design of memory and CPU structure is determined by how fast a processor advances. This in turn requires memory and CPU structure to be developed to match the capability of processors. With each passing year, we can only predict the future but only continuous development of technology will determine how far the technology will lead us in terms of processor speed.

15

References

[1]

John L Hennessy and David A Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufman, CA, 1996.

[2]

David Patterson, Thomas Anderson et al. A Case for Intelligent RAM: IRAM, IEEE Micro, April 1997.

[3]

Nihar R. Mahapatra, Balakrishna Venkatrao, The Processor-Memory bottleneck: Problems and Solutions. www.acm.org/crossroads/xrds5-3/pmgap.html.1996

[4]

Web Media Brands Inc. http://www.webopedia.com/TERM/I/pipelining.html. Last modified: March 2003.

[5]

Intel Corporation. Intel® Pentium® 4 Processor, Supporting Hyper-Threading Technology1. Document Number: 303128-004. http://download.intel.com/ design/ Pentium4/ datashts /303128.pdf.

[6]

Jon Hannibal Stokes. Introduction to Multi-threading, Super-threading and Hyperthreading. http://arstechnica.com/old/content/2002/10/hyperthreading.ars. 2002

[7]

Susan Eggers, Hank Levy, Steve Gribble. Simultaneous Multithreading Project. University of Washington. 2007

[8]

D. Burger, J.R. Goodman, and A. Kagi. Memory Bandwidth Limitations of Future Microprocessors, Proc. 23rd Ann. Int'l Symp. Computer Architecture, Assoc. of Computing Machinery, 1996.

[9]

Wikimedia

Foundation,

Inc.

Hyper-Threading.

en.wikipedia.org/wiki/Hyper-

threading. 2009 [10]

Answers

Corporation.

Computer

Desktop

www.answers.com/topic/hyper-threading. 2009

i

Encyclopedia:

Hyper-threading.

[11]

Chris Lee. The Perfect Computer Memory. http://arstechnica.com/ hardware/ news/ 2007/ 12/ the-perfect-computer-memory.ars. December 26, 2007

[12]

William Stallings. Computer Organization and Architecture Sixth Edition. Pearson Education International. 2003

[13]

Hamacher , V.C.. Computer Organization, Fifth Edition, McGraw Hill, 2002

[14]

David Patterson, Thomas Anderson et al., A Case for Intelligent RAM : IRAM, IEEE Micro, April 1997.

[15]

D. Burger, J.R. Goodman, and A. Kagi. Memory Bandwidth Limitations of Future Microprocessors, Proc. 23rd Ann. Int'l Symp. Computer Architecture, Assoc. of Computing Machinery, pp. 79-90, Aug. 1996.

[16]

Hammerstrom, D. and E. Davidson. Information Content of CPU Memory Referencing Behavior, Proc. 4th Ann. Int'l Symp. Computer Architecture, Assoc. of Computing Machinery, pp. 184-192. March 1977.

[17]

L. Rudolph and D. Criton. Creating a Wider Bus using Caching Techniques. Proc. 1st Int'l Symp. High Performance Computer Architecture. IEEE Comp Society Press, pp. 90-99. 1995.

[18]

T. Mudge and P. Bird. An Instruction stream Compression Technique. Proc of Micro -30, Dec 1997.

[19]

Wm. A. Wulf and Sally A McKee. Hitting the Memory wall: Implications of the Obvious. Computer Architecture News, 23(1), pp. 20-24, March 1995.

ii

Assignment Ii Cst131

Overview

More details

Related Documents

Assignment Ii Cst131

Cst131 Assignment 1-07

Assignment Ii

Assignment Ii

Assignment 1. Vie Ii

Brand Ii Assignment