Parallel Processing
Parallel Processing • It is an efficient form of information processing which emphasizes the exploitation of concurrent events in the computing process. • Purpose of parallel processing – To speed up the computer processing capability – To increase its throughput
• Advantages over sequential processing – Cost effective – Improved performance ,i.e., faster execution time
Concurrency implies parallelism, simultaneity, and pipelining – Parallel events occur in multiple resources during the same time interval – Simultaneous events may occur at the same time instant – Pipelined events may occur in overlapped time spans
• Note that parallel processing differs from multitasking, in which a single CPU executes several programs at once. • Parallel processing is also called parallel computing.
4 levels of parallel processing • • • •
Job or Program level Task or procedure level Interinstruction level Intrainstruction level
Distributed Processing is a form of parallel processing in a special environment.
Parallelism in Uniprocessor System Basic Uniprocessor System • A typical Uniprocessor (super minicomputer) consists of 3 major components: – Main memory – CPU – I/O subsystem
• Uniprocessor : Mainframe IBM – Main memory: 4 LSUs, storage controller – Peripherals are connected to the I/O channels – Operation with the CPU is asynchronous
Parallel Processing Mechanisms • • • • • •
Multiplicity of functional units Parallelism and pipelining within the CPU Overlapped CPU and I/O operations Use of hierarchical memory system Balancing of subsystem bandwidth Multiprogramming and Time-sharing
1.Multiplicity of functional units • Use of multiple processing elements under one controller • Many of the ALU functions can be distributed to multiple specialized units • These multiple Functional Units are independent of each other
Example: • IBM 360/91 – 2 parallel execution units • Fixed point arithmetic • Floating point arithmetic(2 Functional units) – Floating point add-sub – Floating point multiply-div
2.Parallelism & pipelining within the CPU • Parallelism is provided by building parallel adders in almost all ALUs • Pipelining – Each task is divided into subtasks which can be executed in parallel
3.Overlapped CPU and I/O operations • I/O operations can be performed simultaneously with the CPU computations by using – separate I/O controllers – I/O channels – I/O processors
DMA-Cycle-stealing • Perform I/O transaction between the device and the main memory without the intervention of the CPU – I/O controller will generate a sequence of memory addresses – CPU is responsible for initiating the block transfer – CPU is notified, when the transfer is complete
Use of I/O Processor : achieves maximum concurrency • I/O subsys is facilitated by an IOP • CPU is free to proceed with its primary task • IOP consists of: – I/O processor – I/O channels
• I/O processor is attached directly to system bus • I/O processor is capable of executing I/O requests
4.Use of hierarchical memory system • Speed of CPU = 1000 times speed of Main memory
• hierarchical memory structure is used to close up the speed gap – Cache memory – Virtual memory – Parallel memories for array processors
5.Balancing of subsystem bandwidth • Balancing bandwidth between main memory and CPU • Balancing bandwidth between main memory and I/O
6.Multiprogramming and Time-sharing • Multiprogramming – Mix the execution of various types of programs (I/o bound, CPU bound) – The interleaving of CPU and I/O operations across several programs
• Time-sharing – Time-sharing OS is used to avoid high-priority programs occupying the CPU for long – Fixed or variable time-slices are used – Creates a concept of virtual processors
Parallel Computer Models • Pipeline Computers • Array Computers • Multiprocessor Computers
Performance of parallel computers • Parallel computers with n identical processors do not have speeds n times faster than a single processor computer, but less than n • Speeds range from log2n to n/ln n