Concurrent Programming S Session 33: Fundamentals F d l Computer Engineering Department Iran Universityy of Science and Technology gy Tehran, Iran
Lecturer: Nima Ghaemian Distributed Systems Lab. Computer Engineering Department, Department Iran University of Science and Technology,
[email protected]
What is Concurrency? y
What is a sequential q pprogram? g ◦ A single thread of control that executes one instruction and when it is finished execute the next logical instruction
y
What is a concurrent program? ◦ A collection of autonomous sequential threads, executing (logically) in parallel
Concurrency vs vs. Parallelism y
Does Concurrency equal Parallelism? ◦ Concurrency is not (only) Parallelism
y
Interleaved Concurrency ◦ Logically simultaneous processing ◦ Interleaved execution on a single processor
y
Si Simultaneous l Concurrency C ◦ Physically simultaneous processing ◦ Requires q a multiprocessors p or a multicore system
So… So Concurrency
Parallelism
y
Note N t th thatt from f another th point i t off view, i concurrency can be divided into full and pseudo se d parallelism. arallelism
Why Concurrency? y y
No More Clock Cycles! N Natural l Application A li i SStructure x The world is not sequential! Easier to program multiple independent and concurrent activities
y
IIncreased d application li ti th throughput h t and d responsiveness x Not blocking the entire application due to blocking I/O
y
Performance P f from f multiprocessor/multicore l / l hardware x Parallel Execution
y
Distributed Systems x Single application on multiple machines x Client/Server type or peer-to-peer systems
Flynn’ss Taxonomy Flynn y
Classification of Computer p Architectures ◦ SISD x Single Instruction Single Data stream
◦ SIMD x Single Instruction Multiple Data streams
◦ MISD x Multiple Instructions Single Data stream
◦ MIMD x Multiple Instructions Multiple Data streams
SISD Single Instruction Single Data stream y Sequential Computer y Traditional Uniprocessor Systems y
◦ PC ◦ Old Mainframes
SIMD Single Instruction Multiple Data streams y Intel’s Streaming SIMD Extensions (SSE) y Array Processors y Stream S Processors P inside i id a GPU y
MISD Multiple Instructions Single Data stream y Is not common y
y
y
It was even thought to be N/A!!
Where is it applicable? y
Fault Tolerance!
y
Exploratory Decomposition y
y
Decomposition?
An Implementation: y
Space Shuttle Flight Control
MIMD y
Multiple Instructions Multiple Data streams ◦ Multiprocessors ◦ Distributed Systems
y
The Most Popular as top 10 Architectures in TOP500 (from ‘06) ◦ TOP500?
y
Further Divisions… ◦ SPMD x A Grid in GPU
◦ MPMD x Manager/Worker Strategy
Classification of Parallel Computers y y y
Multicore Computing S Symmetric Multiprocessing M l Distributed Computing ◦ Cluster Computing ◦ Massive Parallel Processing p g ◦ Grid Computing
y
Specialized Parallel Computers ◦ ◦ ◦ ◦
Reconfigurable computing with FPGA GPGPU GPGPUs Application-Specific Integrated Circuits Vector Processors
Any Abstract Model? y
PRAM ◦ Parallel Random Access Machine
y
Eliminates the focus on miscellaneous issues ◦ Synchronization x An Important Factor! We’ll talk about it later!
◦ Communication x Von Neumann Bottleneck
y
Lets the designer think explicitly about the exploitation of concurrency
Any Abstract Model? (Cont’d) (Cont d) y
PRAM in Flynn’s y Taxonomy? y ◦ MIMD
y
Programming Model ◦ A New Structure x for i pardo Pi
Any Abstract Model? (Cont’d) (Cont d)
How to Model More Efficiently? y
“To write a pparallel pprogram g in a machineindependent fashion requires the p of abstract models for pparallel development computations without sacrificing the p y of these pprograms” g implementability
y
Communicating Sequential Process (CSP) Calculus of Communicatingg Systems y (CCS) ( ) Petri Nets
y y
Levels of Concurrency y
Levels of Parallelism ◦ Task Level Parallelism x Focuses on distributing execution processes (threads) across different parallel computing nodes
◦ Data Level Parallelism x distributing st but g the t data ata across ac oss differentt parallel pa a computing nodes
◦ Instruction Level Parallelism x fetch, f decode, execute
◦ Bit Level Parallelism x How hardware treats words
Levels of Concurrency (Cont’d) (Cont d) y
Granularity x Ratio of the compute time to communication overhead x Ratio of the synchronizations
y
Levels of Concurrency in terms of Granularity ◦ ◦ ◦ ◦
Very Coarse-Grained Coarse-Grained Medium-Grained Fine-Grained
Models of Concurrent Programming y
Attributes of Concurrency Models ◦ Level of granularity (level of atomicity) x Discussed before..
◦ Sharing the clock ◦ Sharing the memory ◦ Pattern of interaction x Synchronization x Communication
◦ Pattern P off synchronization h i i x Mutual Exclusion x Mutual Admission
Models of Concurrent Programming y
Attributes of Concurrency Models (Cont’d) ◦ Pattern P tt off Communication C i ti x Synchronous x Asynchronous
◦ Specifying Concurrency x Application Concurrency x Implementation p Concurrencyy
◦ Implementation of Concurrency x Effective Concurrency x Simulated Concurrency
◦ Interaction Protocol x Cooperative x Competitive
Development of Parallel Languages y
Sequential Language Extensions ◦ ◦ ◦ ◦ ◦
y
UPC (Unified Parallel C) CUDA C Brook (Stanford’s Merrimac) Cilk Cilk++ HPF (High Performance Fortran)
Libraries ◦ MPI
y
Parallelizing Compilers
◦ Automatic ◦ Directed (using OpenMP)
y y
New Parallel Languages Oz