HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK
Lecture 5
DATA AND INSTRUCTION FORMATS, ADDRESSING METHODS AND MACHINE PROGRAMMING CONCEPTS (2) Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - V - D&IF(2) - 1
COMPUTER PROGRAMMING CONCEPTS •
Choice of a language: – machine language (uses 0‘s and 1‘s or hex, inconvenient and error prone) – assembly language (machine dependent but efficient) – high-level language (convenient and universal)
•
Compiler/assembler and operating system in combination with computer architecture (instruction set plus I/O capabilities) are decisive factors with respect to computer performance
•
The most important advantages of knowing an assembly langage: – understanding how a computer really works – learning how to debug programs – being able to optimize the performance CA - V - D&IF(2) - 2
EXAMPLE OF AN INSTRUCTION EXECUTION 4-bit operation field
6-bit source 6-bit destination address field address field
Op-code 15
src
ADD #R7, R3
dstn
12 11
6 5
0
General format 32
R3 PC = R7
R3
49
PC = R7
i
i+4
Autoincrement mode 010 (210): EA=[PC]=[R7] and increment PC=R7
add 2
i i+2
R7
0
R3
17
add 2
i i+2
i+4
R7
0
R3
17
i+4
Before instruction fetch
After instruction execution
CA - V - D&IF(2) - 3
EXAMPLE OF A SHORT PROGRAM EXECUTION Main memory 16-bit word contents address A = 1150 B = 1152
1200 MOV 1202 1204 ADD 1206 1208 MOV 1210 1212
1) R0 = 317
317 - 193
2) R0 = 124
6
R7 - 54
0
R0
EA = X + PC = -54 + 1204 = 1150
6
R7 - 56
0
R0
0
R6 988
6
R7
MOV A, R0 ADD B, R0 MOV R0, C
HALT
HALT
Assembly language version of program Mode “6“ - Relative index addressing EA=X + [PC]=X + [R7 ]
C = 2200
124 CA - V - D&IF(2) - 4
Example of an stw (storeword) instruction execution (Power PC) 6 bit Opcode
5 bit Source register
5 bit Sourceaddress A
Offset
(a) G en eral format
Autoincrement mode stw stw rS, d(rA) , where rS (Source) and rA (Destination) are registers and d is an offset The contents of rS are stored into the word in memory addressed by EA, where EA=(rA)+d. Binary representation of mnemonic stw: 100100 Decimal representation of mnemonic stw: 36 CA - V - D&IF(2) - 5
I/O PROGRAMMING (I) I/O BUS TERMINAL
CPU
TTYIN
TTYOUT
8
10 Op./s
CIN Keyboard
COUT Monitor
Big Problem: Matching Speeds
e.g., 10 - 100 Characters/sec.
TTYIN
8 bit buffer register associated with a keyboard
TTYOUT
8 bit buffer register associated with a monitor
CIN, COUT
control flags
CIN = 1
informs CPU that a valid character is in TTYIN
COUT= 1
informs CPU that it can transfer a valid character to TTYOUT
MOVB TTYIN, R1
input operation
MOVB R1, TTYOUT output operation CA - V - D&IF(2) - 6
I/O PROGRAMMING (II) MOV
#LOC,R0
READ: TSTB KBSTATUS BPL READ MOVB TTYIN,@R0 ECHOBACK: TSTB BPL
PRSTATUS ECHOBACK
Initialize pointer register R0 to contain the address of the first location of the area in main memory where the characters are to be loaded. Wait for a character to be entered into the keyboard buffer register TTYIN. Transfer the character from into the main memory (this clears CIN to 0). Wait for the monitor to become ready.
MOVB @R0,TTYOUT Move the character just read the monitor buffer register TTYOUT for printing (this clears COUT to 0). CMPB (R0)+,#CR BNE READ HALT
Check to see if the character just read is "carriage return” (CR). If it is not CR, branch back and read another character, otherwise stop. The pointer register R0 is incremented , anticipating that another character will be read. CA - V - D&IF(2) - 7
TYPE OF ARCHITECTURES • • • • • •
STACK ARCHITECTURES REGISTER-REGISTER ARCHITECTURES REGISTER-STORAGE ARCHITECTURES STORAGE-STORAGE ARCHITECTURES TAGGED ARCHITECTURES TRADEOFFS AMONG STACK, REGISTER AND STORAGE ARCHITECTURES – – – – – –
PERFORMANCE EFFICIENCY DESIGN COMPLEXITY EASE OF PROGRAMMING EASE OF PARAMETER PASSING AND SUBROUTINE CALLS RECURSION
CA - V - D&IF(2) - 8
A STACK IN THE MAIN MEMORY 0 Stack Pointer Register
. . . Ri
-28 17 739
Current Top Element
.
The Stack
. 43 . . M–1
CA - V - D&IF(2) - 9
Bottom Element
EXAMPLES OF STACK OPERATIONS MOV MOV
R1
R1
-28
NEWITEM,- (R1) (R1)+,TOPITEM
R1
19 -28
17
(push) (pop)
17
R1
739
-11 739
.
739 .
17 739
.
.
.
.
.
.
.
.
.
.
.
.
.
43 Newitem
.
Stack
19
Topitem (a) Initial contents of stack
43 Newitem
43
19
Topitem
43
Newitem
19
Newitem
Topitem
-28
Topitem
(b) After push from NEWITEM
c) After pop into TOPITEM
Notice (a) => (b), (a) => (c), (a) => (d)
CA - V - D&IF(2) - 10
19
d) After executing ADD (R1)+, @R1
STACK MACHINES (1) • • • • • • •
Most instructions manipulate the top few data items (mostly top 2) of a pushdown stack Additional instructions are provided to move data between memory and top of stack Top few items of the stack (2 to 8 items or more) are kept in the CPU Instructions manipulate the top of the stack implicitly Ideal for evaluating expressions (stack holds intermediate results) Were thought to be a good match for high level languages Awkward: – stack becomes very slow if it grows beyond CPU local storage – no simple way to get data from ‘middle of stack‘
CA - V - D&IF(2) - 11
STACK MACHINES (2) •
Binary arithmetic and logic operations: – operands: top 2 items on stack – operands are removed from stack – result is placed on top of stack
•
Unary arithmetic and logic operations: – operand: top item on the stack – operand is replaced by result of operation
•
Data move operations: – push: place memory data on top of stack – pop: move top of stack to memory CA - V - D&IF(2) - 12
STACK MACHINES - SAMPLE PROGRAM •
We evaluate our favorite expression(y ← ax2+ bx+c) ; we use a hypothetical assembly language (as usual) push push dup mult mult push push mult push add add pop
a x
b x c
y
; tos: a ; tos: a x ; tos: a x x ; tos: a x2 ; tos: a x2 ; tos: a x2 b ; tos: a x2 b x ; tos: a x2 bx ; tos: a x2 bx c ; tos: a x2 bx+c ; tos: a x2+bx+c ; y <- a x2+bx+c
CA - V - D&IF(2) - 13
GENERAL PURPOSE REGISTER MACHINES •
With stack machines, only the top two elements of the stack are directly available to instructions. In general purpose register machines, the CPU storage is organized as a set of registers which are equally available to the instructions.
•
Frequently used operands are placed in registers (under program control) – this reduces instruction size – also reduces memory traffic
CA - V - D&IF(2) - 14
GPR MACHINES - SAMPLE PROGRAM •
Again, we evaluate (y←ax2+bx+c) on a hypothetical machine with 16 registers, R0 to R15, and 2 operand register-register ALU operations load load load load mult mult mult add add store
x, R1 a, R2 b, R3 c, R4 R1, R2 R1, R2 R1, R3 R2, R3 R3, R4 R4, y
; ; ; ; ; ; ; ; ; ;
R1 <- x R2 <- a R3 <- b R4 <- c R2 <- ax R2 <- ax2 R3 <- bx R3 <- ax2 + bx R4 <- ax2 + bx + c y <- ax2 + bx + c
CA - V - D&IF(2) - 15
GPR MACHINES - SAMPLE MACHINES •
VAX
•
IBM 360, IBM 370, PC-RT, RISC 6000
•
PowerPC, MOTOROLA 68000
•
CDC AND CRAY
•
SPARC, MIPS, PENTIUM
CA - V - D&IF(2) - 16
EXAMPLE OF PROGRAM AND DATA SEGMENT ORGANIZATION MC68000 Program base
Program segment Data limit (DL)
Data segment
(PB)
Data area Program counter (PC)
Data base (DB)
Program limit (PL)
Stack marker
Increasing memory addresses
Stack filled by previous procedures
Stack pointer (SP)
Stack filled by current procedure Empty stack locations
Stack limit (SL) CA - V - D&IF(2) - 17
CALLING STACK (I) Memory
Calling Program
Subroutine
location
200
Call_subroutine SUB
201
“next instruction”
SUB
“first instruction”
Branch (LINK)
The Call_subroutine instruction places the address 201 into memory location LINK a) Through a dedicated location LINK CA - V - D&IF(2) - 18
CALLING STACK (II) Memory
Calling Program
Subroutine
location
200
Call_subroutine SUB
201
“next instruction”
SUB
“first instruction”
Return
The Call_subroutine instruction
The return instruction
pushes the address 201 onto the stack.
pops the top of the stack into the PC.
b) Using a stack CA - V - D&IF(2) - 19
NESTED SUBROUTINE CALLS ON A STACK (I) R5 [R6] = 1050
old value of R5 topvalue .
Processorstack
. .
a) Initially
R5 is a linkage register (stores the value of the PC). R6 is a pointer to the processor stack.
CA - V - D&IF(2) - 20
NESTED SUBROUTINE CALLS ON A STACK (II) R5
[R6] = 1048
R5
2004
[R6] = 1042
old value of R5
2006
2254
topvalue
[R1] main
.
[R0] main
.
old value of R5 topvalue
.
. . . b) After execution of JSR R5, SUB1
c) After execution of JSR R7, SUB2
CA - V - D&IF(2) - 21
NESTED SUBROUTINE CALLS ON A STACK (III) R5
[R6] = 1040
2006
R5
[R0] sub1
[R6] = 1044
2006
[R1] main
2254
[R0] main
[R1] main
old value of R5
[R0] main
topvalue
old value of R5
.
topvalue
.
.
.
. . d) After execution of MOV R0, -(R6) in SUB 2 CA - V - D&IF(2) - 22
e) After execution of RTS R7
NESTED SUBROUTINE CALLS ON A STACK (IV)
R5
old value of R5
[R6] = 1050
topvalue . . .
f) After execution of RTS R5
2006
CA - V - D&IF(2) - 23
NESTED SUBROUTINE CALLS ON A STACK ii Main program 2000 2004 2006 2008 Subroutine SUB1: 2200
2250 2254
Subroutine 3000 SUB2:
… JSR R5,SUB1 .WORD PARAM .WORD ANSWER next instruction …….
Call SUB1, passing addresses PARAM and ANSWER through memory locations.
MOV R0,-(R6) Save [R0]main and [R1]main on the processor stack. MOV R1,-(R6) Load parameter into R0. MOV @(R5)+,R0 ……. MOV LOC,R1 Place a parameter into R1 and call SUB2. JSR R7,SUB2 next instruction ……. MOV RESULT,@(R5)+ Send [RESULT] as a “result“ into location ANSWER in the main program. MOV (R6)+,R1 Restore original contents of MOV (R6)+,R0 registers R0 and R1 RTS R5 MOV …….. ADD MOV RTS
R0,-(R6)
Save [R0]sub1 on the processor stack.
R0,R1 (R6)+,R0 R7
Send a result to SUB1 through R1. Restore R0. CA - V - D&IF(2) - 24
EXAMPLE PROGRAM FOR THE CALCULATION OF n FACTORIAL ON PowerPC lbz r1, 15(r0)
Load the number n from MEM[15+ r0] into register r1
clz r2 add 1 fac: cmpwi crf0, r1, 2
Load 1 into register r2, which holds the result Compare r1 to 2; crf0 is condition register field 0, which is the same as less than, in other words: if r1<2 the bit in crf0 is set to 1
blt end
Conditional jump to a label end, if bit in crf0 is 1
mullw r2, r1, r2
Multiply r2, by r1, the result is stored into r2
subi r1, r1, 1
Decrement r1 by 1
b fac
Jump to a label fac
end: blr
Branch to linkregister
The result is stored in register r2. CA - V - D&IF(2) - 25