ARM7TDMI-S CPU
MultiMarket Semiconductors BL Standard ICs - Microcontrollers February 2004
ARM Architecture • Thumb state • Instruction set • Processor Modes • Register usage • Interrupt Handling • 3-stage Pipeline Semiconductors
2
ARM7TDMI-S The ARM7TDMI-S is based on ARM7 core – 3 stage pipeline – Von Neumann architecture – CPI ~1.9 – T:
Thumb instruction set
– D:
includes debug extensions
– M:
enhanced multiplier (32x8) with instructions for 64-bit results
– I:
core has EmbeddedICE logic extensions
– S:
fully synthesisable (soft IP)
Semiconductors
3
Thumb State
Semiconductors
4
Thumb state • ARM uses a 32-bit architecture with a subset of 16-bit instructions, still using 32-bit data and registers. • Set of instructions re-coded into 16 bits – Improved code density by ~ 30% – saving program memory space
• In Thumb state only the program code is 16-bit wide – after fetching the 16-bit instructions from memory, they are decompressed to 32 bit instructions before they are decoded and executed – all operations are still 32-bit operations
Semiconductors
5
ARM and Thumb Interworking • Switch between ARM state and Thumb state using BX instruction – In ARM state:
BX Rn
– In Thumb state: BX Rn Rn
3 1
1
0
n: 0-15
BX 3
Destinatio1 n address Semiconductors
1
0
0
ARM / Thumb selection 0: ARM state 1: Thumb state
6
Instruction Set
Semiconductors
7
ARM Instruction Set • All instructions are 32-bits long • Many instructions execute in a single cycle • Most of the ARM Instructions can be conditionally executed • Could be divided into six broad classes of instruction – Branch instructions – Data Processing instructions – Status register transfer instructions – Load and Store instructions – Coprocessor instructions – Exception-generating instructions Semiconductors
8
Thumb Instruction Set • All instructions are 16-bits long • Most of the Thumb Instructions cannot be conditionally executed • Thumb instruction set is subset of ARM instruction set • It takes more instructions in Thumb to do the same job in ARM resulting in a performance penalty
Semiconductors
9
Processor Modes
Semiconductors
10
Processor Modes(1) ARM has seven operating modes – User run
unprivileged mode under which most applications
– FIQ
entered, when a high priority (fast) interrupt is raised
– IRQ
general purpose interrupt handling
– Supervisor
protected mode for the operating system entered on reset or software interrupt instruction
– System
privileged mode using same registers as user mode
– Abort
used to handle memory access violations
– Undefined
used to handle undefined instructions
Semiconductors
11
Processor Modes(2) User System FIQ IRQ Supervisor
Privileged Modes
Exception Modes
Abort Undefined Semiconductors
12
Privileged and Exception Modes FIQ
IRQ
Supervisor
Abort
Undefined
• Entered when a specific exception occurs • Each mode has additional registers to prevent corruption • On Reset ARM core is in Supervisor mode • Have access to system resources • Can change modes freely using ARM instructions
Semiconductors
13
User Mode & System Mode • User Mode: – User mode has access to limited system resources – Cannot change modes freely within User mode – User program can make a supervisor call using the SWI instruction(SWI- Software Interrupt but is usually called Supervisor call)
• System Mode: – System mode is similar to User mode but used by OS which needs access to system resources (Privileged) – System mode also used during nested interrupt handling
Semiconductors
14
ARM Registers
Semiconductors
15
Registers (1) An ARM core has 37 registers (32-bits wide) • General purpose registers – 1 program counter – 30 general purpose registers
• Status registers – 1 current program status register(CPSR) – 5 saved program status registers(SPSR) These registers are not all accessible at the same time. The processor state and operating mode determine which registers are available to the programmer. Semiconductors
16
Registers (II) • Depending on processor mode one of several banks is accessible. Each mode can access – the program counter r15 (PC) – a particular r13 (stack pointer SP) and r14 (subroutine link register, LR) – a particular set of r0-r12 registers – the current program status register (CPSR)
• Privileged modes (except System mode) can also access – a particular SPSR (saved program status register)
Semiconductors
17
User and System
Register Banking
r0 r1 r2 r3 r4 r5 r6
Banked registers FIQ
IRQ
Supervisor
Abort
Undefine d
r7 r8
r8_fiq
r9
r9_fiq
r10
r10_fiq
r11
r11_fiq
r12
r12_fiq
r13 (SP)
r13_fiq (SP) r14_fiq (LR)
r13_irq (SP) r14_irq (LR)
r13_svc (SP) r14_svc (LR)
r13_abt (SP) r14_abt (LR)
r13_und (SP) r14_und (LR)
SPSR_fiq
SPSR_irq
SPSR_svc
SPSR_abt
SPSR_und
r14 (LR) r15 (PC) CPSR
Semiconductors
18
Registers in Thumb State • The Thumb state register set is a subset of the ARM state set. The programmer has direct access to: – eight general registers
r0 - r7
– the program counter
PC
– a Stack pointer
SP
– a Link register
LR
– the current program status register
CPSR
• In Thumb state, the high registers (r8 - r12) are not part of the standard register set. The assembly language programmer has limited access to them, but can use them for fast temporary storage Semiconductors
19
Thumb vs. ARM r0
r0
r1
r1
r2
r2
Thumb state
r3
r3
r4
r4
Low registers
r5
r5
r6
r6
r7
r7
Thumb state High registers
Semiconductors
Thumb
ARM
r8
State
State
r10
r9
r11 r12 r13 (SP)
r13 (SP)
r14 (LR)
r14 (LR)
r15 (PC)
r15 (PC)
CPSR
CPSR
SPSR
SPSR
20
Thumb state High registers
Thumb state Low registers
Register Overview User and System
FIQ
IRQ
r0
r0
r0
r1
r1
r2
Supervis or
Abort
r0
r0
r1
r1
r1
r1
r2
r2
r2
r2
r2
r3
r3
r3
r3
r3
r3
r4
r4
r4
r4
r4
r4
r5
r5
r5
r5
r5
r5
r6
r6
r6
r6
r6
r6
r7
r7
r7
r7
r7
r7
r8
r8_fiq
r8
r8
r8
r8
r9
r9_fiq
r9
r9
r9
r9
r10
r10_fiq
r10
r10
r10
r10
r11
r11_fiq
r11
r11
r11
r11
r12
r12_fiq
r12
r12
r12
r12
r13 (SP)
r15 (PC)
r13_fiq (SP) r14_fiq (LR) r15 (PC)
r13_irq (SP) r14_irq (LR) r15 (PC)
r13_svc (SP) r14_svc (LR) r15 (PC)
r13_abt (SP) r14_abt (LR) r15 (PC)
r13_und (SP) r14_und (LR) r15 (PC)
CPSR
CPSR
CPSR
CPSR
CPSR
CPSR
SPSR_fiq
SPSR_irq
SPSR_svc
SPSR_abt
SPSR_und
r14 (LR)
Semiconductors
Undefine d r0
21
Program Status Register (1) 31 30 29 28 27 N
Z
C
V
Q
24 23
16 15
J
Condition code flags
Reserved
8
7
6
5
I
F
T
4
0 mode
Control bits
• Condition Code Flags – N: Negative or less than – Z: Zero – C: Carry or borrow or extend – V: Overflow To not corrupt reserved bits, a read-modify-write strategy should be applied to change PSR bits. Semiconductors
22
Program Status Register (2) 31 30 29 28 27 N
Z
C
V
Q
24 23
16 15
8
J
Condition code flags
7
6
5
I
F
T
mode
• Mode Bits
– I: IRQ interrupts disable
10000
User
– F: FIQ interrupts disable
10001
FIQ
10010
IRQ
10011
Supervisor
10111
Abort
11011
Undefined
11111
System
• T Bit – Thumb mode (when set) – ARM mode (when cleared)
Semiconductors
0
Control bits
Reserved
• Interrupt Disable Bits
4
23
Program Counter (r15) • When the processor is executing in ARM state – all instructions are 32 bits wide – all instructions must be word aligned – bits [31:2] contain the PC, bits [1:0] are zero (instructions cannot be halfword or byte aligned)
• When the processor is executing in Thumb state – all instructions are 16 bits wide – all instructions must be halfword aligned – bits [31:1] contain the PC, bit [0] is zero (instructions cannot be byte aligned) Semiconductors
24
Interrupt Handling
Semiconductors
25
ARM Exception Vectors and processor mode
Semiconductors
Reset
Supervisor
Undefined Instruction
Undefined
Software Interrupt(SWI)
Supervisor
Prefetch Abort
Abort
Data Abort
Abort
Interrupt(IRQ)
IRQ
Fast Interrupt(FIQ)
FIQ 26
Exception Vectors table . . . 0 x1C 0 x18 0 x14 0 x10 0 x0C 0 x08 0 x04 0 x00 Semiconductors
FIQ IRQ (Reserved) Data Abort Prefetch Abort Software Interrupt Undefined Instruction Reset 27
Exception Handling • Entering an exception the ARM core – saves the address of the next instruction in the appropriate LR r14_<mode> (LR)
r15 (PC)
– copies the CPSR into the appropriate SPSR SPSR_<mode>
CPSR
– sets appropriate CPSR bits • interrupt disable bits • mode field bits
8
CPSR:
• if running in Thumb state, enter ARM state*
7
6
5
I
F
T
4
0 mode
Control bits
– forces PC to fetch next instruction from relevant exception vector *: all exceptions switch to ARM state!
Semiconductors
28
Leaving Exception(1) • To leave an exception, the exception handler must – copy SPSR back into CPSR SPSR_<mode>
CPSR
(automatically restoring also I, F and T)
8
CPSR:
7
6
5
I
F
T
4
0 mode
Control bits
– move contents of current LR minus offset* to PC r14_<mode> (LR)
PC - offset
r15 (PC)
*: varies according to type of exception: 2, 4
Semiconductors
29
Leaving Exception-Example(2) • After servicing IRQ execute the following instruction
SUBS PC,R14_irq,#4 • This restores both PC and CPSR SPSR_irq>
r14_ (LR)
Semiconductors
CPSR
PC - offset
r15 (PC)
30
Multiple Exceptions • Exception priorities – When multiple exceptions arise at the same time, a fixed priority sytem determines the order in which they are handled 1. 2. 3. 4. 5. 6. 7.
Reset Data Abort (data memory access cannot be completed) FIQ IRQ Prefetch Abort (instruction memory access cannot be completed) Undefined Instruction SWI - Software Interrupt (to enter supervisor mode)
Semiconductors
highest priority
lowest priority
31
FIQ-Why is it called so ? • This mode has its own set of banked registers from R8-R12. Hence no or minimal stack operations are required • FIQ is the last interrupt vector in the vector table. Hence jump is not needed to reach ISR • ARM recommends only one interrupt source to be classified as FIQ Semiconductors
32
Interrupt Latency • Latency could be between 5 to 27 processor clocks • Ask customers to refer to ARM7TDMI-S Technical Reference Manual for details
Semiconductors
33
Instruction Pipeline
Semiconductors
34
Instruction Pipeline • The ARM7TDMI-S core uses a pipeline to increase the speed of the flow of instructions to the processor. This enables several operations to take place simultaneously • The Program Counter (PC) points to the instruction being fetched rather than to the instruction being executed • During normal operation, while one instruction is being executed, its successor is being decoded, and a third instruction is being fetched from memory
Semiconductors
35
3-Stage Instruction Pipeline ARM
Thum b
PC PC PC 4
Fetch Decode
PC - 2
Instruction Fetched from Memory Thumb only: Thumb instruction decompressed to ARM instruction Instruction decoded
Execute PC 8 Semiconductors
PC - 4
Registers read from Register Bank, Shift and ALU operations performed, Registers written back to Register Bank 36
Optimal Pipelining – In this example it takes 6 clock cycles to execute 6 instructions – All operations are on registers (single cycle instructions) – Clock cycles per instruction (CPI) = 1 ADD
Fetch
SUB
Decod e Fetch
MOV
Execut e Decod e Fetch
Execut e Decod e Fetch
AND ORR
Execut e Decod e Fetch
EOR
Execut e Decod e Fetch
CMP
Execut e Decod e Fetch
RSB
1 Cycle Semiconductors
2
3
4
5
6
7
Execut e Decod e Fetch
8 37
Branch Pipeline Example – Branches break the pipeline – Example in ARM state
BL
0x8000
X
0x8004
X
0x8008
ADD
0x8FEC
SUB
0x8FF0
MOV
0x8FF4
AND
0x8FF8
Fetch
Execut e Decod e Fetch
Linkre t
Adjust
Fetch
Decod e Fetch
Execut e Decod e Fetch
Execut e Decod e Fetch
1 Cycle Semiconductors
Decod e Fetch
2
3
4
5
6
7 38
Reference • ARM Architecture Reference Manual – Available with ARM tools – Also available on PDF
• ARM System-on-Chip Architecture – By Steve Furber
Semiconductors
39