UNIX Kernel Architecture User programs libraries
trap
User level Kernel level
Sytem call interface
File subsytem
Buffer cache Chracter
Process control sybsystem
Inter-process communication scheduler Memory management
block
Device drivers
Kernel level
Hardware level
Hardware control
hardware
1
System Calls • System call are similar to ordinary functions • Libraries map these function call to primitives needed to enter OS • Assembly programs can directly invoke system call • System calls can be categorized into : – File subsystem related calls – Process control system related calls
2
• File subsystem related calls – Allocating file space – Free space management – File access control – open, close read, write, stat, chown, chmode etc • Process control system related calls – Process synchronization – Interprocess communication – Memory management – Process scheduling – Fork, exec, wait, exit etc. 3
Buffering • File subsystem accesses file data using buffering mechanism • The mechanism regulates the flow between buffer and secondary storage • The mechanism interacts with block i/o device driver to initiate the data transfer • block I/O devices are random access storage devices • Device drivers are kernel module that controls the device operation 5
Buffer Cache • When a process wants to access data from a file, the kernel brings the data into main memory, alters it and then request to save in the file system • Buffering improves the response time , throughput • Buffering minimizes the frequency of disk access by keeping a pool of internal data buffer called buffer cache. 6
Buffer Cache • Buffer cache contains the data in recently used disk blocks • When reading data from disk, the kernel attempts to read from buffer cache. • If data is already in the buffer cache, the kernel does not need to read from disk • If data is not in the buffer cache, the kernel reads the data from disk and cache it 7
Buffer Headers • A buffer consists of two parts – a memory array – buffer header
• disk block : buffer = 1 : 1 device num block num
ptr to data area
status ptr to previous buf on hash queue ptr to next buf on hash queue ptr to previous buf on free list ptr to next buf on free list Buffer Header
8
Buffer Headers • device num – logical file system number
• block num – block number of the data on disk
• status – – – – –
The buffer is currently locked/busy. The buffer contains valid data. delayed-write The kernel is currently reading or writing the contents of buffer to disk. A process is currently waiting for the buffer to become free.
• kernel identifies the buffer content by examining device num and block num.
9
Structures of the buffer pool • Buffer pool according to LRU • The kernel maintains a free list of buffer – doubly linked list – take a buffer from the head of the free list. – When returning a buffer, attaches the buffer to the tail. Forward ptrs
free list head
buf 1
buf 2
buf n Back ptrs
Free list of Buffers
10
Structure of the buffer pool • When the kernel accesses a disk block – separate queue (doublely linked circular list) – hashed as a function of the device and block num – Every disk block exists on one and only on hash queue and only once on the queue Hash queue headers
28
4
64
blkno1 mod 4
17
5
97
blkno2 mod 4
98
50
10
blkno3 mod 4
3
35
99
blkno0 mod 4
Buffers on the Hash Queues
11
Scenarios for retrieval of a buffer • Determine the logical device num and block num • The algorithms for reading and writing disk blocks use the algorithm getblk – The kernel finds the block on its hash queue • The buffer is free. • The buffer is currently busy.
– The kernel cannot find the block on the hash queue • The kernel allocates a buffer from the free list. • In attempting to allocate a buffer from the free list, finds a buffer on the free list that has been marked “delayed write”. • The free list of buffers is empty.
12
Retrieval of a Buffer:1st Scenario (a) • The kernel finds the block on the hash queue and its buffer is free Hash queue headers
28
4
64
17
5
97
98
50
10
3
35
99
blkno0 mod 4 blkno1 mod 4
blkno2 mod 4 blkno3 mod 4
freelist header
Search for block 4
13
Retrieval of a Buffer:1st Scenario (b) 28
4
64
17
5
97
98
50
10
3
35
99
blkno0 mod 4
blkno1 mod 4
blkno2 mod 4
blkno3 mod 4
freelist header
Remove block 4 from free list
14
Buffer allocation • getblk algorithm is used for buffer allocation • Once buffer is allocated kernel can – Read data from disk into buffer – Manipulate or write data to buffer or disk – Mark buffer busy. ( no other process can access it and change its content )
• When kernel finishes using buffer , it releases the buffer (brelse) and places it at end of freelist
15
Retrieval of a Buffer: 2nd Scenario (a) • The kernel cannot find the block on the hash queue, so it allocates a buffer from free list Hash queue headers
28
4
64
17
5
97
98
50
10
3
35
99
blkno0 mod 4 blkno1 mod 4
blkno2 mod 4 blkno3 mod 4
freelist header
Search for block 18: Not in cache
16
Retrieval of a Buffer: 2nd Scenario (b) Hash queue headers
28
4
64
17
5
97
98
50
10
35
99
blkno0 mod 4 blkno1 mod 4
blkno2 mod 4 blkno3 mod 4
18
freelist header
Remove 1st block from free list: Assign to 18 17
Retrieval of a Buffer: 3rd Scenario (a) • The kernel cannot find the block on the hash queue, and finds delayed write buffers on hash queue Hash queue headers
28
4
64
17
5
97
blkno0 mod 4 blkno1 mod 4
blkno2 mod 4 blkno3 mod 4
delay 98
50
10
3
35
99
delay freelist header
Search for block 18, Delayed write blocks on free list 18
Retrieval of a Buffer: 3rd Scenario (b) Hash queue headers
28
64
blkno0 mod 4 blkno1 mod 4
17
5
97
writing blkno2 mod 4
blkno3 mod 4
98
50
10
3
35
99
18
writing freelist header (b) Writing Blocks 3, 5, Reassign 4 to 18
Complete writing and put buffers at the head of free list
19
Retrieval of a Buffer: 4th Scenario • The kernel cannot find the buffer on the hash queue, and the free list is empty Hash queue headers
28
4
64
17
5
97
blkno2 mod 4
98
50
10
blkno3 mod 4
3
35
99
blkno0 mod 4
blkno1 mod 4
freelist header
Search for block 18, free list empty
20
• Process goes to sleep till some other process executes brelse • When the process is scheduled it must search the hash queue again because some other process might have taken the block already
21
Retrieval of a Buffer: 5th Scenario • Kernel finds the buffer on hash queue, but it is currently busy Hash queue headers
28
4
64
17
5
97
98
50
10
3
35
99
blkno0 mod 4 blkno1 mod 4
blkno2 mod 4 blkno3 mod 4
busy freelist header
Search for block 99, block busy
22
Process A acquires a buffer, marks it busy and goes to sleep. Process B needs the same block but find it ‘busy’. So it marks the block as ‘in demand’. Process A will eventually release the buffer and will find that it was in demand. Process A will issue a wakeup to all processes waiting for the buffer. B must now check that the buffer is indeed free and no other process has locked it or brought in some other block into it.
23
25
Internal representation of file • Every file on a Unix system has a unique Inode • inode contains the information necessary for a process to access a file • It contains the description of the disk layout of the file data, information about the file owner, access permissions, access time and data block addresses etc
27
File System – Data Structure • Kernel contains three Data Structure – User file descriptor table • Kernel allocates this table for each running process • When a file is opened/created with in a process, its entry is made in this table. • Index of user file descriptor table where this entry is made is return to the user program as File Descriptor. – File Table • It’s a global kernel structure which maintains the file offset. • It also maintains access rights allowed to the opening process. – Inode Table • It’s a table of inodes which contain information of each file. – Each of the above tables makes a entry for a file.
File System Structure
• Unix divided physical disks into logical disks called partitions. • Each partition is a standalone file system. • a boot block is located in the first few sectors of a file system. The boot block contains the initial bootstrap program used to load the operating system. • a super block describes the state of the file system: the total size of the partition, the block size, pointers to a list of free blocks, the inode number of the root directory etc. • a linear array of inodes (short for “index nodes”). There is a one to one mapping of files to inodes and vice versa. • data blocks blocks containing the actual contents of files
Structure of a Regular file Direct 0 Direct 1 Direct 2 Direct 3 Direct 4 Direct 5 Direct 6
Direct 7 Direct 8 Direct 9 Single Indirect
Double Indirect Triple Indirect
Example • Assume – Logical block = 1 K Bytes – Block number is addressable by a 32 bit (4 bytes) – A block can hold 256 block numbers – Maximum number of bytes possible in a file = (10K + 256K + 64M +16G)bytes
4096 228 45423 0 0 11111 0 101 367 0 428 9156 824
367 0 331
9156
753333 331
3333
• E.g. 1 – If a process wants to access byte offset 9000 • Kernel calculate that the byte is in 8th block (starting from 0) • It accesses the physical block number stored in 8th direct map location. (If 367 is the value stored in 8th direct map location then access 367th block from the disk.) • Then access 808th byte (9000 – 8192) in 367th block. • This will be the resultant 9000th byte in file.
• E. g. 2 – If process want to access byte offset 350,000 in the file – 9th direct pointer can store till (10K-1) – Single indirect pointer can store from 10K to (266K-1) – Double indirect pointer can store from 266K to (64266K-1) – The request is between these value. So the request is in double indirect pointer.
• 350000 in a file is 350000 – 266K in double indirection • = 77616 of a double indirection. • Since each single indirection can access 256K, 77616 is in the 0th single indirection block of double indirection block. • So we will fetch block number 331. • Since the direct block in a single indirect block hold 1KB, 77616 is in 75th direct block in the single indirect block. • So we will fetch block number 3333. • The offset location from where we need to access the byte is calculated by 77616 – 75K = 816 • If inode block entry is 0, then the logical block entries contain no data.