CE 466 System Programming
File System and Processing Computer Engineering Department Yarmouk University 10/12/2008
1
Files • A named collection of related info • Consists of a sequence of bits, bytes, lines, or records • For each file, the OS keeps the following info: • • • • •
Name Type Location Size Important dates 2
1
File Access Methods • Sequential Access
• Direct Access
3
File system •
A file system provides a logical view of the file (data/information)
4
2
File system • • • •
Who maps the user (logical) view of the files to the blocks where the file’s data is stored? If not the OS, the user should!!! The OS provides a mapping between the logical and physical units of storage The OS provides a set of basic operations for file manipulation such as: create, open, read write, …. 5
Layered File System • The mapping is implemented using a layered approach • Lower levels: device specific I/O control • Intermediate levels: • Basic file system which issues commands to the appropriate device driver to read and write physical blocks (drive 1, track 2, ..) • File organization module translates logical block address to a physical one 6
3
Layered File System • Each file’s logical blocks are numbered from 1 to N • Corresponding physical blocks could have any addresses (depends on allocation policy)
• Upper level (logical file system): • Manages file system structures including: • Directory structures to provide file organization module • File control block (FCB) that contains info about the file including location of file contents, ownership, permissions 7
Directories • A file system may be divided into partitions • Each partition will have a directory structure • Directories store information such as name, size, and location of a file • Operations performed on directories include: search for a file, create a file, list a directory, traverse a file system 8
4
Directory Structure •
Collection of nodes that contains information about all files Directory
Files
•
F1
F2
F3
F4
Fn
Both directory structures and files reside on disk 9
Directory Design Constraints • Organize directory for: • Efficiency in locating files quickly • Convenient naming for users • 2 users may have the same file names • A file may have multiple names
• Logical grouping of files (all games, all c++ code, …)
10
5
Single-Level Directory • A single directory for all users
• Naming problem • Grouping problem 11
Two-Level Directory • Separate directory for each user
• Different users may have the same file name • Efficient searching • Sharing might be hard 12
6
Multi-Level Directory • Efficient searching
• Grouping Capability 13
General Graph Directory
14
7
Directory Implementation • Linear list of file names with pointer to the data blocks • simple to program • time-consuming to execute
• Hash Table – linear list with hash data structure • decreases directory search time • collisions – situations where two file names hash to the same location • fixed size
15
File System Structures • On-disk structure: (nonvolatile) • Boot control block, which contains information needed to boot the OS (boot block in UNIX file system ‘UFS’) • Partition control block, which contains partition details such as number of blocks, size of blocks, free block count, .. (superblock in UFS) • A directory structure used to organize the files • An FCB which contains file details (inode in UFS) 16
8
File System Structures •
In memory: (volatile) • • • • •
Info is used for file-system management and to improve performance Partition table Directory structures of recently accessed directories System-wide open-file table, which includes a copy of the FCB of each open file Per-process open-file table, which contains a pointer to the appropriate FCB in the systemwide open-file table
17
A Typical File Control Block (FCB)
File Permissions File Dates (create, access, write) File Owner, Group File Size File Data Blocks
18
9
Open-File Table • Saves time since no need to search for the file every time an I/O operation is performed • When a file is opened, its info is added to the open-file table • The operations are performed using the index in the table • Info in the open-file table include: file pointer, open count,… 19
File System Operations (create a file) • Create new file • Reads the directory into memory • Knowing the directory structure, the logical file system creates a FCB • Updates the directory structure with the new FCB • Writes it back to disk
20
10
File System Operations (open a file) • Before a file can be used for I/O, it must be opened: • Open system call searches the system-wide openfile table to see if the file is already open • If the file is not already open • Directory structure is searched • When file is found, its FCB is copied into the systemwide open-file table
21
File System Operations (open a file) • Entry in the per-process open-file table with a pointer to the entry in the system-wide open-file table is added (also a pointer to the current location in the file is added) • The open file command returns a pointer (file descriptor in UFS) to the appropriate entry in the per-process file table • All file operations are performed by this pointer 22
11
Allocation Methods
• Refers to how disk blocks are allocated for files: • Contiguous allocation • Linked allocation • Indexed allocation
23
Contiguous Allocation
24
12
Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk • Simple, only starting location and length (number of blocks) are required • Random access? • External fragmentation (dynamic storageallocation problem) • Files hard to grow 25
Modified Contiguous Allocation • Known as extent-based systems • Initially, an extent (contiguous chunk of space is allocated • Another extent is allocated if need arises • File’s blocks are recorded as a location and block count plus a link to the first block in the next extent 26
13
Linked Allocation
27
Linked Allocation • Each file is a linked list of disk blocks that may be scattered anywhere on the disk • Simple – need only starting address and no sizedeclaration is necessary • No fragmentation or waste of space • Sequential access only (no random access) • Each block consumes 4 bytes to point to the next block (4B in 512B/B.78% waste) • If one pointer is lost, the file state is not guaranteed 28
14
File-Allocation Table
29
File-Allocation Table (FAT) • Used by MS-DOS and OS2 • A section of space at the beginning of each partition is set a side to contain the table • The table has entry for each disk block and is indexed by block number • Directory entry contains the block number of the first block of the file • The table entry contains the block number of the next block in the file • The last block of the file has a special end of file entry 30
15
Indexed Allocation
31
Indexed Allocation • Solves the problem of direct access • Brings the pointers into one location: the index block • Index block is an array of disk-block addresses • The ith entry in the index block points to the ith block in the file
• Directory contains the address of the index block 32
16
Indexed Allocation • Each file has an index block • If file is small few non null pointers (i.e. space is wasted more than with a linked list) • If file is big Index block may not be sufficient to hold enough pointers to file blocks
33
Index blocks and big files • Linked scheme • Have as many index block as needed to accommodate a file • Last pointer in the current index block points to the next index block
Index Block 0
Index Block 1
34
17
Index blocks and big files • Multilevel index • Use the first-level index block to point to a set of second-level index blocks • Second-level index block point to the file blocks • This approach could continue to a third of fourth level • 4 KB blocks can store 1 K of 4B pointers in an index block Two level of indexes allow 1M 4B pointers 4GB file 35
Multilevel index
M
outer-index
index table
file 36
18
Index blocks and big files • Combined scheme • Use a combination of direct blocks and indirect blocks • The indirect blocks may use multilevel index • UNIX use direct blocks, indirect blocks, double and triple indirect blocks
37
Combined Scheme: UNIX (4K bytes per block)
38
19
Free-Space Management • Bit vector (n blocks)
bit[i] =
678
… 1 ⇒ block[i] free 0 ⇒ block[i] occupied
• Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit 39
Linked Free Space List on Disk
40
20
Standard ‘C’ File Read and Write • Uses standard C library's input and output functions • Portable across all operating systems • Buffers read and write operations, making file operations faster and more efficient • Based on FILE data type (also called stream): • Location in file (where to read or write next) • Read and write buffers 41
Opening Files • To work with a file, we must open it first, using fopen() • Returns a (FILE *) on success or NULL on failure FILE* f_read; FILE* f_write; FILE* f_readwrite; FILE* f_append; f_read = fopen("/home/choo/data.txt", "r"); f_write = fopen("logfile", "w"); f_readwrite = fopen("/usr/local/lib/db/users", "r+"); f_append = fopen("/var/adm/messages", "a");
42
21
Closing Files • When done working with the file, we need to close it using fclose() • returns 0 on success and -1 on failure
• Closing a file does the following: • Flushes unsaved changes to disk (OS disk cache) • Frees file descriptor and other file resources if (!fclose(f_readwrite)) { perror("Failed closing file'/usr/local/lib/db/users':"); exit(1); } 43
Reading from a File int c; char buf[201]; /*read a single character from the file*/ c = fgetc(f_read); /* read one line from the file */ fgets(buf, 201, stdin); /* place the given character back into file stream */ ungetc(c, stdin); /* check if the read/write head has reached EOF */ if (feof(f_read)) { printf("End of file reached\n"); } /* read one block of 120 characters */ if (fread(buf, 120, 1, f_read) != 1) { perror("fread"); } 44
22
Writing to a File int c; char buf[201]; /* write the character 'a' to the given file. */ c = 'a'; fputc(c, f_readwrite); /* write the string "hello world" to the given file. */ strcpy(buf, "hello world"); fputs(buf, f_readwrite); /* write the string "hi there, mate" to the (screen) */ fprintf(stdout, "hi there, mate\n"); /* write out any buffered writes to the given file stream. */ fflush(stdout); /* write twice the string "hello, great world!\n" */ /* third parameter to fwrite():the number of blocks to write). */ strcpy(buf, "hello, great world. we feel fine!\n"); if (fwrite(buf, strlen(buf), 2, f_readwrite) != 2) { perror("fwrite"); }
45
Moving Read/Write Location /* move the read/write pointer to position '30' */ /* first position in the file is '0', not '1'. */ fseek(f_read, 29L, SEEK_SET); /* move the read/write pointer of the file stream 25 characters*/ /* forward from its given location. */ fseek(f_read, 25L, SEEK_CUR); /* remember the current read/write pointer's position, /* move it to location '520' in the file, write the string /* "hello world", and move the pointer back to the /* previous location. */ long old_position = ftell(f_readwrite); if (old_position < 0) { perror("ftell"); exit(0); } if (fseek(f_readwrite, 520L, SEEK_SET) < 0) { perror("fseek(f_readwrite, 520L, SEEK_SET)"); exit(0); } fputs("hello world", f_readwrite); if (fseek(f_readwrite, old_position, SEEK_SET) < 0) { perror("fseek(f_readwrite, old_position, SEEK_SET)"); exit(0);} 46
23
Moving Read/Write Location (other functions) #include <stdio.h> int fseek(FILE *stream, long offset, int whence); long ftell(FILE *stream); void rewind(FILE *stream); int fgetpos(FILE *stream, fpos_t *pos); int fsetpos(FILE *stream, fpos_t *pos);
47
Accessing Files Using System Calls • • • • •
Accessing files is done best using the standard C library functions Sometimes, more low-level operations are needed such as checking file permissions Unix treats different devices similar to accessing files Sometimes called “low level I/O” This form of I/O is UNBUFFERED i.e. each read/write request results in accessing disk (or device) directly to fetch/put a specific number of bytes. 48
24
File Descriptors • A File descriptor is the basic system object to manipulate files • A File descriptor is an +ve integer used to access a memory area containing data about the open file • It is different for different files to uniquely identify a file
49
Creating Files • To create a file use the creat() system call int creat(char *file_name, int mode)
50
25
Creating Files /usr/include/sys/stat.h: #define S_IRWXU 0000700 /* -rwx------ */ #define S_IREAD 0000400 /* read permission, owner */ #define S_IRUSR S_IREAD #define S_IWRITE 0000200 /* write permission, owner */ #define S_IWUSR S_IWRITE #define S_IEXEC 0000100 /* execute/search permission, owner */ #define S_IXUSR S_IEXEC #define S_IRWXG 0000070 /* ----rwx--- */ #define S_IRGRP 0000040 /* read permission, group */ #define S_IWGRP 0000020 /* write " " */ #define S_IXGRP 0000010 /* execute/search " " */ #define S_IRWXO 0000007 /* -------rwx */ #define S_IROTH 0000004 /* read permission, other */ #define S_IWOTH 0000002 /* write " " */ #define S_IXOTH 0000001 /* execute/search " " */
51
Opening Files • int open(char *filename, int flag, int perms) • •
Returns a file descriptor or -1 for when it fails O_APPEND, O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_WRONLY + others see online man pages or reference manuals
int fd_read; int fd_write; int fd_readwrite; int fd_append; /* Open the file /etc/passwd in read-only mode. */ fd_read = open("/etc/passwd", O_RDONLY); if (fd_read < 0) { perror("open"); exit(1); } 52
26
File Operations • int close(int handle) • int read(int handle, char *buffer, unsigned length) • int write(int handle, char *buffer,unsigned length) • Length: is used to specify the number of bytes to be read or written into a file • The sizeof() is commonly used to specify the length
53
Example /* program to read a list of floats from a binary file first byte of file is an integer saying how many floats in file. Floats follow after it */ /* command line */ #include<stdio.h> #include float bigbuff[1000]; 54
27
Example main(int argc, char **argv) { int fd; int bytes_read; int file_length; if ( (fd = open(argv[1],O_RDONLY)) = -1) { /* error file not open */.... perror("Datafile"); exit(1); } if ( (bytes_read = read(fd,&file_length, sizeof(int))) == -1) { /* error reading file */... exit(1); } if ( file_length > 999 ) {/* file too big */ ....} if ( (bytes_read = read(fd,bigbuff,file_length*sizeof(float))) == -1) { /* error reading open */... exit(1); } } 55
fsync() • •
int fsync(int fildes); To ensure that the file on the physical disk gets updated immediately
#include /*declaration of fsync()*/ . . . . if (fsync(fd) == -1) { perror("fsync"); }
56
28
access() • int access(const char *path, int amode);
• The return value is zero if the requested access is possible and -1 otherwise. • The parameter amode can take one of the following four symbolic values. • • • •
R_OK Test for read permission W_OK Test for write permission X_OK Test for execute permission F_OK Test for existence
• Check: stat() and chmod() 57
Other System Call • off_t lseek(int fildes, off_t offset, int whence);
• lseek - reposition read/write file offset • int rename(const char *old, const char *new); • int unlink(const char *path);
58
29
Reading the Contents of a Directory • DIR structure for directory reading is what the FILE structure is for files • contains information used by other calls to read the contents of the directory • When reading the contents of a directory, we need to open a directory (returns a DIR structure). • The data regarding a directory entry is returned in a dirent structure. • The only relevant field in this structure is d_name, • A null-terminated character array, containing the name of the entry (a file or a directory). 59
Opening And Closing A Directory #include /* struct DIR, struct dirent, opendir().. */ /* open the directory "/home/users" for reading. */ DIR* dir = opendir("/home/users"); if (!dir) { perror("opendir"); exit(1); } When we are done reading from a directory, we can close it using the closedir() function: if (closedir(dir) == -1) { perror("closedir"); exit(1); } 60
30
Reading The Contents Of A Directory /* this structure is used for storing the name of each entry in turn. */ struct dirent* entry; /* read the directory's contents, print out the name of each entry. */ printf("Directory contents:\n"); while ( (entry = readdir(dir)) != NULL) { printf("%s\n", entry->d_name); }
61
31