Os Concepts - Course Material

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Os Concepts - Course Material as PDF for free.

More details

  • Words: 19,705
  • Pages: 60
‘OS’ Concepts

1.

Introduction to operating system

• • • • • • 2.

About Operating system and Kernels • • • • • •

3.

Into to Process management sub system Into to Memory management sub system Intro to File sub system Into to Device drivers Signals and System calls

Process management sub system • • • • • • • •

5.

O.S vs Kernels Batching processing Multi – user, Multi- processing O.S Layered Processing Distributed system Real Time O.S

Operating System components • • • • •

4.

Components of an Operating System Operating System Executive: Types of operating system Batch Systems: Interactive systems: Objectives of an operating system.

Creating process Process Id, process descriptor Process contexts Process state Process address space Process scheduling policies Threads vs process Multi threading and Hyper threading

Memory management sub system • • •

Segmentation Paging Page allocation strategies



6.

File sub system • • • •

7.

Memory mapping and unmapping

Importance of File system File representation in the system File creation and maintenance Block devices and file system

Resource sharing in OS • • • • •

File locking Record locking Dead locks Advisory locking and mandatory locking Concept of Semaphores and mutex

Chapter 1 Introduction To OS Components of an Operating System Operating System Executive Types of operating system Batch Systems Interactive systems Objectives of an operating system An operating system is a collection of programs used to control a computer system.

Components of an Operating System. Booting Loader: The function of the bootstrap loader program, which is usually contained in ROM, is to read the main portion of the operating system from secondary memory. Diagnostic Tests: The operating system contains number of diagnostic programs that test the operations of the system components. Eg: Check the operations of the disk drive Check the operations of the RAM Operating System Executive: The operating executive is a program that controls the activities of the system (also called monitor program) Eg: Executing programs and assigning tasks to hardware devices. This operating system executive is read into RAM when computer is started. Some sophisticated operating system executives allow several programs to run simultaneously. BIOS: - Basic Input Output systems The program is of the low level. Some functions of the BIOS programs are: • Reading a character from key board

• • • • •

Writing a character on the video display Writing a character to the printer Determining whether the printer is busy or not Reading a sector from the disk in a particular disc drive The BIOS programs are usually recorded in the ROM

Utility Programs: Every computer system needs utility programs to perform house keeping functions. The functions are as follows: • Formatting a disk • Displaying the contents of a disk • •

Copying the contents of one disk to another Making a backup copy of the contents of a hard disk

File Maintenance: The operating system provides service routines that can be used to maintain files on the various secondary memories. These files maintenance routines allow programs to create read, write files TYPES OF OPERATING SYSTEMS Batch Systems: A batch operating system accepts jobs, and places them in a queue to await execution. This process is often called spooling. The operating system selects jobs based on priorities from the job queue. Batch jobs maybe executed on its serial basis Interactive systems: An interactive operating system allows users to interact directly with a computer from a terminal. In effect the users can interrupt a low priority batch job and cause the computer to perform his high priority work. Interactive systems must be multiprogramming systems. Multi Tasking: Multi tasking is the capability of a CPU to execute two or more programs concurrently. Essentially two are more programs are stored concurrently in primary storage. Operating system as resource allocator: -

A computer system has many resources that may be required to solve a problem, CPU time memory space file storage space I/O devices and so on. The operating system acts as a manger of these resources and allocates them to specific programs and users as necessary for tasks

WHAT IS ON OPERATING SYSTEM? An operating system is software that performs the task of overall resources management for a computer system. In this course we shall look into the basic concepts and methodologies used for a few important parts of an operating system.

Objectives of an operating system. Co-operation:The operating should enable the users to share the resources in a manner that all the jobs can get resources they need. Operating system components:The important components of an operating system are: • Process management • Main memory management • Secondary storage management • I/O system management • File management • Protection system Process management:The program executed by the computer system includes the user defined programs as well as system defined programs. A program is executed in a CPU. Besides the CPU, a process requires a memory and other resources. The operating system must take proper decision regarding proper allocation of CPU. The process management unit of operating system is responsible for taking such decisions. Sometimes two processes may want to exchange data with one another. One process must wait till another process reaches a particular stage of execution. These communications and synchronizations are carried out by process management components. Policies: -

The operating system must ensure the application of a proper policy through which the different processes are allocated resources. The operating system must decide as to who should get a particular resources when and how long for resources like memory operating system must specify how much, main memory area should be allocated to a program at a time. Error handling: The operating system must be able to react to error condition. Operating system must also include system diagnostic programs to check the health of the system at power on. Accounting: The operating system must keep track to the initialization of the different resources such as, how much disk space or how much CPU time a particular program uses. Following activities are associated with process management: • The creation and deletion of both users and system process • Suspension and resumption of process • Process communication and synchronization • Dead lock handling Main memory management: When a program is saved it is usually kept on the disk but for executing a program it must be brought into memory. Primary memory does not offer sufficient space to hold all the programs hence it is to be decided as to which set of programs should be brought into primary memory for execution. It is also to be decided as to which part of the memory should be allocated to a particular program. If memory is to be allocated for a new process, some programs residing in the memory need to be swapped out to the secondary memory. These decisions are taken and executed by the main memory management unit. UNIX is a layered operating system. The innermost layer is the hardware that provides the service for the O.S. The O.S. in UNIX is called as Kernel interacts directly with the hardware and provides the services to the user programs. User programs interact with the kernel through a set of standard system calls. These system calls request services to be provided by the Kernel. Such services would include accessing a file open, close, read, write, link, changing ownership of a file or a directory, changing to a new directory, creating, suspending or killing a process and setting limits on system resources. UNIX is a multi user, multi tasking and multi terminal O.S.

UNIX caught on among programmers because it was designed with these features: • programmers environment • simple user interface • simple utilities that can be combined to perform powerful functions • hierarchical file system • simple interface to devices consistent with file format • multi-user, multi-process system • architecture independent and transparent to the user.

Chapter 2 About Operating system and Kernels O.S vs Kernels Batching processing Multi – user, Multi- processing O.S Layered Processing Distributed system Real Time O.S

OVERVIEW What is an operating system and what does it do? The operating system has two basic functions. First, it communicates with the PC’s hardware; it gives applications access to things like the hard drive, it receives input from the keyboard, and it outputs data to the monitor. The second function is to provide a user interface and interpret commands so a user can communicate with the hardware. An operating system is also referred to as a command interpreter. Several types of operating systems perform basic functions. For the purpose of this book, we cover only three: MS-DOS, Windows 3.x, and Windows 95/98. We cover some basics for each operating system and talk about the architecture of each. OPERATING SYSTEM ROLE • • • •

Must communicate with the PC’s hardware Works with the BIOS to provide access to devices such as hard drives Communicates with device drivers Provides a user interface

To communicate with the PC’s hardware, the operating system starts running immediately after the PC has finished its Power-On Self Test (POST), when it takes control of the PC. The operating system works with the BIOS to provide access to devices such as hard drives, floppy drives, keyboards, mice, and basic video. If a user wants to access a file on a diskette in a floppy drive, the operating system passes the request to the appropriate set of BIOS instructions that tells the floppy controller to send the requested information to RAM. If the BIOS is unable to talk with the hardware, the operating system talks to the hardware directly. For this capability, it needs some additional programming. Device drivers provide the code necessary for the operating system to communicate with specific hardware. Device drivers are written specifically for a particular operating system, usually by the hardware manufacturer. In addition to communicating with hardware, the operating system provides for some type of error handling and error notification. If a piece of hardware is not functioning properly, the operating system tries to fix the problem or tries to communicate with the device a few more times. If it is still unable to communicate with the device, it provides an error screen notifying the user of the problem. Providing a user interface is the second function of an operating system. Basically, the operating system organizes applications so users can easily access them, use them, and store application data. When an application is opened, the operating system lets the application provide the majority of the user interface. That is, the operating system disappears or moves away from the center of the screen. The operating system still has the responsibility of providing access to the hardware for whatever the application needs. If the program cannot function properly, the operating system again takes control, stops the application, and provides some sort of error message.

Kernel Definition The kernel is a program that constitutes the central core of a computer operating system. It has complete control over everything that occurs in the system. A kernel can be contrasted with a shell (such as bash, csh or ksh in Unix-like operating systems), which is the outermost part of an operating system and a program that interacts with user commands. The kernel itself does not interact directly with the user, but rather interacts with the shell and other programs as well as with the hardware devices on the system, including the processor (also called the central processing unit or CPU), memory and disk drives. The kernel is the first part of the operating system to load into memory during booting (i.e., system startup), and it remains there for the entire duration of the computer session because its services are required continuously. Thus it is important for it to be as small as

possible while still providing all the essential services needed by the other parts of the operating system and by the various application programs. Because of its critical nature, the kernel code is usually loaded into a protected area of memory, which prevents it from being overwritten by other, less frequently used parts of the operating system or by application programs. The kernel performs its tasks, such as executing processes and handling interrupts, in kernel space, whereas everything a user normally does, such as writing text in a text editor or running programs in a GUI (graphical user interface), is done in user space. This separation is made in order to prevent user data and kernel data from interfering with each other and thereby diminishing performance or causing the system to become unstable (and possibly crashing). When a computer crashes, it actually means the kernel has crashed. If only a single program has crashed but the rest of the system remains in operation, then the kernel itself has not crashed. A crash is the situation in which a program, either a user application or a part of the operating system, stops performing its expected function(s) and responding to other parts of the system. The program might appear to the user to freeze. If such program is a critical to the operation of the kernel, the entire computer could stall or shut down. The kernel provides basic services for all other parts of the operating system, typically including memory management, process management, file management and I/O (input/output) management (i.e., accessing the peripheral devices). These services are requested by other parts of the operating system or by application programs through a specified set of program interfaces referred to as system calls. Process management, possibly the most obvious aspect of a kernel to the user, is the part of the kernel that ensures that each process obtains its turn to run on the processor and that the individual processes do not interfere with each other by writing to their areas of memory. A process, also referred to as a task, can be defined as an executing (i.e., running) instance of a program. The contents of a kernel vary considerably according to the operating system, but they typically include (1) a scheduler, which determines how the various processes share the kernel's processing time (including in what order), (2) a supervisor, which grants use of the computer to each process when it is scheduled, (3) an interrupt handler, which handles all requests from the various hardware devices (such as disk drives and the keyboard) that compete for the kernel's services and (4) a memory manager, which allocates the system's address spaces (i.e., locations in memory) among all users of the kernel's services. The kernel should not be confused with the BIOS (Basic Input/Output System). The BIOS is an independent program stored in a chip on the motherboard (the main circuit board of a computer) that is used during the booting process for such tasks as initializing the hardware and loading the kernel into memory. Whereas the BIOS always remains in the computer and is specific to its particular hardware, the kernel can be easily replaced

or upgraded by changing or upgrading the operating system or, in the case of Linux, by adding a newer kernel or modifying an existing kernel. Most kernels have been developed for a specific operating system, and there is usually only one version available for each operating system. For example, the Microsoft Windows 2000 kernel is the only kernel for Microsoft Windows 2000 and the Microsoft Windows 98 kernel is the only kernel for Microsoft Windows 98. Linux is far more flexible in that there are numerous versions of the Linux kernel, and each of these can be modified in innumerable ways by an informed user. A few kernels have been designed with the goal of being suitable for use with any operating system. The best known of these is the Mach kernel, which was developed at Carnegie-Mellon University and is used in the Macintosh OS X operating system. It is not necessary for a computer to have a kernel in order for it to be usable, the reason being that it is not necessary for it to have an operating system. That is, it is possible to load and run programs directly on bare metal machines (i.e., computers without any operating system installed), although this is usually not very practical. In fact, the first generations of computers used bare metal operation. However, it was eventually realized that convenience and efficiency could be increased by retaining small utility programs, such as program loaders and debuggers, in memory between applications. These programs gradually evolved into operating system kernels. The term kernel is frequently used in books and discussions about Linux, whereas it is used less often when discussing some other operating systems, such as the Microsoft Windows systems. The reasons are that the kernel is highly configurable in the case of Linux and users are encouraged to learn about and modify it and to download and install updated versions. With the Microsoft Windows operating systems, in contrast, there is relatively little point in discussing kernels because they cannot be modified or replaced. Categories of Kernels Kernels can be classified into four broad categories: monolithic kernels, microkernels, hybrid kernels and exokernels. Each has its own advocates and detractors. Monolithic kernels, which have traditionally been used by Unix-like operating systems, contain all the operating system core functions and the device drivers (small programs that allow the operating system to interact with hardware devices, such as disk drives, video cards and printers). Modern monolithic kernels, such as those of Linux and FreeBSD, both of which fall into the category of Unix-like operating systems, feature the ability to load modules at runtime, thereby allowing easy extension of the kernel's capabilities as required, while helping to minimize the amount of code running in kernel space.

A microkernel usually provides only minimal services, such as defining memory address spaces, interprocess communication (IPC) and process management. All other functions, such as hardware management, are implemented as processes running independently of the kernel. Examples of microkernel operating systems are AIX, BeOS, Hurd, Mach, Mac OS X, MINIX and QNX. Hybrid kernels are similar to microkernels, except that they include additional code in kernel space so that such code can run more swiftly than it would were it in user space. These kernels represent a compromise that was implemented by some developers before it was demonstrated that pure microkernels can provide high performance. Hybrid kernels should not be confused with monolithic kernels that can load modules after booting (such as Linux). Most modern operating systems use hybrid kernels, including Microsoft Windows NT, 2000 and XP. DragonFly BSD, a recent fork (i.e., variant) of FreeBSD, is the first nonMach based BSD operating system to employ a hybrid kernel architecture. Exokernels are a still experimental approach to operating system design. They differ from the other types of kernels in that their functionality is limited to the protection and multiplexing of the raw hardware, and they provide no hardware abstractions on top of which applications can be constructed. This separation of hardware protection from hardware management enables application developers to determine how to make the most efficient use of the available hardware for each specific program. Exokernels in themselves they are extremely small. However, they are accompanied by library operating systems, which provide application developers with the conventional functionalities of a complete operating system. A major advantage of exokernel-based systems is that they can incorporate multiple library operating systems, each exporting a different API (application programming interface), such as one for Linux and one for Microsoft Windows, thus making it possible to simultaneously run both Linux and Windows applications. The Monolithic Versus Micro Controversy In the early 1990s, many computer scientists considered monolithic kernels to be obsolete, and they predicted that microkernel would revolutionize operating system design. In fact, the development of Linux as a monolithic kernel rather than a microkernel led to a famous flame war (i.e., a war of words on the Internet) between Andrew Tanenbaum, the developer of the MINIX operating system, and Linus Torvalds, who originally developed Linux based largely on MINIX. Proponents of microkernels point out that monolithic kernels have the disadvantage that an error in the kernel can cause the entire system to crash. However, with a microkernel, if a kernel process crashes, it is still possible to prevent a crash of the system as a whole by merely restarting the service that caused the error. Although this sounds sensible, it is questionable how important it is in reality, because operating systems with monolithic

kernels such as Linux have become extremely stable and can run for years without crashing. Another disadvantage cited for monolithic kernels is that they are not portable; that is, they must be rewritten for each new architecture (i.e., processor type) that the operating system is to be used on. However, in practice, this has not appeared to be a major disadvantage, and it has not prevented Linux from being ported to numerous processors. Monolithic kernels also appear to have the disadvantage that their source code can become extremely large. Source code is the version of software as it is originally written (i.e., typed into a computer) by a human in plain text (i.e., human readable alphanumeric characters) and before it is converted by a compiler into object code that a computer's processor can directly read and execute. For example, the source code for the Linux kernel version 2.4.0 is approximately 100MB and contains nearly 3.38 million lines, and that for version 2.6.0 is 212MB and contains 5.93 million lines. This adds to the complexity of maintaining the kernel, and it also makes it difficult for new generations of computer science students to study and comprehend the kernel. However, the advocates of monolithic kernels claim that in spite of their size such kernels are easier to design correctly, and thus they can be improved more quickly than can microkernel-based systems. Moreover, the size of the compiled kernel is only a tiny fraction of that of the source code, for example roughly 1.1MB in the case of Linux version 2.4 on a typical Red Hat Linux 9 desktop installation. Contributing to the small size of the compiled Linux kernel is its ability to dynamically load modules at runtime, so that the basic kernel contains only those components that are necessary for the system to start itself and to load modules. The monolithic Linux kernel can be made extremely small not only because of its ability to dynamically load modules but also because of its ease of customization. In fact, there are some versions that are small enough to fit together with a large number of utilities and other programs on a single floppy disk and still provide a fully functional operating system (one of the most popular of which is muLinux). This ability to miniaturize its kernel has also led to a rapid growth in the use of Linux in embedded systems (i.e., computer circuitry built into other products). Although microkernels are very small by themselves, in combination with all their required auxiliary code they are, in fact, often larger than monolithic kernels. Advocates of monolithic kernels also point out that the two-tiered structure of microkernel systems, in which most of the operating system does not interact directly with the hardware, creates a not-insignificant cost in terms of system efficiency.

Kernel basic responsibilities

The kernel's primary purpose is to manage the computer's resources and allow other programs to run and use these resources. Typically, the resources consist of: •





The CPU (frequently called the processor). This is the most central part of a computer system, responsible for running or executing programs on it. The kernel takes responsibility for deciding at any time which of the many running programs should be allocated to the processor or processors (each of which can usually run only one program at once) The computer's memory. Memory is used to store both program instructions and data. Typically, both need to be present in memory in order for a program to execute. Often multiple programs will want access to memory, frequently demanding more memory than the computer has available. The kernel is responsible for deciding which memory each process can use, and determining what to do when not enough is available. Any Input/Output (I/O) devices present in the computer, such as disk drives, printers, displays, etc. The kernel allocates requests from applications to perform I/O to an appropriate device (or subsection of a device, in the case of files on a disk or windows on a display) and provides convenient methods for using the device (typically abstracted to the point where the application does not need to know implementation details of the device)

Kernels also usually provide methods for synchronization and communication between processes (called inter-process communication or IPC). A kernel may implement these features itself, or rely on some of the processes it runs to provide the facilities to other processes, although in this case it must provide some means of IPC to allow processes to access the facilities provided by each other. Finally, a kernel must provide running programs with a method to make requests to access these facilities.

Process Management The main task of a kernel is to allow the execution of applications and support them with features such as hardware abstractions.

Memory management The kernel has full access to the system's memory and must allow processes to access this memory safely as they require it.

Device Management To perform useful functions, processes need access to the peripherals connected to the computer, which are controlled by the kernel through device drivers

Kernel design decisions Fault tolerance An important consideration in the design of a kernel is fault tolerance; specifically, in cases where multiple programs are running on a single computer, it is usually important to prevent a fault in one of the programs from negatively affecting the other. Extended to malicious design rather than a fault, this also applies to security, and is necessary to prevent processes from accessing information without being granted permission. Two main approaches to the protection of sensitive information are assigning privileges to hierarchical protection domains, for example by using a processor's supervisor mode, or distributing privileges differently for each process and resource, for example by using capabilities or access control lists. Hierarchical protection domains are much less flexible, as it is not possible to assign different privileges to processes that are at the same privileged level, and can't therefore satisfy Denning's four principles for fault tolerance (particularly the Principle of least privilege). Hierarchical protection domains also have a major performance drawback,

since interaction between different levels of protection, when a process has to manipulate a data structure both in 'user mode' and 'supervisor mode', always requires message copying (transmission by value). A kernel based on capabilities, however, is more flexible in assigning privileges, can satisfy Denning's fault tolerance principles, and typically doesn't suffer from the performance issues of copy by value. Both approaches typically require some hardware or firmware support to be operable and efficient. The hardware support for hierarchical protection domains is typically that of "CPU modes." An efficient and simple way to provide hardware support of capabilities is to delegate the MMU the responsibility of checking access-rights for every memory access, a mechanism called capability-based addressing. Most commercial computer architectures lack MMU support for capabilities. An alternative approach is to simulate capabilities using commonly-support hierarchical domains; in this approach, each protected object must reside in an address space that the application does not have access to; the kernel also maintains a list of capabilities in such memory. When an application needs to access an object protected by a capability, it performs a system call and the kernel performs the access for it. The performance cost of address space switching limits the practicality of this approach in systems with complex interactions between objects, but it is used in current operating systems for objects that are not accessed frequently or which are not expected to perform quickly. Approaches where protection mechanism are not firmware supported but are instead simulated at higher levels (e.g. simulating capabilities by manipulating page tables on hardware that does not have direct support), are possible, but there are performance implications. Lack of hardware support may not be an issue, however, for systems that choose to use language-based protection.

Security An important kernel design decision is the choice of the abstraction levels where the security mechanisms and policies should be implemented. One approach is to use firmware and kernel support for fault tolerance (see above), and build the security policy for malicious behavior on top of that (adding features such as cryptography mechanisms where necessary), delegating some responsibility to the compiler. Approaches that delegate enforcement of security policy to the compiler and/or the application level are often called language-based security.

Hardware-based protection or language-based protection Typical computer systems today use hardware-enforced rules about what programs are allowed to access what data. The processor monitors the execution and stops a program that violates a rule (e.g., a user process that is about to read or write to kernel memory, and so on). In systems that lack support for capabilities, processes are isolated from each other by using separate address spaces. Calls from user processes into the kernel are regulated by requiring them to use one of the above-described system call methods. An alternative approach is to use language-based protection. In a language-based protection system, the kernel will only allow code to execute that has been produced by a

trusted language compiler. The language may then be designed such that it is impossible for the programmer to instruct it to do something that will violate a security requirement. Advantages of this approach include: •

Lack of need for separate address spaces. Switching between address spaces is a slow operation that causes a great deal of overhead, and a lot of optimisation work is currently performed in order to prevent unnecessary switches in current operating systems. Switching is completely unnecessary in a language-based protection system, as all code can safely operate in the same address space.



Flexibility. Any protection scheme that can be designed to be expressed via a programming language can be implemented using this method. Changes to the protection scheme (e.g. from a hierarchical system to a capability-based one) do not require new hardware.

Disadvantages include: •

Longer application starts up time. Applications must be verified when they are started to ensure they have been compiled by the correct compiler, or may need recompiling either from source code or from byte code.



Inflexible type systems. On traditional systems, applications frequently perform operations that are not type safe. Such operations cannot be permitted in a language-based protection system, which means that applications may need to be rewritten and may, in some cases, lose performance.

Examples of systems with language-based protection include JX and Microsoft's Singularity.

Process cooperation Edsger Dijkstra proved that from a logical point of view, atomic lock and unlock operations operating on binary semaphores are sufficient primitives to express any functionality of process cooperation. However this approach is generally held to be lacking in terms of safety and efficiency, whereas a message passing approach is more flexible.

I/O devices management The idea of a kernel where I/O devices are handled uniformly with other processes, as parallel co-operating processes, was first proposed and implemented by Brinch Hansen

(although similar ideas were suggested in 1967). In Hansen's description of this, the "common" processes are called internal processes, while the I/O devices are called external processes.

Kernel-wide design approaches Naturally, the above listed tasks and features can be provided in many ways that differ from each other in design and implementation. While monolithic kernels execute all of their code in the same address space (kernel space) to increase the performance of the system, micro kernels try to run most of their services in user space, aiming to improve maintainability and modularity of the code base. Most kernels do not fit exactly into one of these categories, but are rather found in between these two designs. These are called hybrid kernels. More exotic designs such as nanokernels and exokernels are available, but are seldom used for production systems. The Xen hypervisor, for example, is an exokernel. The principle of separation of mechanism and policy is the substantial difference between the philosophy of micro and monolithic kernels. Here a mechanism is the support that allows to implement many different policies, while a policy is a particular "mode of operation". In minimal microkernel just some very basic policies are included, and its mechanisms allows what is running on top of the kernel (the remaining part of the operating system and the other applications) to decide which policies to adopt (as memory management, high level process scheduling, file system management, ecc.). A monolithic kernel instead tends to include many policies, therefore restricting the rest of the system to rely on them.

Monolithic kernels

Diagram of Monolithic kernels In a monolithic kernel, all OS services run along with the main kernel thread, thus also residing in the same memory area. This approach provides rich and powerful hardware access. Some developers maintain that monolithic systems are easier to design and implement than other solutions, and are extremely efficient if well-written. The main disadvantages of monolithic kernels are the dependencies between system components - a bug in a device driver might crash the entire system - and the fact that large kernels can become very difficult to maintain.

Microkernels In the microkernel approach, the kernel itself only provides basic functionality that allows the execution of servers, separate programs that assume former kernel functions, such as device drivers, GUI servers, etc. The microkernel approach consists of defining a simple abstraction over the hardware, with a set of primitives or system calls to implement minimal OS services such as memory management, multitasking, and inter-process communication. Other services, including those normally provided by the kernel such as networking, are implemented in user-space programs, referred to as servers. Microkernels are easier to maintain than monolithic kernels, but the large number of system calls and context switches might slow down the system because they typically generate more overhead than plain function calls. Microkernels generally underperform traditional designs, sometimes dramatically. This is due in large part to the overhead of moving in and out of the kernel, a context switch, to move data between the various applications and servers. By the mid-1990s, most researchers had abandoned the belief that careful tuning could reduce this overhead dramatically, but recently, newer microkernels, optimized for performance, such as L4 and K42 have addressed these problems. A microkernel allows the implementation of the remaining part of the operating system as a normal application program written in a high-level language, and the use of different operating systems on top of the same unchanged kernel. It is also possible to dynamically switch among operating systems and to have more than one active simultaneously.

Monolithic kernels vs microkernels As the computer kernel grows, a number of problems become evident. One of the most obvious is that the memory footprint increases. This is mitigated to some degree by perfecting the virtual memory system, but not all computer architectures have virtual memory support. To reduce the kernel's footprint, extensive editing has to be performed to carefully remove unneeded code, which can be very difficult with non-obvious interdependencies between parts of a kernel with millions of lines of code.

Due to the problems that monolithic kernels pose, they were considered obsolete by the early 1990s. As a result, the design of Linux using a monolithic kernel rather than a microkernel was the topic of a famous flame war between Linus Torvalds and Andrew Tanenbaum. There is merit on both sides of the argument presented in the Tanenbaum/Torvalds debate. Some, including early UNIX developer Ken Thompson, argued that while microkernel designs were more aesthetically appealing, monolithic kernels were easier to implement. However, a bug in a monolithic system usually crashes the entire system, while this doesn't happen in a microkernel with servers running apart from the main thread. Monolithic kernel proponents reason that incorrect code doesn't belong in a kernel, and that microkernels offer little advantage over correct code. Microkernels are often used in embedded robotic or medical computers where crash tolerance is important and most of the OS components reside in their own private, protected memory space. This is impossible with monolithic kernels, even with modern module-loading ones. However, the monolithic model tends to be more efficient through the use of shared kernel memory, rather than the slower IPC system of microkernel designs, which is typically based on message passing.

Hybrid kernels The hybrid kernel approach tries to combine the speed and simpler design of a monolithic kernel with the modularity and execution safety of a microkernel. Hybrid kernels are essentially a compromise between the monolithic kernel approach and the microkernel system. This implies running some services (such as the network stack or the filesystem) in kernel space to reduce the performance overhead of a traditional microkernel, but still running kernel code (such as device drivers) as servers in user space.

Nanokernels A nanokernel delegates virtually all services — including even the most basic ones like interrupt controllers or the timer — to device drivers to make the kernel memory requirement even smaller than a traditional microkernel.

Exokernels An exokernel is a type of kernel that does not abstract hardware into theoretical models. Instead it allocates physical hardware resources, such as processor time, memory pages, and disk blocks, to different programs. A program running on an exokernel can link to a library operating system that uses the exokernel to simulate the abstractions of a wellknown OS, or it can develop application-specific abstractions for better performance.[28]

Batch processing Batch processing is the execution of a series of programs ("jobs") on a computer without human interaction, when possible. Batch jobs are set up so they can be run to completion without human interaction, so all input data is preselected through scripts or commandline parameters. This is in contrast to interactive programs which prompt the user for such input. Batch processing has these benefits: • • • •

It allows sharing of computer resources among many users It shifts the time of job processing to when the computing resources are less busy It avoids idling the computing resources without minute-by-minute human interaction and supervision It is used on expensive classes of computers to help amortize the cost by keeping high rates of utilization of those expensive resources.

Batch processing has historically been synonymous with mainframe computers. Since this class computer was so expensive, batch processing was used for the reasons listed above. Also, in the early days of electronic computing, interactive sessions with computer terminal interfaces and later Graphical user interfaces) were not yet widespread. Batch processing has grown beyond its mainframe origins, and is now frequently used in UNIX environments, where the corn and at facilities allow for scheduling of complex job scripts. Similarly, Microsoft DOS and Windows systems refer to their command-scripting language as batch files and Windows has a job scheduler. A popular computerized batch processing procedure is printing. This normally involves the operator selecting the documents they need printed and indicating to the batch printing software when and where they should be output. Batch processing is also used for efficient bulk database updates and automated transaction processing, as contrasted to interactive online transaction processing (OLTP) applications.

Multiprocessing Multiprocessing is a generic term for the use of two or more central processing units (CPUs) within a single computer system. It also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them.[1] There are many variations on this basic theme, and the definition of multiprocessing can vary with context, mostly as a function of how CPUs are defined (multiple cores on one die, multiple chips in one package, multiple packages in one system unit, etc.).

Multiprocessing sometimes refers to the execution of multiple concurrent software processes in a system as opposed to a single process at any one instant. However, the term multiprogramming is more appropriate to describe this concept, which is implemented mostly in software, whereas multiprocessing is more appropriate to describe the use of multiple hardware CPUs. A system can be both multiprocessing and multiprogramming, only one of the two, or neither of the two.

Multi-user Multi-user is a term that defines an operating system or application software that allows concurrent access by multiple users of a computer. Time-sharing systems are multi-user systems. Most batch processing systems for mainframe computers may also be considered "multi-user", to avoid leaving the CPU idle while it waits for I/O operations to complete. However, the term "multitasking" is more common in this context. An example is a UNIX server where multiple remote users have access (via Telnet or SSH) to the UNIX shell prompt at the same time. Another example uses multiple X sessions spread across multiple monitors powered by a single machine. Management systems are implicitly designed to be used by multiple users, typically one system administrator or more and an End-user (computer science) community. The opposite term, single-user, is most commonly used when talking about an operating system being usable only by one person at a time, or in reference to a single-user software license agreement.

Distributed computing is a method of computer processing in which different parts of a program run simultaneously on two or more computers that are communicating with each other over a network. Distributed computing is a type of parallel computing. But the latter term is most commonly used to refer to processing in which different parts of a program run simultaneously on two or more processors that are part of the same computer. While both types of processing require that a program be parallelized—divided into sections that can run simultaneously, distributed computing also requires that the division of the program take into account the different environments on which the different sections of the program will be running. For example, two computers are likely to have different file systems and different hardware components. An example of distributed computing is BOINC, a framework in which large problems can be divided into many small problems which are distributed to many computers. Later, the small results are reassembled into a larger solution. Distributed computing is a natural result of the use of networks to allow computers to efficiently communicate. But distributed computing is distinct from networking. The

latter refers to two or more computers interacting with each other, but not, typically, sharing the processing of a single program. The World Wide Web is an example of a network, but not an example of distributed computing. There are numerous technologies and standards used to construct distributed computations, including some which are specially designed and optimized for that purpose, such as Remote Procedure Calls

Real-time operating system A real-time operating system (RTOS) is a multitasking operating system intended for real-time applications. Such applications include embedded systems (programmable thermostats, household appliance controllers, and mobile telephones), industrial robots, spacecraft, industrial control and scientific research equipment. An RTOS facilitates the creation of a real-time system, but does not guarantee the final result will be real-time; this requires correct development of the software. An RTOS does not necessarily have high throughput; rather, an RTOS provides facilities which, if used properly, guarantee deadlines can be met generally (soft real-time) or deterministically (hard real-time). An RTOS will typically use specialized scheduling algorithms in order to provide the real-time developer with the tools necessary to produce deterministic behavior in the final system. An RTOS is valued more for how quickly and/or predictably it can respond to a particular event than for the given amount of work it can perform over time. Key factors in an RTOS are therefore a minimal interrupt latency and a minimal thread switching latency. An early example of a large-scale real-time operating system was the so-called "control program" developed by American Airlines and IBM for the Sabre Airline Reservations System.

Chapter 3 Operating System components Into to Process management sub system Into to Memory management sub system Intro to File sub system Into to Device drivers Signals and System calls

Process management The main task of a kernel is to allow the execution of applications and support them with features such as hardware abstractions. To run an application, a kernel typically sets up an address space for the application, loads the file containing the application's code into memory (perhaps via demand paging), sets up a stack for the program and branches to a given location inside the program, thus starting its execution. Multi-tasking kernels are able to give the user the illusion that the number of processes being run simultaneously on the computer is higher than the maximum number of processes the computer is physically able to run simultaneously. Typically, the number of processes a system may run simultaneously is equal to the number of CPUs installed (however this may not be the case if the processors support simultaneous multithreading). In a pre-emptive multitasking system, the kernel will give every program a slice of time and switch from process to process so quickly that it will appear to the user as if these processes were being executed simultaneously. The kernel uses scheduling algorithms to determine which process is running next and how much time it will be given. The algorithm chosen may allow for some processes to have higher priority than others. The kernel generally also provides these processes a way to communicate; this is known as inter-process communication (IPC) and the main approaches are shared memory, message passing and remote procedure calls (see concurrent computing). Other systems (particularly on smaller, less powerful computers) may provide cooperative multitasking, where each process is allowed to run uninterrupted until it makes a special request that tells the kernel it may switch to another process. Such requests are known as "yielding", and typically occur in response to requests for interprocess communication, or for waiting for an event to occur. Older versions of Windows and Mac OS both used co-operative multitasking but switched to pre-emptive schemes as the power of the computers to which they were targeted grew.

The operating system might also support multiprocessing (SMP or Non-Uniform Memory Access); in that case, different programs and threads may run on different processors. A kernel for such a system must be designed to be re-entrant, meaning that it may safely run two different parts of its code simultaneously. This typically means providing synchronization mechanisms (such as spin locks) to ensure that no two processors attempt to modify the same data at the same time.

Memory management The kernel has full access to the system's memory and must allow processes to access this memory safely as they require it. Often the first step in doing this is virtual addressing, usually achieved by paging and/or segmentation. Virtual addressing allows the kernel to make a given physical address appear to be another address, the virtual address. Virtual address spaces may be different for different processes; the memory that one process accesses at a particular (virtual) address may be different memory from what another process accesses at the same address. This allows every program to behave as if it is the only one (apart from the kernel) running and thus prevents applications from crashing each other. On many systems, a program's virtual address may refer to data which is not currently in memory. The layer of indirection provided by virtual addressing allows the operating system to use other data stores, like a hard drive, to store what would otherwise have to remain in main memory (RAM). As a result, operating systems can allow programs to use more memory than the system has physically available. When a program needs data which is not currently in RAM, the CPU signals to the kernel that this has happened, and the kernel responds by writing the contents of an inactive memory block to disk (if necessary) and replacing it with the data requested by the program. The program can then be resumed from the point where it was stopped. This scheme is generally known as demand paging. Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas, one being reserved for the kernel (kernel space) and the other for the applications (user space). The applications are not permitted by the processor to address kernel memory, thus preventing an application from damaging the running kernel. This fundamental partition of memory space has contributed much to current designs of actual general-purpose kernels and is almost universal in such systems, although some research kernels (e.g. Singularity) take other approaches.

Device management To perform useful functions, processes need access to the peripherals connected to the computer, which are controlled by the kernel through device drivers. For example, to show the user something on the screen, an application would make a request to the kernel, which would forward the request to its display driver, which is then responsible for actually plotting the character/pixel.

A kernel must maintain a list of available devices. This list may be known in advance (e.g. on an embedded system where the kernel will be rewritten if the available hardware changes), configured by the user (typical on older PCs and on systems that are not designed for personal use) or detected by the operating system at run time (normally called plug and play). In a plug and play system, a device manager first performs a scan on different hardware buses, such as Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB), to detect installed devices, then searches for the appropriate drivers. As device management is a very OS-specific topic, these drivers are handled differently by each kind of kernel design, but in every case, the kernel has to provide the I/O to allow drivers to physically access their devices through some port or memory location. Very important decisions have to be made when designing the device management system, as in some designs accesses may involve context switches, making the operation very CPUintensive and easily causing a significant performance overhead.

File sub system Computers can store information on several different storage media, such as magnetic disks, magnetic tapes and optical disks. So that computer system will be convenient to use, the operating system provides a uniform logical view of information storage. Files are mapped by the operating system on to physical devices. These devices usually nonvolatile so contents are persistent.

Signals A signal is a limited form of inter-process communication used in UNIX, Unix-like, and other POSIX-compliant operating systems. Essentially it is an asynchronous notification sent to a process in order to notify it of an event that occurred. When a signal is sent to a process, the operating system interrupts the processes' normal flow of execution. Execution can be interrupted during any non-atomic instruction. If the process has previously registered a signal handler, that routine is executed. Otherwise the default signal handler is executing.

Sending signals •

Typing certain combinations at the controlling terminal of a running process causes the system to send it certain signals: o Ctrl-C (in older Unixes, DEL) sends an INT signal (SIGINT); by default, this causes the process to terminate. o Ctrl-Z sends a TSTP signal (SIGTSTP); by default, this causes the process to suspend execution. o Ctrl-\ sends a QUIT signal (SIGQUIT); by default, this causes the process to terminate and dump core.

Handling signals Signal handlers can be installed with the signal () system call. If a signal handler is not installed for a particular signal, the default handler is used. Otherwise the signal is intercepted and the signal handler is invoked. The process can also specify two default behaviors, without creating a handler: ignore the signal (SIG_IGN) and use the default signal handler (SIG_DFL). There are two signals which cannot be intercepted and handled: SIGKILL and SIGSTOP. Signal handling is vulnerable to race conditions. Because signals are asynchronous, another signal (even of the same type) can be delivered to the process during execution of the signal handling routine. The sigprocmask() call can be used to block and unblock delivery of signals. Signals can cause the interruption of a system call in progress, leaving it to the application to manage a non-transparent restart.

System calls To actually perform useful work, a process must be able to access the services provided by the kernel. This is implemented differently by each kernel, but most provide a C library or an API, which in turn invoke the related kernel functions. The method of invoking the kernel function varies from kernel to kernel. If memory isolation is in use, it is impossible for a user process to call the kernel directly, because that would be a violation of the processor's access control rules. A few possibilities are: • •





Using a software-simulated interrupt. This method is available on most hardware, and is therefore very common. Using a call gate. A call gate is a special address which the kernel has added to a list stored in kernel memory and which the processor knows the location of. When the processor detects a call to that location, it instead redirects to the target location without causing an access violation. Requires hardware support, but the hardware for it is quite common. Using a special system call instruction. This technique requires special hardware support, which common architectures (notably, x86) may lack. System call instructions have been added to recent models of x86 processors, however, and some (but not all) operating systems for PCs make use of them when available. Using a memory-based queue. An application that makes large numbers of requests but does not need to wait for the result of each may add details of requests to an area of memory that the kernel periodically scans to find requests.

Chapter 4 Process management sub system Creating process Process Id, process descriptor Process contexts Process state Process address space Process scheduling policies Threads vs process Multi threading and Hyper threading

The concept of a process is fundamental to any multiprogramming operating system.A process is usually defined as an instance of a program in execution; thus, if 16 users are running v i at once, there are 16 separate processes (although they can share the same executable code). Processes are often called "tasks". In this chapter, we will first discuss static properties of processes and then describe how process switching is performed by the kernel. The last two sections investigate dynamic properties of processes, namely, how processes can be created and destroyed. This chapter also describes how multithreaded applications are executed.

Process Descriptor In order to manage processes, the kernel must have a clear picture of what each process is doing. It must know, for instance, the process's priority, whether it is running on the CPUor blocked on some event, what address space has been assigned to it, which files it is allowed to address, and so on. This is the role of the process descriptor , that is, of a task_s t ructype t structure whose fields contain all the information related to a single process. As the repository of so much information, the process descriptor is rather complex. Not only does it contain many fields itself, but some contain pointers to other data structures that, in turn, contain pointers to other structures.

The Process/Kernel Model As already mentioned, a CPU can run either in User Mode or in Kernel Mode. Actually, some CPUs can have more than two execution states. For instance, the Intel 80x86

microprocessors have four different execution states. But all standard Unix kernels make use of only Kernel Mode and User Mode. When a program is executed in User Mode, it cannot directly access the kernel data structures or the kernel programs. When an application executes in Kernel Mode, however, these restrictions no longer apply. Each CPU model provides special instructions to switch from User Mode to Kernel Mode and vice versa. A program executes most of the time in User Mode and switches to Kernel Mode only when requesting a service provided by the kernel. When the kernel has satisfied the program's request, it puts the program back in User Mode. Processes are dynamic entities that usually have a limited life span within the system. The task of creating, eliminating, and synchronizing the existing processes is delegated to a group of routines in the kernel. The kernel itself is not a process but a process manager. The process/kernel model assumes that processes that require a kernel service make use of specific programming constructs called system calls. Each system call sets up the group of parameters that identifies the process request and then executes the hardware-dependent CPU instruction to switch from User Mode to Kernel Mode. Besides user processes, Unix systems include a few privileged processes called kernel threads with the following characteristics: They run in Kernel Mode in the kernel address space. They do not interact with users, and thus do not require terminal devices. They are usually created during system startup and remain alive until the system is shut down. • • •

Notice how the process/ kernel model is somewhat orthogonal to the CPU state: on a uniprocessor system, only one process is running at any time and it may run either in User or in Kernel Mode. If it runs in Kernel Mode, the processor is executing some kernel routine. Figure 1-3 illustrates examples of transitions between User and Kernel Mode. Process 1 in User Mode issues a system call, after which the process switches to Kernel Mode and the system call is serviced. Process 1 then resumes execution in User Mode until a timer interrupt occurs and the scheduler is activated in Kernel Mode. A process switch takes place, and Process 2 starts its execution in User Mode until a hardware device raises an interrupt. As a consequence of the interrupt, Process 2 switches to Kernel Mode and services the interrupt.

Unix kernels do much more than handle system calls; in fact, kernel routines can be activated in several ways: A process invokes a system call. The CPU executing the process signals an exception, which is some unusual condition such as an invalid instruction. The kernel handles the exception on behalf of the process that caused it. • A peripheral device issues an interrupt signal to the CPU to notify it of an event such as a request for attention, a status change, or the completion of an I/O operation. Each interrupt signal is dealt by a kernel program called an interrupt handler. Since peripheral devices operate asynchronously with respect to the CPU, interrupts occur at unpredictable times. • A kernel thread is executed; since it runs in Kernel Mode, the corresponding program must be considered part of the kernel, albeit encapsulated in a process. • •

Process Implementation To let the kernel manage processes, each process is represented by a process descriptor that includes information about the current state of the process. When the kernel stops the execution of a process, it saves the current contents of several processor registers in the process descriptor. These include: The program counter (PC) and stack pointer (SP) registers The general-purpose registers The floating point registers The processor control registers (Processor Status Word) containing information about the CPU state • The memory management registers used to keep track of the RAM accessed by the process When the kernel decides to resume executing a process, it uses the proper process descriptor fields to load the CPU registers. Since the stored value of the program counter • • • •

points to the instruction following the last instruction executed, the process resumes execution from where it was stopped. When a process is not executing on the CPU, it is waiting for some event. Unix kernels distinguish many wait states, which are usually implemented by queues of process descriptors; each (possibly empty) queue corresponds to the set of processes waiting for a specific event. Process Address Space Each process runs in its private address space. A process running in User Mode refers to private stack, data, and code areas. When running in Kernel Mode, the process addresses the kernel data and code area and makes use of another stack. Since the kernel is reentrant, several kernel control paths—each related to a different process—may be executed in turn. In this case, each kernel control path refers to its own private kernel stack. While it appears to each process that it has access to a private address space, there are times when part of the address space is shared among processes. In some cases this sharing is explicitly requested by processes; in others it is done automatically by the kernel to reduce memory usage. If the same program, say an editor, is needed simultaneously by several users, the program will be loaded into memory only once, and its instructions can be shared by all of the users who need it. Its data, of course, must not be shared, because each user will have separate data. This kind of shared address space is done automatically by the kernel to save memory. Processes can also share parts of their address space as a kind of interprocess communication, using the "shared memory" technique introduced in System V Creating a new Task using fork and exec The DOS and Windows API contain the spawn family of functions. These functions take as an argument the name of a program to run and create a new process instance of that program. Linux doesn’t contain a single function that does all this in one step. Instead, Linux provides one function, fork, that makes a child process that is an exact copy of its parent process. In general we never know that child starts executing before or vice versa. This depends upon the scheduling algorithm used by the kernel.

The process state As its name implies, the s ta tefield of the process descriptor describes what is currently happening to the process. It consists of an array of flags, each of which describes a possible process state. In the current Linux version these states are mutually exclusive,

and hence exactly one flag of s ta teis set; the remaining flags are cleared. The following are the possible process states: TASK_RUNNING The process is either executing on the CPU or waiting to be executed. TASK_INTERRUPTIBLE The process is suspended (sleeping) until some condition becomes true. Raising a hardware interrupt, releasing a system resource the process is waiting for, or delivering a signal are examples of conditions that might wake up the process, that is, put its state back to TASK_RUNNING . TASK_UNINTERRUPTIBLE Like the previous state, except that delivering a signal to the sleeping process leaves its state unchanged. This process state is seldom used. It is valuable, however, under certain specific conditions in which a process must wait until a given event occurs without being interrupted. For instance, this state may be used when a process opens a device file and the corresponding device driver starts probing for a corresponding hardware device. The device driver must not be interrupted until the probing is complete, or the hardware device could be left in an unpredictable state. TASK_STOPPED Process execution has been stopped: the process enters this state after receiving a S IGSTOP , S IGTSTP, SIGTTIN, or SIGTTOU signal. When a process is being monitored by another (such as when a debugger executes a ptrace( ) system call to monitor a test program), any signal may put the process in the TASK_STOPPED state. TASK_ZOMBIE Process execution is terminated, but the parent process has not yet issued a wait( )like system call (wait( ), wait3( ), wait4( ), or waitpid( )) to return information about the dead process. Before the wait( )-like call is issued, the kernel cannot discard the data contained in the dead process descriptor because the parent could need it.

User Running

stop Stoppe d

continue

Fork() Ready to run

Return from System call interrupt

System call interrupt Kernel Running switch

wait

Zombi e

exit sleep

Asleep

wakeup stop

continue

Initial (idle)

stop

Fork()

wakeup Stoppe d+

The fork(2) system call creates a new process, which begins life in the initial state. When the process is fully created fork(2) moves it to ready to run state, where it must wait to be scheduled eventually the kernel selects it for execution and initiates to a context switch. This invokes a kernel routine (switches) that loads the hardware context of the process into the system registers and transfer control to the process. From this point the new process behaves like any other process. A process running in a user mode enters kernel mode as a result of a system call or an interrupts and returns to a user mode when that completes. While executing a system call, the process may need to wait for an event or for a resource that is currently unavailable, it does by calling sleep(2) system call which puts process into a queue of sleeping process and changes its state to asleep. When the event occurs or the resource becomes available, the kernel wakes up the process which now becomes ready to run and waits to be scheduled. When a process is scheduled to run, it initially runs in kernel mode (kernel running state) where it completes the context switch. Initially process exits by calling exit(2) system call or because of a signal. In either case, the kernel releases all the resources of the process, except for the exit status and the resource usage information and leaves process in Zombie state. The process remains in this state until it’s parent calls wait(2) system call.

Process Context Each process has a well defined context. This context has several components: • User address space: This is usually divided in to several components. The program text executable code, data, user stack, shared memory regions and so on. • Control information: The kernel uses two main data structures to maintain control information about the process. • Credentials: The credentials of the process include the user and group IP’s associated with it. • Environment variables: Which are the inherited from the parent. • Hardware context: This includes the context of the general purpose registers and of a set of special system registers. The system register includes   

• •

The program counter (PC): Which holds the address of the next instruction to execute. The stack pointer which contains the address of the upper most element of the stack. The processor status word (PSW) which has several status bits containing information about system state, such as current and previous execution modes, current and interrupt priority levels, and overflow and carry bits.

Memory management registers: Which map the address translation tables of the process. Floating point unit (FPU) registers

Machine registers contain the hardware context of the currently running process. When context switch occurs is registers are saved in task struct structure. Identifying a Process Although Linux processes can share a large portion of their kernel data structures—an efficiency measure known as lightweight processes—each process has its own process descriptor. Each execution context that can be independently scheduled must have its own process descriptor. Lightweight processes should not be confused with user-mode threads, which are different execution flows handled by a user-level library. For instance, older Linux systems implemented POSIX threads entirely in user space by means of the pthread library; therefore, a multithreaded program was executed as a single Linux process. Currently, the pthread library, which has been merged into the standard C library, takes advantage of lightweight processes.

The very strict one-to-one correspondence between the process and process descriptor makes the 32-bit process descriptor address[1] a convenient tool to identify processes. These addresses are referred to as process descriptor pointers. Most of the references to processes that the kernel makes are through process descriptor pointers. Any Unix-like operating system, on the other hand, allows users to identify processes by means of a number called the Process ID (or PID). The PID is a 32-bit unsigned integer stored in the p id field of the process descriptor. PIDs are numbered sequentially: the PID of a newly created process is normally the PID of the previously created process incremented by one. However, for compatibility with traditional Unix systems developed for 16-bit hardware platforms, the maximum PID number allowed on Linux is 32767. When the kernel creates the 32768th process in the system, it must start recycling the lower unused PIDs. At the end of this section, we'll show you how it is possible to derive a process descriptor pointer efficiently from its respective PID. Efficiency is important because many system calls like k i l l ( use ) the PID to denote the affected process.

Process Switching In order to control the execution of processes, the kernel must be able to suspend the execution of the process running on the CPU and resume the execution of some other process previously suspended. This activity is called process switching , task switching, or context switching. The following sections describe the elements of process switching in Linux: • • • •

Hardware context Hardware support Linux code Saving the floating point registers

Hardware Context While each process can have its own address space, all processes have to share the CPU registers. So before resuming the execution of a process, the kernel must ensure that each such register is loaded with the value it had when the process was suspended. The set of data that must be loaded into the registers before the process resumes its execution on the CPU is called the hardware context. The hardware context is a subset of the process execution context, which includes all information needed for the process execution. In Linux, part of the hardware context of a process is stored in the TSS segment, while the remaining part is saved in the Kernel Mode stack. As we learned in , the TSS segment coincides with the t s s field of the process descriptor.

Swap processes If the most deserving process to run is not the current process, then the current process must be suspended and the new one made to run. When a process is running it is using the registers and physical memory of the CPU and of the system. Each time it calls a routine it passes its arguments in registers and may stack saved values such as the address to return to in the calling routine. So, when the scheduler is running it is running in the context of the current process. It will be in a privileged mode, kernel mode, but it is still the current process that is running. When that process comes to be suspended, all of its machine state, including the program counter (PC) and all of the processor's registers, must be saved in the processes task_struct data structure. Then, all of the machine state for the new process must be loaded. This is a system dependent operation, no CPUs do this in quite the same way but there is usually some hardware assistance for this act. This swapping of process context takes place at the end of the scheduler. The saved context for the previous process is, therefore, a snapshot of the hardware context of the system as it was for this process at the end of the scheduler. Equally, when the context of the new process is loaded, it too will be a snapshot of the way things were at the end of the scheduler, including this processes program counter and register contents. If the previous process or the new current process uses virtual memory then the system's page table entries may need to be updated. Again, this action is architecture specific. Processors like the Alpha AXP, which use Translation Lookaside Tables or cached Page Table Entries, must flush those cached table entries that belonged to the previous process.

Process scheduling Scheduling is a fundamental operating system function. All computer resources are scheduled before use. Since CPU is one of the primary computer resources, its scheduling is central to the operating system design. Scheduling refers to a set of policies and mechanisms supported by operating system that controls the order in which the work to be done is completed. A scheduler is an operating system program that selects the next job to be admitted for execution. The main objective of scheduling is to increase CPU utilization and the higher output. In multi processing environment CPU scheduling is one of the fundamental requirement. This mechanism improves the overall efficiency of the computer system by getting more work done in less time.

Types of scheduler There are three types of schedulers: Long term, medium term and short term schedulers. Long term schedulers sometimes it is also job scheduling. This determines which job shall be admitted for immediate processing. There are always more processes that can be executed by CPU as a batch processing. These processes are kept in the large storage devices like disk for later processing. The long term schedulers select processes from this pool and loads into the memory. In memory these processes belong to a ready queue. Long term scheduler

Short term scheduler End of a program Ready Queue

CPU

Suspended Queue

Medium term scheduler: Most of the processes require some I/O operations in that case it may become suspended for I/O operation after running a while. It is beneficial to remove these processes from the main memory to hard disk to make room for other processes. At some time later this processes can be reloaded into memory and continue from where it was left earlier. Saving of suspended processes is said to be swapped out or rolled out. The process is swapped in and swaps out bySuspended the mediumand term scheduler. swapped out queue Long term scheduler Job

Ready Queue

CPU

Suspended Queue

End

Medium term scheduler Short term scheduler: It allocates processes belonging to ready queue to CPU for immediate processing. Its main objective is to maximize CPU requirement. Compared to the other two scheduler it is more frequent. It must select a new process for execution quite often because a CPU executes a process only for a few milliseconds before it goes for I/O operation. So the short term scheduler is very fast.

Scheduling Algorithms CPU scheduling deals with the problem of deciding which of the processes is in the ready queue. In this session several scheduling algorithms. Basically there are two types of scheduling Pre- emptive Non- pre emptive Non pre-emptive scheduling: If once a process has been given the CPU, CPU cannot taken from that process till it completes is called Non preemptive scheduling. In this type of scheduling, the jobs are made to wait by longer jobs, but the treatment of processes is fairer. Preemptive: In preemptive scheduling the CPU can be taken away by the allocated process. It is more useful in high priority, which requires immediate response .

FCFS This is the simplest scheduling, the processes in this scheduling are served in the order they arrive. Its implementation is done using FIFO queue. Once the process yields CPU, it runs and completes its job, then it releases the CPU. The FCFS is Non Preemptive scheduling which results in poor performance. In this scheduling there is low rate of CPU utilization and system throughput. The short jobs have to wait for long time. Consider two processes Process P1 P2

Execution 30 6

If these processes arrive in the order p1 then p2 the turnout time is 30 and 36units of time respectively. Thus giving an average time of (30 + 36)/2 =33 units of time. Their corresponding waiting time is 15 units of time. However, if the processes arrives in reverse order then the turn around time 6 and 36 units with the average 21 units and the average waiting time is 0+6/2 = 3.

Shortest job first scheduling In the shortest job scheduling of a job or process is done on the basis it’s having shortest execution time. If the two processes have the same execution time then, FCFS is used for example consider the process given below: Process P1 P2 P3 P4

CPU time 4 10 6 3

According to the shortest job scheduling the processes are served in the order P4-P1-P3P2 So the waiting time is (0+3+7+13)/4=23/4= 5.75units of time. Now lets compare it with the FCFS scheduling, the average waiting time in FCFS is (0+4+14+20)/4=38/4=9.5 units of time So the SJF scheduling reduces the average waiting time. The shortest job scheduling may be preemptive and Non preemptive, but in both the cases whenever SJF scheduling is involved it searches the ready queue to find the job or the processes with the shortest execution time. The difference between the two SJF scheduler lies in the condition that leads to invocation of this scheduler and consequently the frequency of execution .

Round Robin Scheduling Round Robin Scheduling is the oldest, simplest and widely used. This scheduling is primarily used in time sharing and a multi user system where the primary requirement is the good response time. In this scheduling algorithm CPU time is divided into time slices. Each process is allocated a small time slice while it is running. No process can run for more then one time slice when there are other is waiting in the ready queue. If the process require more CPU time then it should go in the end of the ready queue.

Z

Y

X CPU

Preemption Round Robin Scheduling

good response time where as the long processes may require several time slices and thus In the round Robin scheduling small process may be executed in single time slice giving be forced to pass through ready queue a few times before completion. But in this scheduling the system resources are utilized in an equitable manner for example: Consider the three processes waiting in the ready queue. Process P1 P2 P3

Execution time 20 5 10

If we use a time slice of 5 units of time then P1 gets first 5 units of time and the CPU is given to the Second process that is P2. since P2 needs just 5 units of time it terminates as time slicer is expired. The CPU is given to the process P3. Once each process has received one time slicer. The CPU is return to P1 for additional time slicer. For implementing a round robin scheduling we require a dedicated time. The timer is usually set to interrupt the operating system whenever a time slicer expires and thus force the scheduler to be involved. Processing the interrupt to switch the CPU to another process requires saving all the registers for the old process and then loading the registers for the new process. This task is known as context switching.

Scheduling and performance criteria CPU utilization: It refers to the amount of work completed in a unit of time. One way to measure throughput is by means of the number of processes that are completed in a unit of time. Higher the number of processes the more works apparently being done by the system. Turnaround time: It may be defined as interval from the time of submission of a process to the time of its completion. It is the sum of the period spent waiting to get into memory, waiting in the ready queue, CPU time and I/O operations. Waiting time: In multiprogramming operating system several jobs reside at a time in memory, CPU executes only one job at a time. The rest of the job waits for the CPU. The waiting time may be expressed as turnaround time , less the actual processing time. Response time: It is most frequently considered in time sharing and real time environment. However its characteristic differs in two systems. In time sharing system it may be defined as interval from the time the last character of the command line of a program or traction is entered to the time the last result appears on the terminal. In real time systems it may be defined as interval from the time an internal or external event is signaled to the time the first instruction of the respective service routine is executed.

What is a Thread? Why Use Threads A thread is a semi-process, which has its own stack, and executes a given piece of code. Unlike a real process, the thread normally shares its memory with other threads (where as for processes we usually have a different memory area for each one of them). A Thread Group is a set of threads all executing inside the same process. They all share the same memory, and thus can access the same global variables, same heap memory, same set of file descriptors, etc. All these threads execute in parallel (i.e. using time slices, or if the system has several processors, then really in parallel).

Processes Vs. Threads For some programs that benefit from concurrency, the decision whether to use processes or threads can be difficult. Here are some guidelines to help you decide which concurrency model best suits your program: •

All threads in a program must run the same executable. A child process, on the other hand, may run a different executable by calling an exec function.



An errant thread can harm other threads in the same process because threads share the same virtual memory space and other resources. For instance, a wild memory write through an uninitialized pointer in one thread can corrupt memory visible to another thread.



An errant process, on the other hand, cannot do so because each process has a copy of the program’s memory space.



Copying memory for a new process adds an additional performance overhead relative to creating a new thread. However, the copy is performed only when the memory is changed, so the penalty is minimal if the child process only reads memory.



The advantage of using a thread group instead of a normal serial program is that several operations may be carried out in parallel, and thus events can be handled immediately as they arrive (for example, if we have one thread handling a user interface, and another thread handling database queries, we can execute a heavy query requested by the user, and still respond to user input while the query is executed).



The advantage of using a thread group over using a process group is that context switching between threads is much faster than context switching between processes (context switching means that the system switches from running one thread or process, to running another thread or process). Also, communications between two threads is usually faster and easier to implement than communications between two processes.



Threads should be used for programs that need fine-grained parallelism. For example, if a problem can be broken into multiple, nearly identical tasks, threads may be a good choice. Processes should be used for programs that need coarser parallelism. •

Sharing data among threads is trivial because threads share the same memory.



On the other hand, because threads in a group all use the same memory space, if one of them corrupts the contents of its memory, other threads might suffer as well. With processes, the operating system normally protects processes from one another, and thus if one corrupts its own memory space, other processes won't suffer. Another advantage of using processes is that they can run on different machines, while all the threads have to run on the same machine (at least normally).



Multi threading and hyper threading Hyper-threading, officially called Hyper-Threading Technology (HTT), is Intel's trademark for their implementation of the simultaneous multithreading technology on the Pentium 4 micro architecture. It is a more advanced form of Super-threading that debuted on the Intel Xeon processors and was later added to Pentium 4 processors. The technology improves processor performance under certain workloads by providing useful

work for execution units that would otherwise be idle, for example during a cache miss. A Pentium 4 with Hyper-Threading enabled is treated by the operating system as two processors instead of one.

Performance The advantages of Hyper-Threading are listed as: improved support for multi-threaded code, allowing multiple threads to run simultaneously, improved reaction and response time. According to Intel, the first implementation only used an additional 5% of the die area over the comparable non-hyper threaded processor, yet yielded performance improvements of 15–30%. Intel claims up to a 30% speed improvement compared against an otherwise identical, non-simultaneous multithreading Pentium 4. The performance improvement seen is very application-dependent, however, and some programs actually slow down slightly when Hyper Threading Technology is turned on. This is due to the replay system of the Pentium 4 tying up valuable execution resources, thereby starving the other thread. (The Pentium 4 Prescott core gained a replay queue, which reduces execution time needed for the replay system, but this is not enough to completely overcome the performance hit.) However, any performance degradation is unique to the Pentium 4 (due to various architectural nuances), and is not characteristic of simultaneous multithreading in general.

Details Hyper-Threading works by duplicating certain sections of the processor—those that store the architectural state—but not duplicating the main execution resources. This allows a Hyper-Threading equipped processor to pretend to be two "logical" processors to the host operating system, allowing the operating system to schedule two threads or processes simultaneously. Where execution resources in a non-Hyper-Threading capable processor are not used by the current task, and especially when the processor is stalled, a HyperThreading equipped processor may use those execution resources to execute another scheduled task. (The processor may stall due to a cache miss, branch misprediction, or data dependency.) Except for its performance implications, this innovation is transparent to operating systems and programs. All that is required to take advantage of Hyper-Threading is symmetric multiprocessing (SMP) support in the operating system, as the logical processors appear as standard separate processors. However, it is possible to optimize operating system behavior on Hyper-Threading capable systems, such as the Linux techniques discussed in Kernel Traffic. For example, consider an SMP system with two physical processors that are both Hyper-Threaded (for a total of four logical processors). If the operating system's process scheduler is unaware of Hyper-Threading, it would treat all four processors the same. As a result, if only two

processes are eligible to run, it might choose to schedule those processes on the two logical processors that happen to belong to one of the physical processors. Thus, one CPU would be extremely busy while the other CPU would be completely idle, leading to poor overall performance. This problem can be avoided by improving the scheduler to treat logical processors differently from physical processors; in a sense, this is a limited form of the scheduler changes that are required for NUMA systems.

Chapter 5 Memory management sub system Segmentation Paging Page allocation strategies Memory mapping and unmapping

Memory Addresses Programmers casually refer to a memory address as the way to access the contents of a memory cell. But when dealing with Intel 80x86 microprocessors, we have to distinguish among three kinds of addresses: Logical address Included in the machine language instructions to specify the address of an operand or of an instruction. This type of address embodies the well-known Intel segmented architecture that forces MS-DOS and Windows programmers to divide their programs into segments. Each logical address consists of a segment and an offset (or displacement) that denotes the distance from the start of the segment to the actual address.

Linear address (Virtual address) A single 32-bit unsigned integer that can be used to address up to 4 GB, that is, up to 4,294,967,296 memory cells. Linear addresses are usually represented in hexadecimal notation; their values range from 0x00000000 to 0x f f f f f .f f f Physical address Used to address memory cells included in memory chips. They correspond to the electrical signals sent along the address pins of the microprocessor to the memory bus. Physical addresses are represented as 32-bit unsigned integers. The CPU control unit transforms a logical address into a linear address by means of a hardware circuit called a segmentation unit; successively, a second hardware circuit called a paging unit transforms the linear address into a physical address. Figure Logical address translation

Segmentation in Hardware Starting with the 80386 model, Intel microprocessors perform address translation in two different ways called real mode and protected mode. Real mode exists mostly to maintain processor compatibility with older models and to allow the operating system to bootstrap

Memory allocation The main memory should accommodate both operating system and the various user processes. The memory is usually divided into two partitions, one for the resident operating system, and one for the user processes. It is possible to place the operating system in either low memory or high memory. The major factor affecting this decision is the location of the interrupt vector. Since the interrupt vector is in low memory, it is more common to place the operating system in low memory. Thus user process resides in the high memory. Thus we need to protect the operating system code and data from changes. We also need to protect the user process from one another. We can provide this protection by using a relocation register, with a limit register. The relocation register contains the address of the smallest memory location; the limit register contains the range of the logical address. Each logical address must be less than the limit register.

External and internal fragmentation As process are loaded and removed from memory , the free memory space is broken into little pieces. External fragmentation exits when enough total memory space exits to satisfy a request, but if it is not continuous; storage is fragmented into large number of small holes. The fragmentation is problem can be a severe. In the worst case we could have a block of free memory between every two process. The selection of the first-fit or the bestfit will affect the fragmentation. Depending on the total number of memory storage and the average process size , external fragmentation may be either a minor or major problem. Another problem that arises with multiple part ion allocation technique is , consider the hole of 18,500 bytes. The request for the next process is 18,498 bytes. If we allocate same block we left with hole of 2 bytes. The over load to keep a track of this hole is larger than the hole itself. General is to allocate very small holes as part of the larger request. Thus the allocated memory may be slightly larger than the requested memory. The difference between these two numbers is internal fragmentation (memory that is internal to a partition, but it is not being used). Operating system P1

Hole of 18 MB P2

One solution to the problem of external fragmentation is compassion. The goal is to shuffle the memory contents to place all the free memory together in one large block. Compassion is not always possible because if the relocation is static and done at assembly or load time, compassion can not be done. Compassion can be done only if the relocation is dynamic, and it is done at the execution time. If the addresses are relocated dynamically, relocation requires only moving the program and data. Then changing the base register to reflect the new base address. When compassion is possible, we must determine its cost. The simplest compassion algorithm is simply to move all processes toward one end of the memory. All holes moves to the other end.

Swapping also can be combined with compassion. A process can be rolled out of main memory to a backing store and rolled in again. When the process is rolled out its memory is released, and perhaps is reused by another process. If static relocation is used then process must be rolled in to the same memory. If dynamic relocation is used then a process can be rolled into a different location. In this case we find free block compacting if necessary. One approach to compassion is to roll out those processes to be moved, and to roll them into different memory location. Note: Paging is another solution to the external fragmentation problem. (Will be discussed in this chapter later) Segmentation Registers A logical address consists of two parts: a segment identifier and an offset that specifies the relative address within the segment. The segment identifier is a 16-bit field called Segment Selector, while the offset is a 32-bit field. To make it easy to retrieve segment selectors quickly, the processor provides segmentation registers whose only purpose is to hold Segment Selectors; these registers are called cs, ss, ds, es, fs, and gs. Although there are only six of them, a program can reuse the same segmentation register for different purposes by saving its content in memory and then restoring it later. Three of the six segmentation registers have specific purposes: cs :The code segment register, which points to a segment containing program instructions. ss :The stack segment register, which points to a segment containing the current program stack. ds :The data segment register, which points to a segment containing static and external data. The remaining three segmentation registers are general purpose and may refer to arbitrary segments. The cs register has another important function: it includes a 2-bit field that specifies the Current Privilege Level (CPL) of the CPU. The value denotes the highest privilege level, while the value 3 denotes the lowest one. Linux uses only levels and 3, which are respectively called. Kernel Mode and User Mode. Segment Descriptors Each segment is represented by an 8-byte Segment Descriptor (see Figure ) that describes the segment characteristics. Segment Descriptors are stored either in the Global Descriptor Table (GDT ) or in the Local Descriptor Table (LDT ).

Each Segment Descriptor consists of the following fields: A 32-bit Base field that contains the linear address of the first byte of the segment. A G granularity flag: if it is cleared, the segment size is expressed in bytes; otherwise, it is expressed in multiples of 4096 bytes. • A 20-bit L im i tfield that denotes the segment length in bytes. If G is set to 0, the size of a non-null segment may vary between 1 byte and 1 MB; otherwise, it may vary between 4 KB and 4 GB. • An S system flag: if it is cleared, the segment is a system segment that stores kernel data structures; otherwise, it is a normal code or data segment. • A 4-bit Type field that characterizes the segment type and its access rights. The following Segment Descriptor types are widely used: • •

Code Segment Descriptor Indicates that the Segment Descriptor refers to a code segment; it may be included either in the GDT or in the LDT. The descriptor has the S flag set. Data Segment Descriptor Indicates that the Segment Descriptor refers to a data segment; it may be included either in the GDT or in the LDT. The descriptor has the S flag set. Stack segments are implemented by means of generic data segments. Task State Segment Descriptor (TSSD) Indicates that the Segment Descriptor refers to a Task State Segment (TSS), that is, a segment used to save the contents of the processor registers; it can appear only in the GDT. The corresponding Type field has the value 11 or 9, depending on whether the corresponding process is currently executing on the CPU. The S flag of such descriptors is set to 0. Local Descriptor Table Descriptor (LDTD) Indicates that the Segment Descriptor refers to a segment containing an LDT; it can appear only in the GDT. The corresponding Type field has the value 2. The S flag of such descriptors is set to 0. A DPL (Descriptor Privilege Level ) 2-bit field used to restrict accesses to the segment. It represents the minimal CPU privilege level requested for accessing the segment. Therefore, a segment with its DPL set to is accessible only when the CPL is 0, that is, in Kernel Mode, while a segment with its DPL set to 3 is accessible with every CPL value. A Segment-Present flag that is set to if the segment is currently not stored in main memory. Linux always sets this field to 1, since it never swaps out whole segments to disk. An additional flag called D or B depending on whether the segment contains code or data. Its meaning is slightly different in the two cases, but it is basically set if the

addresses used as segment offsets are 32 bits long and it is cleared if they are 16 bits long (see the Intel manual for further details). A reserved bit (bit 53) always set to 0. • An AVL flag that may be used by the operating system but is ignored in Linux. Segment Selectors To speed up the translation of logical addresses into linear addresses, the Intel processor provides an additional nonprogrammable register—that is, a register that cannot be set by a programmer—for each of the six programmable segmentation registers. Each nonprogrammable register contains the 8-byte Segment Descriptor (described in the previous section) specified by the Segment Selector contained in the corresponding segmentation register. Every time a Segment Selector is loaded in a segmentation register, the corresponding Segment Descriptor is loaded from memory into the matching nonprogrammable CPU register. From then on, translations of logical addresses referring to that segment can be performed without accessing the GDT or LDT stored in main memory; the processor can just refer directly to the CPU register containing the Segment Descriptor. Accesses to the GDT or LDT are necessary only when the contents of the segmentation register change . Each Segment Selector includes the following fields: A 13-bit index (described further in the text following this list) that identifies the corresponding Segment Descriptor entry contained in the GDT or in the LDT A T I (Table Indicator) flag that specifies whether the Segment Descriptor is included in the GDT (T I = 0) or in the LDT (T I = 1) •

An RPL (Requestor Privilege Level ) 2-bit field, which is precisely the Current Privilege Level of the CPU when the corresponding Segment Selector is loaded into the cs register •

Since a Segment Descriptor is 8 bytes long, its relative address inside the GDT or the LDT is obtained by multiplying the most significant 13 bits of the Segment Selector by 8. For instance, if the GDT is at 0x00020000 (the value stored in the gdt r register) and the index specified by the Segment Selector is 2, the address of the corresponding Segment Descriptor is 0x00020000 + (2 x 8), or 0x00020010. The first entry of the GDT is always set to 0: this ensures that logical addresses with a null Segment Selector will be considered invalid, thus causing a processor exception. The maximum number of Segment Descriptors that can be stored in the GDT is thus 8191, that is, 213-1.

Segmentation in Hardware : Normally, the user program is compiled, and the compiler automatically constructs segments reflecting the input program. Some compilers might create separate segments for , 1. The global variables 2. the procedure call stack, to store parameters and return address 3. the code portion of each procedure or function and 4. the local variables of each procedure and function. The loader would take all these segments and assign them segments numbers. Although the user can now refer to objects in the program by a two dimensional address, the actual physical memory is still one- dimensional. Thus, we must define an implementation to map two – dimensional user defined address into one – dimensional physical address. This mapping is affected by a segment table. Each entry of the segment table has a segment base and the segment limit. The segment base contains the starting physical address where the segment resides in memory, where as the segment limit specifies the length of the segment. A logical address consists of two parts a segment number s, and a offset into the segment d. the segment number is used as an index into the segment table. The offset “d “ of the logical address must be between 0 and the segment limit. The segment table is thus array of base – limit register pairs.

Segmentation Unit Figure shows in detail how a logical address is translated into a corresponding linear address. The segmentation unit performs the following operations: Examines the T I field of the Segment Selector, in order to determine which Descriptor Table stores the Segment Descriptor. This field indicates that the Descriptor is either in the GDT (in which case the segmentation unit gets the base linear address of the GDT from the gdt r register) or in the active LDT (in which case the segmentation unit gets the base linear address of that LDT from the l d t rregister). •

Computes the address of the Segment Descriptor from the i ndex field of the Segment Selector. The i ndex field is multiplied by 8 (the size of a Segment Descriptor), and the result is added to the content of the gdt r or l d t rregister. •

Adds to the Base field of the Segment Descriptor the offset of the logical address, thus obtains the linear address. •

Paging External fragmentation of memory can be efficiently handled by paging by permiting logical address space of the process to be noncontiguous, thus allowing a process to be allocated physical memory wherever the latter is available. Paging avoids the considerable problem of fitting the varying sized memory chunks.

Paging in Hardware The paging unit translates linear addresses into physical ones. It checks the requested access type against the access rights of the linear address. If the memory access is not valid, it generates a page fault exception. For the sake of efficiency, linear addresses are grouped in fixed-length intervals called pages; contiguous linear addresses within a page are mapped into contiguous physical addresses. In this way, the kernel can specify the physical address and the access rights of a page instead of those of all the linear addresses included in it. Following the usual convention, we shall use the term "page" to refer both to a set of linear addresses and to the data contained in this group of addresses. The paging unit thinks of all RAM as partitioned into fixed-length page frames (they are sometimes referred to as physical pages). Each page frame contains a page, that is, the length of a page frame coincides with that of a page. A page frame is a constituent of main memory, and hence it is a storage area. It is important to distinguish a page from a page frame: the former is just a block of data, which may be stored in any page frame or on disk. The data structures that map linear to physical addresses are called page tables; they are stored in main memory and must be properly initialized by the kernel before enabling the paging unit. In Intel processors, paging is enabled by setting the PG flag of the c r0 register. When PG = 0, inear addresses are interpreted as physical addresses. Regular Paging Starting with the i80386, the paging unit of Intel processors handles 4 KB pages. The 32bits of a linear address are divided into three fields: Directory The most significant 10 bits Table The intermediate 10 bits Offset The least significant 12 bits The translation of linear addresses is accomplished in two steps, each based on a type of translation table. The first translation table is called Page Directory and the second is called Page Table. The physical address of the Page Directory in use is stored in the cr3 processor register. The Directory field within the linear address determines the entry in the Page Directory that points to the proper Page Table. The address's Table field, in turn, determines the entry in the Page Table that contains the physical address of the page frame containing the page. The Offset field determines the relative position within the page frame . Since it is 12 bits long, each page consists of 4096 bytes of data.

Paging by Intel 80x86 processors

Both the Directory and the Table fields are 10 bits long, so Page Directories and Page Tables can include up to 1024 entries. It follows that a Page Directory can address up to 1024 x 1024 x 4096=232 memory cells, as you'd expect in 32-bit addresses. The entries of Page Directories and Page Tables have the same structure. Each entry includes the following fields: Presen tflag If it is set, the referred page (or Page Table) is contained in main memory; if the flag is 0, the page is not contained in main memory and the remaining entry bits may be used by the operating system for its own purposes. Field containing the 20 most significant bits of a page frame physical address Since each page frame has a 4 KB capacity, its physical address must be a multiple of 4096, so the 12 least significant bits of the physical address are always equal to 0. If the field refers to a Page Directory, the page frame contains a Page Table; if it refers to a Page Table, the page frame contains a page of data. Accessed flag Is set each time the paging unit addresses the corresponding page frame. This flag may be used by the operating system when selecting pages to be swapped out. The paging unit never resets this flag; this must be done by the operating system.

Di r tyflag Applies only to the Page Table entries. It is set each time a write operation is performed on the page frame. As in the previous case, this flag may be used by the operating system when selecting pages to be swapped out. The paging unit never resets this flag; this must be done by the operating system. Read /Wr i teflag Contains the access right (Read/Write or Read) of the page or of the Page Table . User /Superv i soflag r Contains the privilege level required to access the page or Page Table . Two flags called PCD and PWT Control the way the page or Page Table is handled by the hardware cache. Page S i zeflag Applies only to Page Directory entries. If it is set, the entry refers to a 4 MB long page frame (see the following section). If the entry of a Page Table or Page Directory needed to perform an address translation has the Presen tflag cleared, the paging unit stores the linear address in the c r2 processor register and generates the exception 14, that is, the "Page fault" exception. Extended Paging Starting with the Pentium model, Intel 80x86 microprocessors introduce extended paging , which allows page frames to be either 4 KB or 4 MB in size. Extended paging

As we have seen in the previous section, extended paging is enabled by setting the Page S i zeflag of a Page Directory entry. In this case, the paging unit divides the 32 bits of a linear address into two fields: Directory The most significant 10 bits Offset The remaining 22 bits Page Directory entries for extended paging are the same as for normal paging, except that: The Page S i zeflag must be set. Only the first 10 most significant bits of the 20-bit physical address field are significant. This is because each physical address is aligned on a 4 MB boundary, so the 22 least significant bits of the address are 0. • •

Extended paging coexists with regular paging; it is enabled by setting the PSE flag of the c r4 processor register. Extended paging is used to translate large intervals of contiguous linear addresses into corresponding physical ones; in these cases, the kernel can do without intermediate Page Tables and thus save memory. Hardware Protection Scheme The paging unit uses a different protection scheme from the segmentation unit. While Intel processors allow four possible privilege levels to a segment, only two privilege levels are associated with pages and Page Tables, because privileges are controlled by the User /Superv i soflag r mentioned in. When this flag is 0, the page can be addressed only when the CPL is less than 3 (this means, for Linux, when the processor is in Kernel Mode). When the flag is 1, the page can always be addressed. Furthermore, instead of the three types of access rights (Read, Write, Execute) associated with segments, only two types of access rights (Read, Write) are associated with pages. If the Read /Wr i teflag of a Page Directory or Page Table entry is equal to 0, the corresponding Page Table or page can only be read; otherwise it can be read and written. An Example of Paging A simple example will help in clarifying how paging works. Let us assume that the kernel has assigned the linear address space between 0x20000000 and 0x2003 f f f fto a running process. This space consists of exactly 64 pages. We don't care about the physical addresses of the page frames containing the pages; in fact, some of them might not even be in main memory. We are interested only in the remaining fields of the page table entries.

Let us start with the 10 most significant bits of the linear addresses assigned to the process, which are interpreted as the Directory field by the paging unit. The addresses start with a 2 followed by zeros, so the 10 bits all have the same value, namely 0x080 or 128 decimal. Thus the Directory field in all the addresses refers to the 129th entry of the process Page Directory. The corresponding entry must contain the physical address of the Page Table assigned to the process. If no other linear addresses are assigned to the process, all the remaining 1023 entries of the Page Directory are filled with zeros. An example of paging

The values assumed by the intermediate 10 bits, (that is, the values of the Table field) range from to 0x03 f, or from to 63 decimal. Thus, only the first 64 entries of the Page Table are significant. The remaining 960 entries are filled with zeros. Suppose that the process needs to read the byte at linear address 0x20021406 . This address is handled by the paging unit as follows: 1. The Directory field 0x80 is used to select entry 0x80 of the Page Directory, which points to the Page Table associated with the process's pages. 2. The Table field 0x21 is used to select entry 0x21 of the Page Table, which points to the page frame containing the desired page. 3. Finally, the Offset field 0x406 is used to select the byte at offset 0x406 in the desired page frame. If the Presen tflag of the 0x21 entry of the Page Table is cleared, the page is not present in main memory; in this case, the paging unit issues a page exception while translating the linear address. The same exception is issued whenever the process attempts to access linear addresses outside of the interval delimited by 0x20000000 and 0x2003ffff since the Page Table entries not assigned to the process are filled with zeros; in particular, their Present flags are all cleared.

Three-Level Paging Two-level paging is used by 32-bit microprocessors. But in recent years, several microprocessors (such as Compaq's Alpha, and Sun's UltraSPARC) have adopted a 64-bit architecture. In this case, two-level paging is no longer suitable and it is necessary to move up to three-level paging. Let us use a thought experiment to see why. Start by assuming about as large a page size as is reasonable (since you have to account for pages being transferred routinely to and from disk). Let's choose 16 KB for the page size. Since 1 KB covers a range of 210 addresses, 16 KB covers 214 addresses, so the Offset field would be 14 bits. This leaves 50 bits of the linear address to be distributed between the Table and the Directory fields. If we now decide to reserve 25 bits for each of these two fields, thismeans that both the Page Directory and the Page Tables of a process would include 225 entries, that is, more than 32 million entries. Even if RAM is getting cheaper and cheaper, we cannot afford to waste so much memory space just for storing the page tables. The solution chosen for Compaq's Alpha microprocessors is the following: • Page frames are 8 KB long, so the Offset field is 13 bits long. • Only the least significant 43 bits of an address are used. (The most significant 21 bits are always set 0.) • Three levels of page tables are introduced so that the remaining 30 bits of the address can be split into three 10-bit fields. So the Page Tables include 210 = 1024 entries as in the two-level paging schema examined previously.

Page allocation strategies

Chapter 6 File sub system Importance of File system File representation in the system File creation and maintenance Block devices and file system

File attributes A file has certain attributes which vary from one operating system to another, but typically consist of : • • • • • •



Name: the symbolic file name is the only information kept in the human readable form Type this information is needed for those systems that support different types. Location this information is a pointer to a device and to the location of the file on that device. Size the current size of the file (in bytes or blocks) Protection access control information controls who can do reading, writing and executing so on. Date , time and user information: time stamp can be kept in the form of i) creation ii) last modification iii) last usage User information is owner of the file and group of the file belongs to.

The information about all files is kept in the directory structure that also resides on the secondary storage.

File operations A file is an abstract data type. To define a file we need to consider a operation on the file. The operating system provides a system call to create, read, write, reposition , delete and truncate files. Creating a file. Two steps are necessary to create a file. First , space in the file system must be found. Second, entry for the new file must be made in the directory. Writing to file. To write in to file we make a system call by specifying name of the file and information to be written into the file. The system must keep the write pointer to the location of the file where next write takes place. Reading a file. To read from a file we use a system call that specifies the name of the file and where in memory next block of the file should be put. Normally system keeps only one pointer to read and write. Repositioning within a file. The directory is searched for the appropriate entry, and the current file position is set to a given value repositioning within a file doesnot involve any actual I/O Deleting a file. To delete a file we search a directory for the named file. Having found the associated directory entry, we release all the file space and erase the directory entry. Truncating a file there are occasions when the user wants the attributes of the file to retain the same, but wants to erase the contents of the file. Most of the file operations involve searching the directory for the entry associated with named file. To avoid this constant searching many systems will open a file when that file first is opened. Operating system keeps small table containing information about all opened files. When a file operation is requested index into this file table is used. When the file is closed, operating system removes the entry from the table. The implementation of open and close is much more complicated in multi user operating system like UNIX. In such system several users may open multiple files at the same time. Typically operating system uses two levels of internal tables. There is a per process table of all the files that each process has open. This contains all the information regarding the file used by the process, for instance current file pointer. Each entry in the per –process table in turn points to a system wide open file table. System wide table contains information that is process independent. Such a location of

the file in the disk , access dates , and file size. Once the file is opened by one process, another process executing an open call simply results ina new entry being added to the process open the file table with a new current pointer, and the pointer to the appropriate entry in the system wide table. Typically the open file table also has an open count associated with each file indicating the number of process which have the file open. Each close decreases this count , and when the count reaches to zero the file is no longer in use, and the files entry is removed from the opened file table.

File structure File types also may be used to indicate internal structure of the file. Certain files must confirm to a required structure that is understood by the operating system. For example the operating system may required the executable file have a specific structure so that it can determine where in memory to load the file, and what the location of the first instruction. The above said idea has some of the disadvantages. The operating systems which supports multiple file systems may not be able to determine the type of the file in priory. To avoid this Some of the operating systems use Virtual File systems (VFS).

Related Documents