UNIX
Table of Contents Preface..................................................................................................................................4 1. Operating Systems Basics ...............................................................................................5 1.1 Introduction................................................................................................................5 1.2 What is an Operating system?....................................................................................6 1.3 Functions of an Operating System ............................................................................7 1.4 Process Management.................................................................................................8 1.5 Memory Management................................................................................................9 1.6 Device (I/O) Management.......................................................................................10 1.7 File System Management.........................................................................................11 1.8 Flavors......................................................................................................................12 2. UNIX – Overview..........................................................................................................13 2.1 History .....................................................................................................................13 2.2 Flavors......................................................................................................................15 2.3 Logging In................................................................................................................16 2.4 File and Directory Structure.....................................................................................17 2.5 Process.....................................................................................................................19 3. Shell Basics....................................................................................................................22 3.1 Introduction..............................................................................................................22 Program Initiation .................................................................................................24 Input-output Redirection .......................................................................................24 Pipeline Connection ..............................................................................................25 Substitution of Filenames .....................................................................................25 Maintenance of Variables ......................................................................................25 Environment Control ............................................................................................26 3.2 Flavors......................................................................................................................27 3.3 Interaction with Shell...............................................................................................33 3.4 Features....................................................................................................................36 3.5 Program execution...................................................................................................41 3.6 Background Jobs......................................................................................................43 3.7 Batch jobs.................................................................................................................43 3.8 Input, output and pipes.............................................................................................47 4. The Command Line.......................................................................................................52 4.1 Creating and Manipulating Files and Directories....................................................52 4.2 Controlling Permissions to Files..............................................................................68 4.3 grep and find............................................................................................................79 4.4 Extracting data.........................................................................................................89 4.5 Redirection and Piping.............................................................................................93 4.6 Sorting and Comparing............................................................................................95 5. The vi editor...................................................................................................................99 5.1 Command mode.....................................................................................................101 5.2 Ex mode.................................................................................................................101 5.3 Edit mode...............................................................................................................101 6. Shell Programming......................................................................................................104 6.1 Variables.................................................................................................................104
2
6.2 Command-Line arguments.....................................................................................106 6.3 Decision-making constructs...................................................................................107 6.4 Looping Constructs................................................................................................114 6.5 Reading data ..........................................................................................................118 7. Basics of UNIX Administration...................................................................................121 7.1 Login Process and Run Levels...............................................................................121 7.2 Processes ...............................................................................................................125 7.3 Archiving and backup............................................................................................127 7.4 Security..................................................................................................................132 8. Communication............................................................................................................133 8.1 Basic UNIX mail....................................................................................................133 8.2 Communicating with other users...........................................................................137 8.3 Accessing the Internet............................................................................................139 8.4 e-mail on the Internet.............................................................................................141 9. Makefile concepts........................................................................................................143 9.1 Introduction............................................................................................................143 9.2 Using Variables in Makefile ..................................................................................146 9.3 Writing Makefiles..................................................................................................148 9.4 Sample Makefile....................................................................................................153
3
Preface This courseware is meant for beginners of UNIX Operating System. The information presented provides an overview of the UNIX operating system and adequate stuff to gain considerable working knowledge. The content is divided into nine sections to enable easy understanding. The first section provides the basics of Operating System that would help beginners who come from a non-computer science background. The structure and the functions of any operating system, in general, are discussed here. The different modules of the operating system, namely, process management, memory management, file system management, Input-Output management and the different flavors of Operating are discussed in brief. The second section gives an overview of UNIX Operating System, briefly discussing the history, flavors and the structure of UNIX. The logging in process and other basic concepts underlying the UNIX operating System are also discussed here. The concepts and features relating to the UNIX shell are discussed in the third section. This covers the variety of shells available, interacting with the shell, the features associated with the shell, executing ordinary programs, batch jobs, background jobs and the concept of input, output and pipes are also covered in this section of the courseware. The various operations carried out on the command line are discussed in the fourth section. Creating and manipulating files and directories, controlling permissions, extracting data, redirection, piping, sorting, comparing and so on forms the essence of this section. The famous vi editor is discussed along with the various modes of operation in the fifth section. The Shell Programming concepts and constructs are discussed in the sixth section. The seventh section covers the basics of UNIX administration and security aspects of UNIX Operating System. The eighth section covers connecting to other systems, basic concepts of mailing in UNIX, communicating with other users, accessing and e-mailing on the Internet. The last section discusses creating an application on UNIX platform.
4
Chapter 1 1. Operating Systems Basics 1.1 Introduction The structure of a computer system can be depicted as in the figure below. The figure shows how the operating system fits in a computer system.
Application Programs | System Programs | Operating System | Assembly Language | Micro program | Physical devices Figure 1 – Computer System Physical devices consist of IC chips, wires, power supplies, CRT and so on. Micro program is located in the ROM. The software directly controls these physical devices and provides a cleaner interface to the next layer. This could be implemented in hardware also. The micro program interprets the machine language or assembly language instructions to the devices. Assembly language is specific to the processor and it typically contains 50 to 300 instructions for doing simple operations like moving data around, doing arithmetic and comparisons. Operating system hides all the complexity and provides a clean interface to the system and application programs that reside on it. System programs are those that are loaded according to user’s convenience on top of the operating system like compilers, editors and command interpreters. Application programs are those that are written by the users to solve their problems.
5
1.2 What is an Operating system? The term Operating System denotes those system program modules within a computer system that governs the control of equipment resources such as processes, main storage, secondary storage, I/O Devices, and files; and the system modules that provide a base for all application programs such as word processors, database management systems, and so on. The Operating System can be viewed as an extended machine and also as a resource manager; the former being the top-down view and the latter being the bottom-up view.
Operating system as an extended machine (Top-down view) • • •
Hides the truth about the hardware from the programmer Provides a nice, simple view of named files that can be read and written Abstraction
Hence, the top-down view looks upon the operating system as providing an easy interface to the user hiding all the hardware complexities.
Operating system as a resource manager (Bottom-up view) • • •
Manages all pieces of a complex system Provide an orderly and controlled allocation of processes, memory, and I/O devices among the various programs competing for them Keeps track who is using which resource, grants resource requests, accounts usage, mediate conflicting requests from different programs and users
Hence, the bottom-up view looks upon the system that governs the process management, memory management, file system management and I/O management.
6
1.3 Functions of an Operating System The functions of an Operating System can be listed as follows:
Provides easy interface by hiding the hardware complexities Process Management Memory Management File System Management I/O Management
These functions will be discussed in brief in the subsequent sections.
7
1.4 Process Management Process is a program in execution. Processor management is concerned with the management of the physical processors, specifically, the assignment of processors to processes. As a process executes, it changes state. The state of a process is defined in part by the current activity of that process. The process state diagram is as follows:
Scheduler Dispatch
New
Admitted
Ready Interrupt
I/O Completion
Blocked
Exit
Running Termination
I/O or Event Wait
Figure 2 – Process States
The various functions of the process management functions are as follows: •
• • • • •
Keep track of the status of processes using process tables. The Operating System maintains a process table with one entry per process. When the switching takes place, the values in the process table are saved and the values corresponding to the next process is taken. Scheduling of processes is an important function in deciding which process will use the processor. The job scheduler chooses from all the jobs submitted to the system and decides which one will be allowed to take the processor. Allocate the resources to a process by setting up necessary hardware registers and this is often called the dispatcher. Reclaim resources when process relinquishes processor usage, terminates, or exceeds allowed amount of usage. Inter-process Communication using system calls. Process synchronization using monitors, event counters and semaphores.
8
1.5 Memory Management Memory management of an Operating System is concerned with the management of primary memory. The primary memory or the core memory is the part of memory that the processors directly access for instructions and data. The various functions of the memory management functions are as follows: • • • • •
Keep track of the status of each location of primary memory, whether allocated to any process or free Determining allocation policy for memory Allocation technique – Once it is decided to allocate memory, the specific location must be selected and allocation information updated De-allocation technique and policy – handling the de-allocation of memory Paging – Swapping pages from the secondary memory to primary memory when a page fault occurs
9
1.6 Device (I/O) Management Management of I/O devices includes efficient management of actual I/O devices such as printers, card readers, tapes, disks, control units, control channels and so on. There are three basic techniques for implementing the device management. Dedicated – A technique whereby a device is assigned to a single process. Shared – A technique whereby a device is shared by many process. Virtual – A technique whereby a physical device is simulated on another. The various functions of the device management functions are as follows: • • • •
Keep track of the status of I/O devices using unit control blocks for devices I/O scheduling using scheduling algorithms Allocation – physically assigning a device to a process. The corresponding channels and control units must be assigned. De-allocation – may be done on either on a process level or job level. On a process level, a device may be assigned for as long as the process needs it. On a job level, a device is assigned for as long as the job exists in the system.
The method of deciding how devices are allocated depends on the flexibility of the device. Some devices cannot be shared e.g., card readers and must therefore be dedicated to a process. Others may be shared e.g., disks and hence more flexibility. Others may be made into virtual devices. For example, the operation of punching on a cardpunch could be transformed into a write onto a disk i.e., a virtual card punch and at some later time a routine would copy the information onto a card punch. Virtual card reader, card punch, and printer devices is performed by spooling routines. The virtual devices approach allows: • • •
Dedicated devices to be shared, hence, more flexibility in scheduling these devices More flexibility in job scheduling Better match of speed of device and speed of requests for that device
10
1.7 File System Management Information Management is concerned with the storage and retrieval of information entrusted to the system. The modules of information system are collectively referred to as the file system. The various functions of the file system management functions are as: • •
•
•
Keep track of all information in the system through various tables, the major one being the file directory – sometimes called the VTOC (Volume Table of Contents). These tables contain the name, location, and access rights of all information within the system Deciding policy for determining where and how information is stored and who gets access to the information. Factors influencing this policy are efficient utilization of secondary storage, efficient access, flexibility to users, and protection of access rights to the information requested. Allocate the information, e.g., open a file. Once the decision is made to let a process have access to information, the allocation modules must find the desired information, make the desired information accessible to the process, and establish the appropriate access rights. De-allocate the information, e.g., close a file. Once the information is no longer needed, temporary table entries and other such resources may be released. If the user has updated the information, the original copy of that information may be updated for possible use by other processes.
11
1.8 Flavors The different flavors of Operating systems are as follows. MS-DOS UNIX WINDOWS MVS/ESA ATLAS XDS-940 THE RC 400 CTSS MULTICS MACH The student is advised to explore more information about the operating systems mentioned above.
12
Chapter 2 2. UNIX – Overview 2.1 History UNIX An open discussion on the workings of an operating system is never complete without discussing what the operating system is and the history behind it. The purpose of this module is to explain how UNIX came to be what it is today. UNIX is a multi-user, multitasking, multithreading computer operating system that enables different people to access a computer at the same time and to run more than one program simultaneously. Since its humble beginning nearly 40 years ago, it has been redefined and refined time and time again. Networking capabilities enhance the suitability of UNIX for the workplace, and support for DOS and Windows is coming in the 32-bit workstation markets. UNIX was designed to be simple and flexible. The key components include a hierarchical directory tree that divides files into directories and real time processing.
Exploring History The UNIX Operating System came into life more or less by accident. In the late 1960s, an operating system called MULTICS was designed by the Massachusetts Institute of Technology to run on GE Mainframe computers. Built on banks of processors, MULTICS enabled information sharing among users, although it required huge amounts of memory and ran slowly. Ken Thompson working for Bell Labs wrote a crude computer game to run on the mainframe. As the performance the mainframe gave or the cost of running it is very low, Ken with the help of Dennis Ritchie rewrote the game to run on a DEC computer and in the process, wrote an entire operating system as well. Several variations have circulated about the system, but the common is that it is a derivative of MULTICS. In 1970, Thompson and Ritchie’s operating system came to be called UNIX and Bell Labs kicked in financial support to refine the product. By 1972, around 10 computers were running UNIX, and in 1973, Thompson and Ritchie rewrote the kernel from Assembly language to C language – the brainchild of Ritchie. Since then, UNIX and C have been intertwined, and growth of UNIX is partially due to the ease of transporting the C language to other platforms. AT&T, the parent company of Bell offered UNIX in source-code form to government institutions and to universities for a fraction of its worth. In 1979, it was ported to popular VAX computers from Digital, further cementing its way to many universities. Thompson, in 1975 moved to University of California and he recruited a graduate student named Bill Joy to help enhance the system. In 1977, Joy mailed out free copies of his system modifications. When AT&T began releasing it as a commercial product, system numbers were used (System III, System V and so on). The refinements done at the University were released as Berkeley Software Distribution, or BSD (2BSD, 3BSD, and so on). These included the vi editor and C shell. AT&T versions accepted 14 characters for file names and Berkeley expanded it to
13
25. Towards the end of 1970s, ARPA (Advanced Research Projects Agency), of DoD (Department of Defense) used Berkeley version of UNIX. Bill Joy, in the mean time left the campus setting and became one of the founding members of Sun Microsystems. Sun workstations used a derivative of BSD known as the Sun Operating System, or SunOS. As there was lack of uniformity among UNIX versions, several steps were taken to correct the problem. In 1988, Sun Microsystems and AT&T joined forces to rewrite UNIX into System V, release 4.0. Other companies like IBM and Digital Equipment, countered by forming their own standards group to come up with a guideline for UNIX. Both groups incorporated BSD in their guidelines but still managed to come up with different versions of System V. In 1992, a joint product, UNIXWARE as announced by UNIX System Laboratories, a spin company of AT&T. UNIXWARE combines UNIX with features of Netware was marketed. In the early 1990s, Berkeley announced that no more editions of BSD would be forthcoming. As the enhancements in PC Operating Systems were interesting, Microsoft created XENIX – a form of UNIX designed for desktop PC and later sold it to Santa Cruz Operations (SCO). As the personal computer has matured, UNIX has come into favor as an Operating system for it. SCO UNIX, as well as Sun’s and Novell’s entries provide excellent operating systems for multi-user environments.
14
2.2 Flavors Since it began to escape from AT&T's Bell Laboratories in the early 1970's, the success of the UNIX operating system has led to many different versions. Universities, research institutes, government bodies and computer companies all began using the powerful UNIX system to develop many of the technologies which today are part of the IT environment. Computer Aided Design, Manufacturing Control Systems, Laboratory Simulations and the Internet began life with and because of UNIX systems. Soon all the large vendors, and many smaller ones, were marketing their own, versions of the UNIX system optimized for their own computer architectures and boasting many different strengths and features. Find below the chronological picture of UNIX. Click on this image to view a larger picture.
Figure 3 – UNIX Chronology The different flavors of UNIX are shown in the picture. Various giants like SCO, IBM, AT&T, Siemens, Berkeley, SUN Microsystems, DEC, and HP redefined the preliminary versions adding many features and thus leading to many flavors of UNIX. The different flavors include Multics, System III, BSD, Sun OS, Solaris, Sinix, Ultrix, HP UX, AIX, XENIX, SCO-UNIX and so on.
15
2.3 Logging In A UNIX session begins with the login prompt. This prompt appears on the screen. The system administrators must again a login name and password to you before you can gain admittance to the system. After you have attained both, enter the login name at the prompt, as in the following: login : user1 <Enter> Next a password : prompt appears. Enter your password here. After you correctly enter the login and password, the shell prompt appears. The shell prompt can be dollar sign ($) or percentage sign (%). UNIX contains various shells or command interpreters. The % identifies that the C shell is in use, and the $ is used by most of the shells. A # sign on the other hand indicates that you have logged in with administrative permissions. To end your session with the operating system, you need to leave the shell and return to the login. This task is known as logging out or signing off. Several methods are available depending on your software. The preferred method is the exit command. An alternative is pressing Ctrl+D, which signifies the end of data input. Other commands that might exist on your system and perform the same function are logout, logoff, or log.
16
2.4 File and Directory Structure The UNIX operating system uses an inverted tree structure, much the same as the many other operating systems. In an inverted tree, one main directory branches into several subdirectories. Each subdirectory can then further branch into more subdirectories. This structure, although novel at the time UNIX first became available, is familiar now and commonplace with most operating systems. The purpose of each of the standard subdirectories and the method by which UNIX maintains files and directories and the method by which UNIX maintains files and directories, however, is vitally important.
Root directory and its branches The root directory is the beginning, or first layer, of the file system. Symbolized as a forward slash (/), the root directory is the point from which all other subdirectories branch. A root user also called the super user has the ability to change anything related to the file system without question. This user can bring the system up, shut it down, and do everything in between. By no small coincidence, the home directory of this user is the root directory – from here, all information filters down. The figure below illustrates this by presenting the classic diagram of the UNIX subdirectory file system.
/
bin
dev
etc
lib
l&f
crdm
bin
shlb
include
tmp
tcb
lib
usr
man
var
spool
l&f – lost & Found
Figure 4 – Inverted tree of UNIX – Classic Diagram Only one file – UNIX should be within the normal system’s root directory. This is the actual, bootable, operating system file. In its absence, the operating system cannot come up after a restart. This file is also known as the kernel. In addition to the UNIX file, there are a number of subdirectories. Every system has unique ones created by an administrator for specific purposes, but the default ones created when a new operating system is installed are as follows.
17
bin dev etc lib lost+found shlib tcb tmp usr var
18
2.5 Process UNIX is a multi-user, multi-tasking operating system, which means, more than one user can use the system and each user can run more than one program at the same time. In UNIX, each task that the system that the system is executing is a process. Normally, the UNIX has to handle several processes at the same time. However, the CPU can handle only one job at a time. So, UNIX uses the time-sharing concept to solve the problem. Time-sharing means the kernel maintains a list of current tasks or processes and allocates a small time slice for each process. The kernel switches to other process in the queue once the current process completed or the allocated time has elapsed. The kernel also assigns a priority number to each of the processes. Based on the priority, the processes are scheduled to run on the CPU. Typically, the kernel switches from process to process very rapidly. This gives the user the impression that all the processes are running simultaneously. Suppose the system is executing many processes that do not fit simultaneously into the main memory, the kernel then swaps the blocked and sleeping processes to the disk and makes room for the running process. All the processes are assigned a unique identification number by the kernel called the process id (pid). This pid is used by the kernel to identify each process. The information about the processes, their states, pid, memory area, the terminal from which the program was executed, the owner of the process and some other system information are maintained in the process table by the kernel. After the login process, the kernel invokes a shell for that user. The shell is the parent process that runs till the user logs out of the system. All the commands, jobs, or processes that run under the shell or get invoked from the shell are child processes of the parent shell. Each process gets killed after it is completed. All the child processes should be killed before the parent dies. Another shell can also be invoked from an existing shell. The child shell should be killed before exiting from the parent shell.
Manipulation of Processes Process Status The ps command displays the contents of the process table. This command without any option displays a four-column output containing the status of the processes of the particular user terminal only. The first column is the pid of the process, the second column is the terminal number, the third column is the time taken by the process at the time of issuing the command and last column displays the command or the program name itself. The ps command with the option –e, displays all the active processes. With the –l option displays among other details, the priority associated with each process, the state of the process and number of CPU resources used by the process. The –l option along with the –e option displays the information about all the processes that are active in the system.
Process Scheduling There are also some process scheduling commands namely nice, sleep, at, and nohup etc.
nice
19
It is also possible for individual user to run a process with a reduced priority. Only the super user can increase the priority value for a process. This is achieved by a command nice. Example:
$ nice
<enter> $ nice -15 vi first <enter>
The priority gets reduced by 15 units. If the number is not specified, the priority gets reduced by 10 units.
sleep To suspend the execution of any process started by the user for a short interval, the command sleep is used. The time period for sleep is given in terms of seconds and up to a maximum of 65535 seconds. Example:
$ sleep <seconds> <enter>
at The at command is used for scheduling one or more commands to be executed at a specified time. This command can be used to execute a command or program at a later time, even after logged out from the system. Example:
$ at <enter> ………….. Ctrl+D $ $ at 8pm $ at 2001 sat $ at 1300 mon week $ at 3pm sep 30
nohup Kernel terminates all active processes by sending a hang up signal at the time of logout. The nohup command is used to continue execution of a process even after the user has logged out. With nohup, any command can be specified. Example:
$ nohup
Any command that is specified with nohup will continue running even after the user has logged out.
Background Processing UNIX can schedule the execution of more than one program at the same time. In fact, the system can execute only one program at any one instance, but it switches between processes so quickly, usually fractions of seconds, that most of the time all the programs seems to be running at the
20
same time. If a particular program seems to be taking a lot of time to complete and the user likes to begin another task, he can schedule that program in the background. The main advantage of running programs in the background is that shell prompt appears immediately. Initiating a background process is easy with a UNIX system. Simply type the ampersand character at the end of the command line invoking the process that has to be run in the background. The shell will respond by printing the process id of the background process and immediately display the prompt. The user can start a number of background processes, but depending upon the system, the kernel may limit it to 30 or 50 per user. Generally, processes that do not require inputs from the user are started in the background, otherwise it is impossible to guess which program is accepting the inputs, the process in background or one in the foreground. Like the foreground processes, the shell redirects the standard input, output and error files of the background processes to the terminal and keyboard. Therefore, it is possible to get outputs from the background processes mixed with the output of the foreground process. This can be avoided by redirecting the standard input, output and error files of the background processes. Example:
$ backprocess& <Enter> 2777
At the time of logout, all the active processes, including the background processes, will be terminated. To ensure the background processes to continue nohup command is used before the program name. Example:
$ nohup backprocess& <Enter> 2779
Background processes can also be given by issuing more than one command separated by a semi colon. The prompt returns only after the execution of the last command in sequence.
Terminating Processes Occasionally, a need may arise to stop executing processes. This can be done by sending a software termination signal to the process. To send this software termination signal, the kill command is used. The kill command takes the process id as the argument and terminates the process. $ kill The pid is displayed by the shell immediately after it places a process in the background. It can also be obtained through the ps command. Some programs are designed to ignore the interrupts. In this case, the form of kill command would not terminate a process. However, we can request the kill command to send a sure kill signal instead. This signal always terminates any processes owned by the user issuing the kill command. The sure kill signal is represented by including a minus nine option (-9) along with the kill command. $ kill -9 2777 Killing the shell process will logout the user from the system. Only the owner of the process or the super user can kill a process.
21
Chapter 3 3. Shell Basics 3.1 Introduction You can do many things without having an extensive knowledge of how they actually work. For example, you can drive a car without understanding the physics of the internal combustion engine. A lack of knowledge of electronics doesn't prevent you from enjoying music from a CD player. You can use a UNIX computer without knowing what the shell is and how it works. However, you will get a lot more out of UNIX if you do. Three shells are typically available on a UNIX system - Bourne, Korn, and C shells. As the shell of a nut provides a protective covering for the kernel inside, a UNIX shell provides a protective outer covering. When you turn on, or "boot up," a UNIX-based computer, the program unix is loaded into the computer's main memory, where it remains until you shut down the computer. This program, called the kernel, performs many low-level and system-level functions. The kernel is responsible for interpreting and sending basic instructions to the computer's processor. The kernel is also responsible for running and scheduling processes and for carrying out all input and output. The kernel is the heart of a UNIX system. There is one and only one kernel. As you might suspect from the critical nature of the kernel's responsibilities, the instructions to the kernel are complex and highly technical. To protect the user from the complexity of the kernel, and to protect the kernel from the shortcomings of the user, a protective shell is built around the kernel. The user makes requests to a shell, which interprets them, and passes them on to the kernel. The remainder of this section explains how this outer layer is built. Once the kernel is loaded to memory, it is ready to carry out user requests. First, though, a user must log in and make a request. For a user to log in, however, the kernel must know who the user is and how to communicate with him. To do this, the kernel invokes two special programs, getty and login. For every user port—usually referred to as a tty—the kernel invokes the getty program. This process is called spawning. The getty program displays a login prompt and continuously monitors the communication port for any type of input that it assumes is a user name. When getty receives any input, it calls the login program. The login program establishes the identity of the user and validates his right to log in. The login program checks the password file. If the user fails to enter a valid password, the port is returned to the control of a getty. If the user enters a valid password, login passes control by invoking the program name found in the user's entry in the password file. This program might be a word processor or a spreadsheet, but it usually is a more generic program called a shell. In the system suppose four users have logged in. Of the four active users, two are using the Bourne shell, one is using the Korn shell, and one has logged into a spreadsheet. Each user has received a copy of the shell to service his requests, but there is only one kernel. Using a shell does not prevent a user from using a spreadsheet or another program, but those programs run under the active shell. A shell is a program dedicated to a single user, and it provides an interface between the user and the UNIX kernel.
22
You don't have to use a shell to access UNIX. In the previous example, one of the users has been given a spreadsheet instead of a shell. When this user logs in, the spreadsheet program starts. When he exits the spreadsheet, he is logged out. This technique is useful in situations where security is a major concern, or when it is desirable to shield the user from any interface with UNIX. The drawback is that the user cannot use mail or the other UNIX utilities. Because any program can be executed from the login—and a shell is simply a program—it is possible for you to write your own shell. In fact, three shells, developed independently, have become a standard part of UNIX. They are The Bourne shell, developed by Stephen Bourne The Korn shell, developed by David Korn The C shell, developed by Bill Joy This variety of shells enables you to select the interface that best suits your needs or the one with which you are most familiar.
Functions It doesn't matter which of the standard shells you choose, for all three have the same purpose - to provide a user interface to UNIX. To provide this interface, all three offer the same basic functions: •
Command line interpretation
•
Program initiation
•
Input-output redirection
•
Pipeline connection
•
Substitution of filenames
•
Maintenance of variables
•
Environment control
•
Shell programming
Command Line Interpretation When you log in, starting a special version of a shell called an interactive shell, you see a shell prompt, usually in the form of a dollar sign ($), a percent sign (%), or a pound sign (#). When you type a line of input at a shell prompt, the shell tries to interpret it. Input to a shell prompt is sometimes called a command line. The basic format of a command line is
23
command arguments command is an executable UNIX command, program, utility, or shell program. The arguments are passed to the executable. Most UNIX utility programs expect arguments to take the following form: options filenames For example, in the command line $ ls -l file1 file2 there are three arguments to ls, the first of which is an option, while the last two are file names. One of the things the shell does for the kernel is to eliminate unnecessary information. For a computer, one type of unnecessary information is whitespace. Therefore, it is important to know what the shell does when it sees whitespace. Whitespace consists of the space character, the horizontal tab, and the new line character. Consider this example: $ echo part A
part B
part C
part A part B part C Here, the shell has interpreted the command line as the echo command with six arguments and has removed the whitespace between the arguments. For example, if you were printing headings for a report and you wanted to keep the whitespace, you would have to enclose the data in quotation marks, as in $ echo 'part A part A
part B
part B
part C'
part C
The single quotation mark prevents the shell from looking inside the quotes. Now the shell interprets this line as the echo command with a single argument, which happens to be a string of characters including whitespace. Program Initiation When the shell finishes interpreting a command line, it initiates the execution of the requested program. The kernel actually executes it. To initiate program execution, the shell searches for the executable file in the directories specified in the PATH environment variable. When it finds the executable file, a subshell is started for the program to run. You should understand that the subshell can establish and manipulate its own environment without affecting the environment of its parent shell. For example, a subshell can change its working directory, but the working directory of the parent shell remains unchanged when the subshell is finished. Input-output Redirection
24
Input-output redirection is the responsibility of the shell to make this happen. The shell does the redirection before it executes the program. Consider these two examples, which use the wc word count utility on a data file with 5 lines: $ wc -l fivelines 5 fivelines $ wc -l
25
Here, the shell establishes LOOKUP as a variable, and assigns it the value /usr/mydir. Later, you can use the value stored in LOOKUP in a command line by prefacing the variable name with a dollar sign ($). Consider these examples: $ echo $LOOKUP /usr/mydir $ echo LOOKUP LOOKUP Like filename substitution, variable name substitution happens before the program call is made. The second example omits the dollar sign ($). Therefore, the shell simply passes the string to echo as an argument. In variable name substitution, the value of the variable replaces the variable name. For example, in $ ls $LOOKUP/filename the ls program is called with the single argument /usr/mydir/filename. Environment Control When the login program invokes your shell, it sets up your environment, which includes your home directory, the type of terminal you are using, and the path that will be searched for executable files. The environment is stored in variables called environmental variables. To change the environment, you simply change a value stored in an environmental variable. For example, to change the terminal type, you change the value in the TERM variable, as in $ echo $TERM vt100 $ TERM=ansi $ echo $TERM ansi
26
3.2 Flavors Most contemporary versions of UNIX provide all three shells—the Bourne shell, C shell, and Korn shell — as standard equipment. Choosing the right shell to use is an important decision because you will spend considerable time and effort learning to use a shell, and more time actually using it. The right choice will allow you to benefit from the many powerful features of UNIX with a minimum of effort. Of course, no one shell is best for all purposes. If you have a choice of shells, then you need to learn how to choose the right shell for the job. The shell has three main uses: As a keyboard interface to the operating system As a vehicle for writing scripts for your own personal use As a programming language to develop new commands for others Each of these three uses places different demands on you and on the shell you choose. Furthermore, each of the shells provides a different level of support for each use. The first point to keep in mind when choosing a shell for interactive use is that your decision affects no one but yourself. This gives you a great deal of freedom: you can choose any of the three shells without consideration for the needs and wishes of others. Only your own needs and preferences will matter. The principal factors that will affect your choice of an interactive shell are as follows: Learning. It is a lamentable fact of life that as the power and flexibility of a tool increases, it becomes progressively more difficult to learn how to use it. The much-maligned VCR, with its proliferation of convenience features, often sits with its clock unset as silent testimony. So too it is with UNIX shells. There is a progression of complexity from the Bourne shell, to the C shell, to the Korn shell, with each adding features, shortcuts, bells and whistles to the previous. The cost of becoming a master is extra time spent learning and practicing. You'll have to judge whether you'll really use those extra features enough to justify the learning time. Keep in mind though that all three shells are relatively easy to learn at a basic level. Command editing. The C shell and the Korn shell offer features to assist with redisplaying and reusing previous commands; the Bourne shell does not. The extra time savings you can realize from the C shell or the Korn shell command editing features depends greatly on how much you use the shell. Generations of UNIX users lived and worked before the C and Korn shells were invented, demonstrating that the Bourne shell is eminently usable, just not as convenient for the experienced, well-practiced C shell or Korn shell user. Wildcards and shortcuts. Once again, your personal productivity (and general peace of mind) will be enhanced by a shell that provides you with fast ways to do common things. Wildcards and command aliases can save you a great deal of typing if you enter many UNIX commands in the course of a day. Portability. If you will sit in front of the same terminal every day, use the same UNIX software and applications for all your work, and rarely if ever have to deal with an unfamiliar system, then, by all means choose the best tools that your system has available. If you need to work with many different computers running different versions of UNIX, as system and network administrators often must, you may need to build a repertoire of tools (shell, editor, and so on) that are available
27
on most or all of the systems you'll use. Don't forget that being expert with a powerful shell won't buy you much if that shell isn't available. For some UNIX professionals, knowing a shell language that's supported on all UNIX systems is more important than any other consideration. Prior experience. Prior experience can be either a plus or a minus when choosing a shell. For example, familiarity with the Bourne shell is an advantage when working with the Korn shell, which is very similar to the Bourne shell, but somewhat of a disadvantage when working with the C shell, which is very different. Don't let prior experience dissuade you from exploring the benefits of an unfamiliar shell. The table rates the three shells using the preceding criteria, assigning a rating of 1 for best choice, 2 for acceptable alternative, and 3 for poor choice. Shell
Learning Editing Shortcuts Portability Experience
Bourne 1
3
3
1
3
C
2
2
1
3
2
Korn
3
1
2
2
1
Bourne Shell The Bourne shell is your best choice for learning because it is the simplest of the three to use, with the fewest features to distract you and the fewest syntax nuances to confuse you. If you won't be spending a lot of time using a command shell with UNIX, then by all means develop some proficiency with the Bourne shell. You'll be able to do all you need to, and the productivity benefits of the other shells aren't important for a casual user. Even if you expect to use a UNIX command shell frequently, you might need to limit your study to the Bourne shell if you need to become effective quickly. The Bourne shell is lowest in the productivity categories because it has no command editor and only minimal shortcut facilities. If you have the time and expertise to invest in developing your own shell scripts, you can compensate for many of the Bourne shell deficiencies, as many shell power users did in the years before the C shell and the Korn shell were invented. Even so, the lack of command editing and command history facilities means you'll spend a lot of time retyping and repairing commands. For intensive keyboard use, the Bourne shell is the worst of the three. If you have any other shell, you'll prefer it over the Bourne shell. The C shell and the Korn shell were invented precisely because of the Bourne shell's low productivity rating. They were both targeted specifically to creating a keyboard environment that would be friendlier and easier to use than the Bourne shell, and they are here today only because most people agree that they're better. However, portability concerns might steer you toward the Bourne shell despite its poor productivity rating. Being the oldest of the three shells (it was written for the very earliest versions of UNIX), the Bourne shell is available virtually everywhere. If you can get your job done using the Bourne shell, you can do it at the terminal of virtually any machine anywhere.
28
This is not the case for the C and Korn shells, which are available only with particular vendors' systems or with current UNIX releases. The Bourne shell has a rating of 3 for prior experience because prior experience using the Bourne shell is no reason to continue using it. You can use the Korn shell immediately with no additional study and no surprises, and you can gradually enhance your keyboard skills as you pick up the Korn shell extensions. If you have access to the Korn shell, you have no reason not to use it.
C Shell The C shell rates a 2 for learning difficulty, based simply on the total amount of material available to learn. The C shell falls between the Bourne shell and the Korn shell in the number and complexity of its facilities. Make no mistake—the C shell can be tricky to use, and some of its features are rather poorly documented. Becoming comfortable and proficient with the C shell takes time, practice, and a certain amount of inventive experimentation. Of course, when compared to the Bourne shell only on the basis of common features, the C shell is no more complex, just different. The C shell rates a passing nod for command editing because it doesn't really have a command editing feature. Its history substitution mechanism is complicated to learn and clumsy to use, but it is better than nothing at all. Just having a command history and history substitution mechanism is an improvement over the Bourne shell, but the C Shell is a poor second in comparison to the simple and easy command editing of the Korn shell. With the Korn shell, you can reuse a previously entered command, even modify it, just by recalling it (Esc-k if you're using the vi option) and overtyping the part you want to modify. With the C shell, you can also reuse a previous command, but you have five different forms for specifying the command name (!!, !11, !-5, !vi, or !?vi?), additional forms for selecting the command's arguments (:0, :^, :3-5, :-4, :*, to name a few), and additional modifiers for changing the selected argument (:h, :s/old/new/, and so forth). Even remembering the syntax of command substitution is difficult, not to speak of using it. On the other hand, if you like to use wildcards, you'll find that the C shell wildcard extensions for filenames are easier to use—they require less typing and have a simpler syntax—than the Korn shell wildcard extensions. Also, its cd command is a little more flexible. The pushd, popd, and dirs commands are not directly supported by the Korn shell (although they can be implemented in the Korn shell by the use of aliases and command functions). Altogether, the C shell rates at the top of the heap in terms of keyboard shortcuts available, perhaps in compensation for its only moderately successful command editing. Depending on your personal mental bent, you might find the C shell the most productive of all three shells to use. We have seen that those already familiar with the C shell have not been drawn away in droves by the Korn shell in the past. For portability considerations, the C shell ranks at the bottom, simply because it's a unique shell language. If you know only the C shell, and the particular system you're using doesn't have it, you're out of luck. A C shell user will almost always feel all thumbs when forced to work with the Bourne shell, unless she is bilingual and knows the vagaries and peculiarities of both. The C shell gets a 2 for prior experience. If you already know it and want to continue using it, there is no compelling reason why you shouldn't. On the other hand, you may be missing a good bet if you decide to ignore the Korn shell. Unless you feel quite comfortable with the C shell's
29
history substitution feature and use it extensively to repair and reuse commands, you might find the Korn shell's command editing capability well worth the time and effort to make the switch. Anyone accustomed to using the Korn shell's command editing capability feels unfairly treated when deprived of it—it's that good. If you haven't already experimented with the Korn shell and you have the chance, I would strongly recommend spending a modest amount of time gaining enough familiarity with it to make an informed choice. You might be surprised. Altogether, the C shell is a creditable interactive environment with many advantages over its predecessor, the Bourne shell, and it is not clear that the Korn shell is a compelling improvement. Personal preference has to play a role in your choice here. However, if you're new to UNIX, the C shell is probably not the best place for you to start.
Korn Shell In terms of time and effort required to master it, the Korn shell is probably the least attractive. That's not because it's poorly designed or poorly documented, but merely because it has more complex features than either of the other two shells. Of course, you don't have to learn everything before you can begin using it. The Korn shell can be much like good music and good art, always providing something new for you to learn and appreciate. For productivity features, the Korn shell is arguably the best of the three shells. Its command editor interface enables the quick, effortless correction of typing errors, plus easy recall and reuse of command history. It's hard to imagine how the command line interface of the Korn shell could be improved without abandoning the command line altogether. On the down side, the Korn shell provides equivalents for the C shell's wildcard extensions, but with a complicated syntax that makes the extensions hard to remember and hard to use. You can have the pushd, popd directory interface, but only if you or someone you know supplies the command aliases and functions to implement them. The ability to use a variable name as an argument to cd would have been nice, but you don't get it. The Korn shell's command aliasing and job control facilities are nearly identical to those of the C shell. From the point of view of keyboard use, the Korn shell stands out over the C shell only because of its command editing feature. In other respects, its main advantage is that it provides the C shell extensions in a shell environment compatible with the Bourne shell; if Bourne shell compatibility doesn't matter to you, then the Korn shell might not either. Speaking of Bourne shell compatibility, the Korn shell rates a close second to the Bourne shell for portability. If you know the Korn shell language, you already know the Bourne shell, because ksh is really a superset of sh syntax. If you're familiar with the Korn shell, you can work reasonably effectively with any system having either the Bourne or Korn shells, which amounts to virtually one hundred percent of the existing UNIX computing environments. Finally, in terms of the impact of prior experience, the Korn shell gets an ambiguous rating of 2. If you know the Bourne shell, you'll probably want to beef up your knowledge by adding the extensions of the Korn shell and switching your login shell to ksh. If you already know ksh, you'll probably stick with it. If you know csh, the advantages of ksh may not be enough to compel you to switch. If you're a first-time UNIX user, the Korn shell is the best shell for you to start with. The complexities of the command editing feature will probably not slow you down much; you'll use the feature so heavily its syntax will become second nature to you before very long.
30
If you develop any shell scripts, you'll probably want to write them in the same shell language you use for interactive commands. As for interactive use, the language you use for personal scripts is largely a matter of personal choice. If you use either the C shell or the Korn shell at the keyboard, you might want to consider using the Bourne shell language for shell scripts, for a couple of reasons. First, personal shell scripts don't always stay personal; they have a way of evolving over time and gradually floating from one user to another until the good ones become de facto installation standards. Writing shell scripts in any language but the Bourne shell is somewhat risky because you limit the machine environments and users who can use your script. Of course, for the truly trivial scripts, containing just a few commands that you use principally as an extended command abbreviation, portability concerns are not an issue. If you're not an experienced UNIX user and shell programmer, you probably know only one of the three shell languages. Writing short, simple shell scripts to automate common tasks is a good habit and a good UNIX skill. To get the full benefit of the UNIX shells, you almost have to develop some script writing capability. This will happen most naturally if you write personal scripts in the same language that you use at the keyboard. For purposes of comparison, the table below describes the shell features that are available in only one or two of the three shells. Feature
sh
csh ksh
Arithmetic expressions
-
X
X
Array variables
-
X
X
Assignment id=string
X
-
X
case statement
X
-
X
cdpath searches
SysV X
X
clobber option
-
X
X
Command aliases
-
X
X
echo -n option
-
X
-
export command
X
-
X
foreach statement
-
X
-
getopts built-in command
-
-
X
glob command
-
X
-
Hash table problems, rehash and unhash commands
-
X
-
Job control (bg, fg, ...)
-
X
X
let command
-
-
X
limit, unlimit commands
-
X
-
nice shell built-in
-
X
-
nohup shell built-in
-
X
-
31
notify shell built-in
-
X
-
onintr command
-
X
-
print command
-
-
X
pushd, popd commands
-
X
-
RANDOM shell variable
-
-
X
repeat shell built-in
-
X
-
select statement
-
-
X
setenv, unsetenv commands
-
X
-
SHELL variable specifies command to execute scripts -
X
-
switch statement
-
X
-
until statement
X
-
X
set -x
X
-
X
set optionname
-
X
-
Set-uid scripts
-
-
X
Shell functions
SysV -
X
Substring selectors :x
-
X
-
trap command
X
-
X
typeset command
-
-
X
ulimit command
X
-
X
Undefined variable is an error
-
X
-
! special character
-
X
-
@ command
-
X
-
*(...) wildcards
-
-
X
$(...) command expression
-
-
X
{...} wildcards
-
X
-
|& coprocessing
-
-
X
>& redirection
-
X
-
32
3.3 Interaction with Shell Before a user can interact with the shell, you must have a session with your UNIX host, which typically involves connecting to your system and logging in. A login session is established by entering your user name and password. The system then proceeds through a login procedure and starts the shell that the system administrator configured for you. An interactive login session can be established to the UNIX server through a wide variety of methods and from an equally wide variety of clients. Those methods are connections through a modem, direct connection from a terminal, and network connections through Token ring and Ethernet. The exact method used to connect to your UNIX system depends upon the type of communications interface used to make the connection. You can see a login prompt after you are connected. In response to that prompt, enter the user name. Typically you also are prompted to enter your password. The combination of your login name and password identifies you to the system and restricts access to the server to authorized people only.
Issuing Commands Issuing commands to the shell is as simple as typing a command and pressing <enter>. Each command is terminated by the Enter key, which signals that the shell should process the instructions the user has typed and execute them. Although Enter is most commonly used, the semicolon can be used as a command terminator, too. The following example uses the semicolon: who; date; ls This executes the who command, followed by the date command, and then the ls command. Each command is executed as it appears on the command line and is equivalent to typing the command on separate lines. The table lists what each of the special characters is used for.
Character
Use
Space
Separates commands, options, and arguments
Tab
Separates commands, options, and arguments
New line
Terminates a command
Prompts The shell informs the user that it is ready to accept input through a prompt. The prompt on the Bourne shell is traditionally a dollar sign ($); for the C shell a percent sign (%); and the Korn shell a number followed by dollar sign (4$). In fact, each shell has several different prompts that inform you of different shell conditions, as illustrated in the given table:
33
Prompt Name
Bourne shell
C shell
Korn Shell
PS1 PS2 PS3 Prompt
$ > not used not used
not used not used not used %
!$ or $ > #? not used
The names PS1, PS2, PS3, and prompt are the names of the shell variables that the respective shells use to identify each prompt. Using these variables, the shell prompt can be changed. The PS1 prompt used in the Bourne and Korn shell is the standard prompt. The standard prompt for the C shell is quite different; its name is prompt, and it is a percent sign. The PS2 prompt is used in the Bourne and Korn shell to indicate that the shell is expecting more input. The PS3 prompt in the Korn shell prompts the user to enter a choice based upon the information provided in another Korn shell command. Example of PS1 and PS2 prompts: $ date’ > In the above example, the user struck the apostrophe keys he pressed Enter. This caused the shell to print the second prompt “>” because it was looking for the second apostrophe to complete the command. It is useful to know the additional control keys that the shell uses.
Special Key
Name
Explanation
Ctrl+D
End of File
Logs out or ends input operation
Ctrl+H
Backspace
Erases the previous key typed or backspace
Ctrl+\
Quit
Interrupts the current command
Ctrl+c
Interrupt
Interrupts the current command or DEL
Ctrl+S
XOFF
Pauses output on the display
Ctrl+Q
XON
Restarts output on the display
You can find out what special keys are used on your system with the sty command, which is used to Query and configure the system about configuration of your connection session. The stty output looks like the following: $ stty – a line = NTTYDISC; speed 38400 baud erase = ^h; Kill = ^u; min = 6; time = 1; intr = ^c; quit = ^|; eof=_^d;
34
The –a option to sty instructs the command to print the information regarding your connection. The information that you are looking for is erase, intr, and quit. From this output, erase is Ctrl+H, intr is Ctrl+c, and quit is Ctrl+D.
Handling Mistakes The shell also informs you when you make mistakes. If the command you enter is not available or doesn’t exist, the shell informs you with a message like the following one: $ daet daet: not found This message tells you that the shell can’t find the command you asked it to execute. When you make mistake typing a command, press the Backspace key to move the cursor back over your mistake so that you can correct it. It is important to remember that the shell does not know how to interpret the control codes sent by the arrow keys on your keyboard.
35
3.4 Features The shell has a number of features that can be used on a regular basis without the need for the programming language. The shell, depending upon which one you are using, might provide for wild cards, background jobs execution, job control, and more.
Wild Cards Wild Cards can be used to perform filename substitution. The Wild Cards used in filename substitution are the asterisk (*), the question mark (?), and the character class ([..]). Wild Cards can be used anywhere in the filename and can be combined to produce some complex pattern, meaning that the pattern match is restricted to those that meet the entire pattern. The asterisk matches any character zero or more times. For example, if you enter the command Is –l *, then the shell will substitute all of the files for the asterisk. $ Is –l * -rw-- 1 -rw-- 1 -rw-- 1 -rw-- 1 $
chare chare chare chare
users users users users
390 Aug 390 Aug 108 Aug 418 Aug
31 31 31 31
20:15 20:14 20:15 20:15
aba abc bab debbie
But the asterisk can be used in many other ways. What if the command Is –l a* were used on these files? This will list the files aba and abc only. From this example, you can see that you could also look for file names using the asterisk first, then some text, or by enclosing the asterisk between some text. Following are illustrations of these two cases: $ Is -l *b -rw-- 1 chare $ -rw-- 1 chare $
users
108 Aug 31 20:15 bab
users
418 Aug 31 20:15 debbie
The given table provides some further examples and explanations for the different types of wild cards.
36
Example a* *.doc text.* t*.doc
Description Matches all files starting with or followed by zero or more letters Matches any file names that end in .doc Matches any file names that start with text Matches any file names that start with t and end with .doc
Question Mark Wild Card Examples a? ?.doc ???
Matches any file with two letters, the first one being an a Matches any file that has one letter followed by .doc Matches any three character file name
Combination Wild Cards a??.*
Matches any file that starts with a, is followed by two letters, a period, and any other characters
*.??
Matches any text followed by a period and two more letters
The question mark matches one character only. Just as the * matches zero or more letters, there must be one character when the question mark is used. The command ‘Is –l ?’ lists only those files that have one character . The command Is – l ?? lists only those files that have two characters in their name. The asterisk and the question mark characters can be combined to match very specific patterns, as shown under the combination wild cards. The character class lists files that contain letters from group of characters and it matches one character of the group. The character class is used by listing the specific characters between left and right brackets ([ ]). The character class is typically used with either the ? or * wild cards. The table given below illustrates the uses of the character class, along with more wild cards.
Characters
Description
[abc] ??
Matches any three letter filename that starts with the letters a, b, or c
[Abc] *
Matches any file name starting with the letter A, b, or c
[abc][xyz] *
Matches any file name that starts with a, b, c, has a secnd letter of x, y, or z, and is followed by other letters
[abcd]
Matches any single character file name that is a, b, c, or d.
[abc][ghi]
Matches any two character file name
37
More Wild card examples ls * [!o]
Matches any file that doesn’t end with an o
ls *[\!o]
Matches any file that doesn’t end with an o
cat chap.[0-7]
Matches any file named chap. That ends with 0-7
ls[aft] *
Matches any file starting with either the letter a, f, or t
ls t??
Matches any three letter file starting with t
Quotes The shell also understands the different quotes that are available and that serve different purposes. The quotation mark characters are as follows: ‘text’ “text” `text`
protects the contents expands the contents executes as a command
The single quotes instruct the shell to protect and not interpret the text between them. quotes are used in shell programs, which are called shell scripts.
These
The double quotes are used to group words together to form an argument to a command, or to form a sentence. These quotes are generally used in shell scripts, but also can be used by the ordinary user. Consider the following example: $ tp this is a test Number of arguments =4 Argument 1 = this Argument 2 = is Argument 3 = a Argument 4 = test No quotes are around the text provided to the tp command as an argument. As a result, tp gives four arguments. Consider the next example: $ tp “ this is a test” Number of arguments = 1 Argument 1 = this is a test In this example, the argument, this is a test, is enclosed in “”. This instructs the shell to group the words together and treat them as one argument. In this case, the double quotes and single quotes are equivalent. They differ in how they allow the shell to expand a shell variable. The shell has variables that can be used in shell programs and are used to control the execution of other
38
programs and the user’s environment. This variable is called an environment variable. Consider the following example in which a variable named TERM is defined. To see the contents of the shell variable, reference the variable using its name preceded by a dollar sign. $ tp “ $TERM ” Number of arguments = 1 Argument = ansi In this example, the shell expands the variable $TERM and givers the value of the variable to the program tp. The next example illustrates how the single quote is different from the double. $ sh tp ‘$TERM’ Number of argument = 1 Argument = $TERM In the preceding example, the shell cannot see that the $TERM in the quotes is supposed to be a variable, so it prints the literal $TERM instead. The back tick quotes are used to put the output of the command in a shell variable. This is useful because information that might be needed again in a shell program can be easily accessed. Even at the command line this can be useful. For example, if you have along directory path, you can store your current directory in a variable and change to that directory again without typing the path. For example, to save your current working directory in a variable called PWD, type the following command: $ PWD = `pwd` This causes the shell to execute the command pwd and puts the output of the command into the variable PWD. The contents of the shell variable PWD is printed by using the command echo. $ echo $PWD
Variables and the Environment Shell variables are a storage place for information, just like variables in any programming language. Variable names are case sensitive. They can be any length, can be upper or lowercase, and can contain numbers and some special characters; variable names cannot start with a number, nor can they contain the characters special to the shell – the asterisk, dollar sign, question mark, and brackets. The following are all valid variable names: TERM PATH path Visual _ id testDir Access to the value stored in the variable is gained by prefixing a dollar sign to the name of the variable.
39
Examining Existing Variables To see the environment variables configured on your system, use the env or printenv command: $ env PATH = : /usr/ ucb :/ bin :/usr/bin LOGNAME = chare SHELL = /usr/bin/ksh HOME = / home/chare TERM = ansi The user environment on this system lists the five shell variables. Environmental variables are typically defined in the system setup files, which are processed when a user logs in, or in the user’s own setup files. The PATH variable in the preceding example indicates that the shell looks in the current directory, then/ usr / ucb, then/bin, and finally/ usr/bin. If the command you entered doesn’t exist in any of these directories, then the shell reports that it cannot find the command. When a user logs in to a UNIX system, the system records the user’s login name and saves it in the LOGNAME variable. This variable is used by many programs to determine who is running the command. The SHELL variable defines the name of the shell that the user is currently using as his login shell and is defined when the user logs in to the system. The HOME variable defines what the user’s home directory is. This variable is often used by UNIX commands to find out where information to this user should be written. The last variable is TERM variable. Many UNIX commands depend upon knowing the terminal type the user is using. The terminal type is determined when the user logs in either through a system default or through customization of the user’s startup files. Some systems prompt the user to enter the terminal type they are using.
C Shell Variables The C shell was developed at the University of California at Berkeley. When compared to the Bourne shell, the C shell has a different command structure for its programming language. Variables are created by using the command set, followed by the variable name and the value, as in the following example: set variable = value set a = seashell C shell variables are accessed by preceding the variable name with the dollar sign. The C shell supports a different mechanism for assigning environmental variables. The C shell uses the command sentenv, followed by the name of the variable and value to be assigned. This places variable in the environment. % setenv TERM vt220
40
3.5 Program execution The shell uses the command line to collect from you the name of the command and the options, and arguments you want to pass to the command. The shell goes through several steps to execute a program, as illustrated in figure.
Check the system for the comman
Display error message No
$ date Yes Load the command
Run the command
Figure 5 - Program Execution As illustrated in this figure, the user types the command date. The shell locates the command by looking in each directory listed in the PATH variable. If the command is not found, the shell prints an error message. If the command is located, the shell determines if the user can execute the command and if so, loads the command into memory and requests the kernel to execute it. The end result is the output from the command, which, in this example, is the date. The shell is a command, also. This means that you can execute a shell when needed. A shell executed through the sh command is called a sub shell. A sub shell inherits its parent’s environment.
Program Grouping Programs can also be grouped together to control how they are processed. Grouping is accomplished using the (..) and {..} constructs. The (..) construct causes the commands that are enclosed between the parentheses to be executed in a sub shell. The (..) construct starts a new version of the shell and executes the command in this new sub shell. This can be useful to group a series of commands.
41
($ date; who; Is ) | wc 20 29 163 $ In this example, the date, who, and Is commands are executed in a subshell, and the output of those commands is then processed by the command wc, which counts the lines, words, and characters in the output stream. The advantage is that the data output from all three commands is merged together and sent to the wc command. The {..} construct instructs the shell to run the commands enclosed between the braces in your current shell. The following example illustrates the differences between these constructs. $ var = 1024 set the value of var $ echo $var print it 1024 $ (var = 2048) in a subshell set var to 2048 $ echo $ var print it 1024 $ (var = 8192; echo $ var) in a subshell, set var to 8192 ad print it 8192 $ echo $ var print the current value of var 1024 $ {var = 2048; } in this shell, change var to 2048 $ echo $ var print it 2048 $ The semicolon in the {var = 2048;} example is required because the {..} construction is considered a command, and when two commands are grouped on the same command line, they must be separated by a semicolon.
42
3.6 Background Jobs Typically, commands that you execute are in the foreground, meaning that the program you are executing has controlled over your workstation. When you run an interactive command such as vi, or others, you interact with the command. This is foreground execution. Background execution enables you to start a non-interactive command and send it to the background. This allows the command to continue execution and frees your login session. To execute a non-interactive command in the background, the command and its options and arguments are typed on the command line, followed by an ampersand (&). $ long – running – command & # PID $ In the above example, long – running – command will execute until it completes, regardless of how long it takes. The # PID that is returned is the UNIX Process ID number assigned to this command. Background execution is not suitable for command that required keyboard interaction.
3.7 Batch jobs In batch, you have little control if you want to automate a job, or want it executed on a regular basis. But there is yet another command in this series, called cron. cron is designed to execute commands at specific time, based upon a schedule. This schedule is known as a crontab file because of the command that is used to submit it. The cron command is not actually executed by a user. It is started when the system is booted and remains running, until the system is shut down.
crontab file The crontab file is used to provide the job specifications that cron uses to execute the commands. A user has one crontab file that can contain as many jobs as required. For example, minutes
hours
day of month
month
day of week
command
Each line in the crontab file looks like this. There can be no blank lines, but comments are allowed, using the shell comment character, #. The first five fields are integer fields, and the sixth is the command field, which contains the command information to be executed. Each of the five integer fields contains informatioin in several formats and has a range of legal values. The legal values are listed in the following table.
minutes
hours
0 - 59
0 - 23
day of month 1 – 31
month
day of week
1 – 12
0 - 6 0 = sunday
43
Each of these five integer fields has a series of formats that are allowable for their values. These formats include the following:
A number in the respective range. For example, a single digit. A range of numbers separated by a hyphen, indicating an inclusive range. For example, 1-10. A list of numbers separated by commas, meaning all of these values. For example, 1,5,10,30. A combination of the previous two types. For example, 1-10, 20-30, An asterisk, meaning all legal values.
Some sample commands from crontab file: 0 * * * 0-6 echo “\007” >> /dev/console; date>> /dev/console; echo >>/dev/console 0, 15, 30, 45 * 1-31 * 1-5 /usr/local/collect.sys > /usr/spool/status/unilabs 0, 10, 20, 30, 40, 50 * * * * /usr/local/runq –v9 2>/dev/null 5, 15, 25, 35, 45, 55 * * * * /usr/lib/uucp/uucico –r1 –sstealth 2>/dev/null These four entries are from a crontab file on a real system. They illustrate the different values that each of the integer fields can contain. The first line means that the command will be executed at the 0 minute of every hour because the minute’s field is zero, and the hour’s field, which is the second from the left, is an asterisk, meaning all legal values. This is done for every day of the month, for every month of the year- both of these fields, the third and fourth, contain asterisk also. The fifth integer field contains the values 0-6, which indicates that this is done for each day the week. In the second line, the minutes field indicates that this command is executed every 15 minutes. The asterisk in the second field means that this done for every hour of the day. The third field, which is the day of the month, indicates that this command is ti be run on every day of the month. When you look at the fifth field, this command is restricted to Monday through Friday. So, if the day of the month is in the range 1 to 31, and the day of the week is in the range of Monday to Friday, then the command will be executed. The following example creates a number of problems: * *
*
*
* any command
This has the effect of overloading your system very quickly because the command is executed every minute of every hour, of every day, of every month. Depending upon the command, this could bring your system to its knees very quickly. Please be careful to avoid crontab entries like the preceding one.
Creating a crontab File crontab files are created with your favorite text editor. Do not use word processors, because they insert special characters and control codes. The file must be plain text, such as that which is
44
created with the vi editor. Each field should be separated by a space or a tab. Blank lines are not allowed; they cause the crontab command to complain, as you will see subsequently. The comment symbol is the same as that used for the shells, the pound symbol (#).You should save your crontab file in your home directory.
Submitting the crontab File The crontab file, once created, is submitted to cron through the use of the command crontab. The only argument crontab needs to submit a cron job is the name of the file that contains your specifications. An example follows: $ cat cronlist # # This jo0 b is executed to remind me to go home # 15 17 * * 1 -5 mail -s “time to go home” < / dev/ null $ crontab cronlist $ This demonstrates a successful submission to crn. But it is not uncommon to make mistakes, as shown here: $ cat cronlist # # This job is executed to remind me to go home # 15 17 * * 1 – 5 mail –s “time to go home” < /dev/ null $ crontab cronlist crontab: error on previous line; unexpected character found in line. $ In the preceding example, there is a blank line at the end of the file. When crontab reads the file to ensure that its format is correct, it sees the blank line and reports that there is an error in the file. This is the same error message that crontab will print if there aren’t enough fields on the line. However, integers that are outside their boundaries are reported, as seen in the following: $ cat cronlist # # This job is executed to remind me to go home # 99 17 * * 1 -5 mail -s “time to go home” < /dev/ null $ crontab cronlist 99 17 * * 1 -5 mail –s “time to go home” < /dev/ null $ crontab cronlist 99 17 * 8 1-5 mail –s “time to go home”
45
Making Changes The crontab command has two options that can be used to make changes to your submitted jobs. The first is –r. This option instructs crontab to remove the existing crontab file. The contents of the file that are used by cron are destroyed. The second option is –l, which lists the jobs that are currently know to cron for the invoking user. These options are illustrated in the following: $ crontab –l # # This job is executed to remind me to go home # 15 17 * * 1-5 mail -s “time to go home” $ HOME/ cronlist This retrieves the job specifications that cron currently has and saves them in the cronlist, in your home directory. 2. vi cronlist Once you have this information, you must edit it to reflect the changes you want to see. Edit the file using your favorite text editor, make the changes, save the file. 3. crontab cronlist Next execute the crontab command with your newly changed cronlist file. This has the effect of replacing the current information with your new specification.
46
3.8 Input, output and pipes When you log in to UNIX, three files, or data streams are opened- standard input, standard output, and standard error. Standard input is typically the keyboard. Standard output is the output device. Standard error is the error message stream. The standard output and error are separated so that programmers, users, and system administrators can all take advantage of the shell’s powerful redirection facilities.
Input and output When you execute a command, the output of the command is written on standard output. When a command is executed, any output that the programmer wanted the user to see is written to standard output. Error messages, such as those printed when an invalid option used, are generally written to standard error. The standard output and error are generally printed on the terminal. : # # # @ (#) termlist v1.0 – show a list of supported terminals # copyright chris hare, 1989 # # This script will occasionally generate some different looking results # Which is dependent upon how the termcap file is set up # : # # Get the system and release name # SYS=`uname –s` REL=`uname –r` echo “Supported Terminals for $SYS $REL” echo “ ___________________________________” grep ‘^..|.*|.*’ /etc/termcap | sed ‘s/:\\ //g s/^..|//g s/:.*://g’| sort –d | awk ‘{ FS= “|” ; printf “%-15s\t%-40s\n”, $, $NF }’ Any error messages are written to standard error, which appears the same as standard output because they both print on the terminal device. The input/output redirection facilities enable you to change to where the input, output, and error streams are connected. This increases the level of flexibility in the operation of the system. Redirection allows the output of a command to be written in a place other than what is typical. You perform output redirection by writing the command and following it with a > and the name of the file to which the output should be written.
47
Following are examples are for standard output redirection. The object on the right hand side of the > sign must be a file, as in the following syntax: Command > file The following example run the command who and save the output in a file. $ who > /tmp/save $ cat /tmp/save chare console jun 25 16:36 Using the > file construct instructs the shell to create the file if it doesn’t exist. IF the file does exist, the information is lost. If you want to append information to the existing file, use >> rather than >, as in the next example: $ date >> /tmp/save $ cat /tmp/save chare console jun 25 16:36 Wed jun 29 09:53:47 EDT 1994 Notice that the file / tmp/save still has the information from the previously executed who command and the output from the date command. You can redirect where the standard error messages are written, but it is done somewhat differently than with standard output. You must add a file descriptor number, as in the following line. Command 2> file Three file descriptors are associated with data streams. standard output; and two (2) is standard error.
Zero (0) is standard input; One (1) is
Standard input is generally associated with your keyboard. You can also redirect from where standard input comes by typing the command, followed by the < and a file, which has the input needed for the command. The following is an example: command < file Input redirection is used infrequently when compared with output redirection, but can be used to provide input to an interactive command. The following is an example of input redirection. $ cat list banana apple The sort command accepts its input and sorts the data. Example: $ sort < list apple banana
48
Using exec Another method of input/output redirection is to use the exec command. The following example performs the redirection only for this one command: who > tmp / save But if you want to catch the output of a number of commands in a file, you can redirect the entire data stream. The following example redirects the standard output stream to the file / tmp / stdout: $ exec > /tmp / stdout $ date $ who $ ls $ exec > ?dev / tty $ pwd /tmp In this example, the standard output is to be saved in the file / tmp / stdout. The shell still displays the prompt because it is displayed on standard error. The execution of the who, date, and ls command looks like no output is generated. Then you restore the standard output to the terminal by using a terminal device known as /dev / tty, which is a special name connected to the connection port you are using. In the C shell, input and output redirection is slightly different, but the mechanisms work similarly. The C shell does not allow the use of numbers. Rather, an ampersand (&) is appended to the redirection symbol. This has the effect redirecting both standard output and standard error to the same place. The following example redirects standard output and standard error in the C shell. % who – Z > & / tmp / save
Pipes Pipe (|) is a mechanism that connects the output of the command to the input of another. By using pipes, you can build very powerful commands without having to learn a high level programming languages. Commands must meet the following requirements to be used in a pipe.
The command must write to standard output The other command in then pipe must read from standard input.
Pipes have a command on each side of the | symbol, as in the following example: command | command This is called pipeline. Pipelines can be long and involved, or consist of only one or two command. The following program is an example of a complicated pipeline that uses the facilities of a number of common UNIX commands. :
49
# # Get the system and release name # SYS= ` unname -s ` REL= ` unname –r` echo “ Supported Terminals for $SYS $ REL” echo “===================================” grep ‘^.. | .* | .*’ /etc / termcap | sed ‘ s/:\\ //g s /^ . .* : // g s/ : . *: //g’ | sort –d | awk ‘{ FS= “\”; printf “ %-15s \ t%-40 s\n” , $1, $NF }’ Pipelines do not need to be this complicated. To illustrate standard input, output, and error and their interaction with pipes, look at the following small shell program. The first program in this example is called ax1 and is as follows: $ cat ax1 # ax1 shell program echo “This is being sent to standard output.” exit 0 ax1 simply prints the message This is being sent to standard output on the screen. If you were to run this at the command line, you would see the following: $ ax1 This is being sent to standard output. $ The second command is called ax2 and is as follows: $ cat ax2 # ax2 shell program while read LINE do echo “this came from standard input” echo “ - > $LINE” done ax2 is a little more complicated. It reads standard input and prints each line that is read on the standard output device. If you run ax2 from the command line, the following happens: $ ax2 this is a test this came from standard input -> this is a test line 2 this came from standard input -> line 2 this is for a sample pipeline
50
this came from standard input -> this is for a sample pipeline $ Because the ax2 command is reading from standard input, you must tell it when there is no more input to process. This is done using the Ctrl+d character. Pipes are used in every situation. To find out how many files are in a directory, use the command ls | wc -l. If you want to view allthe files in a directory, use ls –l | more.
51
Chapter 4 4. The Command Line 4.1 Creating and Manipulating Files and Directories A file is a sequence of character, or bytes. When reading these bytes, the UNIX Kernel places no significance on them at all; it leaves the interpretation of these bytes up to the application or program that requested them. The data files used by application programs are saved by the programs in a specific format that the application knows how to interpret.
Understanding Directories A directory is a file, meaning that it is just a sequence of bytes. The information contained in a directory is the name of the file and an inode number. The inode is like an index card that holds the specific information about a file or other directory, include the following:
Owner of the file Group that owns the file File size File permissions Data block addresses on the disk where the information is stored.
The directory structure is part of the heart and soul of UNIX. This is dependent upon the version of UNIX you are using.
Special Files UNIX also supports several types of specific files, which fall into one of the following categories:
Character device files Block device files Hard link Symbolic links
Character device files read and write data one character at a time. Block device files access a block of data at a time. The block is generally either 512 0r 1,024 bytes. The kernel actually does read or write the device a character at a time, but the information is handled through a buffering process that only provides the information when there is a block. Disk appears as both block and character devices because, depending on the command in use, they function either type of device. The hard link is a special type of file that allows a single file to have multiple names. Hard links have the following two restrictions:
The file and the second name must be part of the same file system. Hard links can only provide a second name for files. They do not work for directories.
52
Symbolic links serve the same purpose as hard links. They can span file systems, and they point to directories.
Moving around a Directory Moving around in the directory structure requires that you know two commands: cd and pwd. The cd command is used to change from one directory to another.
cd The cd command is used to from one place to another in the file system. The cd command uses the value of the environment variable HOME. $ pwd /tmp $ cd $ pwd /home/chare $ When using cd with an argument, you add the name of the directory that you want to access. You can use either a full path name or a relative pathname to get to that directory. The following are some examples of using cd with both types of pathnames: $ cd /tmp $ pwd /tmp $ cd $ pwd /home/chare $ cd book $ pwd /home/chare/book $ cd /usr $ pwd /usr $
pwd pwd is an acronym for Print Working Directory. The pwd command accepts no arguments and prints the name of the directory you are currently working in. For example, $ pwd /home/chare/gopher2.012/doc $
53
Understanding Absolute and Relative Pathnames Absolute pathnames always starts with a slash. Following is a list of some absolute pathnames: / /usr/spool/mqueue /home/chare /usr/mmdf When using absolute pathnames with cd, enter the cd command followed by the name of the directory. Relative pathnames do not begin with a slash because the term relative mans relative to the current directory.
Listing Files and Directories There are two commands that can help you navigate through the files and directories in your UNIX system.
ls Command The ls command lists the contents of a directory. It accepts arguments and has a plethora of options that affect its operation. The following code shows some sample output of the ls command: $ ls CrossRoads . DECterm Mail News $
book gophermail.tar uyhl.pc
The ls command has a list of options that can alter the way it lists the files. The most common options are as follows: -a -C -F -l -d -r
lists all file, including hidden files lists in columns shows the file type lists in long format shows directory names only does a recursive listing
The ls –a command lists all files in the directory, including hidden files. Every directory has two dot files. The current directory is indicated with the single dot (.), and the parent directory is shown as dot – dot ( ..). The following code illustrates the ls –a command: $ lx gamma1
infra _red
uvA
uvB
xray1
$ ls .
.white
gamma1
uvA
xray1
xray2
-a
54
..
.xray3
infra _ red1 uvB
xray2
$ From the above example, the first time the ls command was issued no options were indicated and not all the files were listed. The second example added the –a option to the command, and now you see the hidden files as well. The –C option to ls is the standard mode of operation for BSD systems. The -C option is used to lists the files in columns. The –C options often is combined with the –F option to see the executable files and directories in the file list. The following commands show an example of these two options: $ ls -C gamma 1 xray2 $ ls -CF gamma1 xray1
infra_red1
uvA
infra_red1*
uvB
uvA/
uvB
xray2
$ In the second part of the previous code, some files are suffixed with an asterisk or a slash. The asterisk indicates that the file is an executable. These slash that this is a directory. The –l option lists the files and the directories in a long format. This format provides most of the needed information to the user. When looking at your output from ls -l, the time field can show a date and time, such as date and time. Sometimes, you must search through many files to find the existing directories. The ls -d option is used for this purpose. The –d option instructs ls to display only the directories and not the files. The following code illustrates the use of ls –d in the / usr /lib directory: $ pwd /usr/lib $ ls -l tmac total 4 -rw- -r- -r - -rw- -r- -r - -rw- -r- -r - -rw- -r- -r - -
1 bin 1 bin 1 bin 1 bin
bin bin bin bin
55 91 65 58
Jun Jun Jun Jun
6 6 6 6
1993 tmac.an 1993 tmac.m 1993 tmac.osd 1993 tmac.ptx
$ ls -d tmac tmac $ ls -ld tmac drwxrwxrwx
2 root users
96 Jun 6 1993 tmac
$
55
In the above code, the initial command shows the contents of the directories tmac. The second time the ls command was issued, only the –d option was used, which does not list the contents of the directory tmac. Finally the ls command was issued with both the –l and –d options. This prints all the information on the tmac directory. The last option is –R option, which instructs ls to perform a recursive listing of the files in the directory. The following code shows using ls –R in a small directory structure: $ ls -lR uvA total 12 drwxr-xr-x -rw-r- -r--rwxr-xr-x
2 chare 1 chare 1 chare
users users users
48 Aug 24 17:53 micro_light 438 Aug 24 17:52 test1 45 Aug 24 17:53 test3
uvA/micro_light: total 1 -rw-r- -r-1 chare
users
29 Aug 24
17:53 sam
$ The three additional options, used less frequently than the six options, can come in handy for changing the order in which the list is displayed. • • •
-r Reverse the display. -t Shows the files in order of time modified with the most recently modified files appearing first. -U Shows the files in order of last access time, with most recently accessed files listed first.
These options can be used individually, or in conjunction with any of the other available options.
The File Command The file command uses a file called /etc/magic, or on some systems /usr/lib/file/magic, to tell it what to look for in the file that tells what type of file it is. This is called the file signature. The file command is on your system is determined by the magic file and how many entries your vendor included in it. The file command should be followed by a file or list of files that you want to examine. The following example shows a sample of the file command: $ file * FileCabinet: MW.INI: a.out: a1: b1: brk: city: hello.c: list:
directory data mc68k executable ascii text c program text commands text English text c program text empty
56
$
Creating a File Files can be created by application programs through output redirection, or as temporary storage by commands such as language compilers.
Using the touch Command to Create a File This command creates an empty file and can be used to update the file access time. The syntax of the command is as follows: $ touch filename The creation of a file using the touch command is illustrated in the following code: The touch command creates a file in the current directory with the name specified on the command line. This command is also used for updating the access times on files. The following code illustrates the changes that occur on the access times using touch: $ touch /etc/passwd touch: canot change times on /etc/passwd $ ls -l list -rw-r- -r-1 chare users $ date Wed Aug 24 18:01:14 EST 1994 $ touch list $ ls -l list -rw-r- -r-1 chare users $
29 Aug 24
17:53 list
29 Aug 24
18:01 list
Editor You can create files on UNIX by using text editor. Those that are typically part of UNIX distributions are ed, ex, vi and emacs.
Output Redirection Files can be created using output redirection. The output redirection enables you to change where the output of a command is sent. You can send the output to a file using the appropriate symbol for your shell Example: $ cal > /tmp/output $ cat /tmp/output August 1994 S M Tu W Th F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28
57
29 30 31 $ ls -l /tmp/output -rw-r- -r-1 chare $
users
29 Aug 24
17:53 /tmp/output
The output of the cal command is being redirected into the file/ tmp / output, thereby creating a new file. The redirection symbol is the > between the command and the file name.
Making a Copy Creating a file also can be done by making a copy of an existing file. This process involves the use of the command cp, which requires at least two arguments as illustrated in the following examples. $ cp old _ file new _ file $ cp file 1 file 2 ....... directory The first line copies the old file to the new file. $ ls –l total 3 -rw-r- -r-1 chare -rw-r- -r-1 chare -rw-r- -r-1 chare $ cp list new_file $ ls -l total 4 -rw-r- -r-1 chare -rw-r- -r-1 chare -rw-r- -r-1 chare -rw-r- -r-1 chare $
users users users
29 Aug 24 27 Aug 24 23 Aug 24
17:04 sam 17:53 list 17:03 sample
users users users users
29 29 27 23
17:04 17:53 17:53 17:53
Aug Aug Aug Aug
24 24 24 24
sam new_file list sample
Reading a File Aside from using an editor to look at a file, a number of commands enable you to view the contents of a file without using an editor. The commands used for this purpose are cat, more, and pg.
The Cat Command The cat command is used to view a small text file or to send a text file to another program through a pipe. The cat command has no facility to view the file in manageable chunks. The only way to do this is to use the Ctrl + s key to suspend the output and ctrl + Q to restart the output flow. If you are connecting through a network, it is possible that the control commands will not be processed quickly enough to avoid the “loss” of data off the screen. The use of the cat command both to view a file and to send it to another program through a pipe is demonstrated in the following code:
58
$ cat fruits apple orange lemon lime banana kiwi cherry $ cat fruits | sort apple banana cherry kiwi lemon lime orange $ In the first part, the cat command lists the contents of the file to the screen. In the second part of the code, the cat command is used to give sort some input to sort. To find out if the file has more information than you can handle on –screen, you can use one of the commands. You can also use the command wc to find out how big the file is. The wc command is a word counter. It scans a given file and counts the number of lines, words, and characters in file. The wc command uses the whitespace to tell when a word starts and ends. This tells you whether you can use cat, or if you should use more or pg. The wc command uses the following format: $ wc file In this format, wc reports all three counts: lines, words, and characters, as demonstrated in the following code. The following example also shows how using the –l option instructs the wc command to count only the number of lines $ wc a. out 39 541 $ wc -1 fruits 7 fruits $
21959 a.out
If the output from wc indicates that the file is more than 20 or 22 lines, then it is probably a good idea to use either more or pg to view the file.
The More Command One alternative to the cat command is the more command, which has its roots in BSD UNIX. Because of its popularity, many other UNIX vendors started including the more command in their own distributions. The format of the more command is as follows:
59
$ more file The more command displays the file one screen at a time, making it useful for viewing large files. As each screen is displayed, more pauses the display and prints a prompt on the last line of the screen. Example: $ more test3 total 220 drwxr-xr-x 2 chare -rw-r- -r-1 chare drwxr-xr-x 2 chare -rw-r- -r-1 chare drwxr-xr-x 2 chare drwxr-xr-x 2 chare -rw-r- -r-1 chare -rw-r- -r-1 chare --More---(15%)
users users users users users users users users
48 29 43 25 58 68 79 99
Aug Aug Aug Aug Aug Aug Aug Aug
24 24 24 24 24 24 24 24
1996 micro_light 1996 sam 17:53 abd 17:04 samply 17:59 egjkrhgo 07:53 ujheue 07:04 iuyeu 17:04 uke
The last line in the display shows the --More---(15%)prompt. By this, more tells you that it is waiting for a command, and that you have viewed 35 percent of the total file The more command has flexibility built in to it- from searching for text, to moving forward, to starting vi at the line that you are viewing.
The pg Command The pg command is like the more command, but has more system V background. It enables the user to view a file one screen at a time. The command format for pg is the same as more, as shown in the following example: $ pg file The following list shows the help screen from pg command: h q or Q or \n l d or ^D . or L f n p $ w or z s savefile /pattern/ ? pattern ? or ^ pattern ^
help quit next page next line display half of a page more redisplay current page skip the next page forward next file previous file last page set window size and display next page save current file in savefile search forward for pattern search backward for pattern
60
! command
execute command
most command can be preceded by a number, as in +1\n (next page); (page).
-1\n (previous page); 1\n
The previous list shows some similarities between pg and more, but there are some significant differences. For example, to view the next screen in the file, press Enter, not the spacebar. To view the next line in the file use the l key, not the enter key. The following code shows the use of pg to view a file, and how to use the l command: $ pg test3 total 220 drwxr- xr-x -rw-r- - r- drwxr-xr-x -rw-r---drwx-------drwxr- xr-x -rwxr-xr-x :l -rw-r- -r-:l -rw-r- -r-l: -rw-r- -r-:
2 chare 1 chare 6 chare 1 chare 2 chare 2 chare 1 chare
users users users users users users users
32 May 16 1993 Clipboard 126 Jun 5 1993 Environment 272 May 3 07:47 Filecabinet 63 Jul 29 1993 MW.INI 32 Apr 30 06: 37 Mail 32 May 16 1993 Wastebasket 21959b May 8 07:01 a.out
1 chare
users
29 Aug 24
17:04 sam
1 chare
users
25 Aug 24
16:04 hgf
1 chare
users
28 Aug 24
19:04 ggy
With pg command, you can easily move backward line by line as well as forward line by line.
Removing a file When a file is removed, the number of the directory in which the file was is changed to zero. This means that there is no way to connect the filename to the actual information. And after this is done, you cannot effectively undo the information unless you have a backup copy of the file saved somewhere.
The rm command Removing a file is done with the rm command, which has three options: -i,-f and –r. The format of the rm command is as follows: $ rm file1 file2 file3 ... … Like most UNIX commands, you can specify any number of files on the rm command line. The following code shows removing a file and verifying that it has been deleted. $ ls -l total 13 -rw-r- -r--rw-r- -r--
1 chare 1 chare
users users
29 Aug 24 24 Aug 24
17:04 sam 16:04 output
61
drwxr-xr-x -rw-r- -r--rw-r- -r--rw-r- -r--rwsr-xr-x $ rm output $ ls -l total 2 -rw-r- -r-drwxr-xr-x -rw-r- -r--rw-r- -r--rw-r- -r--rwsr-xr-x $
2 chare 1 chare 1 chare 1 chare 1 chare
users users users users users
53 99 09 49 79
Aug Aug Aug Aug Aug
24 24 24 24 24
15:53 17:04 18:04 19:04 15:04
abd ijm jhm sajh samjh
1 chare 2 chare 1 chare 1 chare 1 chare 1 chare
users users users users users users
29 53 99 09 49 79
Aug Aug Aug Aug Aug Aug
24 24 24 24 24 24
17:04 15:53 17:04 18:04 19:04 15:04
sam abd ijm jhm sajh samj
Use the pwd command to ensure that you know where you are in the directory structure. This helps prevent you from removing something that you don’t really want removed. Next, list the files, and then type the rm command. The following list of code demonstrates the best way to remove files using wild cards. Use of the –i option puts rm into interactive mode. For each file on the command line, rm prompts you with the name of the file. If you specify y and press Enter, the file is removed. Example: $ ls -l total 33 -rw-r- -r-1 chare users 29 Aug 24 17:04 sam -rw-r- -r-1 chare users 24 Aug 24 16:04 output drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd -rw-r- -r-1 chare users 99 Aug 24 17:04 ijm -rw-r- -r-1 chare users 09 Aug 24 18:04 jhm -rw-r- -r-1 chare users 49 Aug 24 19:04 sajh -rwsr-xr-x 1 chare users 79 Aug 24 15:04 samjh $rm -i * sam: ? y rm: output directory ijm: ? jhm: ? y sajh: ? $ ls -l total 3 -rw-r- -r-1 chare users 24 Aug 24 16:04 output drwxr-xr-x 2 chare users 53 Aug 24 15:53 abd -rw-r- -r-1 chare users 49 Aug 24 19:04 sajh $ You type n or nothing and press enter, the file is not removed. deleted, it cannot be recovered without the use of backup.
Remember that when a file is
The –f option forces the removal of a file, regardless of the permissions. The use of this option requires that the user be the owner of the file, or root.
62
The –r option can perform recursive removals of a directory structure, which includes all files and directories in that structure. The rm –r command is very dangerous . Do not use it with wild cards unless you are prepared to live with the consequences. The rm –r command accepts files or directories to be removed. If the argument is a directory, then rm looks through the directory, removes everything under it., then removes the directory itself. The following code shows an rm –r command in action: $ ls -l total 6 -rw-r- -r--rw-r- -r-drwxr-xr-x -rw-r- -r--rw-r- -r-$ cd new_dir $ ls -l total 1 drwxr-xr-x $ rm -ri test1 directory test1: test1/a.out: ? y test1/a1: ? y test1/a2: ? y test1/a3: ? y test1/a4: ? y test1: ? y $
1 chare 1 chare 2 chare 1 chare 1 chare
users users users users users
29 24 53 99 09
Aug Aug Aug Aug Aug
24 24 24 24 24
17:04 16:04 15:53 17:04 18:04
sam output abd ijm jhm
2 chare
users
112 Aug 24 19:03 test1
?y
Simply using the rm –r command provides no output or error messages unless you do not have the permission to remove the directory tree. In these cases, rm reports the error message indicating that you don’t have the needed permission.
Creating the Directory To make a directory, use the command mkdir, which accepts multiple arguments, each one being the name of the directory you want to create. Unless indicated by providing a full pathname, the directories are created in your current directory.
The mkdir command The syntax for the mkdir is as follows: $ mkdir directory _ name The same rules for file names apply for directory names. You can insert a space into a directory name, but it makes the name difficult to use. Avoid spaces in directory names;
63
use an underscore instead. The directory name also cannot already exist as either a directory or file. Both cases result in an error message. The following shows some sample directory being created. $ ls -l total 5 -rw-r- -r-1 chare -rw-r- -r-1 chare drwxr-xr-x 2 chare -rw-r- -r-1 chare $ mkdir new_dir drwxr-xr-x 2 chare $ mkdir sam mkdir:file exists $ mkdir /etc/chare $mkdir: permission denied $mkdir “space dir” $ ls -ld spaxe dir space not found
users users users users
29 24 53 99
Aug Aug Aug Aug
24 24 24 24
users
199 Aug 24
17:04 16:04 15:53 17:04
sam output abd ijm
17:04 new_dir
In this code, you see a directory listing of the files in the current directory. The user makes a directory new _ dir, and then tries to make another directory ax1. Of course this fails, and mkdir tells him that it failed because a file called ax1 already exists. The user then tries to create a directory in / etc. This typically is not permitted because the / etc / directory, as you may recall, is used by the system as a place to store system administration commands and system configuration files. Next is a directory name with a space in it. Using a space in the directory name can lead to all kinds of confusion. This is why you should avoid using spaces in both file and directory names.
Removing a Directory Removing a directory is accomplished by using the rmdir or rm –r commands.
The rmdir command The rmdir command also can accept multiple directory names like the mkdir command, but it has one requirement. The directory must be empty; it cannot contain any hidden files, files, or directories at all. If your directory has subdirectories that have subdirectories, removing them can be a tedious process. The following code illustrates the removal of a directory using rmdir: $ ls -l total 7 -rw-r- -r-1 chare -rw-r- -r-1 chare drwxr-xr-x 2 chare -rw-r- -r-1 chare -rw-r- -r-1 chare $ rmdir /etc rmdir: /etc: permission denied $ rmdir space dir
users users users users users
29 24 53 99 09
Aug Aug Aug Aug Aug
24 24 24 24 24
17:04 16:04 15:53 17:04 18:04
sam output abd ijm jhm
64
rmdir:space nonexistent rmdir: dir nonexixtent $ rmdir “space dir” $ mkdir new_dir/test1 $ rmdir new_dir rmdir: new_dir not empty $ In this example, the user tried to create a subdirectory in the /etc directory. For the same reason that the user can’t create a directory, the user can’t remove /etc. The user then tried to remove that directory, which has a space in the name, and forgot about the quotation marks. This looked like two arguments to rmdir, which then complained that it couldn’t find a directory named space, or dir. Then he remembered about the quotation marks and removed the directory. However, the directory new_dir couldn’t be removed because the directory isn’t empty. Another command rm –r was explained earlier.
The mv Command The mv command does two things: moves a file from one place to another and renames a file. The cp command is used to copy files from one directory to another. The disadvantage of this is that your file now takes up twice as much disk space. You can move the file from one place to another instead. This saves some disk space. You can accomplish the same thing as the mv command does by copying the file and removing the original. The syntax to move a file from one directory to another is as shown in the following: $ mv file file file …. Directory This enables the user to enter at least one file name followed by the directory to move the file to. If more than one file is specified, and the last name is not a directory, the move fails and mv reports an error. The following code shows the use of mv command: $ ls –l total 6 -rw-rw-rw1 drwxr-xr-x 2 -rw-rw-rw1 -rw-rw-rw1 $ mv s* t* /usr/tmp $ ls s* t* s* not found t* not found $ ls /usr/tmp sample test $
chare chare chare chare
users users users users
656 32 605 60
Aug 24 18:54 ax1 Aug 24 18:52 micro_light Aug 24 18:40 sample Aug 24 18:41 test
Notice that several different types of wild cards are used in this example. In the first ls command, separate substitutions are made for the c* files and the s* files. The same substitution is made for the mv command. However, when it comes time to check the / usr / tmp/ directory, separate substitutions are not used; they are combined using a character class.
65
The second format of the mv command enables you to change the name of a file. You can choose to change the name and leave the file in the current directory, or you can change the name and move it to another directory. The syntax for these cases as follows: $ mv old _ name $ mv old _name
new _ name /new _dir/new _ name
The first example changes the file name and keeps it in the current directory. In the second example, the file name is changed and the file is put into another directory. The first example with mv renames the file ax1 to junk. The second example shows that the file starter .doc is being moved to the /tmp directory, and its name being changed to junk also. If you were to list the files in the current directory, you would see that the file tty _ lines is missing, but a file named junk is there instead. When you look at the /tmp directory, you find a file named junk. If the situation arises, then mv informs you with the appropriate error message.
The ln command If you need to have the same file known by different names, this is possible without having to make a copy. A hard link enables you to give a file another name. This is done by creating an entry in a directory; no additional disk space is consumed. Creating a hard link is done with the command ln. The syntax of the command is as follows: $ ln old _ name new _ name The new name can be either an absolute or relative pathname. The following code illustrates the use of the ln command to create a hard link: $ ls –l old_one -rw-r--r-1 $ ln old_one new $ ls –l old_one -rw-r--r-2 -rw-r--r-2 $
chare
users
202
Aug 24 19:16
old_one
chare chare
users users
202 202
Aug 24 19:16 Aug 24 19:16
new old_one
This code shows an ls –l listing of a file called old _ one in the directory. Notice the link count, which is the number immediately following the permissions. When the example began, the link count was one. After the ln command was used to create another name for this file, the next ls shows a link count of 2, meaning that two directory entries in the system point to this file. The second kind of link, often used in Network File System (NFS) environments is the symbolic link. The symbolic or soft links are often used for attaching a different directory name but also can be used for files. If dir1 and dir2 are directories, $ ln –s dir1 dir2 creates a soft link.
66
This code shows an example of creating symbolic links using the ln command. The syntax for ln –s is the same as for ln. In this example, a symbolic link to a file is created. The ls –l output in the example shows that in the case of dir2, which is a symbolic link to dir 1, the output reports a line like the following: dir2 -> dir1
The copy command The copy command is used for copying a directory and its contents. It has number of options. They are –r, -o, -m, and –v. These instruct copy to do a recursive copy, retain the owner and last modification date, and to be verbose about what it is doing. $ copy -romv ../ andrewg examine directory ../andrewg/mail400 examine directory ../andrewg/mail400/mtaexe copy file ../andrewg/matsexe/mta-admin copy file ../andrewg/mail400/mtaexe/mta examine directory ../andrewg/mail400/uaexe/mail400 copy file ../andrewg/mail400/uaexe/mail400 examine directory ../andrewg/gopher copy file ../andrewg/wsg-10.exe $ This preceding notation tells the shell to go up one level and find the directory andrewg, and then copy it in the current directory. The remainder of the example shows the output of the copy command. Copy, when using the –v option, informs you when it copies a file, or when it looks at a directory to see what there is to copy. There are other two ways to copy a directory structure. One is to use the command cp –R on systems that supports it, as shown below: $ cp –R source destination Here, the source directory structure is copied to a new directory called destination. The second method uses the command tar. The syntax of the command follows: $ cd . source; tar cf- . | (cd dest; tar xfBp -) This uses the tar command to create an archive of the directory and to send it through a pipe to subshell which, in turn, goes to the destination directory and uses tar to extract the archive. The result of this command is a copy of the original structure.
67
4.2 Controlling Permissions to Files Defining Permissions File and directory permissions form the combination lock around your data. However, even if you completely secure your information so that no other user can access it, on most systems the system administrator can still open and read your files. Even though you can prevent the majority of people from accessing your files, you cannot prevent all of them.
The Password File The basic information about who you are is stored in a file called the password file, which is typically found at / etc/ passwd. The password file entry consists of seven fields: Username Password UID GID Comment Home Directory Shell
chare A/49wrhyu 1003 104 Chris Hare /home/chare /usr/bin/ksh
These entries are inserted when your account is created by the system administrator. The password file is colon delimited, which means that a colon separates each field in the file. The first entry is your login, or username. Some examples of usernames are as follows: chare terri jimh rfh All these are valid user name. The second field in your password entry is encrypted password. When you login, the password that you type is encrypted to see if there is a match. The third field is your actual user number, or UID. This number uniquely identifies you to the system. The fourth field is your group number, or GID. This is your login group, and it identifies to which group of users you belong. The sixth field in the password file is your home directory. When you log in to UNIX, it places you in a particular location of the directory structure. The seventh field is your login shell. There are number of shells that are available. The value of this field determines if you will be using the Bourne shell, the Korn shell, or the C shell.
The Group File The group file contains information about groups of users. The group file has a colon separating each field in the file, and there are four fields.
68
Groupname Password GID user list
tech * 104 chare, andrewg, patc
The first field is the name of the group. The second field, which usually contains an x, is for a group password. The third field is the actual group number, which identifies the group. The fourth field is a list of comma separated user names.
User ID Each user is assigned a unique user id or UID that identifies him to the system. Every process you run, every file you create, is stamped with your UID. This UID is associated only with your user name.
Group ID A group is a collection of users who may be assigned together so that they can access a common set of files or directories. This means that if they do not own the file, but are a member of the group who owns the file, they can still be allowed access to the file.
The id Command The command id is used to give the information about the user. As illustrated in the following code, id tells you your user name, UID, and GID. If your effective UID or GID is different, these also are listed in the output of id. $ id uid = 1003 (chare) $
gid = 104(tech)
Understanding the Permission Bits Nine permission bits are associated with each file and directory – three for the owner of the file, three for the members of the same group owner of the file, and three for every one else. If you are not the owner, but you belong to the same group that owns the file, then the group permissions control your access.
File Permissions For each group of users, there is a set of permission bits. These bits correspond to being able to read, write, and execute the file. The permission field is classified into three components. Read permissions give the user the capability to open the file and view its contents. This could be done with the commands like cat, more, and vi. Write permission gives a user the capability to open the file and modify its contents. Execute permissions gives the user the capability to execute it as a command. The following example illustrates permissions in action to control access to files: $ ls -l output
69
--w - - - - - - -1 chare users $ cat output cat: cannot open output $ ls -l output2 -r- - r- -r- - 1 chare users $ echo “new data” > output2 ksh: output2: cannot create $ output2 ksh: output: cannot execute $
236 Aug 24 20:13 output
236 Aug 24 20:13 output2
The first ls example has no read permissions for anyone. Subsequently, the user tried to overwrite the contents of the file output2 using shell redirection. Because the file has read permission but not write permission, this action is denied, as is the user’s request to execute a file.
Directory Permissions When a user has read permission on a directory, the user can list the contents of a directory. With write permission, the user can create new files, or delete existing files from the directory. The execute bit on a directory does not mean that you can execute the directory but that you can use cd to go into the directory, or use a file in the directory. The following code illustrates the permissions on a directory in action: $ ls -l a a/list:Permission denied $ ls -l total 4 dr- - r- -r- 2 chare drwxr-xr-x 2 chare drwxr-xr-x 2 chare -rw-r- -r- 1 chare $ cd a ksh: a: bad directory $ touch a/new touch: a/new cannot create $ cp output a2 $
users users users users
48 32 32 38
Aug Aug Aug Aug
24 24 24 24
21:09 21:08 21:08 21:09
a a2 micro_light output
On UNIX systems, the read bit enables the user to list the contents of the directory, but on DEC Ultrix systems, it takes more than the read bit to list the files. The execute permission bit restricts access to the directory by controlling if you can use the cd command to go into it. On SCO systems, if you have the execute bit set and not the read bit, then you can cd into the directory and use a file if you know the name. The write bit enables users to create or remove files in the destination.
Interactions Interaction between the directory and the file can create problems. When a user wants to create a file, the permissions on the directory are checked. If the user has write permission on the
70
directory, then the file will be created. The following code illustrates the file that has write permission for all users, but no write permission on the directory. $ ls -l total 4 dr- - r- -r- 2 chare users drwxr-xr-x 2 chare users drwxr-xr-x 2 chare users -rw-r- -r- 1 chare users $ ls -ld a2 dr-xr-xr-x 2 chare users $ ls a2 output $ rm a2/output rm: a2/output not removed. Permission denied $ ls -l a2 total 1 -rw-r- -r- 1 chare users $ date > a2/output $ ls -l a2 total 1 -rw-r- -r- 1 chare users $
48 32 32 38
Aug Aug Aug Aug
24 24 24 24
21:09 21:08 21:08 21:09
a a2 micro_light output
48 Aug 24 21:12 a2
38 Aug 24 21:12 output
29 Aug 24 21:14 output
Here, the user cannot remove the file using the rm command the file can be essentially removed because of the write permission on the directory.
The chmod Command Changing the permissions on a file or directory is done with the chmod command. The syntax of the command is as follows: $ chmod mode file (s) The mode is the permissions that you want to assign. You can write the mode in two ways. One is called symbolic and other absolute. The symbolic format uses letters to represent the different t permissions, and the absolute uses a numeric format with octal digits representing the different permission levels. Bear in mind that only the owner of the file can change the permissions associated with it. Remember, though, that the super user or root can also alter the permissions as well.
Symbolic The symbolic mode uses letters to represent the different permissions that can be assigned, as outlined in the given table
71
Symbol
Meaning
r
read
w
write
x
execute or search
There are different groups of users to which you want to grant permissions. These are the owner, members of the same group, and all other users.
Symbol
Meaning
u
Owner or user of the file
g
members of the same group
o
all other users
a
all users
Three operators are used to indicate what is to be done with the permission and the user group.
Symbol
Meaning
+
Add the permission
-
Remove the permission
=
Set the permissions equal to this
To define a mode using symbolic format, you need to decide which users you affect. After you select which users are to be affected, you need to decide if you are adding or removing the permission, and then that what permission are you working with. Several examples are shown in the following code: $ ls -l total 8 dr- - r- -r- 2 chare users dr-xr-xr-x 2 chare -rw-r- -r- - 1 chare 21:28 alpha.coderwxr-xr-x 24 21:08 micro_light -rw-r- -r- 24 21:08 output -rw-r- -r- 24 21:28 test2 -rw-r- -r- 24 21:28 test_1 drwxr-xr-x
users 2 1 1 1 2
chare chare chare chare chare
48 Aug 24 21:09 a 48 Aug 24 21:12 a2 users 25 Aug 24 users 32 Aug users 38 Aug users 29 Aug users 12 Aug users 32 Aug
72
24 22:19 uVAro
$ chmod -r test2 $ ls -l test2 --w- - - - 1 chare users 29 Aug 24 21:28 test2 $ chmod g+rwx test2 $ ls -l test2 --w-rwx- - 1 chare users 29 Aug 24 21:28 test2 $ chmod =r test2 $ ls -l test2 -r- -r- -r- 1 chare users 29 Aug 24 21:28 test2 $ chmod u+rwx, g+r, o+r test2 $ ls -l test2 -rwxr- -r- 1 chare users 29 Aug 24 21:28 test2 $ls -l test_1 - - -x- -x- -x- - 1 chare users 12 Aug 24 21:28 test_1 $
The first example demonstrates the removal of read permission from the file test2. This results in the permissions being write only for the owner. The second example illustrates the addition of read, write , and execute permissions for the group owners. With group option, the permission change doesn’t affect any other users. The third example shows how to use the = operator. This instructs chmod to make the permissions on the file. The next example illustrates how to make multiple changes at once. You could execute chmod three different times to make the desired changes. If chmod for any reason cannot access the file or make the requested change, an error message is printed to indicate the problem.
Absolute The absolute method requires that the permissions for all users be specified, even for those that are not changing. This method uses a series of octal digits to represent each of the permissions. The octal values are added together to give the actual permission Octal representation: -r w– r- - r - 42 1 421421 6
4
4
Notice in the following code that the read permission has an octal value of 4; write has a value of 2; and execute a value of 1. To calculate the permissions, add the octal values for each group of users. You can run the chmod command using the octal value of 644 for the permissions instead of the symbolic values. The following code shows the example using the absolute method of chmod. $ ls -l total 5 drwxrwxrwx 2 chare users -rw-rw-rw1 chare
users
48 Aug 24 21:12 a2 25 Aug 24
21:28
alpha.code
73
-rw-rw-rw1 chare users 29 Aug 24 21:28 test2 -rw-rw-rw1 chare users 12 Aug 24 21:28 test_1 drwxrwxrwx 2 chare users 32 Aug 24 22:19 uvA $ chmod 200 test2 $ ls -l test2 - -w - - - - 1 chare users 29 Aug 24 21:28 test2 $ chmod 270 test2 $ ls -l teat2 - -w-rwx- - 1 chare users 29 Aug 24 21:28 test2 $ chmod 444 test2 $ ls -l test2 -r- -r-r- 1 chare users 29 Aug 24 21:28 test2 $ ls -l test2 -rwxr-r- 1 chare users 29 Aug 24 21:28 test2 $ chmod 111 test_1 - - -x- - x- -x 1 chare users 12 Aug 24 21:28 test_1 $
The commands are equivalent, so let’s compare them. chmod – r test2
chmod 200 test2
chmod g+ rwx test2
chmod 270 test2
chmod=r test2
chmod 444 test2
chmod u+rwx, g+r, o+r test2
chmod 744 test2
chmod –r, -w, a+x test_1
chmod 111 test_1
In the first example, in which the permission mode is 200, you are assigning write only to the owner. In the second example, you assign write only to the owner, and read, write, and execute for the members of the same group, with no permissions for other users.
Default File Permissions- umask The default permissions for a file or directory are established when the file or directory is created. The default permissions are controlled by a value called umask. The default permissions for afile and directory, with a mask value as shown in the following: $ umask 022 $ The umask command enables you to change the default file permissions when you create a file or directory. The umask is applied to the default permissions for a file and a directory. For example, the default file permissions for a file is 666, or read and write access for everyone on the system. This is not the most optimal thing to do, so apply the umask, which here is 022.
74
666 022 __ 644 The operation here is not subtracting although it appears that way. This is a bitwise exclusive OR(XOR) operation. This example results in read and write for the owner, with read only for everyone else. The following example applies a umask value of 001. 666 011 ___ 666 In this case, a umask value of 001 has no effect because the execute bits are not turned on. If you want read or write for the owner, with no access rights for anyone else. In this case the umask value will be 066. 666 066 ___ 600 When the umask value is used, it removes the read and write bits for the group and other users, which leaves the read and write bits for the owner intact. But the umask applies to directories as well, so if you are going to customize the value, you must consider the impact on the directories. The default permissions for a directory are 755, which gives the owner read, write and search, with read and search for all other users. Like the file, the actual default is 777, so any user can do anything. 777 022 ___ 755 The preceding example shows that the umask is working correctly. How about in the next case:
75
777 011 ___ 766 This example means that the group and other users can list files and create or delete them, but they cannot cd to this directory. 777 066 ___ 711 A umask value of 066 demonstrates how to allow people access to a directory while preventing them from creating, listing, or removing files. In the situation where you want to prevent access except yourself, then you need to remove the read, write, and execute / search bits for all users except the owner. This is accomplished using a value of 077, which changes the default permission on the directory to 700. 777 077 ___ 700 In the above example, directories are protected by preventing access for any user but the owner. The umask value of 077 is used to protect your files. The following example illustrates how to change umask and that files and directories created have the new permissions. $ ls -l total 5 drwxrwxrwx 2 chare users 48 Aug -rw-rw-rw1 chare users -rw-rw-rw1 chare users -rw-rw-rw1 chare 21:28 test_1 drwxrwxrwx 2 chare 22:19 uvA $ umask 022
24 21:12 a2 25 Aug 24 21:28 alpha.code 29 Aug 24 21:28 test2 users 12 Aug 24 users 32 Aug 24 $ umask 077
76
$ umask 077 $ mkdir new_dir $ ls -ld new_dir drwx - - - - 2 chare new_dir $touch new_file $ ls -l new_file -rw- - - - - 25 01:21 new_file $
users
32
1 chare
users
Aug
25
01:20 0
Aug
Advanced Permissions Several advanced permissions are available in UNIX. They are Set User ID, or SUID, and Set Group ID, or SGID. Two types of identification numbers are carried for you by the system: your real UID and your effective UID. The real UID matches the user name you logged in with. Your effective UID is your alias. For example, the file/etc/passwd, which you know contains some information about your account, is protected by a set of file permissions. The following string shows those permissions: -rw – r - - r- -
1 root
1364 Apr
14 10:45 /etc/passwd
Now, if the /etc/passwd file is not writable by anyone but root, then how can you change your password? Look at the passwd command, which is usually found in /bin, but may be located elsewhere, as in the ollowing example: - rwxr– xr – x
3 root
303104 Mar
19 1991
/usr/bin/passwd
Notice that the permissions bits are different on this program. An ‘s’ is where an ‘x’ would be in the owner’s permissions. This is an SUID program. If you could run the id command while you were running the passwd command, you would see that your effective id is root. So, the SUID bit means that while you are running the program, you look like the owner of the program. The second permission is SGID. An example of an SGID command is as follows: - rwsr – sr – x
1 root
kmem
180224 Apr
5
1991
/usr/bin/mail
This command, /usr/bin/mail, is an SGID program. This means that when the user runs /usr/bin/mail, he seems to be root and belong to a group called kmem.
The chown Command The third of the advanced permissions is called the sticky bit. This was used as a memory management tool. When set, it instructed the kernel to keep a copy of the program in RAM, even if no one was using the program at the time. When using directories, the sticky bit has different meanings on files and directories. The following example illustrates how this sticky bit is shown in ls:
77
$ ls -d /tmp drwxrwxrwt | root root
512 Aug 15 14:03
The sticky bit is shown by the letter ‘t’ in the ‘x’ position of the ‘other’ permission bits. This means that even though the directory has read and write for all users, you cannot remove the file if you don’t own it. Every user on the system is assigned a unique UID. Suppose the user don’t need a particular file and want to give it to someone else. You can do this only if you are a owner of the file. To accomplish the owner change, use the command chown, which has the following command syntax: $ chown user file (s)
The user name used in the chown command can be either a numerical UID, or the textual login name. The following code exemplifies chown in action: $ touch new $ /etc/chown patc new chown: can’t change ownership of new: Not owner $
From the above example, some versions of chown are restricted. super-user, then they can’t execute the command successfully. $ chown andreewg new chown: unknown user id andreewg $ chown andrewg new $ ls - l new - rw – r - - r - $
1 andrewg
If the user is not a root, the
group
25 Aug 24 21:28 new
The file in this example is a sample chown from a version of System V UNIX. In this case, the use of chown is not restricted.
The chgrp Command Like chown, a command exists to enable users to change the group to which the file belongs. Consider the following example. If you want allow a group of users to access a file, then you may change the group to be the same as theirs. Like the chown command, only the owner of the file may change the group name. $ ls _ l new $ chgrp gopher new chgrp: you are not a member of the gopher group. $ chgrp tech new $ ls - l new
78
The chgrp command gives a user the opportunity to change the group who owns the file. As illustrated in the previous code, the capability to change the group on a file requires that you also be a member of the group.
4.3 grep and find Understanding the difference between grep and find The grep is equivalent to the find command under DOS. Both these commands look for text in a file. Grep is one one of three commands in the grep family, namely grep, egrep, and fgrep. The filnd command looks for file names in the directory structure based upon a wide range if criteria such as fil name, file size, permissions, and owner.
Using Regular Expressions A regular expression is a method of describing a string using a series of metacharacters, like in the shell. The metacharacters are assigned a special meaning when used in the regular expression context, but some of the metacharacters used in regular expressions overlap with the shell.
Wild Cards and Metacharacters The table given below shows the wild cards and metacharacters.
Character c \c ^ ^c $ c$ . […] [^…] \n r* r+ r? r1r2 r1\r2 \(r\) (r)
Description Any non special character; c matches itself Turns off any special meaning for character c Positions expressions at the beginning if a line Any line in which c is the first character Positions the expression at the end of of a line Any line in which c is the last character Any single character Any one of characters in ….; ranges like a-z are legal Any single character not in …; ranges are legal W what the nth \(…\) expression matches Zero or more occurrences of r One or more occurrences of r Zero or one occurrences of r r1 followed by r2 r1 or r2 Tagged regular expressions Regular expression r
The ^ operator anchors a pattern to the beginning of a line. For example, the pattern :
79
^ the matches occurrences in which the word ‘the’ is at the beginning of a line. In fact, the t must be the first character on the line. The $ operator anchors patterns to the end of a line, as illustrated in the following example: the$ Although a special character that represents a new line does exists, no metacharacter is used is to match only a new line. This operator does not count the new line in its search. In the preceding example, the word the is only matched when the letter e is in fact the last letter in the line. The single character wild card is a period (.). Consider the following example: th. This example matches any word that has the letters t and h, followed by any character. Any number of . can be put together to match a string. ^the.. ..th$ In these two examples, you are still looking for the occurrences of th and some letter. In these cases, both expect to find two letters before or after the th. The next meta character is a character class. The character class in regular expressions is the same as in the shell. Any single character in the group indicated in the class is matched. The table given below shows some sample classes that can be used in either the […] or [^…].
Example
Description
[abc]
Matches one of a or b or c
[a-z]
Matches one of any lowercase letter between a and z
[A-Z]
Matches one of any uppercase letter between A and Z
[0-9]
Matches any one number between 0 and 9
[^0-9]
Matches any character other than between 0 and 9
[-0-9]
Matches any character between 0 and 9, or a “-“
[0-9-]
Matches any character between 0 and 9 or a “-“
80
[^-0-9]
Matches any character other than between 0 and 9 and “-“
[a-zA-Z0-9]
Matches any alphabetic or numeric character
The [^..] operator indicates that you do not want to match certain letters, as in the following example: th[^ae]n This example does not match words like then and than, but does allow a match for thin. The next operator in regular expressions is the closure operator, or the *. This applies to the preceding pattern and collectively matches any number of successive occurrences of that pattern. For example, the following does not match chare: Char * This example matches char, or charr, charr, and so on.. Consider a few other examples: [a-z]* -> matches any single character plus zero or more occurrences of that same character. If the first letter matched from this class is an r, then [a-z]* matches any number of the letter r after the first one.
grep Command grep stands for Global Regular Expression Print. The command itself is a filter. It is designed to accept input and filter it to reduce the output. Grep accepts its input and compares the pattern to match with the input. If the pattern matches, then the line containing the match is printed. Otherwise, no output is generated. The following code illustrates using grep to extract information from the password file: $ grep chrish /etc/passwd $ grep chare /etc/passwd chare:
A/49w7Ab:1003:104:Chris
Hare:/home/chare:/usr/bin/ksh This illustrates the format of the grep command, which is as follows: grep pattern file (s) In the first line in the preceding example, the pattern is chrish, and the file is /etc/passwd. grep reads the contents of the file looking for chrish. Because grep did not print anything, you know that the file does not contain chrish. In the second line pattern is chare. In this case, grep prints one line from /etc/passswd.
Using grep The table lists the options available for grep:
81
Options
Description
-b
Prints the line with the block number
-c
Prints count of matching lines only
-e pattern
Used when the pattern starts with a-
-f filename
Takes the pattern from the file
-h
Does not print the file name
-i
Ignores case of letters in the comparison
-l
Prints only file names with matching lines
-n
Prints line numbers
-s
Suppresses file error messages
-w
Search for an expression as for a word
-y
Ignores case of letters in the comparison
-v
Prints non matching lines
-x
Prints exact lines matched by their entirety
Counting occurrences (-c) How many directories are there in emp1.lst and emp2.lst? The –c option counts the occurrences, and the following example reveals that there are two of them in each file: $ grep -c ‘director’ emp?.lst emp1.lst:2 emp2.lst:2
Displaying line numbers (-n) The –n option can be used to display the line numbers containing the pattern, along with the lines: $ grep -n ‘marketing’ emp.lst sumit |d.g.m |marketing |19/04/43|8000 marketing |12/04/35|8900
3:5347| 11:3254|dfdfg|director |
The line numbers are shown at the begining of each line, separated from the actual line by a :
82
Deleting Lines (-v) The –v option selects all but the lines containing the pattern. Thus you can create a file otherlist containing all but directors: $ grep -v ‘director’ emp.lst > otherlist $ wc -l otherlist 11 otherlist
Displaying Filenames (-l) The –l option displays only the names of files where a pattern has been found: $ grep -l ‘manager’ *.lst design.lst emp.lst
Ignoring Case (-i) When you look for a name, but are not sure of the case, grep offers the –I option, whose ignores case for pattern matching: $ grep –I ‘agarwal’ emp.lst sudhir Agarwal |executive|personnel |06/07/47|5400
3564|
this locates the name “Agarwal”, but it can’t match the names “agrawal” and “aggarwal” that are spelled in a similar manner, while possessing some minor differences. With the –e option (SCO UNIX), you can match the three agarwals by using grep.
Printing the Neighbourhood GNU grep in Linux has a option that enables locating not only the line matching the pattern, but also a certain number of lines above and below it. For instance, the command grep -5 “do loop” update.sql locates the string “do loop” and displays five lines on either side of it.
Printing a specific Number of Lines (-N) The –N option lets you know quickly whether a pattern occurs in a file or not. The following command displays the first occurrences of “Bill Gates” and exists to the shell: grep -N 1 “Bill Gates” nt_unix.txt The argument with –N controls the number of occurrences. So, -N2 would list two occurrences.
egrep: Extending grep
83
The egrep command offers all the options of grep. characters not used by either grep or sed.
egrep’s set includes some additional
Expression
Significance
Ch+
Matches one or more occurrences of character ch
Ch?
Matches zero or more occurrences of character ch
exp1 | exp2
Matches expression exp1 or exp2
(x1|x2) x3
Matches expression x1x3 or x2x3
Searching for Multiple Patterns How do you now locate both “sengupta” and “dasgupta” from the file, a thing that grep can only do using multiple -e options? Delimit the two expressions with the |, and the job is done: $ egrep ‘sengupta|dasgupta’ emp.lst barun sengupta |director |personnel | 11/05/47|7800 s.n. dasgupta |manager| sales | 12/09/63|5600
2365| 1265|
egrep thus handles the problem easily, but offers an even better alternative. You can group patterns using a pair of parantheses, as well as the pipe: $ egrep ‘(sen|das)gupta’ emp.lst barun sengupta |director s.n. dasgupta |manager
|personnel | sales
| 11/05/47|7800 | 12/09/63|5600
2365| 1265|
The –f option: storing patterns in a file If the patterns are more, egrep offers the –f option to take such patterns from the file. Let’s fill up a file with some patterns: $ cat pat.lst sales
admin|accounts|
This file must contain the patterns, suitably delimited in the same way as they are specified in the command line. When you execute egrep with the –f option in this way: egrep -f pat.lst emp.lst
The command takes the expression from pat.lst egrep enhances the power of grep by accepting both alternative patterns, as well as patterns from the file.
84
fgrep: Multiple string Searching fgrep, like egrep, accepts multiple patterns, both from the command line and a file, but unlike grep and egrep, doesn’t accept regular expression. So, if the pattern to search for is a simple string, or a group of them, fgrep is recommended. Alternative patterns in fgrep are specified by separating one pattern from another by the newline character. You may specify these patterns in the command line itself, or store them in a file in this way: $ cat pat1.lst sales personnel admin
Now you can use fgrep : fgrep -f pat1.lst emp.lst
find Command find is not a filter. It cannot be in the middle or at the end of a pipe. It can provide information at the top of the pipe. find is used to search the UNIX file system looking for files given a certain criteria. The command syntax for find is as follows: find path predicate – list
The path is from where find starts the search. It must be specified. The predicate list consists of the search criteria and command that you want to do with the located files. The examples in the following code illustrate find in action. The –print option prints the names of the files that are found. When find exists, it sets a return code that tells the shell that it did find some files. $ find /usr -atime 10 -print /usr/bin/id /usr/bin/egrep /usr/lib/ua/uasig /usr/include/grp.h /usr/include/pwd.h /usr/include/stdio.h /usr/include/sys/types.h /usr/include/time.h
Apart from the –print option, the code also illustrates how to use the –atime option to find accessed files. Three separate times are stored in the inode for each file. The find command in the previous code looks for file accessed in the last ten days. The following code illustrates two more options. The –name option requires an argument. This argument can use the file substitution wild cards, but if it does, it must be enclosed in quotes. This example looks for files that end in .ps. Notice that it starts from the current directory. After the files are found, you can print the names, and then write the file to the device specified with the –cpio option. In this case, the device is /dev/rmt0h. When all the file names are printed, the number of blocks written to the device is printed.
85
$ find . –name “*.ps” -print -cpio /dev/rmt0h ./a.out.ps ./a1.ps ./a2.ps ./a3.ps ./a4.ps . 50 blocks $
find also has two Boolean style operators, which provide the capability for find to match files based on the one or more criteria. For example, the following code illustrates that files named care or ending in .bak are to be removed: $ find / \ (-name core -o -name “*.bak” \) -exec rm -f() \:
The –ctime option is only different in that it looks at the date the inode information was last changed. This information generally is the permissions on the file. $ find / -ctime 3 -print /usr/adm /usr/spool/lp/request/Epson /usr/spool/uucp/SYSLOG /usr/spool/uucp/o.Log-WEEK $
Using the –exec option, you can execute any command on the found files: $ find . –name “*.ps” -exec ls -l {} \; -rw - - - - 1 chare -rw - - - - 1 chare -rw - - - - ./a2.ps -rw - - - - ./a3.ps -rw - - - - ./a4.pss $
users users 1 chare 1 chare 1 chare
6356 Aug 25 19:25 ./a.out.ps 66 Aug 25 19:25 ./a1.ps users 356 Aug 25 19:25 users 56 Aug 25 19:25 users 6356 Aug 25 19:25
This option instructs find to execute the named command on each file found. The tricky part is the syntax for the option. The syntax involves the command to be executed and its own options, a pair of curly braces, and a command terminator, as shown here: -exec command {} \;
The curly braces instruct find to place the found file here. For example the following code: -exec ls -l {} \;
matches the file sampler.ps in the find statement, then the following exec instruction executes the command: ls -l sampler.ps
The \; is used to terminate the instruction as it is passed to the shell.
86
The following code examines finding files based upon the group that owns the file. This example looks for files that have a group ownership of mail. $ find / -group mail –print /bin/mail /bin/rmail /u/chare/FileCabinet/choreo/policy/usr/itools/frame/dead.letter /usr/mail /usr/mail/uucp /usr/mail/chare /usr/spool/uucp/LOGDEL /usr/spool/uucp/SYSLOG /usr/spool/uucp/Log-WEEK /usr/spool/uucp/o.Log-WEEK /usr/spool/uucp/LOGFILE /usr/local/elm /usr/local/filter $
The following example looks at the –ok option, which is like the –exec option. The difference is that while the –exec option simply executes the command, the –ok option asks the user if the command should be run. $ find . –name “*.ps” -ok ls -l {}\; ? n < ls . . . ./a1.ps >? Y -rw- - - - 1 chare users ./a1.ps < ls . . . ./a2.ps >? Y -rw- - - 1 chare 19:25 ./a2.ps < ls . . . ./a3.ps >? N < ls . . . ./a4.ps >? N $
290 users
Aug 390
25 Aug
19:25 25
The –type option enables you to locate the files based upon the file type. The given table lists the valid arguments for the –type option.
Symbol
Description
b
block special file
c
character special file
d
directory
f
regular file
p
named pipes and FIFOs
As illustrated in the following example, you can find files that are only directories:
87
$ find . –type d -print ./Filecabinet ./wastebasket ./Clipboard $
88
4.4 Extracting data Extracting the data involves some commands that allow the manipulation of data directly in files. Specifically, this section discusses controlling which part of a file you can look at using the commands head and tail; how to extract information from a file using cut; and put it back again together using a paste. The join command is like paste but is used to join lines of text based upon a common field in each file
head Command The head command is used to print the top of the file. The command syntax for head is as follows: $ head file(s) or $ head –num file(s)
The first format of the command prints the top ten lines of each of the named files. You can specify a line count and display., say, the first three lines of the file. Use the symbol followed by a numeric argument: $ head -3 emp.lst 2365|barun sengupta |director |personnel 1265|s.n. dasgupta |manager |sales 6456|Damarla |GM
|11/05/47|7800 |12/09/63|5600 |production
|19/04/65|6000
You can use head to find out the record length by word counting the first line of the file: $ head -1 emp.lst | wc -c 58
tail command The tail command displays the end of the file. It provides the last ten lines when used with arguments. The last three lines are displayed in this way: $ tail -3 emp.lst 8764|sudhir Agarwal 8765|sanju nity |g.m
|executive |g,m |sales
|personnel |06/07/67|7500 |marketing |12/05/78|8000 |24/09/34|9000
7657|
tail has a –f option that enables you to monitor the growth of a file. The system administrator often uses a tail –f option to view the log file that is written by the installation process of many software. tail -f /oracle/app/oracle/product/7.3.2/orainst/install.log
The $ prompt doesn’t return even after the work is over. With this option, you have to abort the process to exit to the shell.. Use the interrupt key applicable on your system.
89
cut and paste commands The cut command is used to cut the information from files, based upon character position, or field within the file. The syntax for the cut command is as follows: $ cut options files The features of cut command will be illustrated with specific reference to the file shortlist, which stores the first five lines of emp.lst $ head -5 emp.lst Barun Sen Gupta |Director |Personnel | 11/05/47|7800 Das Gupta |Manager | Sales | 12/09/63|5600 Damarla |GM |Production |19/04/65|6000 Agarwal |Executive |Personnel |06/07/67|7500 | GM | Marketing |12/05/78|8000
2365| 1265| 6456| 8764|Sudhir 8765|Sanju
cut can be used to extract specific columns from this file, say those signifying the name and designation. The name starts from column number 6 and goes upto column number 27, while the designation data occupies columns 29 through 50. Use cut with the –c option for cutting columns: $ cut -c 6-22, 24-32 shortlist Barun Sen Gupta Director Das Gupta Manager Damarla GM Sudhir Agarwal Sanju
Executive Marketing
Files don’t contain fixed length records, in which case, it is better to cut fields rather than columns. There are two options used for this purpose. The –d option for the field delimiter, and –f for specifying the field list. This is how you cut the second and third field. $ cut -d \| -f 2,3 shortlist | shortlist | tee cutlist1 Barun Sen Gupta Director Das Gupta Manager Damarla GM Sudhir Agarwal Sanju
Executive Marketing
To cut out fields numbered 1, 4, 5 and 6, and save the output in cutlist2, follow a similar procedure: cut -d “|” -f 1,4- shortlist > cutlist2
The paste Command It is a special type of concatenation in that it pastes files vertically, rather than horizontally. cut was used to create two files cutlist1 and cutlist2 containing two cutout portions of the same file. Using paste, you can fix them laterally:
90
$ paste cutlist1 cutlist2 Barun Sen Gupta |Director 2365|Personnel |11/05/47|7800 Das Gupta |Manager 1265| Sales |12/09/63|5600 Damarla |GM 6456| Production |9/04/65|6000 Sudhir Agarwal |Executive 8764|Personnel |6/07/67|7500 Sanju | GM 8765|Marketing |2/05/78| 8000
By default, paste uses the tab character for pasting files, but you can specify a delimit of your choice with the –d option: $ paste -d \| cutlist1 cutlist2 Barun Sen Gupta |Director |2365 Das Gupta |Manager Damarla | GM 19/04/65|6000 Sudhir Agarwal |06/07/67|7500 Sanju Marketing |12/05/78|8000
|Personnel |1265 | Sales |6456 |Executive | GM
|11/05/47|7800 |12/09/63|5600 | Production | |8764 |Personnel |8765 |
Even though paste uses at least two files for concatenating lines, the data for one file can be supplied through the standard input. If, for instance, cutlist2 doesn’t exist, you can provide the character stream by cutting out the first, fourth, fifth and sixth fields from shortlist, and piping the output to paste: $ cut -d \| -f 1,4 - shortlist |paste -d “|” cutlist1 – Barun Sen Gupta |Director |2365 Das Gupta |Manager Damarla | GM 19/04/65|6000 Sudhir Agarwal |06/07/67|7500 Sanju Marketing |12/05/78|8000
|Personnel |1265 | Sales |6456 |Executive | GM
|11/05/47|7800 |12/09/63|5600 | Production | |8764 |Personnel |8765 |
You can also reverse the order of pasting by altering the location of the –sign: cut -d “|” -f 1,4- shortlist |paste -d “|” - cutlist1
join Command The join command takes lines from two files and joins them together, based upon a common field or key. To use join, both files must share a common piece of data, called the key or primary field. The files must be in the same sorted order. Let’s look at the source file for an example: File1 $ cat list1.txt BC 604 ALTA 403 SASK
306 MAN
204
File2 $ cat list2.txt BC British Columbia
91
ALTA Alberta SASK
Saskatchewan MAN
Manitoba
$ join list1.txt list2.txt BC 604 British Columbia ALTA 403 Alberta SASK 306
Saskatchewan MAN 204
Manitoba
The following code shows using the join command to merge the two files and what the output look like: “$ join list1.sort list2.sort > list” followed by “$ cat list” will give the same output as above.
92
4.5 Redirection and Piping Redirection is a process of sending the output of a command to a place other than the terminal. A pipe allows the output of one command to be sent directly to the input of another command.
Standard Input, Standard Output and Standard Error For every command, three files are opened – standard, input, standard output and standard error. Standard input is where the input for a command comes from e.g., keyboard. The Standard output is where the output of the command is sent during processing or after the process is complete e.g., video device connected to a terminal or work station. Standard error is separate from standard output, even though it generally sends its output to the same place. It provides a mechanism for error messages to the user executing the command.
Redirection Redirection is used to connect the output of a command to a file or take input for the command from a file. Example:
$ command > file $ ls > dirlist
Input and output redirection can be used at the same time; they are not mutually exclusive as the following example illustrates. Example:
$ sort < infile > outfile
Here, sort reads data to be sorted from the file “infile” and puts the sorted data in the output file “outfile”. Redirection does not interact with other commands like pipes. Redirection can be used on the command line, shell scripts and in the cron command.
Pipes Pipes are a method of connecting the output of one command to the input of another, i.e., it connects the output of the command on the left of the pipe with the input of the command on the right. Any number of commands can be grouped together in a pipe. Example:
$ ls | sort | pg $ grep word file | wc
Using redirection (output) will involve a temporary file for storing the output and then redirecting (input) it to another command. The pipe can avoid the temporary file. Example: Using redirection, $ ls > list $ sort < list Using pipes, the same can be accomplished by the command.
93
$ ls | sort The advantage of using pipes is to avoid large number of temporary files in case of large shell scripts. Example:
$ cat file | sort $ sort file $ sort < file
All these commands accomplish the same thing. The difference is that the first command must start the command cat, open the file, and pipe the output to sort, whereas the second ad third commands simply open the file and sort it. The end result can be achieved with one command. It is inefficient to use the first type of command.
tee command The tee command is used in the pipeline to save the output in a file. This command sends a copy of its input to standard output, and another copy to the file named on the command line. Example:
$ ls tee list | sort
The output of ls will get saved in the file list and it also goes as the input to sort. If there is no pipe, then the output of ls would go to the standard output. tee is a useful command when the end result of the pipe is not what you are expecting. To solve this type of problem, inserting the tee command at various places in the pipeline enables to look at the output of different commands to determine what the problem is.
94
4.6 Sorting and Comparing UNIX sort performs its usual function. It has several options. When the command is invoked without options, the entire line is sorted. Consider the emp.lst file discussed in the previous section: $ sort emp.lst 2233|anbu 9876|jai 2365|barun
|GM |director |director
|sales |production |personnel
|12/12/52|6000 |12/03/50/7000 |11/05/47/7800
Sorting starts with the first character of each line, and proceeds to the next character only when the characters in two lines are identical. Like cut and paste, sort also works on fields, and the default field separator is the space character. The –t option, followed by the delimiter, overrides the default. $ sort -t \| +1 emp.lst 2233|anbu 2365|barun 9876|jai
|GM |director |director
|sales |personnel |production
|12/12/52|6000 |11/05/47/7800 |12/03/50/7000
The argument +1 Indicates that sorting should start after skipping the first field. To sort on the third field, you should use sort -t “|” +2 emp.lst
The sort order can be reversed with the –r option. The following sequence reverses a previous sorting order: $ sort -t “|” -r +1 emp.lst 9876|jai |director 2365|barun |director 2233|anbu |GM
|production |personnel |sales
|12/03/50/7000 |11/05/47/7800 |12/12/52|6000
This sequence can be written as : sort -t “|” +1r emp.lst
Since sort also a filter, the sorted output can be redirected to a file with the “>” operator. This is done with the –o option. sort -o sortedlist +3 emp.lst sort -o emp.lst emp.lst
And if you want to check whether the file has actually been sorted, you must use: $ sort -c emp.lst 9876|jai
|director
|production
|12/03/50/7000
95
Sorting on a Secondary key: You can sort on more than one fields, you can provide a secondary key to sort. If the primary key is the third field, and the secondary key is the second field, you can use $ sort -t | | +2 -3 +1 emp.lst 2365|barun |director 9876|jai |director 2233|anbu |GM
|personnel |production |sales
|11/05/47/7800 |12/03/50/7000 |12/12/52|6000
This sorts the file by designation and name. -3 indicates stoppage of sorting after the third field, and +1 indicates its resumption after the first field. To resume sorting from the first field, use +o
Sorting on columns: You can also specify a character position within a field to be the beginning of sort. If you are to sort the file according to the year of birth, then you need to sort on the seventh and eighth positions within the fifth field: $ sort -t “|” +4.6 -4.8 emp.lst
+4.6 signifies the starting sort position-the seventh column of the fifth field. Similarly, -4.8 implies that sorting should stop after the eighth column of the same field.
Numeric Sort: When sort acts on numerals, strange things can happen. When you sort a file containing only numbers, you get a curous result: $ sort numfile 10 2 24 4
This is probably not what you expected. This can be overridden by the option –n: $ sort -n numfile 2 4 10 24
Removing Duplicate Lines: The –u option lets you purge duplicate records from a file. There are some commands whose purpose is to determine what the differences are between two files that appear to be the same.
diff Command The diff command is used to determine the difference that exists between two files. The syntax is as follows: $ diff options file1 file2 The options are used to redefine some of the output presented by diff. The diff command looks in the named directory for a file with the same name as that specified and compares the two files. Other wise, both arguments may be a file name. The following code shows the output of diff:
96
$ cat dfile1 This is a small file which will be used to test diff $ cat dfile2 This is a SMALL file that can be used to illustrate how diff operates $ cat dfile1 dfile2 1c1 < This is a small file which will be used to test diff > This is a SMALL file that can be used to illustrate how diff operates $
In this example, the first bit is a line of three characters, 1c1. This is the ed command, which is used to synchronize the two files. In this case, the command says that line one would be changed with line 1 of the other file. The lines after this command are the lines that would be affected by the command listed first. The lines in the first file are preceded by a <, and the lines in the second file are preceded by >. The –e option instructs diff to create an ed command script that recreates file2 from file1. For example: $ diff -e dfile1 dfile2 > cfile $ cat cfile 1c This is a SMALL file that can be used to illustrate how diff operates. $
comm Command The comm. Command selects or rejects lines common to two files. For example; $ sort c1 > c1s $ sort c2 > c2s $ cat c1s apple banana tomato $ cat c2s apples kiwi $ comm c1s c2s apples banana kiwi tomato $
comm. Command prints three columns of output. The first column contains items in the first file, but not in the second. The second column contains items in the second file, but not in the first. The third column contains items in both files.
uniq Command
97
uniq is meant to find the repeated lines in a file. This command reads the input and and compares the adjacent lines looking for duplicate entries. Usually, the second and repeating lines of the file are removed. The remainder is written to either standard output or a file. The syntax is as follows: $ uniq options input – file output – file
The output – file is optional. For example: $ sort c3>c3s $ cat c3s apples apples banana kiwi tomato $ uniq c3s apples banana kiwi tomato $
98
Chapter 5 5. The vi editor No matter what work you do with the UNIX system, you will eventually write some C programs or shell (or perl) scripts. You may have to edit some of the system files at times. If you are working on databases, you will also need to write SQL query scripts, procedures and triggers. For all this, you must learn to use an editor, and UNIX provides a very versatile one – vi. vi is a full-screen editor now available with all UNIX systems, and is widely acknowledged as one of the most powerful editors available in any environment. Another contribution of the University of California, Berkeley, it owes its origin to William (Bill) Joy, a graduate student who wrote this unique program. It became extremely popular, leading Joy to later remark that he wouldn’t have written it had he known that it would become famous! vi offers cryptic, and sometimes mnemonic, internal commands for editing work. It makes complete use of the keyboard, where practically every key has a function. vi has innumerable features, and it takes time to master most of them. You don’t need to do that anyway. As a beginner, you shouldn’t waste your time learning the frills and nuances of this editor. Editing is a secondary task in any environment, and a working knowledge is all that is required initially. Linux features a number of “vi” editors, of which vim (improved) is the most common. Apart from vi, there are xvi, nvi and elvis that have certain exclusive functions not found in the Berkeley version. All of them, barring nvi, let you split up the screen into multiple windows.
The Three Modes A vi session begins by invoking the command vi with (or without) a file name: vi visfile You are presented a full empty screen, each line beginning with a tilde. This is vi’s way of indicating that they are non-existent lines. For text editing, vi uses 24 of the 25 lines that are normally available in a terminal. The last line is reserved for some commands that you can enter to act on the text. This line is also used by the system to display messages. The filename appears in this line with the message “visfile” [New file]. When you open a file with vi, the cursor is positioned at the top left-hand corner of the screen. You are said to be in the command mode. This is the mode where you can pass commands to act on the text, using most of the keys of the keyboard. Pressing a key doesn’t show it on screen, but may perform a function like moving the cursor to the next line, or deleting a line. You can’t use the command mode to enter or replace text. There are two command mode functions that you should know right at this stage – the spacebar and the backspace key. The spacebar takes you one character ahead, while the backspace key (or ) takes you a character back. Backspacing in this mode doesn’t delete the text at at all. To enter text, you have to leave the command mode and enter the input mode. There are ten keys which, when pressed, take you to this mode, and whatever you enter shows up on the screen.
99
Backspacing in this mode, however, erases all characters that the cursor passes through. To leave this mode, you have to press <Esc> key. You have to save your file or switch to editing another file. Sometimes, you need to make a global substitution in the file. Neither of the two modes will quit the work for you. You then have to use the ex mode or line mode, where you can enter the instruction in the last line of the screen. Some command mode functions also have ex mode equivalents. In this mode, you can see only one line at a time, as you see when using EDLIN in DOS. With this knowledge, we can summarize the three modes in which vi works: 1. Input Mode – Where any key depressed is entered as text 2. Command Mode – Where keys are used as commands to act on text. 3. ex Mode – Where ex mode commands can be entered in the last line of the screen to act on text The relationship between these three modes is depicted in figure.
Comman d Mode
<Enter> <Esc> Edit Mode
Ex mode
Figure – 6 vi Modes
100
5.1 Command mode This is the mode you come to when you have finished entering or changing your text. When you press a key in the command mode, it doesn’t show up on the screen, but simply performs its function. That is why you can see changes on the screen without knowing the command that has caused them.
5.2 Ex mode SAVING TEXT AND QUITTING - THE ex MODE When you edit a file using vi, of for that matter, any editor, the original file isn’t disturbed as such, but only a copy of it that is placed in a buffer. From time to time, you should save your work by writing the buffer contents to disk. You may also need to quo\it vi after or without saving the changes.
Saving your Work To enter any command in this mode, enter a:, which appears at the last line of the screen, then the corresponding ex mode command, and finally the <Enter> key. To save a file and remain in the editing mode, use the w (write) command: :W<Enter> “sometext”, 8 lines, 275 characters The message shows the name of a file, along with the number of lines and characters saved.
Saving and quitting The above command keeps you in the command mode so that you can continue editing. However, to save and quit the editor, use the X command instead: :X<Enter> “sometext”, 8 lines, 303 characters $-
5.3 Edit mode Input Mode – Adding and Replacing Text If you are a beginner to vi, it’s better you issue the following command after invoking vi, and before you start editing: : set show mode<Enter> Enter a: (the ex mode prompt), followed by the two words, and then the <Enter> key. This is a command in the ex mode, and when you enter the:, you will see it appearing in the last line of the screen. This command sets one of the parameters of the vi environment, and displays a suitable
101
message whenever the input mode is invoked. The message appears at the bottom line of the screen that is quite self-explanatory. This show mode setting is not necessary when using vim in Linux, which sets it to this mode by default. Before you attempt to enter text into the file, you need to change the default command mode to input mode. There are several methods of entering this mode, depending on the type of input you wish to key in, but in every case the mode is terminated by pressing the <Esc> key.
Insertion of Text The simplest type of input is insertion of text. Whether the file contains any text or not, when vi is invoked, the cursor is always positioned at the first character of the first line. To insert text at this position, press i
# Existing text will be shifted right
The character doesn’t show up on the screen, but pressing this key, changes the mode from command to input. Since the show mode setting was made at the beginning (with : set show mode), you will see the words “INSERT MODE” at the bottom right corner of the screen. Further key depressions will result in text being entered and displayed on the screen. Start inserting a few lines of text, each line followed by <Enter>. The lines containing text along with the “empty lines” (actually non –existent lines, as shown with a –against each) approximate the screen shown here. The cursor is now positioned in the last character of the last line. This is known as the current line, and the character where the cursor is stationed is known as the current cursor position. If you notice a mistake in this line, you can use the backspace key to erase any inserted text, one character at a time. The input mode is terminated by pressing the <Esc> key, which takes you back to the command mode. You started insertion with I, which put text at the left of the cursor position. If the i command is invoked with the cursor positioned on existing text, text on its right will be shifted further without being over written. The insertion of text with i is shown in figure along with the position of the cursor (a) There are other methods of inputting text. To append text to the right of the cursor position, use a
# Existing text will also be shifted right
followed by the text you wish to key . After you have finished editing, press <Esc> . With i and a, you can append several lines of text in this way. They also have their uppercase counterparts performing similar functions. I inserts texts at the beginning of a line, while A appends text at the end of a line.
Opening a New Line You can also open a new line by positioning the cursor at any point in a line and pressing o
# Opens a new line below the current line
This inserts an empty line below the current line.
102
o also opens a line, but above the current line. In either case, the show mode setting tells you that you are in the input mode. You are free to enter as much text as you choose, spanning multiple lines if required. Press the <Esc> key after completing text input.
Replacing Text Text is replaced with the r,R, s and S keys. To replace one single character by another, you should use R
# No <Esc> required
followed by the character that replaces the one under the cursor. You can replace a single character only on this way. vi momentarily switches from the command mode to the input mode when ris pressed. It returns to the command mode as soon as the replacing character is entered. There is no need to press the <Esc > key when using r, followed by the character, since vi expects a single character anyway. To replace more than a single character, use R
# Replaces text as cursor moves right
followed by the text. Existing text will be overwritten as the cursor moves forward. This replacement is, however, restricted to the current line only. The s character replaces a single character with text irrespective of its length. S replaces the entire line irrespective of the cursor position. Vi Quick Reference:
103
Chapter 6 6. Shell Programming 6.1 Variables Like every programming language, the shell offers the facility to define and use variables in the command line. These variables are called shell variables. Shell variables are assigned with the = operator, but evaluated by prefixing the variable name with a $. Example: $ x=37 $ echo $x 37 All shell variables take on the generalized form variable=value. They are of the string type, which means that the value is stored in ANCII rather than in binary format. When the shell reads command line, it interprets any word preceded by a $ as a variable, and replaces the word by the value of the variable. All shell variables are initialized to null strings by default. For example: $ echo $xys $Null strings can also be assigned explicitly by either of the following: x= x= ‘’ To assign multi word strings to a variable, you should quote the value: $ msg= ‘you have mail’ ; echo $msg you have mail A variable name can consists of letters of the alphabet, numerals and underscore character. The first character must be a letter. The shell is sensitive to case; the variable x is different from X. The shell uses a pair of curly braces to enclose a variable name. Here is an alternative form of evaluating the variable fname: $ echo ${fname} emp.sh This form has certain advantages; you can tag a string to it without needing to quote it. This way you can generate a second set of file names by affixing the character x to each one: $ echo ${fname}x
104
emp.shx Variables are concatenated by placing them adjacent to one another. $ x=abcd ; y=efgh $ z=$x$y $ echo $z abcdefgh
Applications of Shell Variables Shell variables are used in intelligent manner, they can speed up your interaction with the system. You can easily assign the path name /usr/kumar/progs/data to a variable, and then use its shorthand representation: $ pn=’usr/kumar/progs/data’ $ cd $pn $ pwd /usr/kumar/progs/data A shell variable can be used to replace even the command itself. When a command is assigned to a variable, the variable should be evaluated by simply specifying the $- prefixed variable as the only word in the command line: $ count=”wc unit01 unit02” $ $count 436 6463 37986 unit01 892 8273 48420 unit02 1318 14736 86406 total You can also use the feature of command substitution to set variables. For instance, if you were to set the complete pathname of the present directory to a variable mydir, you could use $ mydir=`pwd` $ echo $mydir /usr/kumar
105
6.2 Command-Line arguments Shell procedures accept arguments in the command line. This non interactive method of specifying arguments is quite useful for scripts requiring few inputs. It also forms the basis of developing tools that can be used with redirection and pipelines. When arguments are specified with a shell procedure, they assigned to certain special “variables” or rather positional parameters. The first argument is read by the shell into the parameter $1, the second argument into $2, and so on. In addition to these positional parameters, there are a few other special parameters used by the shell. The next script illustrates these features: $ cat emp2.sh echo “program: $0 The number of arguments specified is $# The arguments are $*” grep “$1” $2 echo “\nJob Over” The parameter $* stores the complete set of positional parameters as a single string. $# is set to the number of arguments specified. This lets you design scripts that check whether the right number of arguments have been entered. The command itself is stored in the parameter $0. Invoke this script with the pattern “director” and the file name emp1.lst as the two arguments: $ emp2.sh director emp1.lst program: emp2.sh The number of arguments specified is 2 The arguments are director emp1.lst 1006|chanchal singhvi | director | sales |03/09/38/|6700 6521| lalit chowdury | director | marketing |26/09/45|8200 Job Over In this way, the first word is assigned to $0, the second word to $1, and the third word to $2. You can use more positional parameters in this way up to $9 (and using the shift statement, you can go beyond).
106
6.3 Decision-making constructs In any programming language, the capability to make decisions and alter the program flow based upon those decisions is a requirement to perform any work.
The if Command The if statement takes two-way decisions, depending on the fulfillment of a certain condition. In the shell, the statement uses the following forms: if condition is true then execute commands else execute commands fi if evaluates a condition that accompanies its “command line”. If the condition is fulfilled, the sequence of commands following it is executed. Every if must have a corresponding fi. The else statement, if present, specifies the action in case the condition is not fulfilled. This statement is not always required. All UNIX commands returns a value. In the next example, grep is first executed, and if uses its return value to control the program flow: $ if grep “director” emp.lst > then echo “pattern found – Job Over” > else “pattern not found” >fi
Numeric Comparison with test When you utilize if to evaluate expressions, the test statement is often used as its control command. test uses certain operators to evaluate the condition on its right, and returns either a true or false exit status, which is then used by if for taking decisions. The relational operators have a different form when used by test. They always begin with a – (hyphen), followed by a two character word, and enclosed on either side by white space. The complete set of operators is shown in the given table:
Operator
Meaning
-eq -ne -gt -ge -lt -le
Equal to Not equal to Greater than Greater than or equal to Less than Less than or equal to
107
test doesn’t display any output, but simply returns a value, which is assigned to the parameter $?. For example: $ x=5; y=7; z=7.2 $ test $x –eq $y : echo $? 1 $ test $x –lt $y : echo $? 0 $ test $z –gt $y : echo $? 1 test $z –eq $y : echo $? 0 You can now use the test in the command line of the if conditional. The next script uses three arguments to take a pattern, as well as the input and output filenames. First it checks whether the right number of arguments have been entered: $ cat emp3.sh if test S# -ne 3 then echo “you have not keyed in 3 arguments” exit 3 else if grep “$1” $2 > $3 then found – Job Over”
echo else
“pattern
echo “pattern not found – Job Over” fi fi Here, you have two if constructs, each terminated with its fi. One if is nested within the other. When you run this script with one, or even no argument, this condition evaluates to true and grep is executed. The second if construct now tests for grep’s exit status, and echoes suitable messages. Short hand for test test is so widely used that fortunately there exists a shorthand method of executing it. A pair of rectangular brackets enclosing the expression can replace the word test. Thus the following two forms are equivalent: test $x –eq &y [$x –eq $y]
if – elif: Multi-way Branching if also permits multi-way branching; you can evaluate more conditions if the previous condition fails. The format is if-then-elif-then-else-fi, where you can have as many elif as you want, while the else remains optional. For example:
108
if [$# -ne 3] ; then echo “You have not keyed in 3 arguments”; exit 3 elif grep “$1” $2 > $3 2>/dev/null ; then echo “pattern found – Job Over” else echo “pattern not found – Job Over” ; rm $3 fi All these scripts have a serious shortcoming. They don’t indicate why a pattern wasn’t found. This message appears even if the file doesn’t exist at all, and the redirection of the diagnostic stream with 2> ensures that grep’s complaints are not seen on the terminal.
Test: String Comparisons Test can be used to compare strings with the set of operators. Equality is performed with =, while the C-type operator != checks for inequality. The table lists string handling tests:
Test
Exit Status
-n stg
True if string stg is not a null string
-z stg
True if string stg is a null string
s1=s2
True if string s1=s2
s1!= s2
True if string s1 is not equal to s2
stg
True if string stg is assigned and not null
You can use the string comparison features in the next script to check whether the user actually enters a string, or presses the <Enter> key: $ cat emp4.sh echo “Enter the string to be searched: \c” read pname if [ -z “$pname” ] ; then exit 1 to be used: \c” then “You have not entered the filename” ; exit 2
echo “You have entered the string” ; else echo “Enter the file read flname if [! –n :$flname” ] ; echo else
grep “$pname” “$flname” || echo “pattern not found” fi fi
109
The script checks whether the user actually enters something when the script pauses at the first two points: once in accepting the pattern, and then while accepting the filename. Note that the check for a null string is made with [ -z “$pname” ], as well as with [ ! –n “$flname” ]. The script aborts if one of the inputs is a null string: $ emp4.sh Enter the string to be searched: director Enter the file to be used: <Enter> You have not entered the filename $ emp4.sh Enter the string to be searched: director Enter the file to be used: emp1.lst 1006| iuyrhb | director | sales |03/09/38|6700 6532| hg udhf | director |marketing |26/09/45|8200 test also permits the checking of more than one condition in the same line, using the –a (AND) and –o (OR) operators. You can now simplify the above script to illustrate this feature: if [ -n “$pname” –a –n “$flname” ] ; then grep “$pname” $flname” || echo “pattern not found” else echo “At least one input was a null string” exit 1 fi The test output is true only if both variables are non null strings, i.e., the user enters some non white space characters when the script pauses twice.
Test: File Tests Test can be used to test the various file attributes. For example, you can test whether a file has the necessary read, write or executable permissions. The table lists file related tests with test:
Test
Exit Status
-e file
True if file exists
-f file
True if file exists and is a regular file
-r file
True if file exists and is readable
-w file
True if file exists and is writable
-x file
True if file exists and is executable
110
-d file
True if file exists and is a directory
-s file
True if file exists and has a size greater than zero
Any of the test options can be negated by the ! operator. Thus, [ ! – f file ] negates [ -f file ]. The file testing syntax used by test is quite compact, and you can test some attributes of the file emp.lst at the prompt: $ ls –l emp.lst -rw-rw-rw-
1 kumar group 870 Jun 8 15:52 emp.lst $ [ -f emp.lst ] ; echo $? 0 $ [ -x emp.lst ] ; echo $? 1 $ [! –w emp.lst ] || echo “False that file is not writable” Using these features, you can design a script that accepts a filename as argument, and then performs a number of tests o it: $ cat filetest.sh if [ ! –e $1 ] ; then echo “File does not exist” elif [ ! –r $! ] ; then echo “File is not readable” elif [ ! –w $1 ] ; then echo “File is not writable” else echo “File is both readable and writable” fi Test the script with two filenames. $ filetest.sh emp3.lst File does not exist $ filetest.sh emp.lst File is both readable and writable
The case Conditional The case statement is the second conditional offered by the shell. The statement matches an expression more than one alternative, and uses a compact construct to permit multi-way branching. The general syntax for case statement is as follows: case expression in pattern1) execute commands ;; pattern2) execute commands ;; pattern3) execute commands ;;
111
…… esac case matches the expression first for pattern1, and if successful, executes the commands associated wit it. If it doesn’t then it falls through and matches pattern2, and so on. Each command list is terminated by a pair of semi-colons, and the entire construct is enclosed with esac. For example, you can devise a script menu.sh, which accepts values from 1 to5, and performs some action depending on the number keyed in: $ cat menu.sh echo “
MENU\n 1. List of files\n2. Processes of user\n3. Today’s date 4. Users of system\n5. Quit to UNIX\nEnter your option: \c” read choice case “$choice” in ls –l ;; ps –f ;; date ;; who ;; exit
The five menu choices are displayed with a multi-line echo statement. case matches the value of the variable $choice for strings 1, 2, 3, 4 and 5. It then relates each value to a command that has to be executed. The same logic can also be implemented using the if statement, but the case certainly is more compact. However, case can’t handle relational and file tests: it can only match strings. It’s also most effective when the string is fetched by command substitution.
Matching Multiple Patterns case can also match more than one pattern. Programmers frequently encounter a logic that has to test a user response for both y and Y ( or n and N). To implement this logic with if, you need to use the compound condition feature: If [ “$choice” = “y” –o “$choice” = “y” ] case, on the other hand, has a quite compact solution. Like egrep, it uses the |to delimit multiple patterns. Thus the expressions y|Y can be used to match both upper and lowercase: echo “Do you wish to continue? (y/Y): \c” read answer case”$answer” in y|Y) ;; n|N) exit ;; esac
112
Wild Cards: case Uses Them Like the shell’s wild-cards, case also uses the filename matching metacharacters *, ? and the character class. However, they are used case for string matching only. You can match a fourcharacter string with the model ????, and if it must contain numerals only, [0-9] [0-9] [0-9] [0-9] will be just right For example: case “$answer” in [yY] [eE]*) ;; [nN] [oO] ) exit ;; *) echo “Invalid response” esac
The null Command The : command is a no-op. This means that the shall does nothing when it encounters this command. When it is often used as the first line of a Bourne or Korn shell program. The null command can be used to hold spaces in shell scripts. For example, as you test the script and insert if statements, you can use the null command as the command to be executed in the if statement while you test things. if [ ! $! ] then : # we’ll add this code later fi
The && and || Constructs The && executes the command following the && when the previous command returns true. For example: who | grep “chare” 2>/dev/null && echo “chris is logged on” If the first command returns true, as it would if the user chare is logged in, the echo statement is printed informing the user running the command that the user is on the system. The || is used when the command on the left of the symbol returns a false value. who | grep “chare” 2>/dev/null | | echo “chris is not logged on” In the above example, if the user chare is not logged on, the echo statement is printed. These are used with the test command to execute a command if the test returns true or false. The second part of the statement executes only if the first part is unsuccessful. [ -f $file ] && more $file The preceding line executes the more command on the specified file if the test command says that it is regular file.
113
6.4 Looping Constructs At this point, you must be able to change the flow of commands within the scripts and to execute the same commands over and over again. The commands that you can use are while, for, and until.
The while Command The while statement should be quite familiar to most programmers. It repeatedly performs a set of instructions till the control command returns a true exit status. The general form of this command is as follows: While condition is true do execute commands done The set of instructions enclosed by do and done are to be performed as long as the condition remains true. For example, the emp5.sh script accepts a code and description in the same line, and then writes out the line to newfile. It then prompts you for more entries: $ cat emp5.sh # Program: emp5.sh answer=y while [ “$answer” = “y” ] do echo “Enter the code and description: \c” read code description echo “$code | $description” >> newlist echo “Enter any more (y/n)? \c” read anymore case $anymore in y*|Y*) answer=y ;; n*|N*) answer=n ;; *) answer =y ;; esac done There are situations when a program needs to read a file that is created by another program. The monitfile.sh script periodically monitors the disk for the existence of the file, and then executes the program once the file has been located. $ cat monitfile.sh while [ ! –r invoice.lst ] do sleep 60 done
114
invoice_alloc.pl The loop executes as long as the file invoice.lst can’t be read. If the file becomes readable, the loop is terminated and the proram invoice_alloc.pl is executed. This shell script is an ideal candidate to be run in the background like this: Monitfile.sh & The sleep command is used to give some delay in shell scripts.
The until Command With the while command, the loop is executed until the expression evaluates false. The format of the until command is as follows: until expr do commands done If the expression evaluates as false, the commands between do and done are executed. When the expression evaluates true, the commands are no longer executed.
The for Command The for command is used for processing a list of information. The syntax of the command is as follows: for var in word1 word2 word3 do commands done Each word on the command is assigned to the variable var. the commands between the do and done statements are then executed. This process continues until no more words are left to process. When the last word is assigned and the commands are processed, the for loop is terminated, and execution continues at the first command following the done. for var in one test three four five do echo “var = $var” done when you run this program, the for command assigns each word to the variable var, and then runs the echo command. It looks like the following: $ for1 var = one var = test var = three var = four
115
var = five This command has several special formats, which indicates the use of positional parameters currently assigned, or the files in the current directory. for var do commands done is equivalent to for var in $* do commands done In both cases, the variables $1 to $9 are assigned to the variable var in turn. This is useful for processing the arguments on the command line, as shown in the following example: for var in $* do echo “var = $var” done In addition to processing command-line arguments and positional parameters, you can also process the filename substitution wildcards: for file in * do echo “Found$file” done This example produces a list of the files in the current directory, one per line. Now you can add the for command to the rat program, so that you can work on multiple command line arguments. This new version of rat is as follows: : # rat version 3 # # if we have no files, then report an error # if [ ! “$1” ] then echo “Usage: `basename $0` file” exit 1 fi # # Loop around each argument on the command line #
116
for file in $* do # # If the first argument is a directory, then report that it is as such if [ -d “$file” ] then echo “$file is a directory” # # If the argument is executable, then run the command # elif [ -x “$file” ] then $file # # If the first argument is a file, then run the more command on it elif [ -f “$file” ] then echo “_________$file___________” cat $file else echo “sorry, I don’t know what to do with this file.” fi done $ The preceding example changes the rat program to include the for command. This enables you to specify more than one file on the command line and have the rat program process it. Another example of the for command in use is used to rename a number of files at the same time: for i in * .doc do mv $i `basename $i doc` abc done
117
6.5 Reading data One command (read) is used to directly read data from a user, or other source, and a second command (exec) can be used to adjust where the output or input comes from for a specific script.
The read Command This command accepts on its command line a list of variables into which the information is to be placed. The syntax for read is as follows: read vars /the read command works by waiting for the user to enter some text, which it accepts up the newline. $ read num street 2435 103rdstreet $ echo “>$num< >$street<” >2435< >103rd street< $ Here you see where the user is prompted to enter some text, which will be assigned to the variables num and street. The first word is assigned to the variable num and the remaining words are assigned to the street variable. It is important to note that using only one variable on the line for read has the effect of assigning the entire line to the variable. For example: $ read info CF – 18A Hornet $ echo $info CF – 18A Hornet $ Now read also can be used to accept input from a pipe in a loop. For example: ls fi* | while read file do echo $file done In this example, the ls command provides a list of files that is fed to the while loop using a pipe. For each file name, it is read using the read command and saved into the variable file. In this case, the read command knows when no more data exists, and so it exist, allowing the loop to continue.
The exec Command The exec command can be used to do several things depending on the arguments given to it. These things include the following:
118
• • •
Replace the shell with a specific command Redirect input or output permanently Open a new file for the shell using a file descriptor
exec is most commonly used to replace the current shell with another program. This is done in the user’s login file to replace the login shell with an application. This means that when the user exists in the application, their session to the system is terminated. In effect, the application or command now replaces the login shell. $ exec /bin/data Fri Aug 12 22:46:52 EST 1994 Welcome to AT&T UNIX Please login: The preceding code is an example of using exec to replace the current login shell with the program / bin / date. After the command exists, the user must login in again. The second use for exec is to redirect where input or output is to be sent on a semipermanent basis. Inside the shell script, you can include a line such as the following: `em begin code `em exec > /tamp/trace `em end code `em that instructs the shell that all text destined for standard output will be instead written to the file /tmp/trace. This remains in effect until the script exists, or the following command is executed to send the output back to the terminal: exec > /dev/tty Standard error can be redirected using the following command: exec 2> /tmp/log And finally, input also can be redirected using the following format: exec < /tmp/commands Unless these commands are typed directly at the command line, they remain in effect only until shell exists. The final case for exec is when you want other files to open for input or output besides the standard three that each shell gets. In the script, you need to use the following format:
exec fd mode file In which fd is the file descriptor number. Remember that 0, 1, and 2 are already used. The maximum value that a file descriptor can be is nine.
119
6.6 Functions Instead of having to have multiple occurrences of the code in the file, and slowing the execution time by waiting for other files to start, you can define a function in the shell script that executes when you need it. A shell function is defined following a specific syntax: function_name () { commands } The function name can be any series of characters and letters. In some respects, the function is like a Korn shell or Cshell. However, functions are available in most Bourne shell - and are in Korn shell. # @ (#) search – search for a file from $HOME # Usage search filename search ( ) { if test $# - lt 1 then echo “Usage: search FILENAME” return 1 fi FILE=$1 echo “searching…” find $HOME – name $1 –print echo “search complete.” Return 0 } The preceding example is a sample shell function that accepts as an argument the name of a file to search for. The function uses the find command to locate and print the name of the file. There is no exit command is used in this function. When using functions, the current shell executes them. This means that an exit command has the effect of logging the user off the system. As a result, use the return command, which exists in the function and provides a return code back to the system. The return and exit commands serve the same purpose, but the first is for a script, and the second is for function.
120
Chapter 7 7. Basics of UNIX Administration 7.1 Login Process and Run Levels The Login Process For a direct connection to the system, the login process involves the following commands: getty, login, and init.
Initialization The initialization for a terminal session is started by the init command. The init is responsible for starting the getty command to listen on a terminal port for an incoming connection. init does not work on network terminal port. init accomplishes this task by looking at the /etc/inittab file on the system V UNIX systems, and the /etc/ttys file on BSD- based system. The /etc/inittab file, which follows, shows a sample initialization line used to start getty. 11:23:respawn:/etc/getty/ tty11 9600 # VT420 workstation This example shows a sample entry from a System V inittab indicating that this getty should be respond each time it exists. That means when the user’s login shell exists, a new getty should be started.
Login Phase 1: getty When you press Enter on your system, the system responds with a login prompt. This prompt informs you that the system is ready for you to log in. The program responsible printing the login prompt is the command getty. getty prompts for the user’s login name, as shown in the following: Welcome to UNIX Please Login: This login name has been assigned by the system administrator. The name can be up to eight characters in length.
Login Phase 2: login The second phase of the process involves the command login. login prompts the user to enter the password. The password isn’t printed on the screen for security reasons. Welcome to UNIX Please Login: chare Password:
121
If the user enters the password incorrectly, the system with a generic message, login incorrect.
Login Phase 3: Login Shell The third phase of the process is entered after the user has entered the correct password for the login. This phase sets the parameters around which the user is set up. For example, the login shell is started, the init command starts the user’s login shell as specified in the /etc/passwd file. This process is illustrated in the following: Welcome to the AT&T UNIX pc Please login: chare Password: Starting login procedure … Please type the terminal name and press RETURN: vt100 48% of the storage space is available. $
Logging In through telnetd and rlogin When users want to log in to a system through a TCP/IP network, there are two primary access methods. One uses the telnet protocol, and the other uses the rlogin protocol. telnet and rlogin use a process on the UNIX system called server. This server is started when an incoming connection for the specified protocol is received. In both cases, these commands prompt for the user’s login name and the password.
The Global Initialization Files There are two global initialization files used for the Bourne, Korn, and C shells. initialization files are executed for each user who logs in to the system.
These
The /etc/profile File The shell script is executed for the users of the Bourne and Korn shells only. The profile file is executed for each user when he logs in. The profile is a shell script, so learning how to read it will help you create your own profile and customize your environment. The output shown in the following list illustrates what the /etc/profile is doing: Welcome to the AT&T UNIX pc Please login: chare Password: Starting login procedure … Please type the terminal name and press RETURN: vt100
122
48% of the storage space is available. Logged in on Thu Sep 1 18:18:15 EST 1994 on /dev/tty000 Last Logged in on Thu Sep 1 18:17:44 EST 1994 on /dev/tty000 $
Customizing the User Login Files There are a number of files that are user-dependent and can be modified to create an environment more suited to the user’s tasks.
The .profile File The .profile file is equivalent to the /etc/profile file, except that it enables the user to customize and alter the configuration established by the system administrator. A user profile illustrates adding to the configuration that was performed in the /etc/profile script. To show that the processing of both of these files is done, the .profile prints several lines. One lists the keys that are configured for the erase, interrupt, and kill commands, which are part of the terminal driver.
The C Shell .login File When a user logs in to the system his shell is the c shell, a file called .login is executed. This file can contain any valid Unix or C shell command. The .login file typically contains commands regarding the terminal driver, such as the erase character and the interrupt character.
The C shell .logout File The .logout file is executed when the login shell is terminated. When the user types the command logout or exit, the shell terminates and executes that are in the .logout file. The following illustrates the execution of the commands in the .logout file: % %logout logged out chare at Thu sep 1 18:20:56 EST 1994 please login: Welcome to UNIX
The C Shell .cshrc File The .cshrc file is executed each time a new C shell is started. The .cshrc file typically contains commands that are loaded for each shell. Examples are prompts and information regarding
123
aliases. An alias is anther name for a command. For example, if your system doesn’t have the command lc, you could create an alias in the C shell to define lc as ls –CF.
A .logout File for the Bourne and Korn Shell The Bourne and Korn shell do not have the equivalent of a .logout file, although this facility can be easily mimicked by using the shell’s capability to trap signals. In the following example, you see the output. In fact, the files could be the same, as long as there were no shell specific commands in them. You can simulate the .logout file by adding the following lines to the .profile file in the user’s home directory. This line catches, or trap, when the user requests the logout, and it executes the command in the .checkout file. Trap $home /.checkout 0
Run Levels Under SV flavors, run levels have become to a machine what permissions are to a user. Operating at one run level restricts a machine from performing certain tasks; running at another enables these functions to run. There are eight accepted run levels: O (shoutdown state requiring a manual reboot). When changing to this level, files are synchronized with the disk, and the system is left in a state where is safe to power it off. This level is also known on some systems as Q or q Q or q (On systems where these are not equal to zero). They force the system to reread the inittab files and take into account any changes that have occurred since the last boot. •
• • • • •
1 (Single user mode). This mode is also known as S or s mode on many systems. It allows only one user to access the system. Additionally it allows only one terminal to log in. If the change is to1, then that one terminal allowed to log in is the one defined as the console. If the change is to S/s, then the only terminal allowed to log back in is the one that changed the run level. 2 (Multiple user mode). The traditional state allowing more than one user to log in at a time. This level is where background processes start up, and additional file systems – if present – are mounted. 3 (Network mode). The same as level 2 only with networking or remote file sharing enabled. 4 (User defined) 5 (Hardware state) 6 (Shutdown and automatic reboot). Performs the same action as changing to run level 0 then rebooting the machine.
124
7.2 Processes There are three facilities that exists in UNIX to run jobs when you aren’t around, when the load permits, or over and over again. They are at, batch, and cron. The at command is used to run a single command at a specific time. Syntax of the at command: at time [data ] [increment] The time is not optional on the command line, but you optionally specify the date the command should be run or an increment from the current time.
Setting the Time The time component for at may be specified as one, two, or four digits. If only one or two digits are used, the time is assumed to be hours. In case of four digits, a colon may be used. Example for at command $ date Sat Aug 6 20:02:49 EDT 1994 $ at 2004 date Ctrl+D Job 8765465.a at Sat Aug 6 20:04:00 1994 $ In the preceding list, you checked date and time to ensure that you set your job up appropriately. The command line to at indicates that you want to run the job at 2004,or 8:04 p.m. Once all the commands to execute have been entered, press Ctrl+D on a new line. This informs at that no more commands are going to be entered. Before terminating at prints out a job number for the newly queued job. This job number can be used to find out information on the job in future.
Controlling Output If you are logged on, the output could be sent to your terminal, but that might cause other problems with the application you could be using at the time. Normally, at saves all output from the commands queued unless the output is redirected somewhere else. The output is then mailed to the user who queued the job after it has completed.
Being More Precise The at command will also accept a date to be more precise on when the commands should be scheduled. The date information can be a full date with month, day, and year, or a day of the week. Two special keywords, today and tomorrow, are recognized. If no date is given and the hour is greater than the current hour, today is assumed by at. If the hour is less than the current hour, tomorrow is assumed. The following code illustrates valid date formats for use with at:
125
at 22:05 today
at 1 pm tomorrow
at 1201 am Jan 1, 1995
at 11:30 Jan 24
at 4:30 pm Tuesday
Listing at Jobs The –l option will instruct at to list the contents of the at queue. This will list only the jobs that have been queued by the invoking user. For example: $ at –l 5423654.a 4434356.a $
Sat Aug 13 20:43:00 1994 Sun Jan 1 00:01:00 1995
Removing at Jobs The at command also provides the –r option, which is used to remove queued jobs from the at command queue. For example $ at -l 5423654.a Sat Aug 13 20:43:00 1994 4434356.a Sun Jan 1 00:01:00 1995 $ at -r 5423654.a $ at -l 4434356.a Sun Jan 1 00:01:00 1995
Interactive versus batch With batch, the commands are executed at a time when the system is free enough to handle such requests. Commands are entered at the command line, with a Ctrl+D entered on a new line to terminate the command list. This is illustrated in the following: $ batch date who df pwd Ctrl+D Job 78657.a at Sat Aug 6 21:23:06 1994 $ Here you have scheduled these commands ti be executed by way of batch. The output of the commands is saved and returned to the user through the electronic mail system on the UNIX system. If the command has been written to save its output somewhere else, or if the output is redirected, there will be no mail message.
126
7.3 Archiving and backup Users often accidentally delete their own files and then rush to the administrator to restore them. The administrator has to plan his backups carefully so that he doesn’t back up the same file over and over again even though it has not been accessed for ages. The two most popular programs for backups are tar and cpio. Both combine a group of files into a single file (called an archive), with suitable headers preceding the contents of each file
Backup Strategy Some files are modified more often than others, while some are not accessed at all. How often the data changes in the system influences the backup strategy and determines the frequency of backup. There should be a complete backup of all files once a week, and a daily incremental backup of only those files which have been changed or modified In modern times where one tape cartridge can back up several gigabytes of data at once, incremental backups can now prove quite meaningless for some. For installations maintaining a few hundred megabytes of data, a daily complete backup can now make a lot of sense. You should use cron to schedule your backups.
tar or cpio tar: The Tape Archiver tar is one of several commands that can be used to save data to an archive device. Typically tar is used with tapes or floppy disks. The syntax of the tar is as follows: $ tar key files The key includes any arguments that go along with the options. The key consists of a function that instruct tar what you want it to do, whereas the additional options alter the way that tar does the work. The key or option list that is given to tar does not have to contain a leading hyphen. These commands are equivalent: $ tar –x $ tar x tar is used with a number of key options. They are listed in the given table:
127
Option
Significance (Key Options)
-c
Creates a new archive
-x
Extracts files from archive
-t
Lists contents of archive
-r
Replaces the file at end of archive
-u
Like r, but only if files are newer than those in archive
Option
Significance (Non-key options)
-v
Verbose option-lists files in long format
-w
Confirms from user about action to be taken
-f dvc
Uses pathname dvc as name of device instead of the default
-b n
Uses blocking factor n, where n is restricted to 20
-m
changes modification time of file to time of extraction
-k num
Multi volume backup (SCO UNIX only)
-m
Multi volume backup (Linux only)
-z
compresses while copying (Linux only)
-z
Decompresses while extracting (Linux only)
Backing Up Files tar accepts directory and filenames directly on the command line. The –c key option is used to copy files to the backup device: # tar -cvf /dev/rdsk/f0ql8dt /home/sales/SQL/*.sql This backs up all SQL scripts with their absolute pathnames to the floppy diskette. The single character a before each pathname indicates that the file is appended. The verbose option (-v) shows the number of blocks used by each file.
128
When files are copied in this way with absolute pathnames, the same restrictions apply; they can only be restored in the same directory. However, if you choose to keep the option open of installing the files in a different directory, then you should first “cd” to /home/sales/SQL, and then use a relative pathname: cd /home/sales/SQL tar -cvf /dev/rdsk/foql8dt ./*.sql The command will also execute faster if used with a block size of 18 tar -cvfb /dev/rdsk/foql8dt 18 *.sql since both –f and –b each have to be followed by an argument, the first word (/dev/rdsk/f0q18dt) after the option string –cvfb will denote the argument for –f, and the second (18) will line up with –b.
Restoring Files Files are restored with the –x option. When no file or directory name is specified, it restores all files from the backup device. The following command restores the files just backed up: # tar -xvfb /dev/rdsk/f0q18dt 18
Displaying the Archive The –t option simply displays the contents of the device without restoring the files. When combined with the –v option, they are displayed in a long format: # tar -tvf /dev/rdsk/f0q18dt
Appending to the Archive A file can be added to the archive with the –u option. Since this is a copying operation, the –c option can’t be used in combination with this option. The unusual thing is that an archive can contain several versions of the same file: # tar -uf /dev/rdsk/f0a18dt ./func.sh # tar -tvf /dev/rdsk/f0q18dt
Interactive Copying and Restoration tar offers the facility of interactive copying and restoration. When used with the –w option, it prints the name of the file, and prompts for the action to be taken. With this facility, the earlier version of the file can be restored easily: # tar -xvwf / dev/rdsk/f0a18dt ./func.sh When there are several versions of a single file, it is better to include the verbose option so that the modification times can be seen.
129
Compress and Copy GNU tar in Linux, while lacking in some facilities, offers several features, notable among which is the simultaneous compression facility during backup. The –z option is used for compressing and –Z for decompressing during restoration: tar -cvzf /dev/rct0 tar -xvZf /dev/rect0
cpio: Copy Input Output The cpio command can be used to copy files to and from a backup device. It uses standard input to take the list of filenames, and then copies them with their contents and headers, into a single archive that is written to the standard output. This means that cpio can be used with redirection and piping. It uses two options, -o (output) and –I (input). Other options can be used with either of these two options.
Backing Up Files ls can be used to generate a list of filenames for cpio to use as input. The –o option is used to create an archive, which can be redirected to a device file. # ls | cpio –ov > /dev/rdsk/f0q18dt array.pl calendar The –v option makes cpio operate in the verbose mode, so that the filename is seen on the terminal when it’s being copied. Cpio needs as input is a list of files to be backed up. Incremental Backups: find can also produce a file list, so any files that satisfy its selection criteria can also be backed up. You will frequently need to use find and cpio in combination to back selected files, for instance, those that have been modified in the last two days: find . –type -mtime -2 -print | cpio -ovB > /dev/rdsk/f0q18dt Since the path list of find is a dot, the files are backed up with their relative pathnames. The –B option sets the block size to 5120 bytes for input and output. For higher, the –C option has to be used. Multi-Volume Backups: When the created archive in the backup device is larger than the capacity of the device, cpio prompts for inserting a new diskette into the drive.
Restoring Files A complete archive or selected files can be restored with the –I option. To restore all the files that were backed up with a previous cpio command, the shell’s redirection operator (<) must be used to take input from the device: # cpio -iv < /dev/rdsk/f0q18dt array.pl
130
calendar
Other options The –r option lets you rename each file before starting the copying process. The –f option, followed by an expression, causes cpio to select all files except those in the expression: cpio -ivf “*.C”
Displaying the Archive The –t option displays the contents of the device without restoring the files. This option must be combined with the –i option: # cpio -itv
131
7.4 Security We need to address two types of security – Physical and Logical security. The physical is concerned with where the machine is located and the access controls to the machine. The Logical security component addresses the issues of security in the software such as user name and password. The physical security issues are as follows:
Physical location of the machine Availability of removal of the machine Access to distribution and backup media
To prevent these issues, distribution and backup media should be identified and controlled under lock and key to prevent unauthorized access. The logical security issues are as follows:
User Account Management Password Management Educate the users about the importance of security
132
Chapter 8 8. Communication 8.1 Basic UNIX mail Computer system that communicates with each other to pass UNIX mail actually use a language or protocol called Simple Mail Transport Protocol (SMTP). Communication between UNIX and non UNIX mail systems that generally use non-SMTP protocols requires an e-mail gateway. The gateway acts as a translator speaking both languages.
UNIX Mail Concepts UNIX mail is built on the concept of a mailbox, which is the repository for your mail messages. Messages that you receive and want to keep for later reference are stored mail files. The directory in which a mail file resides is called a mail folder.
Starting mail To start mail, enter mail at your shell’s command line. If you have mail in your mail box, the mail headers will appear on the screen. % mailx mailx version 5.0 Mon Sep 27 07:25:51 PDT 1993 Type ? for help. “/var/mail/steve” : 10 messages 2 new 0 unread 0 41 Sun managers –relay Fri Aug 26 16:58 41/1869 DLT on solaris 1.x or 2.x 0 42 ron Fri Aug 26 17:08 17/435 my phone number {mail}&
mail displays the version and date of the mail program that you are running, and name of the mail you are reading. Mail then displays total number of messages, number of new and unread messages. The first column indicates the status of the message. N O R U
New, unread messages Old messages New messages that has been read Unread messages
The second column shows the message number. The third column shows the sender’s name. The fourth column shows the date and time that the message arrived in your mail box, and size of the messages, both in number of lines as well as number of bytes. The last column shows a listing of the subject line of the mail message. At the end of the header display, the mail program shows the command prompt ({mail}&) and waits for you to enter a command.
Reading Your Mail To read a specific message shown from the mail header display, enter the message number at the command prompt. The message is displayed one page at a time for you to read. Your mail program may be set to use number of paging commands such as more or pg, so a specific key used to move to the next page may be different.
133
The displayed message consists of two parts, the message header and the body of the message. The message header contains information such as who sent the message, when the message was sent, who the recipient is, what the subject of the message is, and who, if anyone, is on the carbon copy list.
Composing Mail To compose a message to be sent by mail, enter the following where username@mailaddress is the e-mail address of the recipient: {Mail}& mail username@mailaddress Subject: My address You may now type your message. Once finished you message, enter Ctrl+D and press Enter. Then your message is delivered and you are back at the {mail}& prompt.
Mail Headers At any {Mail}& prompt enter “h” to redisplay the mail headers. To scroll forward one screen of messages, enter “z”. The next group of mail headers is displayed. To scroll backward one screen of messages, enter “z –“. If you want to display the headers for a group of messages containing a specific message number, enter h message number.
Replying to Mail The mail program provides a number of choices on how to reply: you may reply to the sender of the message, or you may reply to all of the recipients of the message as well as the sender. {Mail}& r
Reply to the sender of the message after you have read it.
{Mail}$ r 13
Reply to the sender of the specific message number
{Mail}& R
Reply to the sender and all of the recipients
Deleting Messages To delete the message that you read, enter the following {Mail}& d This message is now marked for deletion. The message will be permanently removed from your mailbox when you quit the mail. Multiple messages may be deleted with one command by specifying the message numbers, separated by spaces, on the command line. Example: {Mail}& d 2 4 5 Series of messages may also be deleted by inserting a dash between the starting and ending message numbers. Example:
134
{Mail}& d 8-46 The deleted mail messages can be undeleted using the undelete mail command. Example: {Mail}& u 11
Saving Mail Messages To save a mail message into a specific mail file, type “s” followed by the mail file name. For example, enter the following to save the message 6 into a mail file called UNIX: {Mail}& s 6 UNIX To view the mail file, you specify a mail file name on the UNIX command line. For example, if you want to view the messages in your personnel mail file, enter the following: % mail -f personnel Once you have finished reading your mail and deleting or saving them, you need to end your mail session. {Mail}& x %
Advanced features You can also send mail from the UNIX command line by composing a specific mail message, or by redirecting output fro another UNIX program into mail.
Mailing a Single Message Mailing a single message from the command line is as simple as running the mail command and specifying the address of the recipients on the command line. Example: % mail steve Subject: Phone number Our new office phone number is 7548776. Please update your records. . Cc % When you want the output of a UNIX command to be mailed to you, or to a specific user, this capability is used to mail the results of a program to yourself, or another use. Example: % ls -al | mail -s “Steve’s ome directory” [email protected] In this example, the user is sending a directory listing of the current directory to a user [email protected] . The –s option enables a subject: heading to be added to the message.
135
When you composing a mail message, you may want to include the text from another message for the recipient to view or reference. To include a mail message into the current message, enter the following on a blank line: ~ m message number If you want to include a text file in your mail message, enter the following: -r filename
Decoding an Encoded mail Message If you receive an encoded mail message, its contents are valuable to you only if you can extract the file. If the file has been encoded with the uuencode command, you should be able to read its initial encoding line, which has the following format: begin xyz filename Where xyz is the UNIX numerical representation of the file modes, and filename is the name of the extracted file. The first step in decoding the mail message is to save the message to a temporary mail file. Once the temporary mail file is created, you must run the uudecode command from your UNIX shell. The uudecode command’s only argument is the name of the encoded message. The uudecode command will search for the begin statement I the body of the message, discard all of the mail headers, decode the encoded file, and then create a file with permissions and a file name that match those of the begin statement.
Handling Large files in mail Many mail delivery systems have a limit as to the size of a mail message that they will process. Two common ways of mailing large files are to compress them and to split them. File compression uses the various algorithms to detect repeated patterns in files and represent these patterns by shorter character strings. To send a compressed binary file, you must compress the file before beginning the encoding process. Example: ls -al csh* -rw – r – r - - 1 steve 2887 % compress csh -rw – r – r - - 1 steve 1699 % uuencode cshrc.Z > cshrc.Z.uu ls -al csh* -rw – r – r - - 1 steve 1699 -rw – r – r - - 1 steve 2368
Aug 28 09:23
cshrc
Aug 28 09:23
cshrc.Z
Aug 28 09:23 Aug 28 09:24
cshrc.Z cshrc.Z.uu
After a file is in compressed, uuencode format, you can include it in your mail message and sent it. Compressing files using the compress command creates a new file with a.Z extension. The split command divides a file into individual files of a user supplied length, each of which then can be included in your mail message. The wc command is used to count the number of lines in the encoded file.
136
8.2 Communicating with other users This section covers the means by which an administrator, or user, can communicate with another user, other than by e-mail.
write- ing to Other User to send a message to another user currently on the system is with the write command. Use who to find out if a user is on the system and then summon write. Example: $ who hanna ttya evan ttyb $
Aug 2706:35 Aug 31 19:24
$ write cliff cliff is not logged on $ $ write evan At this point, the other user’s terminal beeps twice and displays the following: Message from {you} on {node name} {terminal} [ {date} ]… Message from hanna on NRP (ttya) [ Tue Aug 31 06:41:53 ]… The prompt disappears from both terminals. Begin typing your message and write utility copies the lines from your terminal to the terminal of the other user every time the Return key is pressed.
Using talk talk is an interactive form of write that exists in many versions of UNIX. If a network is present, you can write or talk to a user on a system other than the one you are on by specifying the user and machine name, separated by the at sign(@). Example: $ write cliff@NEWRIDER To prevent other users from writing messages to you, and interrupting that very important job on which you are working, use mesg n. mesg permits or denies messages. When no parameter is given, it tells you the status of the mesg flag, as in the following example: $ mesg is y $ With an argument n, it prevents messages by revoking write permissions on the terminal of the user. When used with an argument of y, it switches, and reinstates the permission. To see who is accepting or denying messaging through the use of write, use the who –T command.
137
$ who –T hanna + ttya Aug 2 08:42 root - ttyb Aug 5 19:24 Here, - sign between the user name and terminal port indicates that he is not accepting messages, while +sign indicates that he can receive messages. When there is a message for everyone, you can use wall to send a message, but it only goes to the users currently logged on. If they are not currently logged on, you can send a mail to them, but that means you have to send mail to everyone. With news, only one file is created and all users read the same file. This reduces the amount of clutter on a system. When a user logs in, if the news command is in his login routine, the contents of any files in /usr/news, or /var/news are displayed on his screen. Once he sees it, his name is removed from the list of users needing to do so, and the file is not shown to him again. If delete is pressed while a news item is being shown, the display stops and the next item is started.
Using echo To utilize echo, you need to know which terminal the user is using and how to address it. who will show you where the user is and, tty gives the full address. Example:
$ who user1 tty8 user2 tty4 $
Dec 30 08:30 Dec 30 07:20
Although the terminal is abbreviated tty8, the true address is slightly longer. $ tty /dev/tty8 $ Using this complete address, a message can be sent with the following command. $ echo “Hello” > /dev/tty8 To have a beep sound also, $ echo “\7\7 Hello” > /dev/tty8 To send the output of a command to other screens, single quotations are used. $ echo `date` > /dev/tty8
138
8.3 Accessing the Internet The internet is a community of people who do things their way because, for better or worse, that is the way they like them. Internet Protocols are the low level tools that bind the machines on the Internet into a useful whole. Ips specify the kinds of communications that can occur between machines and how connections can be made to allow those communications. To be on the internet, a machine must support the IP. One of the main important of these is Transmission Control Protocol (TCP/IP). The Internet treats all of its communications as packets of data, each of which has an address. Machines on the Internet maintain tables that describe addresses of local and remote machines, and routes for packets. The Internet has its roots in a networking initiative and associated protocols. The Internet and its protocol grow and evolve in response to user needs and currently available resources. Mechanisms are available for formalizing, altering, and replacing IPs. The protocols themselves are started in documents freely available from various sites on the Internet. You can see a list of the ones on your machine by entering the command: % cat /etc/protocols
File Encoding A file might be encoded on the Internet for different reasons: to assure its privacy, to encapsulate it in an archive, to compress it, or to send a binary files using an ASCII transmission method like mail or Netnews. As long as the sender has e-mail, the binary files are available by using the uuencode and uudecode commands. These programs convert an arbitrary stream of bytes into ASCII and back again. To use uuencode, type the following: % uuencode file label > out_file Where file is the file to be encoded and label is the name of the file will have when it is decoded with uudecode.
Using ping ping provides the Internet version of a “hello, are you there” query. ping sends network packets to a machine on the Internet, which you designate either by name or by address. Ping sent five packets and each one arrived at designation successfully.
finger Command The finger command can be used over the Internet to show you information about users on other machines. The exact information you receive depends on the command options you use and what the user you are asking about has made available in his .plan and .project files. For example, you want to see who is currently logged on a machine on the internet, you can type the following: % finger @bigbiz.com
139
The result is: Login Name Buff boon Ijfj sanju
TTY p2 q3
Idle When Where 1 Tue 19:26bigbiz.com 21: Mon 08:45 bigbiz.com
finger can also show information about a specific user whether or not that user is currently logged in. finger [email protected] might show the following: Login name:buff Directory: /user/mnmt/buff On since Jan 26 14:32:05 on ttyrc 4 days 23 hours Idle Time Plan: Manager, Big biz company Mail Code: 2746 Extension: 134 (phone: 6425458) Office: HU, 43665e Motto: “If it’s big business, it’s our business!”
140
8.4 e-mail on the Internet Sending and receiving messages electronically is certainly one of the most visible and attractive benefits of computer networking. The propagation of e-mail requires different mechanism than the creation and use of it. Electronic mail can originate from hand-held computers, home computers, desktops, terminals etc. All these machines can be on the networks other than the Internet. Certainly each one of the machines does not use UNIX and mail. Each uses its own mail programs.
E-mail Address Directories No single directory can be searched for an individual’s e-mail address. Two resources on the internet might help you locate an e-mail address. They are the whois and the Knowbot Information Service offered by the Corporation For National Research Initiatives (CNRI)
Using whois The whois program searches in a database for matches to a name you type in at tha command line. To search for all records that match the string boon, type the following: $ whois bon The result is: $ whois bon using default whois server rs.inter.net Bon, Naida (NB76) +1 xxx xxx xxx
[email protected]
Bon, Paul (PB61) (xxx) xxxx-xxx
[email protected]
To get help on the current state of the whois command, $ whois help
Using CNRI’s knowbot It is possible to type in the target string for which you seek matches just once and have many different databases searched. By submitting a single query to KIS (Knowbat Information Service), a user can search a set of remote “white pages” services and see the results of the search in the uniform format. Start up a telnet session. At the prompt, enter the knowbot address. % telnet telnet> open info. Cnri.reston.va.us At the KIS prompt, type the following: >query bon
141
You see the output shown previously and much more besides since KIS searches different databases.
Mail Lists Mail lists consists of users with a common interest in some topic and store desire to read everything that other people have to say on the topic. Of course, some lists are private but many are open to anyone who wishes to join.
Mail Servers Mail servers are programs that distribute files or information. These respond to the e-mail messages that conform to a specific syntax by e-mailing files or information requested in the message back to the sender.
142
Chapter 9 9. Makefile concepts 9.1 Introduction You need a file called a makefile to tell make what to do. Most often, the makefile tells make how to compile and link a program. In this chapter, we will discuss a simple makefile that describes how to compile and link a text editor application that consists of eight C source files and three header files. The makefile can also tell make how to run miscellaneous commands when explicitly asked (for example, to remove certain files as a clean-up operation). When make recompiles the editor, each changed C source file must be recompiled. If a header file has changed, each C source file that includes the header file must be recompiled to be safe. Each compilation produces an object file corresponding to the source file. Finally, if any source file has been recompiled, all the object files, whether newly made or saved from previous compilations, must be linked together to produce the new executable editor. A simple makefile consists of "rules" with the following shape: target ... : prerequisites ... command ... ... A target is usually the name of a file that is generated by a program; examples of targets are executable or object files. A target can also be the name of an action to carry out, such as `clean' (see section Phony Targets). A prerequisite is a file that is used as input to create the target. A target often depends on several files. A command is an action that “make” carries out. A rule may have more than one command, each on its own line. Usually a command is in a rule with prerequisites and serves to create a target file if any of the prerequisites change. However, the rule that specifies commands for the target need not have prerequisites. For example, the rule containing the delete command associated with the target `clean' does not have prerequisites. A rule, then, explains how and when to remake certain files that are the targets of the particular rule. make carries out the commands on the prerequisites to create or update the target. A rule can also explain how and when to carry out an action. A makefile may contain other text besides rules, but a simple makefile need only contain rules. Rules may look somewhat more complicated than shown in this template, but all fit the pattern more or less. Here is a straightforward makefile that describes the way an executable file called edit depends on eight object files that, in turn, depend on eight C source and three header files.
143
In this example, all the C files include `defs.h', but only those defining editing commands include `command.h', and only low level files that change the editor buffer include `buffer.h'. edit : main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o cc -o edit main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o main.o : main.c defs.h cc -c main.c kbd.o : kbd.c defs.h command.h cc -c kbd.c command.o : command.c defs.h command.h cc -c command.c display.o : display.c defs.h buffer.h cc -c display.c insert.o : insert.c defs.h buffer.h cc -c insert.c search.o : search.c defs.h buffer.h cc -c search.c files.o : files.c defs.h buffer.h command.h cc -c files.c utils.o : utils.c defs.h cc -c utils.c clean : rm edit main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o We split each long line into two lines using backslash-newline; this is like using one long line, but is easier to read. To use this makefile to create the executable file called `edit', type: make To use this makefile to delete the executable file and all the object files from the directory, type: make clean In the example makefile, the targets include the executable file `edit', and the object files `main.o' and `kbd.o'. The prerequisites are files such as `main.c' and `defs.h'. In fact, each `.o' file is both a target and a prerequisite. Commands include `cc -c main.c' and `cc -c kbd.c'. When a target is a file, it needs to be recompiled or relinked if any of its prerequisites change. In addition, any prerequisites that are themselves automatically generated should be updated first. In this example, `edit' depends on each of the eight object files; the object file `main.o' depends on the source file `main.c' and on the header file `defs.h'. A shell command follows each line that contains a target and prerequisites. These shell commands say how to update the target file. A tab character must come at the beginning of every command line to distinguish commands lines from other lines in the makefile.
144
The target `clean' is not a file, but merely the name of an action. Since you normally do not want to carry out the actions in this rule, `clean' is not a prerequisite of any other rule. Consequently, make never does anything with it unless you tell it specifically. Note that this rule not only is not a prerequisite, it also does not have any prerequisites, so the only purpose of the rule is to run the specified commands. Targets that do not refer to files but are just actions are called phony targets. By default, make starts with the first target (not targets whose names start with `.'). This is called the default goal. Goals are the targets that make strives ultimately to update. In the simple example of the previous section, the default goal is to update the executable program `edit'; therefore, we put that rule first. Thus, when you give the command: make make reads the makefile in the current directory and begins by processing the first rule. In the example, this rule is for relinking `edit'; but before make can fully process this rule, it must process the rules for the files that `edit' depends on, which in this case are the object files. Each of these files is processed according to its own rule. These rules say to update each `.o' file by compiling its source file. The recompilation must be done if the source file, or any of the header files named as prerequisites, is more recent than the object file, or if the object file does not exist. The other rules are processed because their targets appear as prerequisites of the goal. If some other rule is not depended on by the goal (or anything it depends on, etc.), that rule is not processed, unless you tell make to do so (with a command such as make clean). Before recompiling an object file, make considers updating its prerequisites, the source file and header files. This makefile does not specify anything to be done for them--the `.c' and `.h' files are not the targets of any rules--so make does nothing for these files. But make would update automatically generated C programs, such as those made by Bison or Yacc, by their own rules at this time. After recompiling whichever object files need it, make decides whether to relink `edit'. This must be done if the file `edit' does not exist, or if any of the object files are newer than it. If an object file was just recompiled, it is now newer than `edit', so `edit' is relinked. Thus, if we change the file `insert.c' and run make, make will compile that file to update `insert.o', and then link `edit'. If we change the file `command.h' and run make, make will recompile the object files `kbd.o', `command.o' and `files.o' and then link the file `edit'.
145
9.2 Using Variables in Makefile In our example, we had to list all the object files twice in the rule for `edit' (repeated here): edit : main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o cc -o edit main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o Such duplication is error-prone; if a new object file is added to the system, we might add it to one list and forget the other. We can eliminate the risk and simplify the makefile by using a variable. Variables allow a text string to be defined once and substituted in multiple places later. It is standard practice for every makefile to have a variable named objects, OBJECTS, objs, OBJS, obj, or OBJ which is a list of all object file names. We would define such a variable objects with a line like this in the makefile: objects = main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o Then, each place we want to put a list of the object file names, we can substitute the variable's value by writing `$(objects)' Here is how the complete simple makefile looks when you use a variable for the object files: objects = main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o edit : $(objects) cc -o edit $(objects) main.o : main.c defs.h cc -c main.c kbd.o : kbd.c defs.h command.h cc -c kbd.c command.o : command.c defs.h command.h cc -c command.c display.o : display.c defs.h buffer.h cc -c display.c insert.o : insert.c defs.h buffer.h cc -c insert.c search.o : search.c defs.h buffer.h cc -c search.c files.o : files.c defs.h buffer.h command.h cc -c files.c utils.o : utils.c defs.h cc -c utils.c clean : rm edit $(objects)
146
It is not necessary to spell out the commands for compiling the individual C source files, because make can figure them out: it has an implicit rule for updating a `.o' file from a correspondingly named `.c' file using a `cc -c' command. For example, it will use the command `cc -c main.c -o main.o' to compile `main.c' into `main.o'. We can therefore omit the commands from the rules for the object files. When a `.c' file is used automatically in this way, it is also automatically added to the list of prerequisites. We can therefore omit the `.c' files from the prerequisites, provided we omit the commands. Here is the entire example, with both of these changes, and a variable objects as suggested above: objects = main.o kbd.o command.o display.o \ insert.o search.o files.o utils.o edit : $(objects) cc -o edit $(objects) main.o : defs.h kbd.o : defs.h command.h command.o : defs.h command.h display.o : defs.h buffer.h insert.o : defs.h buffer.h search.o : defs.h buffer.h files.o : defs.h buffer.h command.h utils.o : defs.h .PHONY : clean clean : -rm edit $(objects) This is how we would write the makefile in actual practice.
147
9.3 Writing Makefiles The information that tells make how to recompile a system comes from reading a data base called the makefile.
What Makefiles Contain Makefiles contain five kinds of things: explicit rules, implicit rules, variable definitions, directives, and comments. Rules, variables, and directives are described at length in later chapters. •
•
•
•
•
An explicit rule says when and how to remake one or more files, called the rule's targets. It lists the other files that the targets depend on, call the prerequisites of the target, and may also give commands to use to create or update the targets. See section Writing Rules. An implicit rule says when and how to remake a class of files based on their names. It describes how a target may depend on a file with a name similar to the target and gives commands to create or update such a target. See section Using Implicit Rules. A variable definition is a line that specifies a text string value for a variable that can be substituted into the text later. The simple makefile example shows a variable definition for objects as a list of all object files (see section Variables Make Makefiles Simpler). A directive is a command for make to do something special while reading the makefile. These include: o Reading another makefile (see section Including Other Makefiles). o Deciding (based on the values of variables) whether to use or ignore a part of the makefile (see section Conditional Parts of Makefiles). o Defining a variable from a verbatim string containing multiple lines (see section Defining Variables Verbatim). `#' in a line of a makefile starts a comment. It and the rest of the line are ignored, except that a trailing backslash not escaped by another backslash will continue the comment across multiple lines. Comments may appear on any of the lines in the makefile, except within a define directive, and perhaps within commands (where the shell decides what is a comment). A line containing just a comment (with perhaps spaces before it) is effectively blank, and is ignored.
What Name to Give Your Makefile By default, when make looks for the makefile, it tries the following names, in order: `makefile' and `Makefile'. Normally you should call your makefile either `makefile' or `Makefile'. If make finds none of these names, it does not use any makefile. Then you must specify a goal with a command argument, and make will attempt to figure out how to remake it using only its built-in implicit rules. If you want to use a nonstandard name for your makefile, you can specify the makefile name with the `-f' or `--file' option. The arguments `-f name' or `--file=name' tell make to read the file name as the makefile. If you use more than one `-f' or `--file' option, you can specify several makefiles.
148
All the makefiles are effectively concatenated in the order specified. The default makefile names `makefile' and `Makefile' are not checked automatically if you specify `-f' or `--file'.
Including Other Makefiles The include directive tells make to suspend reading the current makefile and read one or more other makefiles before continuing. The directive is a line in the makefile that looks like this: include filenames... filenames can contain shell file name patterns. Extra spaces are allowed and ignored at the beginning of the line, but a tab is not allowed. (If the line begins with a tab, it will be considered a command line.) Whitespace is required between include and the file names, and between file names; extra whitespace is ignored there and at the end of the directive. A comment starting with `#' is allowed at the end of the line. If the file names contain any variable or function references, they are expanded. For example, if you have three `.mk' files, `a.mk', `b.mk', and `c.mk', and $(bar) expands to bish bash, then the following expression include foo *.mk $(bar) is equivalent to include foo a.mk b.mk c.mk bish bash When make processes an include directive, it suspends reading of the containing makefile and reads from each listed file in turn. When that is finished, make resumes reading the makefile in which the directive appears. One occasion for using include directives is when several programs, handled by individual makefiles in various directories, need to use a common set of variable definitions or pattern rules Another such occasion is when you want to generate prerequisites from source files automatically; the prerequisites can be put in a file that is included by the main makefile. This practice is generally cleaner than that of somehow appending the prerequisites to the end of the main makefile as has been traditionally done with other versions of make. If the specified name does not start with a slash, and the file is not found in the current directory, several other directories are searched. First, any directories you have specified with the `-I' or `-include-dir' option are searched (see section Summary of Options). Then the following directories (if they exist) are searched, in this order: `prefix/include' (normally `/usr/local/include' (1)) `/usr/gnu/include', `/usr/local/include', `/usr/include'. If an included makefile cannot be found in any of these directories, a warning message is generated, but it is not an immediately fatal error; processing of the makefile containing the include continues. Once it has finished reading makefiles, make will try to remake any that are out of date or don't exist. See section How Makefiles Are Remade. Only after it has tried to find a way to remake a makefile and failed, will make diagnose the missing makefile as a fatal error.
149
If you want make to simply ignore a makefile which does not exist and cannot be remade, with no error message, use the -include directive instead of include, like this: -include filenames... This is acts like include in every way except that there is no error (not even a warning) if any of the filenames do not exist. For compatibility with some other make implementations, sinclude is another name for -include.
The Variable MAKEFILES If the environment variable MAKEFILES is defined, make considers its value as a list of names (separated by whitespace) of additional makefiles to be read before the others. This works much like the include directive: various directories are searched for those files. In addition, the default goal is never taken from one of these makefiles and it is not an error if the files listed in MAKEFILES are not found. The main use of MAKEFILES is in communication between recursive invocations of make. It usually is not desirable to set the environment variable before a top-level invocation of make, because it is usually better not to mess with a makefile from outside. However, if you are running make without a specific makefile, a makefile in MAKEFILES can do useful things to help the built-in implicit rules work better, such as defining search paths. Some users are tempted to set MAKEFILES in the environment automatically on login, and program makefiles to expect this to be done. This is a very bad idea, because such makefiles will fail to work if run by anyone else. It is much better to write explicit include directives in the makefiles.
How Makefiles Are Remade Sometimes makefiles can be remade from other files, such as RCS or SCCS files. If a makefile can be remade from other files, you probably want make to get an up-to-date version of the makefile to read in. To this end, after reading in all makefiles, make will consider each as a goal target and attempt to update it. If a makefile has a rule which says how to update it (found either in that very makefile or in another one) or if an implicit rule applies to it, it will be updated if necessary. After all makefiles have been checked, if any have actually been changed, make starts with a clean slate and reads all the makefiles over again. (It will also attempt to update each of them over again, but normally this will not change them again, since they are already up to date.) If you know that one or more of your makefiles cannot be remade and you want to keep make from performing an implicit rule search on them, perhaps for efficiency reasons, you can use any normal method of preventing implicit rule lookup to do so. For example, you can write an explicit rule with the makefile as the target, and an empty command string. If the makefiles specify a double-colon rule to remake a file with commands but no prerequisites, that file will always be remade. In the case of makefiles, a makefile that has a double-colon rule with commands but no prerequisites will be remade every time make is run, and then again after make starts over and reads the makefiles in again. This would cause an infinite loop: make would constantly remake the makefile, and never do anything else. So, to avoid this, make will not
150
attempt to remake makefiles which are specified as targets of a double-colon rule with commands but no prerequisites. If you do not specify any makefiles to be read with `-f' or `--file' options, make will try the default makefile names; see section What Name to Give Your Makefile. Unlike makefiles explicitly requested with `-f' or `--file' options, make is not certain that these makefiles should exist. However, if a default makefile does not exist but can be created by running make rules, you probably want the rules to be run so that the makefile can be used. Therefore, if none of the default makefiles exists, make will try to make each of them in the same order in which they are searched for until it succeeds in making one, or it runs out of names to try. When you use the `-t' or `--touch' option, you would not want to use an out-of-date makefile to decide which targets to touch. So the `-t' option has no effect on updating makefiles; they are really updated even if `-t' is specified. Likewise, `-q' (or `--question') and `-n' (or `--just-print') do not prevent updating of makefiles, because an out-of-date makefile would result in the wrong output for other targets. Thus, `make -f mfile -n foo' will update `mfile', read it in, and then print the commands to update `foo' and its prerequisites without running them. The commands printed for `foo' will be those specified in the updated contents of `mfile'. However, on occasion you might actually wish to prevent updating of even the makefiles. You can do this by specifying the makefiles as goals in the command line as well as specifying them as makefiles. When the makefile name is specified explicitly as a goal, the options `-t' and so on do apply to them. Thus, `make -f mfile -n mfile foo' would read the makefile `mfile', print the commands needed to update it without actually running them, and then print the commands needed to update `foo' without running them. The commands for `foo' will be those specified by the existing contents of `mfile'.
How make Reads a Makefile “make” does its work in two distinct phases. During the first phase it reads all the makefiles, included makefiles, etc. and internalizes all the variables and their values, implicit and explicit rules, and constructs a dependency graph of all the targets and their prerequisites. During the second phase, make uses these internal structures to determine what targets will need to be rebuilt and to invoke the rules necessary to do so. It's important to understand this two-phase approach because it has a direct impact on how variable and function expansion happens; this is often a source of some confusion when writing makefiles. We say that expansion is immediate if it happens during the first phase: in this case make will expand any variables or functions in that section of a construct as the makefile is parsed. We say that expansion is deferred if expansion is not performed immediately. Expansion of deferred construct is not performed until either the construct appears later in an immediate context, or until the second phase.
Variable Assignment Variable definitions are parsed as follows:
151
immediate = deferred immediate ?= deferred immediate := immediate immediate += deferred or immediate define immediate deferred endef For the append operator, `+=', the right-hand side is considered immediate if the variable was previously set as a simple variable (`:='), and deferred otherwise. Conditional Syntax All instances of conditional syntax are parsed immediately, in their entirety; this includes the ifdef, ifeq, ifndef, and ifneq forms.
Rule Definition A rule is always expanded the same way, regardless of the form: immediate : immediate ; deferred deferred That is, the target and prerequisite sections are expanded immediately, and the commands used to construct the target are always deferred. This general rule is true for explicit rules, pattern rules, suffix rules, static pattern rules, and simple prerequisite definitions.
152
9.4 Sample Makefile # Put all your source files here. SRC=main.c source1.c source2.cpp OBJ1=$(SRC:.c=.o) OBJ=$(OBJ1:.cpp=.o) # This is the name of your output file OUT=runme # This specifies all your include directories INCLUDES=-I/usr/local/include -I. # Put any flags you want to pass to the C compiler here. CFLAGS=-g -O2 -Wall # And put any C++ compiler flags here. CCFLAGS=$(CFLAGS) # CC speficies the name of the C compiler; CCC is the C++ compiler. CC=cc CCC=CC # Put any libraries here. LIBS=-L/usr/local/lib -lm LDFLAGS= ##### RULES ##### # All rules are in the format: # item: [dependency list] # command # This means that "item" depends on what's in the dependency list; in other # words, before "item" can be built, everything in the dependency list must # be up to date. # Note that this MUST be a tab, not a set of spaces! .SUFFIXES: .c .c .ln default: dep $(OUT) .c.o: $(CC) $(INCLUDES) $(CFLAGS) -c $< -o $@ .cpp.o: $(CCC) $(INCLUDES) $(CCFLAGS) -c $< -o $@ $(OUT): $(OBJ)
153
$(CC) $(OBJ) $(LDFLAGS) $(LIBS) -o $(OUT) depend: dep dep: makedepend -- $(CFLAGS) -- $(INCLUDES) $(SRC) clean: /bin/rm -f $(OBJ) $(OUT)
154