Advanced System Administration II - Student Manual

SuSE Linux Enterprise Server Advanced System Administration II

SuSE Linux Enterprise Server: Advanced System Administration II
Release: June 2003 (SuSE Linux Enterprise Server 8)

This work is protected by copyright. All rights in connection with the reproduction or copying of this training manual or parts thereof are reserved.

1 Compiling Software Learning Aims In this chapter, learn • about the interrelation between sources, makefiles, and programs • how to compile software from source

1 Compiling Software As most of the software for Linux is under the GPL (General Public License), it is usually available as source. Happily, distributors spare users a lot of work by compiling these sources, packing them in packages, and selling them in bundled form on CDs. Eventually, you may come across a program for which there is no suitable RPM package that must be compiled before you can use it.

This chapter covers the basic tools and components needed for compiling a program from source. A small C program is used as an example in this chapter. A different procedure may be needed for other programming languages.

1.1 Basics

Actually, computers only understand zeros and ones, which is rather difficult for humans. In the early days of computer programming, programmers had to master this language to communicate with their computers. As this was extremely cumbersome, various programming languages were developed, enabling commands to be communicated to the computer in a more or less intelligible language — at least more intelligible than a text consisting exclusively of zeros and ones.

However, the computer does not understand these commands directly; first, these commands must be translated to machine code, a language that the computer is able to understand. This is done by means of a special program called the compiler. Actually, the translation of program code to machine code comprises several steps, as shown in Figure 1.1 on the facing page.

In practice, the individual programs do not have to be executed manually — the program gcc (Gnu C Compiler) makes sure that the individual functions are executed in the correct sequence.


1.2 “Hello World” in C Source code in high level language


Program in assembler language


Object code



Executable program

Figure 1.1: From Source Code to the Executable Program (source: Oualline, 1997)


“Hello World” in C

This procedure can be demonstrated by means of a simple example. For this purpose, let us take a look at the program "Hello World", which is frequently used as a simple example when introducing a programming language. This manual does not attempt to provide an introduction to C programming. It merely endeavors to show the basic workflow for creating a program. /* hello.c * Prints Hello World on the screen */ /* Inclusion of a header file */ #include <stdio.h> /* Main function */ int main () { /* Prints Hello World */ printf("Hello World\n"); return (0); }

1 Compiling Software Comments are inserted between “/*” and “*/”. The line #include <stdio.h> includes a header (or include) file. Files specified in this way are included in the source code during compilation. int main () is the main function of the program. The commands to execute are placed between braces. In this example, only two commands are used, one for printing "Hello World" on the screen and one for returning “0” as return value. The program can be compiled by simply executing gcc: tux@earth:~ > gcc -Wall -o hello hello.c

The option -Wall makes sure that alerts are displayed. The result is the program hello, whose only function is to print "Hello World" on the screen. The individual steps represented in the above figure are not visible when gcc is executed. However, some aspects can be seen later, such as the fact that the program is linked to libraries. tux@earth:~ > ldd ./hello => /lib/ (0x40022000) /lib/ => /lib/ (0x40000000)

In this example, the program was linked to and

Exercise Perform the steps described above. Write a program called hello.c then compile and test it.

1.3 ./configure and the Makefile The example hello.c demonstrates the basic elements of a program. However, real software projects consist of multiple files that have to be compiled in the correct sequence. The resulting object files must be linked to form an executable program. This task can be automated with the help of the program make and a makefile. Moreover, make accelerates the development of programs, as only the program parts whose sources changed since the last compiling cycle are recompiled when the source is tested or edited. For example, to compile the program hello.c with the help of make, prepare a makefile containing the needed commands.


1.3 ./configure and the Makefile A makefile can • compile the program from source • install the program • uninstall the program • clean up the directory in which the compilation was performed A makefile consists of targets, dependencies, and the commands for the targets. Targets and dependencies are separated by a colon. The commands must be placed under the target, indented with one tab space. “#” introduces comments. A makefile for the file hello.c could appear as follows: # Makefile for hello all: hello hello: hello.c gcc -Wall -o hello hello.c install: hello install -m 755 hello /usr/local/bin/hello uninstall: /usr/local/bin/hello rm -f /usr/local/bin/hello clean: rm -f hello

If you execute the command make while you are in the respective directory, the program make will search this directory for the files GNUMakefile, Makefile, or makefile and execute the commands in the file it finds first. If make is executed without any parameters, the first target is used. In our example, this is all. This target is associated with the target hello, which specifies the steps to take: in case hello.c is newer than the file hello, gcc is executed with the respective options. The command make can also be used with individual targets: for example, make install (as root) installs the file at the specified location and make uninstall removes the file. The use of make is not limited to the development of programs. It can also be used for other purposes, such as updating NIS maps or other projects in which specific actions should be performed on the basis of the changes made to files. Basically, even large software projects are carried out in the same way. Naturally, the respective makefiles are much more extensive and complex. If the software should be compiled to a functional program on multiple architectures, things get really complicated.

1 Compiling Software For this reason, a number of tools were developed to facilitate the programmer’s task of creating a suitable makefile, such as the programs autoconf and automake. The end product of these tools is a script called configure. This script generates a makefile that is adapted to the machine on which the software will be compiled. Options used when executing configure are taken into consideration (such as configure -prefix=/usr). If a configure script exists, the software installation from source is usually limited to three short commands: earth:~ # ./configure creating cache ./config.cache checking for a BSD compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking whether make sets ${MAKE}... yes checking for working aclocal... found ... earth:~ # make make all-recursive make[1]: Change to the directory »/tmp/xpenguins-2.2« Making all in src make[2]: Change to the directory »/tmp/xpenguins-2.2/src« gcc -DHAVE_CONFIG_H -I. -I. -I.. -I/usr/X11R6/include -g -O2 -DPKGDATADIR= \""/usr/local/share/xpenguins"\" -c main.c ... earth:~ # make install

In contrast, the uninstallation is a bit more difficult. Usually, make uninstall is not available, so the individual programs often have to be removed manually. This can be solved by using the package manager RPM.

1.4 diff and patch Software is never static. As time goes on, bugs are fixed, new features are programmed, and security issues are solved. Often these changes merely affect a small part of the project files. If the source archive is very large, it would therefore spare resources if only the difference to the previous version is downloaded from the Internet. The programs used for this purpose are diff and patch: • diff generates a file containing the differences between two files. • patch generates the new version from the old file and the file generated by diff.


1.4 diff and patch Consider the program hello.c. The project is internationalized by placing the English version of hello.c in the directory en and a German version in the directory de: /* hello.c * Prints Hello World on the screen */ /* Includes a header file */ #include <stdio.h> /* Main function */ int main () { /* Prints Hello World in German */ printf("Hallo Welt\n"); return (0); }

The difference between the two directories, which only contain the two said files in our example, is generated with the command diff: tux@earth:~ > diff -rNu de/ en/ --- en/hello.c 2003-01-30 16:34:49.000000000 +0100 +++ de/hello.c 2003-01-30 16:34:26.000000000 +0100 @@ -7,7 +7,7 @@ /* Main function */ int main () { /* Prints Hello World */ printf("Hello World\n"); + /* Prints Hello World in English*/ + printf("Hallo Welt\n"); return (0); }

The output can be redirected to a file such as hello.diff. The options used have the following meanings: Option -r -N -u

Meaning Recursively compare any subdirectories found Treat absent files as empty Output in unified output format

The individual lines have the following meanings: Line --- and +++ @@ ... @@ - and +

Meaning Source and target file Range of the change Removed and added lines

Additional options and their meanings are listed in the man pages of diff.

1 Compiling Software patch generates the new version from this file and the old file in the directory de: tux@earth:~ > patch -d de -p1 < hello.dif patching file hello.c

The options used with the command have the following meanings: Option -d -p1

Meaning First change to the specified directory Removes the first slash in path specifications

Additional options and their meanings are listed in the manual pages of patch. The new file can be compiled and used as usual.

Exercise Download the source of the program xpenguins from and install the program. Take a look at the makefile. (Note: You may have to activate the option in Programs in desktop window under KDE Desktop Behavior for xpenguins to Control Center produce a visual effect.)

1.5 For More Information • Steve Oualline: Practical C, 3rd. Ed., O’Reilly & Associates, Sebastopol, 1997 • Andrew Oram & Steve Talbott: Managing projects with make, O‘Reilly & Associates, Sebastopol, 1991

Summary • A compiler is used to create a program from source. • Makefiles greatly facilitate the creation of programs. • Usually, the three commands ./configure, make, and make install are sufficient for compiling programs from source. • The programs diff and patch can be used to create a new file version based on the difference from the previous version of the respective file.


2 RPM (Red Hat Package Manager) Learning Aims In this chapter, learn • about the Red Hat Package Manager • about the structure of the RPM spec file • how to build your own RPM packages

© 2003, SuSE Linux AG (


2 RPM (Red Hat Package Manager)

2.1 Introduction The Red Hat Package Manager (RPM) is a powerful tool for installing and uninstalling software on Linux machines. In the course “SuSE Linux Enterprise Server: System Administration Basics”, you learned how RPM can be used for installing software. In contrast, this section teaches you to build your own RPM packages. Although the installation of a program from source archives allows the program to be adapted to the machine in the best way possible, this approach has some disadvantages. For instance, you need to install a compiler and other tools on the machine and uninstallation usually requires an additional effort — often you have to delete the individual files manually. RPM is characterized by the following features and advantages: • The sources are used as they are obtained from the program author. • The installation is greatly simplified for the user. • Uninstallation is possible without leaving behind any “dead files”. • Dependencies from other packages are taken into consideration. • Information about the installed packages can be queried. • Updates are relatively easy to perform. Nevertheless, to benefit from these advantages, additional work is necessary initally, as the RPM package first must be built.

2.2 Basics To build an RPM package, you need at least two (sometimes three) components: • The program sources • A spec file • Possibly the patches required for the compilation


2.2 Basics The sources, spec file, and patches are located in a defined directory structure. In SuSE Linux Enterprise Server, this structure already exists under /usr/src/packages/: • SOURCES: This directory contains the original source archives. • SPECS: This directory contains the spec files controlling the build process. • BUILD: This is where the source code is unpacked. • RPMS: Upon completion, the RPM package is placed in this directory (or one of its subdirectories). • SRPMS: Upon completion, the source RPM package is placed in this directory.



Authors usually make the sources available as packed .tgz archives on FTP servers. Copy these archives to the SOURCES directory without modifying them.



Patches may be necessary to take specific requirements of the target system into consideration or to solve platform-dependent problems.


The Spec File

The spec file is the core of the RPM build process. It contains the information RPM needs for building the package. Its basic structure is as follows: • The Preamble The preamble contains information displayed when the user queries package information. This includes information about the version, copyright, and a description of the package. • The Prep Section This is where the actual compilation of the package begins — most importantly, the sources are unpacked and any required patches are applied. Basically, this is a shell script that even enables the execution of very specific actions.

© 2003, SuSE Linux AG (


2 RPM (Red Hat Package Manager) • The Build Section Just like the prep section, the build section is a shell script. This section can consist of a single make command or several commands — whatever is needed for compiling the program from the sources. • The Install Section This section too is a shell script. It contains the commands required for installing the package, such as the command make install or any other commands needed for this purpose, such as mv, cp, or install. • Install and Uninstall Scripts In contrast to the previous sections, which are associated with the build system, this section contains scripts executed during the installation or uninstallation of the finished RPM package. Four points in time are available for this purpose: – Before the installation of the package – After the installation of the package – Before the uninstallation of the package – After the uninstallation of the package • The Verify Script This script, too, is executed on the system of the user when RPM verifies the correct installation of the package. Although this is a default function of RPM, this script can also cover aspects that exceed the possibilities of RPM. • The Clean Section Following the build process, RPM automatically performs a clean-up operation. However, this script also enables the execution of any additional actions that may be needed. • The File List This final section consists of a list of files comprising the package. This list is vital, as no package can be built without it.


2.3 Building Your Own RPM Package


Building Your Own RPM Package

As an example, we will consider the SuSE FTP Proxy, which is part of the SuSE Proxy Suite (to date, the suite merely contains this item). The sources are available on the SuSE FTP server: Copy this source archive to the directory SOURCES. As the compilation of the package is part of the build process of an RPM package, first check if the package can be compiled without RPM according to the instructions in the README file. This should work without problems with the Proxy Suite. However, some additional packages may need to be installed. If a package can only be installed after some modifications (e.g., of the makefile), these modifications must be included in a patch that is installed when the package is built with RPM. In this way, the RPM can be built without modifying the sources. The procedure for creating the patch exceeds the scope of this manual. However, information about this subject is available in other documentation, such as the RPM HOWTO:,7.2 Test Building.


The Spec File

# # Spec file for the Proxy Suite. # Summary: The SuSE Proxy Suite: ftp-proxy. Name: proxy-suite Version: 1.9 Release: 1 Copyright: GPL Group: Productivity/Networking/Ftp/Servers Source: %description The SuSE FTP Proxy (part of the SuSE Proxy Suite, still waiting to become a Suite after all.) Use to proxy ftp connections across an ALG.

Generally, the structure of this section is as follows: keyword : entry (in the same line). The sequence of the lines within the preamble is not important. Name, Version, and Release specify the name of the software, the version number assigned by the author of the software, and the RPM release number, respectively. The file name of the RPM package and the name of the package are generated from these three components. The other lines are more or less self-explanatory. %description is followed by a package description (possibly consisting of several lines) for the user.

2 RPM (Red Hat Package Manager) In this example, the %prep section merely contains a preconfigured macro for unpacking the source archive to the directory BUILD: %prep %setup

Shell scripts (without the usual #/bin/bash at the beginning) can be inserted in this section as well as in most of the following sections. The %build section contains the commands needed for compiling the sources. %build ./configure make

The generated files must be copied to the correct location in the file system. This is handled by the %install section: %install make install

Finally, the generated files must be included in the RPM package. As the previous steps do not provide any basis for RPM to guess which files actually belong to the package, the files must be listed explicitly in the %files section. Documentation files and configuration files can be marked as such. In the following example, the files AUTHORS, COPYING, INSTALL, and CREDITS are part of the documentation. %files %doc AUTHORS COPYING %doc INSTALL CREDITS /usr/local/sbin/ftp-proxy /usr/local/man/man5/ftp-proxy.conf.5.gz /usr/local/man/man8/ftp-proxy.8.gz /usr/local/etc/proxy-suite/ftp-proxy.conf

In this example, the %clean section can be used to delete the files copied to the file system: %clean rm /usr/share/man/man5/ftp-proxy.conf.5.gz rm /usr/share/man/man8/ftp-proxy.8.gz rm /usr/local/etc/proxy-suite/ftp-proxy.conf

Changes are logged under %changelog: * Mon Jan 27, 2003 - [email protected] - My first try

rpm -ba builds a binary RPM package as well as a RPM source package. rpm -bb only builds a binary RPM package. Now, run the test: rpm -ba /usr/src/packages/SPECS/proxy-suite.spec

If everything worked correctly, the directory /usr/src/packages/RPMS/i386/ should contain the package proxy-suite-1.9-1.rpm and the directory /usr/src/packages/SRPMS/ should contain the package proxy-suite1.9-1.src.rpm.


2.4 Fine-Tuning Your Own RPM Package

Exercise Download the package proxy-suite-1.9.tgz from one of the FTP servers indicated above and build an RPM package as described above.


Fine-Tuning Your Own RPM Package

The procedure described so far leads to a functional RPM package. Nevertheless, some fine-tuning may be required. For instance, one of the greatest disadvantages is that the program is installed on the build machine when the RPM package is built. This does not really matter if the program is not yet installed on the build system. However, if you want to build a package for a program that is already installed, such as Postfix, you have a problem. In this case, the existing Postfix installation and the configuration would be overwritten, resulting in the corruption of the existing mail configuration. Furthermore, you may want to change the directories to which the program and data files are copied. For example, you may want the finished files to be copied to /usr/ instead of /usr/local/. This, too, requires some modifications.



The terms BuildRoot and BuildDir and the associated variables RPM_BUILD_ROOT and RPM_BUILD_DIR are quite similar to each other and can easily be confused. The build directory mentioned above is usually the directory /usr/src/packages/BUILD/. This is where the sources are unpacked and compiled. Provided nothing else is specified, BuildRoot is the root directory /. If a different directory is specified in the initial part of the spec file, RPM will look for the files listed under %files relative to this directory. ... Source: BuildRoot: /tmp/%{name}-buildroot %description ...

Execute the command rpm -ba proxy-suite.spec once more. The modifications performed are not sufficient to solve all problems. make still installs the programs at the same location in the file system and RPM cannot find anything in the BuildRoot directory. Therefore, additional changes are required in the %build and %install sections. The procedure depends on the program for which to build the package. One possibility is to

2 RPM (Red Hat Package Manager) enter the command ./configure --help | less in the directory containing the unpacked sources. The output reveals the command line options you can use with the script. The commands relevant here are those that can be used to change the installation directory. Enter the suitable options at the correct position in the spec file: %build ./configure --prefix=$RPM_BUILD_ROOT/ make %install make install

Another advantage of using a BuildRoot directory is that it can be removed completely. %clean rm -rf $RPM_BUILD_ROOT

During the test stage, you can add the line rm -rf $RPM_BUILD_ROOT to the %prep section when building the RPM package. Following a test run of RPM, you will see where the files were copied. If this line is entered in the spec file, the directory will be removed prior to the next run, thus preventing the files from the previous run from interfering. Although the above configuration installs the files under the specified directory, thus preventing them from corrupting the build system, the files still do not appear where they should on the target system. The build process will not proceed without errors unless the %files list is modified. However, even if you modify this list, the above configuration will cause the binary ftp-proxy to end up in the directory /sbin/ — a place where it does not belong. Therefore, additional measures are necessary. Modifying the Paths The command ./configure --help | less reveals useful information about modifying the paths: --prefix=PREFIX --exec-prefix=EPREFIX

install architecture-independent files in PREFIX [/usr/local] install architecture-dependent files in EPREFIX [PREFIX]

By default, ‘make install’ will install all the files in ‘/usr/local/bin’, ‘/usr/local/lib’ etc. ... For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --mandir=DIR man documentation [PREFIX/man]


© 2003, SuSE Linux AG (

2.5 For More Information Accordingly, the customized %build section of the spec file could appear as follows: #./configure --prefix=$RPM_BUILD_ROOT/ --sbindir=$RPM_BUILD_ROOT/usr/sbin \ # --mandir=$RPM_BUILD_ROOT/usr/share/man #./configure make

And this is what the %files section could look like: %doc AUTHORS COPYING INSTALL CREDITS %doc /usr/share/man/man5/ftp-proxy.conf.5.gz %doc /usr/share/man/man8/ftp-proxy.8.gz %config /etc/proxy-suite/ftp-proxy.conf /usr/sbin/ftp-proxy

The entry %config marks configuration files that the user can query with the command rpm -qpc package.rpm, just as the entry %doc marks documentation files that can be queried with rpm -qpd package.rpm. Once the build process proceeds without any errors, the final step can be performed. Test the installation of the package with rpm on another system and check if the RPM package and the installed package work correctly.

Exercise Modify the spec file from the previous exercise in such a way that the files are installed in a separate directory and installed on the target system in an FHS-compliant way.


For More Information

Check the following sources for additional information: • The book “Maximum RPM” (published by SAMS) is available in pdf format at: • The RPM HOWTO: • Any spec file, such as those of the source RPMs enclosed on the installation CDs

2 RPM (Red Hat Package Manager)

Summary • RPM is an ideal tool for installing and uninstalling software packages. • You need the source code, a spec file, and possible some patch files to create your own RPM package. • The effort required for preparing the spec file and the RPM package is compensated by the comfortable software installation and uninstallation.


3 The Linux Kernel Learning Aims In this chapter, learn • how to handle kernel modules • how to compile a new kernel • how to install a new kernel • how to patch a kernel • how to remove patches from the kernel sources • how to customize the initial ramdisk initrd

© 2003, SuSE Linux AG (


3 The Linux Kernel

3.1 About the Kernel The kernel is the core of the operating system — a layer between the hardware and the application processes. The kernel performs the following tasks: • Management of the hardware resources (CPU, RAM, devices, computing time, etc.) • Process management • File system management

Application Kernel


Figure 3.1: The Kernel

3.2 General Information about the Linux Kernel 3.2.1 The Kernel Version Numbers The first kernel was released in September 1991 under the version number 0.01. The size of the compressed tar archive was only 70 KB (decompressed: 465 KB). In contrast, the size of the compressed archive of the kernel sources of version 2.4.20 is almost 25 MB. In the realm of Open Source, the version number 1.0 is reserved for the first stable version. Linux reached this status in March 1994, when version 1.0 was released.


3.2 General Information about the Linux Kernel Today, the generally accepted version number system is as follows: linux X.Y.Z • X: Is only stepped up when the kernel undergoes drastic changes. For example, new features in version 2.0 included multiprocessor support, dynamic loading and unloading of kernel modules, and quotas. • Y: Even numbers indicate stable versions. Odd numbers indicate developer versions. • Z: Specifies the exact version number. Example: In January 2003, the latest stable version was 2.4.20 and the developer version was 2.5.59.


The Kernel Sources

To compile a new kernel, you need the kernel sources, which are available on the Internet or on the SuSE installation media (CDs). • From the Internet: The official kernel sources are available at For example, the sources for version 2.4.20 can be downloaded with the following command: wget

The sources are available both in gzip (.gz) and in bzip2 format (.bz2). The archives compressed with bzip2 are usually a bit smaller than those compressed with gzip. Kernels compiled from the unmodified kernel sources are referred to as vanilla kernels. • Using the SuSE kernel sources on the installation CDs: For this purpose, the package kernel-source must be installed. The kernel sources are installed to the directory /usr/src/linux-version.SuSE. The SuSE kernel sources contain a number of patches that do not exist in the original kernel sources.

© 2003, SuSE Linux AG (


3 The Linux Kernel

3.2.3 Files and Directories On a standard SuSE system, the kernel is located in the directory /boot and is called vmlinuz. The following overview shows a number of additional files and directories associated with the kernel: • /boot/vmlinuz: The actual kernel. • /boot/initrd: The initial ramdisk containing all modules required for booting. • /lib/modules/: Directory containing the kernel modules. • /etc/modules.conf: Configuration file for the kernel modules.

3.3 Kernel Modules In the past, all hardware drivers were compiled into the Linux kernel. When new hardware components not supported by the kernel were added, a new kernel had to be compiled. This situation was remedied through the introduction of kernel modules, which can be loaded during operation whenever necessary. These kernel modules are usually hardware drivers, but there are also modules for various file systems, IPv6 support, firewall functions, and so forth. Thus, modules can be defined as loadable device drivers and kernel functions.

3.3.1 Information about Loaded Kernel Modules On a standard SuSE Linux system, a number of kernel modules are usually loaded in the memory. The command /sbin/lsmod shows which modules are currently loaded: earth:~ # lsmod Module ipv6 st sr_mod cdrom sg usb-uhci usbcore 8139too mii lvm-mod reiserfs aic7xxx


Size 150036 28428 14616 28736 29568 23052 61696 15208 1232 65184 193424 124856

Used by Not tainted -1 (autoclean) 0 (autoclean) 0 (autoclean) (unused) 0 (autoclean) [ide-cd sr_mod] 0 (autoclean) 0 (unused) 1 [usb-uhci] 1 0 [8139too] 5 (autoclean) 3 0

3.3 Kernel Modules The number in the third column indicates how many other modules use this module. The fourth column contains further information about the module: • (autoclean) shows that this module is managed by the kerneld (kernel version 2.0.x) or kmod (kernel version 2.2.x or later). • (unused) means that the module is currently not being used.

Exercise List the loaded modules with lsmod and compare the output with the content of the file /proc/modules.


Information about the Function of Kernel Modules

While lsmod lists the loaded kernel modules, the command /sbin/modinfo shows information on the functions of these modules. earth:~ # modinfo 8139too filename: /lib/modules/2.4.19-4GB/kernel/drivers/net/8139too.o description: ‘‘RealTek RTL-8139 Fast Ethernet driver’’ author: ‘‘Jeff Garzik <[email protected]>’’ license: ‘‘GPL’’ parm: multicast_filter_limit int, description ‘\infty139too maximum number of filtered multicast addresses‘‘ parm: max_interrupt_work int, description ’’8139too maximum events handled per interrupt‘‘ parm: media int array (min = 1, max = 8), description ’’8139too: Bits 4+9: force full duplex, bit 5: 100Mbps‘‘ parm: full_duplex int array (min = 1, max = 8), description ’’8139too: Force full duplex for board(s) (1)‘‘ parm: debug int, description ’’8139too bitmapped message enable number‘‘

There are five different information types: Information type filename: description: author: license: parm:

Meaning The file name of this module. A brief description. The authors. The license of this module. Possible parameters for configuring the module.

Section 3.3.4 on the next page shows how the parameters can be used to configure the kernel modules.

© 2003, SuSE Linux AG (


3 The Linux Kernel

Exercise Execute the command modinfo for some of the modules loaded on your system.

3.3.3 Removing Modules As a general rule, only modules that are not used and not needed by other modules can be removed. The command for removing modules is /sbin/rmmod. If you try to remove a module that is currently being used, the following message is displayed: earth:~ # rmmod reiserfs reiserfs: Device or resource busy

Kernel modules that are not needed are not removed automatically. For kernel modules that are not needed to be removed automatically, configure a cron job. The following command removes all unused modules: rmmod -a The option -a does the following: • If a module is not used, it is marked as “removable”. • Modules marked as “removable” are removed.

Exercise 1. Use the command rmmod to remove an unused module. 2. Execute rmmod -a several times. Compare the output of lsmod before and after you execute the command.

3.3.4 Loading Modules Kernel modules are loaded automatically whenever necessary or manually by means of the commands /sbin/insmod or modprobe. Starting with kernel version 2.0, the automatic loading of the modules was handled by the kerneld (kernel daemon). As of kernel version 2.2, this task is handled by the kernel thread kmod.


3.3 Kernel Modules Example for the automatic loading of a module: On a system on which the Logical Volume Manager (LVM) is not installed, normally no module is loaded for the LVM: earth:~ # lsmod | grep lvm earth:~ #

If you search for Logical Volumes with the command lvscan , the required module will be loaded automatically: earth:~ # lvscan lvscan -- no volume groups found earth:~ # lsmod | grep lvm lvm-mod 63040 earth:~ #



There are two commands for manual loading: insmod and modprobe. insmod can be used to load individual modules; however, dependencies from other modules are not taken into consideration.

Example 1: Loading the module bluetooth with insmod: This works smoothly, as there are no further dependencies: earth:~ # insmod bluetooth earth:~ # lsmod Module Size bluetooth 17632 ...

Used by Not tainted 0 (unused)

Example 2: Loading the module ip_conntrack_ftp with insmod: earth:~ # insmod ip_conntrack_ftp Using /lib/modules/2.4.19-4GB/kernel/net/ipv4/netfilter/ip_conntrack_ftp.o /lib/modules/2.4.19-4GB/kernel/net/ipv4/netfilter/ip_conntrack_ftp.o: unresolved symbol ip_conntrack_helper_register ...

In this case, various error messages (unresolved symbol) are displayed, indicating that other kernel modules are needed. This can be remedied with the command modprobe, which takes dependencies into consideration and automatically loads the respective modules (see Example 3).

© 2003, SuSE Linux AG (


3 The Linux Kernel Example 3: Loading the module ip_conntrack_ftp with modprobe: earth:~ # modprobe ip_conntrack_ftp earth:~ # lsmod Module Size Used by Not tainted ip_conntrack_ftp 3456 0 (unused) ip_conntrack 14140 1 [ip_conntrack_ftp] ...

The module ip_conntrack is loaded automatically.

Exercise Try to load modules. Here are some example modules that you can load and remove: ipsec An IPSEC module (VPN). zft-compressor A compression module for floppy tapes. raid5 A module for software RAID.

3.3.5 The File /etc/modules.conf Settings affecting the kernel modules are stored in the file /etc/modules.conf (previously: /etc/conf.modules). Here are some of the parameters that can be set in the file /etc/modules.conf: Parameter



Sets the path to the file containing the module dependencies (default: depfile=/lib/modules/version/modules.dep) This parameter can be used to specify additional directories to search for kernel modules. With this parameter, an additional name (an alias) can be set for the module. Syntax: alias alias_name module_name Example: alias eth0 e100 This parameter can be used to specify options for a module (or an alias). Example: options 3c505 io=0x300 irq=10

path alias



3.4 Custom Linux Kernels


Custom Linux Kernels

Today, the compilation of custom kernels is usually unnecessary, as new device drivers can be made available by means of suitable modules. Furthermore, modern hardware does not require the compilation of resource-friendly lean kernels. Nevertheless, there are some situations in which the compilation of a kernel does make sense. There are some interesting kernel patches that enhance the operating system with a number of useful features. Here is an incomplete list of available kernel patches: • LIDS (Linux Intrusion Detection System): LIDS is a kernel patch and an administration tool for the Linux Intrusion Detection System. LIDS expands the kernel with a number of security features. For example, LIDS can be used to restrict the rights of the user root. See • Openwall security patches: • BadRam: A kernel patch that is able to handle defective RAM modules. • MOSIX: MOSIX (Multicomputer Operating System for UnIX) is a software package for the administration of Linux clusters. A kernel patch is included. This section shows how you can compile and install your own kernel and how to handle kernel patches. Important note: SuSE does not provide any support for systems running non-SuSE kernels.


Compiling a Kernel

This section covers the compilation of custom kernels without kernel modules. The description is based on the kernel sources of version 2.4.20. The procedure is as follows: 1. Decompress kernel sources 2. Configure kernel 3. Compile kernel 4. Install kernel 5. Test kernel

© 2003, SuSE Linux AG (


3 The Linux Kernel Decompressing the Kernel Sources

The directory /usr/src is the correct location for installing the kernel sources.

Exercise Ask your trainer where the kernel sources are located and decompress the source archive in the directory /usr/src.

Following the decompression, there should be a new directory /usr/src/linux containing the kernel sources.

Overview of the Kernel Configuration

To configure the new kernel, change to the directory /usr/src/linux. Basically, there are three ways of performing the configuration:

• make config This method is not advisable, as it is very time-consuming. All configuration parameters are queried one by one.

• make menuconfig This approach provides a simple menu that can be used easily with the keyboard.

• make xconfig This command starts a graphical configuration tool, shown in Figure 3.2 on the facing page. This configuration tool is used for the further procedure.


3.4 Custom Linux Kernels

Figure 3.2: make xconfig Main Menu To begin with, here is an overview of the most important available configuration options: Option


Code maturity level options

Here, determine whether you want options that are not fully mature to be displayed. Support for kernel modules. General processor settings: processor type, multiprocessor support, high-memory support, etc. General settings: network support, PCI support, PCMCIA support, etc. For devices connected to the parallel port, such as zip drives, or for setting up a PLIP network (Parallel Line Internet Protocol). Configuration of plug and play devices via the kernel. Support for block-oriented devices (RAID controllers, floppy disk drives, etc.). Support for software RAID and LVM. General network options (no drivers for network adapters): network protocols, router functions, net filters, etc.

Loadable module support Processor type and features

General setup Parallel port support

Plug and Play configuration Block devices Multidevice support Networking options

© 2003, SuSE Linux AG (


3 The Linux Kernel Option



Support of mass storage media, such as IDE hard disks and ATAPI CD-ROM drives. General SCSI support and drivers for SCSI controllers. Drivers for network adapters. ISDN support for Linux. Character devices such as terminals and mice. Support for various file systems (ext2, ReiserFS, XFS, JFS, VFAT, NTFS, etc.). VGA console and framebuffer device support. Drivers for sound cards. USB support (Universal Serial Bus).

SCSI support Network device support ISDN subsystem Character devices File systems Console drivers Sound USB support

Important note concerning the kernel configuration: The kernel must be able to address and mount the root partition. You need a driver for the hardware (e.g., IDE or SCSI) and a driver for the file system (e.g., ext2 or ReiserFS) of the root partition. If, for example, the file system is not supported by the kernel, the kernel will terminate the boot process with a kernel panic message: Kernel panic: VFS: Unable to mount root fs on 03:03

Drivers and kernel properties can be integrated into the kernel or compiled as kernel modules. For example, Figure 3.3 shows a driver for a network adapter that is configured as a module (see arrow).

Figure 3.3: Configuring Kernel Modules

When the kernel configuration is terminated, all configuration parameters are written to the file .config. On some Linux systems, there is a configuration file for the current kernel. This file is located in the virtual proc file system: /proc/config.gz


3.4 Custom Linux Kernels Compiling a New Kernel Following the configuration, the kernel must be compiled. This is done in several steps, using the command make: 1. make dep Dependency check. 2. make clean Removal of old log files and object files. 3. make bzImage Compilation of the kernel. The compressed kernel image is located under arch/i386/boot/bzImage. 4. make modules Compilation of the kernel modules. 5. make modules_install Installation of the modules in the directory /lib/modules. Hint: All five steps can be performed with a single command: earth:/usr/src/linux # make dep clean bzImage modules modules_install

Additional options of make: Option



A clean-up function similar to make clean. Additionally, configuration files are removed. Like make mrproper, but the following files are also removed: core, *.orig, *˜, *.SUMS, *.bak, *.rej Generates an RPM spec file (required for generating a kernel RPM).

distclean spec

Exercise Configure a new kernel in such a way that this kernel can at least mount the root partition. At the moment, you do not need any of the other configuration options. To do this, you need the following configuration options:

© 2003, SuSE Linux AG (


3 The Linux Kernel • For IDE systems: Normally, this option should already be active. Check the settings under ATA/IDE/MFM/RLL support. • For SCSI systems: Normally, this option also should already be active. General SCSI support should also be active. In addition, you need the driver for the SCSI SCSI low-level drivers adapter: SCSI support • For all: Activate the file system driver for the root partition (File systems). Installing a New Kernel To test the compiled kernel, the kernel image must be copied to the correct location (directory /boot) and the boot manager needs to be adapted accordingly. Step-by-step description of the kernel installation: 1. Copy the kernel image to the directory /boot: The kernel image bzImage is located in the directory arch/i386/boot under the directory containing the Linux sources. Depending on the hardware architecture, the image may also be located in other directories. The path for the Alpha hardware architecture is arch/alpha/boot/ and the path for S/390 is arch/s390/boot/. cp



2. Configuration of the boot manager: If the boot manager GRUB is used, the configuration file /boot/grub/menu.lst must be expanded. A boot entry for the new kernel could look as follows: title linux-2.4.20 kernel (hd0,1)/boot/vmlinuz-2.4.20


Exercise 1. Install your new kernel. 2. Test the kernel by rebooting the system and selecting your new kernel in the boot manager. 3. Configure your kernel anew. Now you should also compile a module for the network adapter in your machine. 4. Install and test this kernel too.


3.4 Custom Linux Kernels


Updating (Patching) the Kernel Sources

To replace a kernel version with a newer kernel version, you do not need to download the entire sources from the Internet. You merely need to download a file containing the differences between the two versions. The diff file (patch file) can be generated with the command diff. These files can be downloaded from A diff file contains exactly the changes from one kernel version to the next. Example: The file patch-2.4.21.bz2 contains the changes between 2.4.20 and 2.4.21. Applying Patch Files The differences of a patch file can be incorporated in the current version of the source files by means of the command patch. If the kernel sources were unpacked in the directory /usr/src, the patch should also be copied to the directory /usr/src. Example: earth:~ # ls /usr/src drwxr-xr-x 14 root drwxr-xr-x 7 root -rw-r--r-1 root

root root root

632 Mar 14 08:25 linux-2.4.20 168 Mar 12 10:14 packages 2741941 Mar 13 08:56 patch-2.4.21.bz2

As the patch file is compressed with bzip2 or gzip, it first must be decompressed: earth:~ # cd /usr/src earth:/usr/src # bunzip2 patch-2.4.21.bz2

Now the kernel sources can be updated to the new version with the command patch: earth:/usr/src # patch -p0 <patch-2.4.21

The option -p specifies which path components are to be removed in file names in the patch file: Option


-p0 -p1 -p2

/dir1/dir2/dir3/file dir1/dir2/dir3/file dir2/dir3/file

-p3 -p3

dir3/file file

Function The path is not modified. One slash (/) is removed. One path component containing two slashes is removed. ...

The option -p0 must be used for patching kernel sources. The decompression and patching stages can also be combined by means of a pipe: earth:~ # cd /usr/src earth:/usr/src # bunzip2 -cd patch-2.4.21.bz2 | patch -p0

3 The Linux Kernel Following the successful update, old object files and so forth should be removed. This can be done with the command make mrproper.

Exercise 1. Update the kernel sources with the patch provided by the trainer. 2. Compile and install the newly patched kernel. Removing Patches The command patch can also be used to undo the changes. The option required for the command patch is -R or --reverse. earth:~ # cd /usr/src earth:/usr/src # bunzip2 -cd patch-2.4.21.bz2 | patch -p0 --reverse

Exercise Remove the patch from the kernel sources.

3.4.3 The Initial Ramdisk initrd Apart from the kernel, most Linux distributions load a ramdisk to the RAM when the system is booted. This ramdisk usually contains kernel modules needed for booting the system. A ramdisk is used like a hard disk and is set up in the RAM. Most Linux distributions use generic kernels that only support IDE hard disks and the standard file system ext2. If the root file system is located on a SCSI hard disk with the Reiser file system, the kernel needs the respective drivers. However, the modules in /lib/modules cannot be accessed, as this directory is located in the root partition. Like the kernel, the initial ramdisk is located in the directory /boot: earth:~ # ls -l /boot/initrd -rw-r--r-1 root root

475100 Mar 12 10:16 /boot/initrd

This ramdisk is a file system compressed with gzip. It can be decompressed with a command such as gunzip. In the following example, the file /boot/initrd is decompressed and stored in the file /tmp/ramdisk: earth:~ #


gunzip /tmp/ramdisk

3.4 Custom Linux Kernels This compressed ramdisk can be mounted on /mnt for test purposes. -o loop is required for the command mount: earth:~ #

mount -o loop /tmp/ramdisk

The option


During the boot process, the kernel searches the initial ramdisk for a file called linuxrc and executes it. This is normally a shell script in which kernel modules are loaded with the command insmod.

Exercise 1. Mount the ramdisk boot/initrd as described above and append two lines at the end of the script linuxrc: echo "Have a lot of fun ..." read TESTVAR 2. Install the new ramdisk by compressing the new image with gzip, for example, with: earth:~ # gzip /boot/initrd

3. Test the new initrd by rebooting the computer. Normally, the initial ramdisk does not have to be processed in this way, as there is a script that automatically performs the required adaptions: /sbin/mkinitrd. The corresponding configuration file is /etc/sysconfig/kernel. The kernel modules to load in the ramdisk are specified in the variable INITRD_MODULES: earth:~ #

cat /etc/sysconfig/kernel

# # This variable contains the list of modules to be added to the initial # ramdisk by calling the script "mk_initrd" # (like drivers for scsi controllers, lvm, or reiserfs) # INITRD_MODULES="aic7xxx reiserfs" ...

If you modify the variable INITRD_MODULES, you have to execute the command mkinitrd: earth:~ # mkinitrd using "/dev/sda6" as root device (mounted on "/" as "reiserfs") creating initrd "/boot/initrd" for kernel "/boot/vmlinuz" (version 2.4.19-4GB) - insmod aic7xxx (kernel/drivers/scsi/aic7xxx/aic7xxx.o) - insmod reiserfs (kernel/fs/reiserfs/reiserfs.o) ...

© 2003, SuSE Linux AG (


3 The Linux Kernel ... creating initrd "/boot/initrd.shipped" for kernel "/boot/vmlinuz.shipped" (version 2.4.19-4GB) - insmod aic7xxx (kernel/drivers/scsi/aic7xxx/aic7xxx.o) - insmod reiserfs (kernel/fs/reiserfs/reiserfs.o) If you are using lilo as boot manager, you may want to run ’lilo’ now.

Note: If you use the boot manager LILO instead of the default boot manager GRUB, you have to execute the command lilo.

3.5 For More Information • Linux kernel 2.4 internals: • Linux kernel hackers’ guide: • The Linux kernel:

Summary • The kernel manages the hardware and resources of a machine. • Kernel modules enable dynamic loading of device drivers and other kernel properties. • New kernels can be configured with the command make menuconfig or make xconfig. • New kernels can be compiled with: make dep clean bzImage modules modules_install • The command patch can be used to update kernel sources or undo patches. • The initial ramdisk enables modules to be loaded even before the root partition is mounted.


4 System Optimization Learning Aims In this chapter, learn • how to configure the system by setting various kernel parameters during operation • how to configure parameters that affect IDE hard disks • how to prevent individual users from using system resources excessively at the expense of other users

© 2003, SuSE Linux AG (


4 System Optimization

4.1 The Directory /proc/ The files and directories under /proc/ contain a wealth of information about various aspects of the running system. Especially the files under /proc/sys/ can be modified during operation, affecting the characteristics of the running machine.

4.1.1 Viewing the Current Configuration The individual files are ASCII text files that can be viewed with cat or less. For example, the following listing shows information about the CD-ROM drive: tux@earth:~> cat /proc/sys/dev/cdrom/info CD-ROM information, Id: cdrom.c 3.12 2000/10/18 drive name: drive speed: drive # of slots: Can close tray: Can open tray: Can lock tray: Can change speed: Can select disk: Can read multisession: Can read MCN: Reports media changed: Can play audio: Can write CD-R: Can write CD-RW: Can read DVD: Can write DVD-R: Can write DVD-RAM: unused devices: <none>

hdc 8 1 1 1 1 1 0 1 1 1 1 0 0 1 0 0

The command sysctl can be used to view all or specific modifiable values: tux@earth:~> /sbin/sysctl net.ipv4.ip_forward net.ipv4.ip_forward = 0 tux@earth:~> /sbin/sysctl -a sunrpc.nlm_debug = 0 sunrpc.nfsd_debug = 0 sunrpc.nfs_debug = 0 sunrpc.rpc_debug = 0 abi.fake_utsname = 0 abi.trace = 0 abi.defhandler_libcso = 68157441 ...


© 2003, SuSE Linux AG (

Editing the Current Configuration

The command echo can be used to edit individual values. For example, the following command activates routing: earth:~ # echo 1 > /proc/sys/net/ipv4/ip_forward

This can also be done with the command sysctl: earth:~ # sysctl -w net.ipv4.ip_forward=1

Another application scenario is the deployment of an Oracle database. This requires a number of kernel parameters to be set: earth:~ # echo "65535" > /proc/sys/fs/file-max earth:~ # echo "2147483648" > /proc/sys/kernel/shmmax ...

The corresponding sysctl commands are as follows: earth:~ # sysctl -w fs.file-max=65535 earth:~ # sysctl -w kernel.shmmax=2147483648

If, for example, you want to load a number of kernel parameters when the system is booted, the command sysctl is very useful. The parameters can be entered in the file /etc/sysctl.conf and set in the file /etc/init.d/boot.sysctl by executing the command sysctl -p. # /etc/sysctl.conf net.ipv4.ip_forward = 1 net.ipv4.icmp_echo_ignore_broadcasts = 1 fs.file-max = 65535 kernel.shmmax = 2147483648

To execute the script /etc/init.d/boot.sysctl when the system is booted, activate it by means of the command insserv: earth:~ # insserv -d boot.sysctl



SuSE Linux Enterprise Server offers a special tool for setting the said parameters: Powertweak. This tool comprises the daemon powertweakd and a graphical YaST2 front-end by means of which the configuration can be carried out in a convenient and transparent manner (Powertweak is not part of UnitedLinux). A significant advantage of this method of setting kernel parameters is that a short description is provided for every parameter. When started for the first time via yast2 System Powertweak Configuration, the configuration file /etc/powertweak/tweaks is generated and the daemon is started. From now on, it will be started every time the system is booted,

© 2003, SuSE Linux AG (


4 System Optimization as the links to the start script /etc/init.d/powertweakd are also set in the respective runlevel directories under /etc/init.d/.

Figure 4.1: Configuring Powertweak The activation of routing as shown in the above example (see page 39) can also be performed here, as shown in the following figure:

Figure 4.2: Changing Kernel Parameters with Powertweak


4.3 hdparm The result is an entry such as the following in the file /etc/powertweak/tweaks: ... # Networking - IP #

IP Forwarding


This option enables forwarding of IP packets. E.g. from eth0 to eth1.

net/ipv4/ip_forward = 0 ...

Note: A disadvantage of these diverse configuration possibilities is that the same parameter can be set differently in various places. For instance, a change of the variable IP_FORWARD=no in the file /etc/sysconfig/sysctl will not have any effect if powertweakd is started and a different value was set for IP forwarding in the Powertweak configuration. Therefore, you should select one method and use it exclusively to avoid inconsistencies.

Exercise Start yast and the module for powertweak. Modify various values and observe the effect.



hdparm offers a variety of options for changing the behavior of IDE hard disks. Most of these options are only needed in exceptional cases and the manual page explicitly warns against the use of certain options. However, there are some options that are used quite frequently. Enter hdparm --help for an overview of the options: earth:~ # hdparm --help hdparm - get/set hard disk parameters - version v5.2 Usage: hdparm [options] [device] .. Options: -a get/set fs readahead -A set drive read-lookahead flag (0/1) -b get/set bus state (0 == off, 1 == on, 2 == tristate) -B set Advanced Power Management setting (1-255) -c get/set IDE 32-bit IO setting -C check IDE power mode status -d get/set using_dma flag -D enable/disable drive defect-mgmt -w perform device reset (DANGEROUS) -W set drive write-caching flag (0/1) (DANGEROUS) -x tristate device for hotswap (0/1) (DANGEROUS) -X set IDE xfer mode (DANGEROUS) -y put IDE drive in standby mode -Y put IDE drive to sleep -Z disable Seagate auto-powersaving mode -z re-read partition table

© 2003, SuSE Linux AG (


The DMA mode setting is especially important for the hard disk performance. This mode can be set with the option -d: earth:~ # hdparm -d 0 /dev/hda /dev/hda: setting using_dma to 0 (off) using_dma = 0 (off)

A performance test can be carried out to assess the difference between active and inactive DMA access: earth:~ # hdparm -t /dev/hda /dev/hda: Timing buffered disk reads:

64 MB in

8.73 seconds =

7.33 MB/sec

earth:~ # hdparm -d 1 /dev/hda /dev/hda: setting using_dma to 1 (on) using_dma = 1 (on) earth:~ # hdparm -t /dev/hda /dev/hda: Timing buffered disk reads:

64 MB in

6.33 seconds = 10.11 MB/sec

Exercise Use hdparm to check the current setting of the DMA mode. Then test the hard disk speed with hdparm, change the DMA mode, and test the hard disk speed again. Do you notice any difference?

4.4 ulimit The command ulimit does not have a direct impact on the system performance. Rather, its task is to prevent individual users from using system resources excessively at the expense of other users. Accordingly, ulimit can be used to configure the memory usage, the number of possible processes, and other factors. The current limits can be viewed with the command ulimit -a:


© 2003, SuSE Linux AG (

4.4 ulimit earth:~ # ulimit -a core file size (blocks, -c) data seg size (kbytes, -d) file size (blocks, -f) max locked memory (kbytes, -l) max memory size (kbytes, -m) open files (-n) pipe size (512 bytes, -p) stack size (kbytes, -s) cpu time (seconds, -t) max user processes (-u) virtual memory (kbytes, -v)

0 unlimited unlimited unlimited unlimited 1024 8 unlimited unlimited 1023 unlimited

The individual values can also be set anew for the current shell and its child processes: earth:~ # ulimit -u 1023 earth:~ # ulimit -u 100 earth:~ # ulimit -u 100

The details of the individual options are described in the manual pages of bash, section ulimit. The settings can also be changed globally for the entire system. The configuration can be performed by means of the file /etc/profile or by way of the PAM configuration. The advantage of the configuration via PAM is that the file /etc/security/limits.conf enables user or group–specific configuration and the files in the directory /etc/pam.d/ enable application-specific (login, sshd, etc.) configuration. The file /etc/profile contains preconfigured entries that you can customize according to your needs: ... # Adjust some size limits (see bash(1) -> ulimit) # Note: You may use /etc/initscript instead to set up ulimits and your PATH. # if test "$is" != "ash" ; then #ulimit -c 20000 # only core-files less than 20 MB are written #ulimit -d 15000 # max data size of a program is 15 MB #ulimit -s 15000 # max stack size of a program is 15 MB #ulimit -m 30000 # max resident set size is 30 MB ulimit -Sc ulimit -Sd # ksh does test "$is" ulimit -Sm

0 # don’t create core files unlimited not support this command. != "ksh" && ulimit -Ss unlimited unlimited

fi ...

© 2003, SuSE Linux AG (


soft hard hard soft hard hard -

core rss nproc nproc nproc nproc maxlogins

0 10000 20 20 50 0 4

# Max number of processes for the members # of the group users @users hard nproc 100

This file also contains an explanation of what can be entered in the individual columns.

Exercise 1. First, perform the following steps: tux@earth:~> echo "main() {for(;;)fork();}" > fork.c tux@earth:~> gcc fork.c

The program created in this way (a.out) merely serves demonstration purposes. This kind of program is referred to as “fork bomb”. The program continuously starts new instances of itself, making the computer virtually unusable due the multitude of processes — unless suitable precautions are taken before the program is started. Do not execute this program on productive systems! 2. Set ulimit to 10 and start a.out. Switch to another console and look at the process table with ps aux. Terminate a.out with Ctrl + C . Change the ulimit value, execute a.out again, and observe the change in the processes. (If the default ulimit value of 1023 is used, the computer will be virtually unusable following the execution of a.out. Often, the only thing you can do in such a case is to reboot the system.) 

4.5 For More Information Useful information about system tuning is available at:


4.5 For More Information

Summary • Kernel parameters can easily be modified during operation with the command sysctl or the package powertweak. • hdparm can be used to configure IDE hard disks. • The command ulimit can be used to control the use of various resources by users.

4 System Optimization


5 RAID Learning Aims In this chapter, learn • about the basics of RAID • how to configure RAID

© 2003, SuSE Linux AG (



5.1 Basics RAID stands for “Redundant Array of Independent (or Inexpensive) Disks.” Of the six separate RAID levels originally defined, the Linux kernel supports the three most common levels (0, 1, and 5). The purpose of RAID is to combine several hard disk partitions to a “virtual” hard disk to increase performance or data security. One way of implementing RAID is the use of special hardware controllers to which the SCSI or IDE hard disks are connected. The controller controls the hard disks and the organization of the data structures according to the set RAID level. In Linux, a single device addressed as one hard disk appears instead of the individual hard disks. Another possibility is to implement RAID as software RAID with the help of the drivers in the Linux kernel. By means of this approach, individual hard disk partitions are combined to a multiple-disk device (e.g., /dev/md0). For practical reasons, these partitions should be located on separate hard disks. However, for training purposes they may also be located on the same hard disk.

5.1.1 RAID Levels RAID 0: In contrast to what is implied by the name RAID, RAID 0 does not provide any redundancy. RAID 0 merely distributes the data across multiple partitions to achieve higher data throughput rates. The disadvantage: if one hard disk in this array fails, all data is lost. This kind of data management is also referred to as striping. RAID 1: RAID 1 increases the data security by maintaining an exact mirror of the data on one disk on the other disk. If one disk fails, all data continues to be fully available on the other disk. A further advantage is the increased read speed. The disadvantages are the increased costs (two hard disks with 50 GB each merely provide a total capacity of 50 GB) and the somewhat reduced write speed. This kind of data management is often referred to as “mirroring”. RAID 5: In terms of performance and redundancy, RAID 5 is an optimized compromise between the other two levels. The capacity corresponds to the sum of the capacities of the utilized disks minus the capacity of one disk (provided the disks are of the same size). Similar to RAID 0, the data are distributed across the hard disks. In RAID 5, the data security is implemented by parity blocks stored on one of the partitions. If one disk fails, its contents can be reconstructed from the data on the other disks. However, if two disks fail at the same time, the data is lost.


© 2003, SuSE Linux AG (

5.2 Configuration



A new RAID array can be set up during the installation or at a later time. The same YaST2 dialog is used in both cases.

Figure 5.1: YaST2 Dialog for Partitioning Hard Disks

To date, YaST2 does not offer the functionality needed for administering an existing RAID array. This can be done with the command-line program mdadm as described in Section 5.3 on page 52. The first step when setting up a RAID array is the selection of the involved partitions. Partitions in a software RAID are specified by the ID 0xFD Linux RAID. Depending on the RAID level, two or more partitions of this kind are needed.

© 2003, SuSE Linux AG (



Figure 5.2: Partitioning for RAID In the second step, the RAID level is determined and the partitions are joined to a RAID array.

Figure 5.3: Determining the RAID Level


© 2003, SuSE Linux AG (

5.2 Configuration

Figure 5.4: Arranging the RAID Array

The last step is the creation of a file system on the new device /dev/md0 and the addition of a suitable entry in the file /etc/fstab.

Figure 5.5: Integrating the RAID Array in the File System

© 2003, SuSE Linux AG (


5 RAID Theoretically, the steps described above could also be performed from the command line with mdadm. However, YaST2 is more practical for these tasks.

5.3 Administration A RAID 5 disk array prevents data loss in the event of a disk failure, thus enhancing the system availability. Nevertheless, RAID 5 does not constitute a substitute for a backup, as the failure of two disks would inevitably lead to a loss of data. The program mdadm can be used to remove and insert individual partitions from and to the array. The current state of the RAID array can be viewed any time with the following command: earth:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb1[1] sdc1[2] sda3[0] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: <none>

Defective disks can be marked as “faulty” and removed from the array with an additional command: earth:~ # mdadm /dev/md0 -f /dev/sdb1 mdadm: set /dev/sdb1 faulty in /dev/md0 earth:~ # mdadm /dev/md0 -r /dev/sdb1 mdadm: hot removed /dev/sdb1

Subsequently, the new replacement disk or replacement partitions must be included in the array. The software RAID then restores the parity information on the new disk. During this procedure, which can be monitored with a command such as watch cat /proc/mdstat, the system continues to be usable, although with a substantially reduced performance. earth:~ # mdadm /dev/md0 -a /dev/sdb2 mdadm: hot added /dev/sdb2 earth:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb2[3] sdc1[2] sda3[0] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/2] [U_U] [>....................] recovery = 3.3% (70892/2104320) \ finish=3.8min speed=8861K/sec unused devices: <none>


© 2003, SuSE Linux AG (

5.3 Administration Upon completion of the recovery, the new partition will be part of the array: earth:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb2[1] sdc1[2] sda3[0] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: <none>

If a fourth partition is added to a RAID array consisting of three partitions, the new partition will not be used immediately. If, however, one of the partitions used fails, the fourth partition will seemlessly replace the failed partition. The partition information will be written to the free partition immediately, thus completing the array. Initial state: earth:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdc1[2] sdb2[1] sda3[0] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: <none>

Addition of a spare partition: earth:~ # mdadm /dev/md0 -a /dev/sdb1 mdadm: hot added /dev/sdb1 earth:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb1[3] sdc1[2] sdb2[1] sda3[0] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: <none>

One of the existing partitions is marked as faulty and removed from the array: linux:~ # mdadm /dev/md0 -f /dev/sda3 mdadm: set /dev/sda3 faulty in /dev/md0 linux:~ # mdadm /dev/md0 -r /dev/sda3 mdadm: hot removed /dev/sda3

The reconstruction of the parity information on the newly added partition begins without any additional commands: linux:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb1[3] sdc1[2] sdb2[1] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/2] [_UU] [==>..................] recovery = 11.3% (240304/2104320) \ finish=6.3min speed=4914K/sec unused devices: <none>

© 2003, SuSE Linux AG (


5 RAID Final state of the RAID array: linux:~ # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdb1[0] sdc1[2] sdb2[1] 4208640 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: <none>

YaST2 and the listed options of mdadm are sufficient for the basic administration of software RAID systems. Additional possibilities are described in the manual pages for mdadm.

Exercise 1. Create four partitions of the same size and set up a RAID 5 array consisting of three partitions. Remove one partition from the array. Add the fourth partition. Monitor the file /proc/mdstat. 2. Following the reconstruction of the array, reinsert the partition you just removed and remove another partition. Monitor the file /proc/mdstat.

5.4 For More Information Software RAID HOWTO: /usr/share/doc/packages/raidtools/Software-RAID-HOWTO.html

Summary • Software RAID is a useful and inexpensive alternative to hardware RAID controllers. • Usually, the RAID levels 0, 1, and 5 are used. • The program for administering RAID is /proc/mdadm.


6 Logical Volume Manager Learning Aims In this chapter, learn • about the basics of the Logical Volume Manager • how to configure LVM • how to make a snapshot of a logical volume

© 2003, SuSE Linux AG (


6 Logical Volume Manager

6.1 Basics The “conventional” partitioning of hard disks is rather inflexible — when a partition is full, you have to move the data to another medium before you can resize the partition, create a new file system, and copy the files back. Usually, such changes cannot be implemented without changing adjacent partitions, whose contents also need to be backed up to other media and written to their original locations after the repartitioning. Although there are tools that facilitate these steps, their use can sometimes result in loss of data and corrupt file systems. The Logical Volume Manager solves this problem by inserting an abstraction layer referred to as volume group (VG) between the partitions and file systems accessed by the applications and the actual physical storage media. In the ideal case, this allows resizing of the physical media during operation without affecting the applications. The basic structure is as follows (see Figure 6.1): several physical volumes (PV) (entire hard disks or individual partitions) are combined to a superordinate unit referred to as volume group (VG). Further hard disks or partitions can be added to the volume group during operation whenever necessary. The volume group can also be reduced in size by removing hard disks or partitions. The volume group, in turn, can be split into several (up to 256) logical volumes (LV) that can be addressed with their device names (e.g., /dev/system/usr) like conventional partitions and on which file systems can be created.



Logical Volumes

Volume Group





Physical Volumes

Figure 6.1: Structure of the Logical Volume Manager


© 2003, SuSE Linux AG (

6.2 Configuring the Logical Volume Manager Note: Just as with other direct manipulations of the file system, a data backup should be made before configuring LVM.


Configuring the Logical Volume Manager

The LVM configuration comprises three basic steps:

1. Determination of the physical volumes (PV) 2. Definition of the volume groups (VG) 3. Setup of the logical volumes (LV)

During the installation of SuSE Linux Enterprise Server, determine the physical volumes in the hard disk partitioning dialog by assigning them the partition ID 8e for LVM. Then click LVM and add the physical volumes to a volume group. The proposed name system can be accepted or changed (as the individual logical volumes are addressed with /dev/VG_name/LV_name, you cannot assign any name that already exists in the directory /dev). In the following step, this volume group is split into individual logical volumes in which file systems are created like in conventional partitions and for which mount points are assigned. Theoretically, it is possible to set up the root file system / in an LV. However, the access with the rescue system in the event of an emergency is easier if this is not the case. As the root partition does not have to be very large and the LVM provides a lot of flexibility, this should not pose any problem. Even on an installed system, additional partitions can be managed by means of the LVM. The basic configuration procedure is the same as during the installation. While the extension of volume groups by adding physical media (partitions, hard disks) and the extension of logical volumes is usually quite easy and can be done with YaST2 during operation, the reduction of logical volumes or the removal physical storage media from the volume group has to be done very carefully. When shrinking file systems, the respective file system first must be unmounted with the command umount. If, for example, you want to shrink the partition /usr, you have to use the command-line tools instead of YaST2.

© 2003, SuSE Linux AG (


6 Logical Volume Manager

Figure 6.2: YaST Configuration Dialog for LVM

6.3 Command-Line Tools for LVM 6.3.1 Creating or Extending Volume Groups or Logical Volumes pvcreate The command pvcreate can be used to prepare partitions or entire hard disks for use in a volume group. earth:~ # pvcreate /dev/sda8

Entire hard disks should not have any partition table in the MBR. Any existing table should be overwritten. The following example shows the command for overwriting the partition table for the second SCSI hard disk: earth:~ # dd if=/dev/zero of=/dev/sdb count=1

vgcreate If no volume group exists or you want to create a new one, you can use the command vgcreate together with the name of volume group and the physical volumes prepared


6.3 Command-Line Tools for LVM with pvcreate: earth:~ # vgcreate system /dev/sda8

vgextend To add a new physical volume to an existing volume group, use the command vgextend. The syntax of this command is the same as that of vgcreate (the only difference is that the volume group already exists): earth:~ # vgextend system /dev/sda8

lvcreate If there is still space in the volume group or you have freed additional space as described above, you can create additional logical volumes with the command lvcreate. Apart from the size of the new logical volume, specify the volume group with which to associate it and its name, unless you want to use the name assigned by default: earth:~ # lvcreate -L 500M -n tmp system

Subsequently, create a file system in the new logical volume (using a command such as mkfs.reiserfs or mkfs.ext3). Now the logical volume can be mounted just like any other partition: earth:~ # mount /dev/system/tmp /tmp

lvextend An existing logical volume can be extended if the volume group still has space not allocated to any logical volume. The procedure comprises two basic steps: 1. Extension of the logical volume 2. Extension of the file system If the logical volume contains an ext2 or ext3 file system, it has to be unmounted with the command umount before the file system is extended. Reiser file systems can be extended while they are mounted. earth:~ # lvextend -L +500M /dev/system/tmp earth:~ # resize_reiserfs -s +500M -f /dev/system/tmp

© 2003, SuSE Linux AG (


6 Logical Volume Manager

Exercise 1. Create two partitions with the ID 8e and include them in the volume group system. Create two logical volumes that do not cover the entire space of the volume group. Create file systems in the logical volumes. 2. Extend one of the volume groups by several MB and adapt the file system accordingly. 3. Mount the logical volumes in the file system and copy data to the logical volumes.

6.3.2 Reducing or Removing Physical Volumes, Volume Groups, or Logical Volumes pvmove and vgreduce Before a physical volume can be removed from a volume group, the physical extents (administrative units in the physical volumes) used by the logical extents (administrative units in the logical volumes) must be moved to other physical volumes. The current state of the physical volumes can be displayed with the command pvscan: earth:~ # pvscan -pvscan -pvscan -pvscan -pvscan -pvscan --

pvscan reading all physical volumes (this may take a while...) ACTIVE PV "/dev/sda5" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda6" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda7" of VG "system" [1.46 GB / 668 MB free] ACTIVE PV "/dev/sda8" of VG "system" [996 MB / 996 MB free] total: 4 [5.39 GB] / in use: 4 [5.39 GB] / in no VG: 0 [0]

The command for emptying the physical volume is pvmove. LVM will decide where to move the contents of the extents. Refer to the manual pages if you want more control over the procedure. earth:~ # pvmove /dev/sda5 pvmove -- moving physical extents in active volume group "system" pvmove -- WARNING: if you lose power during the move you may need to restore your LVM metadata from backup! pvmove -- do you want to continue? [y/n] y pvmove -- doing automatic backup of volume group "system" pvmove -- 375 extents of physical volume "/dev/sda5" successfully moved

Execute pvscan to check the result: earth:~ # pvscan -pvscan -pvscan -pvscan -pvscan -pvscan --


pvscan reading all physical volumes (this may take a while...) ACTIVE PV "/dev/sda5" of VG "system" [1.46 GB / 1.46 GB free] ACTIVE PV "/dev/sda6" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda7" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda8" of VG "system" [996 MB / 164 MB free] total: 4 [5.39 GB] / in use: 4 [5.39 GB] / in no VG: 0 [0]

© 2003, SuSE Linux AG (

6.3 Command-Line Tools for LVM The emptied physical volume can now be “checked out” from the volume group with the command vgreduce: earth:~ # vgreduce system /dev/sda5 vgreduce -- doing automatic backup of volume group "system" vgreduce -- volume group "system" successfully reduced by physical volume: vgreduce -- /dev/sda5

The result: earth:~ # pvscan -pvscan -pvscan -pvscan -pvscan -pvscan --

pvscan reading all physical volumes (this may take a while...) inactive PV "/dev/sda5" is in no VG [1.47 GB] ACTIVE PV "/dev/sda6" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda7" of VG "system" [1.46 GB / 0 free] ACTIVE PV "/dev/sda8" of VG "system" [996 MB / 164 MB free] total: 4 [5.39 GB] / in use: 3 [3.92 GB] / in no VG: 1 [1.47 GB]

lvreduce Individual logical volumes can be reduced with the command lvreduce. However, two additional steps are necessary: the logical volume must be unmounted with the command umount and the file system must be shrunk. Depending on which part of the directory tree the logical volume contains, it may be helpful to change to the single-user mode (init 1), as this ensures that the file system is not accessed in an uncontrolled way, which would prevent the directory branch from being unmounted. The needed commands are umount, resize_reiserfs or resize2fs, and lvreduce. earth:~ # umount /dev/system/opt earth:~ # resize_reiserfs -s -30M /dev/system/opt resize_reiserfs 3.6.2 (2002) You are running BETA version of reiserfs shrinker. This version is only for testing or VERY CAREFUL use. Backup of your data is recommended. Do you want to continue? [y/N]:y Processing the tree: 0%....20%....40%....60%....80%....100% left 0, 16595 /sec nodes processed (moved): int 31 (0), leaves 4344 (1), unfm 78604 (7115), total 82979 (7116). check for used blocks in truncated region ReiserFS report: blocksize block count free blocks bitmap block count

4096 95232 (102912) 4040 (11719) 3 (4)

Syncing..done earth:~ # lvreduce -L 370700k /dev/system/opt

© 2003, SuSE Linux AG (


6 Logical Volume Manager Make sure that the size of the logical volume is still sufficient for the entire file system (shrunk earlier) after the resize. If this is not the case, parts of the file system will be truncated. This could result in a loss of data or it might be impossible to mount the file system. For ext2 file systems, the program e2fsadm can be used to perform the resizing of the file system and the adaption of the logical volume in the correct sequence and with the correct parameters.

Exercise 1. Reduce one of the logical volumes created in the previous exercise. 2. Move the logical extents from one physical extent to another and remove the physical volume from the volume group.

6.4 Snapshots LVM provides the possibility of freezing the state of a logical volume at a specific time by means of a snapshot logical volume especially created for this purpose, for example, to enable consistent backups. At the same time, the original logical volume can continue to be used (files can be created, deleted, or modified). To do this, a new logical volume must be created for the snapshot. The size of the volume depends on the speed with which the data changes in the volume to back up. For the snapshot, the status of all data in the volume to back up is linked with the snapshot volume. As soon as data in the volume is modified, the original data is copied to the snapshot volume. The more data changed, the more the snapshot volume will fill with the data at the time of the snapshot. Thus, a backup representing the state of the data at the time of the snapshot can be made very easily: earth:~ # lvcreate -L 200M -s -n var_snapshot /dev/system/var lvcreate -L 200M -s -n var_snapshot /dev/system/var lvcreate -- WARNING: the snapshot will be automatically disabled \ once it gets full lvcreate -- INFO: using default snapshot chunk size of 64 KB for \ "/dev/system/var_snapshot" lvcreate -- doing automatic backup of "system" lvcreate -- logical volume "/dev/system/var_snapshot" successfully created

The new logical volume can be mounted as usual: earth:~ # mount /dev/system/var_snapshot /mnt mount: block device /dev/system/var_snapshot is write-protected, \ mounting read-only


© 2003, SuSE Linux AG (

6.5 For More Information When the backup is ready and the snapshot is no longer needed, remove it with two steps: earth:~ # umount /mnt earth:~ # lvremove /dev/system/var_snapshot lvremove -- do you really want to remove "/dev/system/var_snapshot"? [y/n]: y lvremove -- doing automatic backup of volume group "system" lvremove -- logical volume "/dev/system/var_snapshot" successfully removed

Exercise Create a snapshot of a logical volume, mount it in the file system, then remove it.


For More Information

Detailed information about LVM with SuSE Linux is available at:

Summary • The Logical Volume Manager facilitates the administration of storage space by liberating the system administrator from the limitations of rigid partitions. • The configuration and administration can be done partly with YaST2 or completely from the command line. • The creation of snapshots enables consistent backups without interruption of ongoing operations. • Important commands in this chapter: Command



Preparation of a partition or hard disk for inclusion in a volume group. Creation of a new volume group. Extension of a volume group by additional physical volumes. Creation of a logical volume in a volume group. Extension of an existing logical volume. Display of the current state of a physical volume. Relocation of physical extents to other physical volumes.

vgcreate vgextend lvcreate lvextend pvscan pvmove

© 2003, SuSE Linux AG (


vgreduce lvreduce lvremove mount/umount resize_reiserfs resize2fs e2fsadm

Removal of a physical volume from a volume group. Reduction of a logical volume. Removal of a logical volume from a volume group. Mounting/unmounting of partitions or logical volumes. Resizing of Reiser file systems. Resizing of ext2 or ext3 file systems. Resizing of ext2 file systems, adaption of logical volumes.

© 2003, SuSE Linux AG (

7 Linux File Systems Learning Aims In this chapter, learn • how file systems are mounted in the directory tree • which mount options are used to configure file systems • how to set up ext2 and Reiser file systems • how to detect and eliminate errors in ext2 file systems • how to convert from ext2 to ext3 • how to set up and activate a swap partition

© 2003, SuSE Linux AG (


7 Linux File Systems

7.1 Mounting File Systems in the Directory Tree A prominent characteristic of Linux and Unix is a hierarchical directory tree grounded in the root directory. Additional file systems or partitions can be mounted in the directory tree in a transparent manner with the command mount.

/dev/hda2 (root partition)

/ a b c

/dev/hda3 / d e


Figure 7.1: Mounting File Systems in the Directory Tree The file systems currently mounted in a Linux system can be queried with the command mount or by taking a look at the file /proc/mounts: tux@earth:~ > mount /dev/hda2 on / type ext2 (rw) proc on /proc type proc (rw) devpts on /dev/pts type devpts (rw,mode=0620,gid=5) /dev/hda1 on /opt type reiserfs (rw) /dev/hda5 on /tmp type reiserfs (rw) /dev/hda6 on /usr type reiserfs (rw) /dev/hda7 on /var type reiserfs (rw) usbdevfs on /proc/bus/usb type usbdevfs (rw) /dev/hda8 on /home type ext3 (rw,noatime) shmfs on /dev/shm type shm (rw)

The general syntax for mounting file systems with the command mount is: mount [-t file_system_type] [-o mount_options] file_system_mount_point

In the following example, the partition /dev/hda9 is mounted on the directory /space. The file system type does not have to be specified, as it is usually recognized automatically: earth:~ # mount /dev/hda9 /space

Exercise Check the mounted partitions in your system. Compare the output of the command mount with the content of the file /proc/mounts.


© 2003, SuSE Linux AG (

7.1 Mounting File Systems in the Directory Tree


The File /etc/fstab

The file systems and their mount points in the directory tree are specified in the file /etc/fstab. This file contains one line comprising six fields for each mounted file system. Example: /dev/hda2 /dev/hda3 /dev/hda1 /dev/hda5 /dev/cdrom

/ /opt swap /tmp /media/cdrom

ext2 reiserfs swap reiserfs auto

defaults defaults pri=42 defaults ro,noauto,user,exec

1 1 0 1 0

1 2 0 2 0

The meaning of the individual fields: Field 1 The name of the device file. Field 2 The mount point — the directory to which the file system should be mounted. The directory specified here must already exist. Field 3 The file system type (e.g., ext2, reiserfs). Field 4 Mount options. Multiple mount options are separated by commas (e.g., defaults, noauto, ro). For example, the option user means that normal users (e.g., tux) are entitled to mount the device file in the Linux system. This option is usually used for the CD-ROM drive (/dev/cdrom) and the floppy disk drive (/dev/fd0). Field 5 Determine whether to use the backup utility dump for the file system. 0 means no backup. Field 6 Determine the sequence of the file system checks (with the fsck utility) when the system is booted: • 0 for file systems that are not to be checked • 1 for the root directory • 2 for all other modifiable file systems


Mount Options

A number of options can be used when mounting file systems. These options can be entered in the file /etc/fstab (Field 4) or specified with -o when using the mount command. In the following example, the partition is mounted with the option ro (for read-only): earth:~ # mount -o ro /dev/hda8 /usr/local

© 2003, SuSE Linux AG (


7 Linux File Systems There are file system–specific and file system–independent options. This sections merely covers file system–independent options. The file system–specific options are covered in the sections dealing with the individual file systems.

7.1.3 General Options remount The option remount causes file systems that are already mounted to be mounted anew. Example (remounting the partition /usr with the additional option ro): earth:~ # mount -o remount,ro /usr

rw, ro These options indicate whether a file system should be writable (rw) or only readable (ro).

Exercise 1. Mount your partition /usr as read-only using the options remount and ro. 2. Check the result by entering the command mount without any further options. 3. Remount your partition /usr as writable. The following options affect the performance of a file system: sync, async Synchronous (sync) or asynchronous (async) input and output in a file system. The default setting is async. atime, noatime This option determines whether the access time of a file (atime) is updated in the inode (atime) or not (noatime). The option noatime should improve the performance.

Exercise 1. Execute the following command several times and make a note of the time that is needed: time find /usr -name "*.so" &>/dev/null

The command time delivers the time needed by the command find. 2. Now mount the partition /usr with the option noatime and repeat the first exercise. You should be able to see improved performance when the command is executed.


© 2003, SuSE Linux AG (

Alternative File System Designations

Previously, local file systems were addressed by means of device file names (e.g., /dev/hda1 or /dev/sda3). However, this designation is not always unique. In systems with removable disks or SCSI systems in which disks can be added and removed during operation, file systems cannot be identified with certainty by means of device files. For this reason, file systems can be identified with various mechanisms: • using the name of the device file (e.g., /dev/hda2) • using the file system label (e.g., with the command e2label) • using a UUID (Universally Unique Identifier)

© 2003, SuSE Linux AG (


7 Linux File Systems File System Label File system labels are names for file systems that are assigned when setting up the file system. Names or labels can be changed later. The maximum length of file system labels is limited to sixteen characters. In the file systems ext2 and ext3, labels can be set in various ways: • By using the option -L when setting up the file system: earth:~ # mkfs -t ext2 -L testlabel /dev/sda1

• Later with e2label or tune2fs: earth:~ # e2label

/dev/sda1 newlabel

earth:~ # tune2fs -L newlabel


The current label of an ext2 or ext3 file system can be displayed with the command e2label: earth:~ # e2label /dev/sda1 newlabel

In the file system reiserfs, file system labels can also be set when setting up the file system or later: • when setting up the Reiser file system: earth:~ #

mkfs -t reiserfs -l testlabel /dev/sda1

or earth:~ # mkreiserfs -l testlabel /dev/sda1

• The label can be changed later, provided the file system is not mounted: earth:~ # reiserfstune -l newlabel /dev/sda1

Instead of the name of the device file, the file /etc/fstab can also have an entry in the form LABEL=labelname: LABEL=newlabel


reiserfs defaults

1 2

Manual mounting using the file system label is possible with the option -L: earth:~ # mount -L newlabel /usr


© 2003, SuSE Linux AG (

7.1 Mounting File Systems in the Directory Tree

Exercise 1. Set a label for a mounted ext2 file system. 2. Modify the entry for this file system in the file /etc/fstab in such a way that the file system is addressed by its label instead of its device name.



In ext2, ext3, and Reiser file systems, a UUID (Universally Unique Identifier) is automatically assigned to the respective file system. This UUID is unique for each file system (see man uuidgen). Depending on which file system is used, different commands are needed to find the UUID of existing file systems: • ext2: The command tune2fs -l displays the parameters of an ext2 file system: earth:~ # tune2fs -l /dev/hda2 | grep UUID Filesystem UUID: 681352db-6533-4552-b37c-4a88b60b0b61

• ReiserFS: The command for ReiserFS is debugreiserfs. earth:~ # debugreiserfs /dev/hda3 | grep UUID debugreiserfs 3.6.4 (2002 UUID: 0f407497-a818-40ac-b0a2-aadab88282c3

In the file /etc/fstab, a file system can be mounted via UUID with UUID=number. The command mount provides the option -U for manually mounting a file system using the UUID. earth:~ # mount -U 754aaff2-b567-405c-abd2-3af19e472104 /mnt

Exercise 1. What are the UUIDs of your system’s file systems? 2. Enter a file system with its UUID in the file /etc/fstab. Check the new entry for correctness, for example, by rebooting the computer.

© 2003, SuSE Linux AG (


7 Linux File Systems

7.2 The Second Extended File System (ext2) 7.2.1 Blocks and Inodes The Second Extended File System, ext2 for short, is the classic among Linux file systems. The file system structure resembles that of other Unix file systems. The file system is arranged in groups with an identical structure, shown in Figure 7.2. Each group consists of a number of blocks containing administrative information and the actual data blocks.

Group 0


Group 1

Group description

Block bitmap

Group n

Inode− bitmap

Inode− table

Data blocks

Figure 7.2: Structure of an ext2 File System Superblock: Contains meta information about the file system, such as the file system label, status, UUID, block size, etc. All information in the superblock can be viewed with the command dumpe2fs: earth:~ # dumpe2fs -h /dev/hda2 dumpe2fs 1.28 (31-Aug-2002) Filesystem volume name: root Last mounted on: <not available> Filesystem UUID: 681352db-6533-4552-b37c-4a88b60b0b61 ...

Group descriptor: Information about the location of the other administrative blocks (bitmaps and inode table). Block bitmap: Overview of used and free blocks. Inode bitmap: Overview of used and free inodes. Inode table: Table of all inodes. Data blocks: The actual content of a file is stored here. A data block has a predefined size of 1024, 2048, or 4096 bytes per block. A file occupies at least one entire block.


© 2003, SuSE Linux AG (

7.2 The Second Extended File System (ext2) Every file is associated with an inode containing information on the file. The number of inodes is predefined when the file system is set up and cannot be modified later. If too few inodes are reserved, it might be impossible to create files even though there are still free blocks. The inode of a file contains the following information: File type: Indicates whether the file is a normal file, a directory, a symbolic link, a device file, or another type. File permissions: The file access permissions in octal form (e.g., 0644). Owner: The UID of the file owner. Group: The GID of the owning group. File size: The file size in bytes. Number of hard links: A link counter shows the number of hard links. Time stamp: Three access times are stored: ctime (change time), atime (access time), and mtime (modification time). The meaning of the individual times: • atime: Last access to this file (including read access). • ctime: Last change of the inode content (e.g., by changing the file permissions). • mtime: Last modification of the file content. Data block addresses: Indicates the data blocks belonging to the file.


Group description

Block bitmap

Inode bitmap







... n

Inode table

Data blocks

Figure 7.3: Inode of a File Indicating the Used Data Blocks

© 2003, SuSE Linux AG (


7 Linux File Systems The command debugfs can be used to display the contents of inodes. earth:~ # debugfs /dev/hda2 debugfs 1.28 (31-Aug-2002) debugfs: stat /etc/passwd Inode: 22592 Type: regular Mode: 0644 Flags: 0x0 User: 0 Group: 0 Size: 974 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 2 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x3e84440d -- Fri Mar 28 13:46:05 2003 atime: 0x3e919331 -- Mon Apr 7 17:03:13 2003 mtime: 0x3e84440d -- Fri Mar 28 13:46:05 2003 BLOCKS: (0):92523 TOTAL: 1

Generation: 228727

Exercise Use the command debugfs to display the content of an inode. Caution: debugfs can destroy files and inodes. Be sure to read the manual page of the command.

7.2.2 Setting up an ext2 File System The setup of an ext2 file system in a partition is relatively easy. For example, to set up an ext2 file system in the partition /dev/hda2, simply execute one of the following commands: earth:~ # mkfs -t ext2 /dev/hda2

or: earth:~ # mke2fs /dev/hda2

Caution: All data in the respective partition is lost when a new file system is set up. In both examples, no additional options were used, so the file system was set up with default values. Of course, all modifiable values can be influenced with options: Option


-b block_size

Specification of the block size in bytes. Possible values: 1024, 2048, and 4096. Before the file system is set up, the partition is screened for bad blocks (also see the command badblocks).



© 2003, SuSE Linux AG (

7.2 The Second Extended File System (ext2) Option


-i bytes_per_inode

Determines the number of inodes. A value of 4096 means that one inode is reserved for every 4096 bytes. Specification of the file system label. Indicates what percent of the available blocks should be reserved for the user root (default: 5%).

-L label -m %_reserved_blocks

Refer to man 8 mke2fs for a list of all options.

Exercise 1. Set up an ext2 file system with a block size of 2048 bytes on a test partition (according to the specifications of your trainer). Select a suitable value for the number of inodes. Discuss the value with the class. 2. Mount the new partition permanently in the system (/etc/fstab). The mount point is the directory /ext2test.


Maintenance Tools for ext2 File Systems

Usually, the setup of the file system does not mean that all is done. File system errors or other situations repeatedly necessitate the maintenance or repair of the ext2 file systems. A number of commands are available for this purpose. File System Check Following a power failure or a system crash, a file system may contain errors. The file system is checked automatically when the system is booted. However, this automatic file system check is not always successful. In this case, the administrator must perform a manual file system check. The file system check is performed with the command fsck or fsck.ext2 and should only be carried out when the file system is not mounted.

© 2003, SuSE Linux AG (


7 Linux File Systems Changing File System Parameters Various parameters of an ext2 file system can be changed. The command for changing parameters is tune2fs. The following values can be modified: Maximum number of mounts: A file system check is performed automatically after a fixed number of mount cycles. This parameter can be changed with the option -c. Time interval between file system checks: A file system check is performed after a certain time. This value can be changed with -i. Example: tune2fs -i 100d

Times are specified in days (d), weeks (w), or months (m). File system label: The file system label can be changed with -L. Reserved blocks: The option -r changes the number of blocks reserved for root. The current parameters can be displayed with the same command: earth:~ # tune2fs -l /dev/sda11

The ext2 Debugger The command debugfs can be used to analyze and modify blocks and inodes in a detailed way. Therefore, this tool should be used with utmost caution.

Exercise In the following exercise, debugfs is first used to corrupt the consistency of a file system then to repair it. This exercise should only be performed on the test file system specifically prepared for training purposes. Never try to perform this exercise on a productive system. 1. Copy the file /etc/passwd to the directory /ext2test in the test file system (see Section 7.2.2 on the page before) and unmount the partition with umount. 2. Open this partition with the command debugfs -w. The option -w opens the file system in write mode, allowing you to modify data. (a) View the inode information of the file passwd: debugfs:

stat passwd

Make a note of the block number used.


© 2003, SuSE Linux AG (

7.3 The ext3 File System (b) Mark the used data block as free: debugfs: freeb block_number (c) Mark the entire file system as “dirty.” This implies that the file system was not unmounted correctly in the last mount cycle. debugfs:


3. In the first part of the exercise, you simulated a system crash with data inconsistency (a block was marked as free even though it is still used). Try to repair the file system with fsck.


The ext3 File System

One of the disadvantages of the ext2 file system is the time-consuming file system check after an event such as a system crash, as the entire file system is checked. For very large file systems, this can take thirty minutes or longer. In contrast, modern journaling file systems use the journal, a special area in the file system, to log information about the actions in the file system. The advantage of this approach is that after a system crash, only the data areas that were actually used need to be checked. Thus, the check of a journaling file system takes only a few seconds. Some journaling file systems only store the meta data in the journal. Although this guarantees the integrity of the file system, it does not guarantee the consistency of the actual data. Journaling file systems that log both the meta data as well as the actual data in the journal guarantee data integrity. The successor of ext2 with journaling functionality is ext3. ext3 provides the following advantages: • ext3 is an ext2 file system with journaling. Therefore, an existing ext2 file system can easily be converted to ext3. • Journaling of meta data and data.


Setting up an ext3 File System

ext3 file systems are set up almost in the same way as ext2 file systems (see Section 7.2.2 on page 74). Additionally, the option -j (for journal) must be set. earth:~ # mke2fs -j /dev/sda11

Apart from this, you can use the same options as when setting up an ext2 file system.

© 2003, SuSE Linux AG (


7 Linux File Systems

7.3.2 Converting ext2 to ext3

One advantage of ext3 is the easy conversion of an existing ext2 file system to ext3. The actual journal can be created on an ext2 file system during operation. However, the new feature will only be applied after the file system is remounted. On an ext2 file system, the journal can be created with the command tune2fs -j: earth:~ # tune2fs -j /dev/sda11 tune2fs 1.28 (31-Aug-2002) Creating journal inode: done

Do not forget to change the file system type entry in the file /etc/fstab from ext2 to ext3. There are some file system–specific mount options for ext3. The following three options affect the journaling of data (not of meta data):

data=journal: The data is transferred to the journal before it is actually written to the file system. This increases the data security, but may inhibit the speed of the file system.

File Meta− data



1 Journal

2 Figure 7.4: Data Flow with data=journal

data=ordered Only meta data are written to the journal. The actual data is written before the meta data is written to the file system. This is the default setting for the ext3 file system.


© 2003, SuSE Linux AG (

7.4 The Reiser File System (ReiserFS)

File Meta− data






2 Figure 7.5: Data Flow with data=ordered data=writeback: This is the fastest and least secure of all three variants. Only meta data is written to the journal, but the meta data is written to the file system regardless of whether the data has already been written.

Exercise 1. Convert the ext2 file system set up in Section 7.2.2 on page 75 to ext3. 2. Update the entry in the file /etc/fstab.


The Reiser File System (ReiserFS)

The Reiser file system (ReiserFS) was the first journaling file system available for Linux. The ReiserFS developer Hans Reiser used an approach that is completely different from ext2 and ext3. The file system is organized in balanced trees (b-trees) which allow quicker access to the files, especially in large directories. Moreover, ReiserFS does not always use an entire block for a file. It tries to use the available space as efficiently as possible. However, this efficient storage management also takes its toll on the speed. If you prefer more speed, use the mount option notail. Presently, the Reiser file system only provides meta data journaling. The advantages of ReiserFS are as follows: • speed, especially for directories with many entries • efficient storage management • easy resizing of the file system during operation (important for Logical Volume Manager)

© 2003, SuSE Linux AG (


7 Linux File Systems

7.4.1 Setting up a Reiser File System The command mkreiserfs sets up a Reiser file system. In contrast to ext2, a security query must be confirmed before the file system is actually set up. earth:~ # mkreiserfs /dev/sda11 mkreiserfs 3.6.2 (2002) mkreiserfs: Guessing about desired format.. mkreiserfs: Kernel 2.4.19-4GB is running. Format 3.6 with standard journal Count of blocks on the device: 257032 Number of blocks consumed by mkreiserfs formatting process: 8219 Blocksize: 4096 Hash function used to sort names: "r5" Journal Size 8193 blocks (first block 18) Journal Max transaction length 1024 inode generation number: 0 UUID: 1cf1e415-e2da-420e-a5ee-849e0aa07256 ATTENTION: YOU SHOULD REBOOT AFTER FDISK! ALL DATA WILL BE LOST ON ’/dev/sda11’! Continue (y/n):

Exercise Set up a Reiser file system on a test partition (according to the instructions of your trainer) and mount the file system on the directory /reisertest.

7.4.2 Checking a Reiser File System Although journaling file systems rarely require a file system check, you should be familiar with the command for checking the Reiser file system. This command is also needed to shrink the file system. Before you shrink the file system, be sure to perform a file system check. The file system check should only be performed on file systems that are not mounted. The command for checking a Reiser file system is reiserfsck.

Exercise Check the Reiser file system you set up on the test partition.


© 2003, SuSE Linux AG (

7.5 7.5.1

Other File Systems The Journaling File Systems XFS and JFS

The file systems XFS from SGI and JFS from IBM have been placed under the GNU General Public License and are also available for Linux. These file systems can be created with the commands mkfs.jfs (for JFS) and mkfs.xfs (for XFS).


The Linux Automounter

Many users consider the manual mounting (mount) and unmounting (umount) of CDROM drives cumbersome. This is where the kernel-based automounter autofs comes into play. The automounter automatically mounts file systems when needed and unmounts them automatically from the directory tree when they are not used.


Preliminary Steps

To use the kernel-based automounter, the programs in the package autofs are required. This package should be installed. Use the following command to check if the package is installed: earth:~ # rpm -qi autofs

If the package is not yet installed, quickly install it with YaST2: earth:~ # yast -i autofs

For the automounter service to be started automatically, it must be activated in the runlevels: earth:~ # insserv autofs

Exercise Check if the package autofs is installed and install it if necessary.

© 2003, SuSE Linux AG (


7 Linux File Systems

7.6.2 Automounting CD-ROM and Floppy Disk Drives To automount file systems on floppy disks or CD-ROMs, only two configuration files need to be modified: /etc/auto.master: The directories to be mounted via the automounter are configured in the file. The file contains a sample configuration for the directory /misc: # # # # # #

$Id: filesyst.tex,v 1.4 2003/06/11 09:14:41 mreyzl Exp $ Sample auto.master file Format of this file: mountpoint map options Also see variable AUTOFS_OPTIONS in /etc/sysconfig/autofs For details of the format look at autofs(8).

# /misc


Remove the comment mark (#) introducing the line for the directory /misc. The first column specifies the directory, while the second column specifies the corresponding configuration file (/etc/auto.misc). The directory-specific configuration file (e.g., /etc/auto.misc): For every directory configured in /etc/auto.master, there is a separate configuration file like /etc/auto.misc: #cdrom #floppy #server

-fstype=auto,ro -fstype=auto,sync -fstype=nfs

:/dev/cdrom :/dev/fd0 server.local:/export

The general structure of this configuration file is as follows: directory options file_system The following example demonstrates how a CD-ROM drive is automounted, allowing the CD-ROM to be mounted on the directory /misc/cdrom when necessary and unmounted automatically after a certain time. Step 1: Generate an entry for the directory /misc in the file /etc/auto.master: /misc


--timeout 10

The option timeout specifies after how many seconds the mount point is released when the CD-ROM drive is no longer accessed (umount). Step 2: Configure a mount point for the CD-ROM drive in the directory-specific configuration file /etc/auto.misc: cdrom




© 2003, SuSE Linux AG (

7.6 The Linux Automounter The first column contains the name of the mount point — the respective subdirectory of /misc. The second column contains mount options: -fstype=auto indicates that the file system is recognized automatically. ro stands for readonly — only read access is permitted. The third column specifies the file system. For local files, the directory name must be preceded by a colon. Step 3: If necessary, create the mandatory directory /misc: earth:~ # mkdir /misc

Mount points are not required. These are dynamically generated and deleted by the automounter. Step 4: To start the automounter, execute the start script: earth:~ # rcautofs start

Step 5: Now test the automounter by inserting any data CD-ROM in the drive. A glance at the directory /misc/cdrom should reveal the content of the CD-ROM: earth:~ # ls /misc/cdrom

After about 10 seconds, the mount point /misc/cdrom should disappear.

Exercise 1. Configure the automounter for the floppy disk drive and the CD-ROM drive with a time-out of ten seconds. 2. Check the automounting function of the floppy disk drive and the CDROM drive. Is the mount point removed?


Automounting Network Directories

The Linux automounter is also able to automount network directories, such as Samba shares or exported NFS directories. For this purpose, the configuration files should have the following entries: 1. The file system type: -fstype=smbfs for Samba shares or -fstype=nfs for NFS directories.

© 2003, SuSE Linux AG (


7 Linux File Systems 2. The path for the network directory in the following form: server:/path Example: For example, the configuration files could appear as follows: /etc/auto.master /net


--timeout 60

The directory /net is associated with the automounter configuration file /etc/ /etc/ earth


When accessing the directory /net/earth, the directory /export of the server is automounted.

Exercise 1. The trainer configures an NFS export on the trainer host. Create an automounter configuration file for this directory. 2. Check if the automounter configuration is correct.

7.7 Swap Partitions Linux provides the possibility of swapping part of the memory to the hard disk. This “virtual” memory is located on one or several swap partitions. Multiple (up to 32) swap partitions can be used to distribute the load to several hard disks. In the past, the recommended size of the swap partition was twice the size of the RAM. Today, this rule is obsolete. Usually, 200 to 500 MB of swap space should be sufficient. If a large portion of this space is used, consider increasing the RAM.


© 2003, SuSE Linux AG (

7.7 Swap Partitions


Checking the Used Swap Space

A number of commands provide information about whether and how the swap partition is used: free The command free prints the load status of the memory and the swap partition. earth:~ # free total Mem: 126360 -/+ buffers/cache: Swap: 136040

used 117448 78288 41884

free 8912 48072 94156

shared 0

buffers 5684

cached 33476

The file /proc/swaps lists all available swap partitions, their size, priority, and load status. earth:~ # cat /proc/swaps Filename /dev/hda1

Type partition

Size 136040

Used 41884

Priority 42

swapon -s The command swapon -s corresponds to the command cat /proc/swaps. top The command top can be used to track the load of the RAM and the swap space over an extended period.


Creating a Swap Partition

Normally the swap partition is created automatically by the installation program YaST2 during the installation. To create a swap partition manually, proceed as follows: Step 1: Create a partition with the ID 82 (Linux swap), using a program such as fdisk, parted, or cfdisk. Step 2: Set up the swap space with mkswap. earth:~ # mkswap /dev/sda1 Setting up swapspace version 1, size = 530140 KiB

Step 3: Activate the swap space: • The swap partition can be activated manually with swapon: earth:~ # swapon /dev/sda1

• The partition can also be entered as swap space in the file /etc/fstab: /dev/sda1




0 0

The priority (pri=42) only plays a role if multiple swap partitions are used. Apart from entire swap partitions, you can also create swap files. However, the performance of swap files is not as good as that of pure swap partitions.

© 2003, SuSE Linux AG (


7 Linux File Systems

7.8 For More Information 7.8.1 General Information • Manual pages: – man 5 fstab – man 8 mount • Internet: – Large file support: aj/linux_lfs.html 

7.8.2 Second Extended File System • Manual pages: – man 8 dumpe2fs – man 8 e2label – man 8 mke2fs – man 8 tune2fs • Internet: –

7.8.3 Reiser File System • Manual pages: – man 8 mkreiserfs – man 8 reiserfsck – man 8 reiserfstune • Internet: –


© 2003, SuSE Linux AG (

• Manual pages – man 8 mkfs.jfs – man 8 mkfs.xfs • Internet – JFS: – XFS:



• Manual pages: – man 5 autofs – man 8 autofs – man 8 automount



• Manual pages: – man 8 mkswap – man 8 swapon – man 8 swapoff

© 2003, SuSE Linux AG (


7 Linux File Systems


Related Documents