MEDIA AND STORAGE UNIX File Systems
Intro
Last few weeks
MS (Microsoft file systems)
Fat 16 Fat 32 NTFS EFS
This week
Generic group feed back on posters Unix/Linux File systems Learning out comes
Be able to show understand the structure and layout of Unix file system
Poster Feedback
Look at who your talking to Know what your talking about (do
not just read what you have copied from the web)
Practice before you deliver Do not fidget Some use of crib cards (good thing)
ACW
Any questions about what you have done /are about to do
UNIX File Systems
EXT2 6
second extended file system is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system (ext). Although
ext2 is not a journaling file system, its successor, ext3, provides journaling and is almost completely compatible with ext2.
History of ext2
This file system was developed by Dr. Stephen Tweedie (and others) in response to the GNU/Linux operating system's need for a feature-rich file system that could handle large files and devices. During Linus Torvalds' initial design work on the Linux kernel he made it compatible with the Minix file system, as that OS was commonly used in academia, and was widely documented and tested. Unfortunately, the Minix file system was limited to drives and file sizes of 64 megabytes, which even in 1991 seemed rather absurd.
History of ext2 8
April 1992
January 1993
The extended file system EXT, was released the first file system using the VFS API and was included in Linux version 0.96c
two new file systems were developed xiafs and the second extended file system (ext2), which was an overhaul of the extended file system incorporating many ideas from the Berkeley Fast File System.
Since then
ext2 has been a test bed for many of the new extensions to the VFS API. Features such as POSIX ACLs and extended attributes were generally implemented first on ext2 because it was relatively simple to extend and its internals were well-understood.
EXT2 9
The space in ext2 is split up in blocks, and organized into block groups, analogous to cylinder groups in the Unix File System. This is done to reduce external fragmentation and minimize the number of disk seeks when reading a large amount of consecutive data. Each block group contains a superblock, the block group bitmap, inode bitmap, followed by the actual data blocks. The superblock contains important information that is crucial to the booting of the operating system, thus backup copies are made in every block group of each block in the file system. However, only the first copy of it, which is found at the first block of the file system, is used in the booting. The group descriptor stores the value of the block bitmap, inode bitmap and the start of the i-node table for every block group and these, in turn is stored in a group descriptor table.
ext2
Create files up to 4 terabytes in size Create files up to 255 characters in length(or up to 1012 by changing a single line of code) You can choose how many bytes to use per inode sizes (1024, 2048, 4096) On top of this, the file system code was specifically written with modularity and extensibility in mind, and various features necessary to a commercial Unix environment were added.
Example of ext2 inode structure: 11
Inodes
Crucial to understanding how ext2 (or most any other Unix-esque file system) works is the concept of the inode, the lowest common denominator of file storage description. Each inode holds the access rights, timestamps, size, and type of the file, as well as pointers to the blocks which actually hold its data. Each of the blocks can either be to direct blocks, which themselves hold data, or to indirect blocks, which hold a list of pointers to either direct blocks or more indirect blocks. Because of this recursive nature, it's possible through triply indirect blocks for the file system to hold relatively huge files. Directories seen in user-space are themselves represented as files, and thus inodes, in the file system. Each directory contains a list of entries, which are composed of a file name and its corresponding inode number. When a request is made to the file system for a specific file, that file's name is converted to the inode number, and the inode is referenced from that point on. Notably, this design allows for the possibility of hard links, by which a single inode is referenced by two or more file names that can be located anywhere in the directory hierarchy. To delete a file or a hard link, the file system removes the file name and decrements its inode's link count. When the link count equals zero, the inode is de-allocated.
File system limits 13
The reason for some limits of the ext2-file system are the file format of the data and the operating system's kernel.
Mostly these factors will be determined once when the file system is built. They depend on the block size and the ratio of the number of blocks and inodes. Block sizes of 8 KB are only possible on alpha-architectures by default.
There are also many userspace programs that can't handle files larger than 2 GB.
The limit of sublevel-directories is about 32768. If the number of files in a directory exceeds 10000 to 15000 files, the user will normally be warned that operations can last for a long time. The theoretical limit on the number of files in a directory is 1.3 × 1020, although this is not relevant for practical situations.
Theoretical ext2 filesystem limits under Linux 14
Block size: max. file size: max. filesystem size:
1 KiB
2 KiB
4 KiB
8 KiB
16 GiB
256 GiB
2 TiB
64 TiB
2 TiB
8 TiB
16 TiB
32 TiB
EXT3
ext3 16
The ext3 or third extended file system is a journal file system that is commonly used by the Linux operating system. It is the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extending ext2 in a February 1999 kernel mailing list posting and the file system was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward. Its main advantage over ext2 is journaling which improves reliability and eliminates the need to check the file system after an unclean shutdown.
Advantages of ext3 17
Although its performance (speed) is less attractive than competing Linux filesystems such as JFS, ReiserFS and XFS, it does have the significant advantage that it does allow inplace upgrades from the ext2 file system without having to back up and restore data as well as using less CPU power than ReiserFS and XFS. It is also considered safer than the other Linux file systems due to its relative simplicity and wider testing base. The ext3 file system adds, over its predecessor: • • •
A Journaling file system Online file system growth htree indexing for larger directories (specialized version of a Btree)
EXT 3 18
Without the 3 (from previous slide) , any ext3 file system is also a valid ext2 file system. This
has allowed well-tested and mature file system maintenance utilities for maintaining and repairing ext2 file systems to also be used with ext3 without major changes. The ext2 and ext3 file systems share the same standard set of utilities, e2fsprogs, which includes a fsck tool. The close relationship also makes conversion between the two file systems (both forward to ext3 and backward to ext2) straightforward.
Journaling levels 19
There are three levels of journaling available in the Linux implementation of ext3: Journal
Ordered Writeback
Journal (lowest risk) 20
Both
metadata and file contents are written to the journal before being committed to the main file system. Because the journal is relatively continuous on disk, this can improve performance in some circumstances. In other cases, performance gets worse because the data must be written twice - once to the journal, and once to the main part of the filesystem.
Ordered (medium risk) 21
Only metadata is journaled; file contents are not, but it's guaranteed that file contents are written to disk before associated metadata is marked as committed in the journal. This is the default on many Linux distributions. If there is a power outage or kernel panic while a file is being written or appended to, the journal will indicate the new file or appended data has not been "committed", so it will be purged by the cleanup process.
(Thus appends and new files have the same level of integrity protection as the "journaled" level.)
However, files being overwritten can be corrupted because the original version of the file is not stored. Thus it's possible to end up with a file in an intermediate state between new and old, without enough information to restore either one or the other (the new data never made it to disk completely, and the old data is not stored anywhere). Even worse, the intermediate state might intersperse old and new data, because the order of the write is left up to the disk's hardware.
Writeback (highest risk) 22
Only metadata is journaled; file contents are not. The contents might be written before or after the journal is updated. As a result, files modified right before a crash can become corrupted. For example, a file being appended to may be marked in the journal as being larger than it actually is, causing garbage at the end. Older versions of files could also appear unexpectedly after a journal recovery. The lack of synchronization between data and journal is faster in many cases. XFS and JFS use this level of journaling, but ensure that any "garbage" due to unwritten data is zeroed out on reboot.
Size limits 23
ext3 has a maximum size for both individual files and the entire filesystem. For the filesystem as a whole that limit is 231−1 blocks. Both limits are dependent on the block size of the filesystem; Kibibyte Gibibyte Tebibyte
Block size
Max file size
Max filesystem size
1KiB
16GiB
<2TiB
2KiB
256GiB
<4TiB
4KiB
2TiB
<8TiB
8KiB[limits 1]
2TiB
<16TiB
24
UNIX Overview
UNIX 25
A simple description of the UNIX system, also applicable to Linux, is this: "On
a UNIX system, everything is a file; if something is not a file, it is a process."
Structure 26
This statement is true because there are special files that are more than just files (named pipes and sockets, for instance), but to keep things simple, saying that everything is a file is an acceptable generalization.
A Linux system, just like UNIX, makes no difference between a file and a directory, since a directory is just a file containing names of other files. Programs, services, texts, images, and so forth, are all files. Input and output devices, and generally all devices, are considered to be files, according to the system.
Structure 27
In order to manage all those files in an orderly fashion, man likes to think of them in an ordered treelike structure on the hard disk, as we know from MSDOS (Disk Operating System) for instance. The large branches contain more branches, and the branches at the end contain the tree's leaves or normal files. For now we will use this image of the tree, but we will find out later why this is not a fully accurate image.
Sorts of files 28
Most files are just files, called regular files; they contain normal data, for example text files, executable files or programs, input for or output from a program and so on. While it is reasonably safe to suppose that everything you encounter on a Linux system is a file, there are some exceptions. Directories: files that are lists of other files. Special files: the mechanism used for input and output. Most special files are in /dev, we will discuss them later.
Links: a system to make a file or directory visible in multiple parts of the system's file tree. We will talk about links in detail. (Domain) sockets: a special file type, similar to TCP/IP sockets, providing interprocess networking protected by the file system's access control.
Named pipes: act more or less like sockets and form a way for processes to communicate with each other, without using network socket semantics.
Example 29
The -1 option to ls displays the file type, using the first character of each input line:
jaime:~/Documents> ls -1 total 80 -rw-rw-r-- 1 jaime jaime 31744 Feb 21 17:56 intro Linux.doc -rw-rw-r-- 1 jaime jaime 41472 Feb 21 17:56 Linux.doc drwxrwxr-x 2 jaime jaime 4096 Feb 25 11:50 course
30
Partitions
Why partition? 31
Most people have a vague knowledge of what partitions are, since every operating system has the ability to create or remove them. It may seem strange that Linux uses more than one partition on the same disk, even when using the standard installation procedure, so some explanation is called for. One of the goals of having different partitions is to achieve higher data security in case of disaster.
By dividing the hard disk in partitions, data can be grouped and separated.
When an accident occurs, only the data in the partition that got the hit will be damaged, while the data on the other partitions will most likely survive.
Why partition? 32
This principle dates from the days when Linux didn't have journaled file systems and power failures might have lead to disaster.
The use of partitions remains for security and robustness reasons, so a breach on one part of the system doesn't automatically mean that the whole computer is in danger. This is currently the most important reason for partitioning.
A simple example: a user creates a script, a program or a web application that starts filling up the disk.
If the disk contains only one big partition, the entire system will stop functioning if the disk is full. If the user stores the data on a separate partition, then only that (data) partition will be affected, while the system partitions and possible other data partitions keep functioning.
Why partition? 33
Mind that having a journaled file system only provides data security in case of power failure and sudden disconnection of storage devices. This does not protect your data against bad blocks and logical errors in the file system. In those cases, you should use a RAID (Redundant Array of Inexpensive Disks) solution.
Partition layout and types 34
There are two kinds of major partitions on a Linux system: data
partition:
normal
Linux system data, including the root and home partition containing all the data to start up and run the system; and
swap
partition:
expansion
of the computer's physical memory, extra memory on hard disk.
Root partion 35
The standard root partition (indicated with a single forward slash, /) is about 100-500 MB, and contains the system configuration files, most basic commands and server programs, system libraries, some temporary space and the home directory of the administrative user. A standard installation requires about 250 MB for the root partition.
Swap space 36
Swap space (indicated with swap) is only accessible for the system itself, and is hidden from view during normal operation. Swap is the system that ensures, like on normal UNIX systems, that you can keep on working, whatever happens.
On Linux, you will virtually never see irritating messages like Out of memory, please close some applications first and try again, because of this extra memory. The swap or virtual memory procedure has long been adopted by operating systems outside the UNIX world by now. Using memory on a hard disk is naturally slower than using the real memory chips of a computer, but having this little extra is a great comfort.
Swap space 37
Linux generally counts on having twice the amount of physical memory in the form of swap space on the hard disk. When installing a system, you have to know how you are going to do this. An example on a system with 512 MB of RAM:
1st possibility: one swap partition of 1 GB 2nd possibility: two swap partitions of 512 MB 3rd possibility: with two hard disks: 1 partition of 512 MB on each disk.
The last option will give the best results when a lot of I/O is to be expected.
Kernel 38
The kernel is on a separate partition as well in many distributions, because it is the most important file of your system. If this is the case, you will find that you also have a /boot partition, holding your kernel(s) and accompanying data files.
The rest 39
The rest of the hard disk(s) is generally divided in data partitions, although it may be that all of the non-system critical data resides on one partition, for example when you perform a standard workstation installation. When non-critical data is separated on different partitions, it usually happens following a set pattern:
a partition for user programs (/usr)
a partition containing the users' personal data (/home)
a partition to store temporary data like print- and mail-queues (/var)
a partition for third party and extra software (/opt)
Once the partitions are made, you can only add more. Changing sizes or properties of existing partitions is possible but not advisable.
Mount points 40
All partitions are attached to the system via a mount point. The mount point defines the place of a particular data set in the file system. Usually, all partitions are connected through the root partition. On this partition, which is indicated with the slash (/), directories are created. These empty directories will be the starting point of the partitions that are attached to them. An example: given a partition that holds the following directories:
videos/ cd-images/ pictures/
Example file system layout 41
42
This is a layout (last slide) from a RedHat system. Depending on the system admin, the operating system and the mission of the UNIX machine, the structure may vary, and directories may be left out or added at will. The names are not even required; they are only a convention. The tree of the file system starts at the trunk or slash, indicated by a forward slash (/). This directory, containing all underlying directories and files, is also called the root directory or "the root" of the file system.
Subdirectories of the root directory 43
Directory
Content
/bin
Common programs, shared by the system, the system administrator and the users.
/boot
The startup files and the kernel, vmlinuz. In some recent distributions also grub data. Grub is the GRand Unified Boot loader
/dev
Contains references to all the CPU peripheral hardware, which are represented as files with special properties.
/etc
Most important system configuration files are in /etc, this directory contains data similar to those in the Control Panel in Windows
/home
Home directories of the common users.
/initrd
(on some distributions) Information for booting. Do not remove!
/lib
Library files, includes files for all kinds of programs needed by the system and the users.
/lost+found
Every partition has a lost+found in its upper directory. Files that were saved during failures are here.
/misc
For miscellaneous purposes.
/mnt
Standard mount point for external file systems, e.g. a CD-ROM or a digital camera.
/net
Standard mount point for entire remote file systems
/opt
Typically contains extra and third party software.
/proc
A virtual file system containing information about system resources. More information about the meaning of the files in proc is obtained by entering the command man proc in a terminal window. The file proc.txt discusses the virtual file system in detail.
/root
The administrative user's home directory. Mind the difference between /, the root directory and /root, the home directory of the root user.
/sbin
Programs for use by the system and the system administrator.
/tmp
Temporary space for use by the system, cleaned upon reboot, so don't use this for saving any work!
/usr
Programs, libraries, documentation etc. for all user-related programs.
/var
Storage for all variable files and temporary files created by users, such as log files, the mail queue, the print spooler area, space for temporary storage of files downloaded from the Internet, or to keep an image of a CD before burning it.
Closing
Any question about what we have covered so far
Self study
Brian carrier File system Forensic analysis Chapters (pages
14 and 15
397 to 478)
References 46
Richard Russon and Yuval Fledel. NTFS Documentation. Retrieved on 2007-07-01.
Microsoft Corporation. Determining Maximum Volume Size. Retrieved on 2007-08-21.
NTFS Data Solutions Inc.. Retrieved on 2007-07-07.
UTF-16 codepoints accepted, but not validated
Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 978-1-55615-660-1.
Loveall, John (2006). Storage improvements in Windows Vista and Windows Server 2008 (PowerPoint) 14-20. Microsoft Corporation. Retrieved on 2007-09-04. "Microsoft TechNet Resource Kit" Mark Russinovich (November 2000). Inside Win2K NTFS, Part 1: New features improve efficiency, optimize disk utilization, and enable developers to add functionality. Windows 2000 Magazine. Microsoft. Retrieved on 2008-01-14. "ntfsmount wiki page on linux-ntfs.org"
cfsbloggers (July 14, 2006). How restore points and other recovery features in Windows Vista are affected when dual-booting with Windows XP. The Filing Cabinet. Retrieved on 2007-03-21.
How to Convert FAT Disks to NTFS. Microsoft Corporation (2001-10-25). Retrieved on 2007-08-27.
"Beating the Daylight Savings Time bug and getting correct file modification times" The Code Project
References 47
Sparse Files. MSDN Platform SDK: File Systems. Retrieved on 2005-05-22.
Sparse FIles and Disk Quotas. Win32 and COM Development: File Systems. Retrieved on 2007-12-05.
Mark Russinovich, "Inside Win2K NTFS, Part 1"
MS Windows NT Workstation 4.0 Resource Guide, "POSIX Compatibility"
John Saville, "What is Native Structured Storage?"
File Compression and Decompression. MSDN Platform SDK: File Systems. Retrieved on 2005-08-18.
"Best practices for NTFS compression in Windows." Microsoft Knowledge Base. Retrieved on 2005-08-18.
Daily, Sean (January 1998). Optimizing Disks. IDG books. Retrieved on 2007-12-17.
Single Instance Storage in Windows 2000 (PDF). Microsoft Research and Balder Technology Group.
How EFS Works, Microsoft Windows 2000 Resource Kit
Symbolic Links. MSDN. Retrieved on 2007-01-05.
Transactional NTFS. MSDN. Retrieved on 2007-02-02.
"How NTFS Works" Windows Server 2003 Technical Reference
Bolosky, William J.; Corbin, Scott; Goebel, David; & Douceur, John R. (date). "Single Instance Storage in Windows 2000" (PDF). Microsoft Research & Balder Technology Group, Inc..
Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 978-1-55615-660-1.
Nagar, Rajeev (1997). Windows NT File System Internals: A Developer's Guide. O'Reilly. ISBN 978-1-56592-249-5.