Session 15amzb

  • Uploaded by: amzeus
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Session 15amzb as PDF for free.

More details

  • Words: 3,927
  • Pages: 47
MEDIA AND STORAGE UNIX File Systems

Intro 

Last few weeks 

MS (Microsoft file systems)    



Fat 16 Fat 32 NTFS EFS

This week Generic group feed back on posters  Unix/Linux File systems  Learning out comes 



Be able to show understand the structure and layout of Unix file system

Poster Feedback  

Look at who your talking to Know what your talking about  (do

not just read what you have copied from the web)

  

Practice before you deliver Do not fidget Some use of crib cards (good thing)

ACW 

Any questions about what you have done /are about to do

UNIX File Systems

EXT2 6 



second extended file system is a file system for the Linux kernel. It was initially designed by Rémy Card as a replacement for the extended file system (ext).  Although

ext2 is not a journaling file system, its successor, ext3, provides journaling and is almost completely compatible with ext2.

History of ext2 





This file system was developed by Dr. Stephen Tweedie (and others) in response to the GNU/Linux operating system's need for a feature-rich file system that could handle large files and devices. During Linus Torvalds' initial design work on the Linux kernel he made it compatible with the Minix file system, as that OS was commonly used in academia, and was widely documented and tested. Unfortunately, the Minix file system was limited to drives and file sizes of 64 megabytes, which even in 1991 seemed

History of ext2 8 

April 1992 



January 1993 



The extended file system EXT, was released the first file system using the VFS API and was included in Linux version 0.96c two new file systems were developed xiafs and the second extended file system (ext2), which was an overhaul of the extended file system incorporating many ideas from the Berkeley Fast File System.

Since then 

ext2 has been a test bed for many of the new extensions to the VFS API. Features such as POSIX ACLs and extended attributes were generally implemented first on ext2 because it was relatively simple to extend and its internals were wellunderstood.

EXT2 9 











The space in ext2 is split up in blocks, and organized into block groups, analogous to cylinder groups in the Unix File System. This is done to reduce external fragmentation and minimize the number of disk seeks when reading a large amount of consecutive data. Each block group contains a superblock, the block group bitmap, inode bitmap, followed by the actual data blocks. The superblock contains important information that is crucial to the booting of the operating system, thus backup copies are made in every block group of each block in the file system. However, only the first copy of it, which is found at the first block of the file system, is used in the booting. The group descriptor stores the value of the block bitmap, inode bitmap and the start of the i-node table for every block group and these, in turn is stored in a group descriptor table.

ext2  





Create files up to 4 terabytes in size Create files up to 255 characters in length(or up to 1012 by changing a single line of code) You can choose how many bytes to use per inode sizes (1024, 2048, 4096) On top of this, the file system code was specifically written with modularity and extensibility in mind, and various features necessary to a commercial Unix environment were added.

11

Example of ext2 inode structure:

Inodes 

Crucial to understanding how ext2 (or most any other Unix-esque file system) works is the concept of the inode, the lowest common denominator of file storage description.  Each inode holds the access rights, timestamps, size, and type of the file, as well as pointers to the blocks which actually hold its data.  Each of the blocks can either be to direct blocks, which themselves hold data, or to indirect blocks, which hold a list of pointers to either direct blocks or more indirect blocks.  Because of this recursive nature, it's possible through triply indirect blocks for the file system to hold relatively huge files.  Directories seen in user-space are themselves represented as files, and thus inodes, in the file system.  Each directory contains a list of entries, which are composed of a file name and its corresponding inode number.  When a request is made to the file system for a specific file, that file's name is converted to the inode number, and the inode is referenced from that point on.  Notably, this design allows for the possibility of hard links, by which a single inode is referenced by two or more file names that can be located anywhere in the directory hierarchy.  To delete a file or a hard link, the file system removes the file name and decrements its inode's link count. When the link count equals zero, the inode is de-allocated.

File system limits 13 

The reason for some limits of the ext2-file system are the file format of the data and the operating system's kernel.   



Mostly these factors will be determined once when the file system is built. They depend on the block size and the ratio of the number of blocks and inodes. Block sizes of 8 KB are only possible on alphaarchitectures by default.

There are also many userspace programs that can't handle files larger than 2 GB.  



The limit of sublevel-directories is about 32768. If the number of files in a directory exceeds 10000 to 15000 files, the user will normally be warned that operations can last for a long time. The theoretical limit on the number of files in a directory is 1.3 × 1020, although this is not relevant for practical situations.

14

Theoretical ext2 filesystem limits under Linux

Block size: max. file size: max. filesystem size:

1 KiB

2 KiB

4 KiB

8 KiB

16 GiB

256 GiB

2 TiB

64 TiB

2 TiB

8 TiB

16 TiB

32 TiB

EXT3

ext3 16 





The ext3 or third extended file system is a journal file system that is commonly used by the Linux operating system. It is the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extending ext2 in a February 1999 kernel mailing list posting and the file system was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward. Its main advantage over ext2 is journaling which improves reliability and eliminates the need to check the file system after an unclean shutdown.

Advantages of ext3 17 





Although its performance (speed) is less attractive than competing Linux filesystems such as JFS, ReiserFS and XFS, it does have the significant advantage that it does allow in-place upgrades from the ext2 file system without having to back up and restore data as well as using less CPU power than ReiserFS and XFS. It is also considered safer than the other Linux file systems due to its relative simplicity and wider testing base. The ext3 file system adds, over its predecessor: • • •

A Journaling file system Online file system growth htree indexing for larger directories (specialized version of a B-tree)

EXT 3 18 

Without the 3 (from previous slide) , any ext3 file system is also a valid ext2 file system.  This

has allowed well-tested and mature file system maintenance utilities for maintaining and repairing ext2 file systems to also be used with ext3 without major changes.  The ext2 and ext3 file systems share the same standard set of utilities, e2fsprogs, which includes a fsck tool.  The close relationship also makes conversion between the two file systems

Journaling levels 19 

There are three levels of journaling available in the Linux implementation of ext3:  Journal  Ordered  Writeback

Journal (lowest risk) 20

 Both

metadata and file contents are written to the journal before being committed to the main file system.  Because the journal is relatively continuous on disk, this can improve performance in some circumstances.  In other cases, performance gets worse because the data must be written twice once to the journal, and once to the main part of the filesystem.

Ordered (medium risk) 21 

 

Only metadata is journaled; file contents are not, but it's guaranteed that file contents are written to disk before associated metadata is marked as committed in the journal. This is the default on many Linux distributions. If there is a power outage or kernel panic while a file is being written or appended to, the journal will indicate the new file or appended data has not been "committed", so it will be purged by the cleanup process. 







(Thus appends and new files have the same level of integrity protection as the "journaled" level.)

However, files being overwritten can be corrupted because the original version of the file is not stored. Thus it's possible to end up with a file in an intermediate state between new and old, without enough information to restore either one or the other (the new data never made it to disk completely, and the old data is not stored anywhere). Even worse, the intermediate state might intersperse old and new data, because the order of the write is left up to the disk's hardware.

Writeback (highest risk) 22 









Only metadata is journaled; file contents are not. The contents might be written before or after the journal is updated. As a result, files modified right before a crash can become corrupted. For example, a file being appended to may be marked in the journal as being larger than it actually is, causing garbage at the end. Older versions of files could also appear unexpectedly after a journal recovery. The lack of synchronization between data and journal is faster in many cases. XFS and JFS use this level of journaling, but ensure that any "garbage" due to unwritten data is zeroed out on reboot.

Size limits 23 

ext3 has a maximum size for both individual files and the entire filesystem. For the filesystem as a whole that limit is 231−1 blocks. Both limits are dependent Max on the block size of the filesystem; Block size Max file size filesystem 1KiB

  

2KiB Kibibyte 4KiB Gibibyte 8KiB Tebibyte

[limits 1]

16GiB

size <2TiB

256GiB

<4TiB

2TiB

<8TiB

2TiB

<16TiB

24

UNIX Overview

UNIX 25 

A simple description of the UNIX system, also applicable to Linux, is this:  "On

a UNIX system, everything is a file; if something is not a file, it is a process."

Structure 26 

This statement is true because there are special files that are more than just files (named pipes and sockets, for instance), but to keep things simple, saying that everything is a file is an acceptable generalization. 



A Linux system, just like UNIX, makes no difference between a file and a directory, since a directory is just a file containing names of other files. Programs, services, texts, images, and so forth, are all files. Input and output devices, and generally all devices, are considered to be files, according to the system.

Structure 27 



In order to manage all those files in an orderly fashion, man likes to think of them in an ordered tree-like structure on the hard disk, as we know from MS-DOS (Disk Operating System) for instance. The large branches contain more branches, and the branches at the end contain the tree's leaves or normal files. For now we will use this image of the tree, but we will find out later why this is not a fully accurate image.

Sorts of files 28 

Most files are just files, called regular files; they contain normal data, for example text files, executable files or programs, input for or output from a program and so on.



While it is reasonably safe to suppose that everything you encounter on a Linux system is a file, there are some exceptions.



Directories: files that are lists of other files.



Special files: the mechanism used for input and output. Most special files are in /dev, we will discuss them later.



Links: a system to make a file or directory visible in multiple parts of the system's file tree. We will talk about links in detail.



(Domain) sockets: a special file type, similar to TCP/IP sockets, providing inter-process networking protected by the file system's access control.



Named pipes: act more or less like sockets and form a way for processes to communicate with each other, without using network socket semantics.

Example 29 

The -1 option to ls displays the file type, using the first character of each input line:   

 

jaime:~/Documents> ls -1 total 80 -rw-rw-r-- 1 jaime jaime 31744 Feb 21 17:56 intro Linux.doc -rw-rw-r-- 1 jaime jaime 41472 Feb 21 17:56 Linux.doc drwxrwxr-x 2 jaime jaime 4096 Feb 25 11:50 course

30

Partitions

Why partition? 31 

Most people have a vague knowledge of what partitions are, since every operating system has the ability to create or remove them.



It may seem strange that Linux uses more than one partition on the same disk, even when using the standard installation procedure, so some explanation is called for.



One of the goals of having different partitions is to achieve higher data security in case of disaster.



By dividing the hard disk in partitions, data can be grouped and separated. 

When an accident occurs, only the data in the partition that got the hit will be damaged, while the data on the other partitions will most likely survive.

Why partition? 32 

This principle dates from the days when Linux didn't have journaled file systems and power failures might have lead to disaster. 



The use of partitions remains for security and robustness reasons, so a breach on one part of the system doesn't automatically mean that the whole computer is in danger. This is currently the most important reason for partitioning.

A simple example: a user creates a script, a program or a web application that starts filling up the disk. 

If the disk contains only one big partition, the entire system will stop functioning if the disk is full. If the user stores the data on a separate partition, then only that (data) partition will be affected, while the system partitions and possible other data partitions keep functioning.

Why partition? 33 

Mind that having a journaled file system only provides data security in case of power failure and sudden disconnection of storage devices.



This does not protect your data against bad blocks and logical errors in the file system. In those cases, you should use a RAID (Redundant Array of Inexpensive Disks) solution.

Partition layout and types 34 

There are two kinds of major partitions on a Linux system:  data

partition:

 normal

Linux system data, including the root and home partition containing all the data to start up and run the system; and

 swap

partition:

 expansion

of the computer's physical memory, extra memory on hard disk.

Root partion 35 



The standard root partition (indicated with a single forward slash, /) is about 100-500 MB, and contains the system configuration files, most basic commands and server programs, system libraries, some temporary space and the home directory of the administrative user. A standard installation requires about 250 MB for the root partition.

Swap space 36 

Swap space (indicated with swap) is only accessible for the system itself, and is hidden from view during normal operation.



Swap is the system that ensures, like on normal UNIX systems, that you can keep on working, whatever happens.



On Linux, you will virtually never see irritating messages like Out of memory, please close some applications first and try again, because of this extra memory.



The swap or virtual memory procedure has long been adopted by operating systems outside the UNIX world by now.



Using memory on a hard disk is naturally slower than using the real memory chips of a computer, but having this little extra is a great comfort.

Swap space 37 

Linux generally counts on having twice the amount of physical memory in the form of swap space on the hard disk. When installing a system, you have to know how you are going to do this. An example on a system with 512 MB of RAM:   



1st possibility: one swap partition of 1 GB 2nd possibility: two swap partitions of 512 MB 3rd possibility: with two hard disks: 1 partition of 512 MB on each disk.

The last option will give the best results when a lot of I/O is to be expected.

Kernel 38 



The kernel is on a separate partition as well in many distributions, because it is the most important file of your system. If this is the case, you will find that you also have a /boot partition, holding your kernel(s) and accompanying data files.

The rest 39 

The rest of the hard disk(s) is generally divided in data partitions, although it may be that all of the non-system critical data resides on one partition, for example when you perform a standard workstation installation.



When non-critical data is separated on different partitions, it usually happens following a set pattern:





a partition for user programs (/usr)



a partition containing the users' personal data (/home)



a partition to store temporary data like print- and mail-queues (/var)



a partition for third party and extra software (/opt)

Once the partitions are made, you can only add more. Changing sizes or properties of existing partitions is possible but not advisable.

Mount points 40 











All partitions are attached to the system via a mount point. The mount point defines the place of a particular data set in the file system. Usually, all partitions are connected through the root partition. On this partition, which is indicated with the slash (/), directories are created. These empty directories will be the starting point of the partitions that are attached to them. An example: given a partition that holds the following directories: 

videos/ cd-images/ pictures/

Example file system layout 41

42 







This is a layout (last slide) from a RedHat system. Depending on the system admin, the operating system and the mission of the UNIX machine, the structure may vary, and directories may be left out or added at will. The names are not even required; they are only a convention. The tree of the file system starts at the trunk or slash, indicated by a forward slash (/). This directory, containing all underlying directories and files, is also called the root directory or "the root" of the file system.

43

Subdirectories of the root directory Directory

Content

/bin

/home

Common programs, shared by the system, the system administrator and the users. The startup files and the kernel, vmlinuz. In some recent distributions also grub data. Grub is the GRand Unified Boot loader Contains references to all the CPU peripheral hardware, which are represented as files with special properties. Most important system configuration files are in /etc, this directory contains data similar to those in the Control Panel in Windows Home directories of the common users.

/initrd

(on some distributions) Information for booting. Do not remove!

/lib

Library files, includes files for all kinds of programs needed by the system and the users.

/lost+found

Every partition has a lost+found in its upper directory. Files that were saved during failures are here.

/misc

For miscellaneous purposes.

/mnt

Standard mount point for external file systems, e.g. a CD-ROM or a digital camera.

/net

Standard mount point for entire remote file systems

/opt

Typically contains extra and third party software. A virtual file system containing information about system resources. More information about the meaning of the files in proc is obtained by entering the command man proc in a terminal window. The file proc.txt discusses the virtual file system in detail. The administrative user's home directory. Mind the difference between /, the root directory and /root, the home directory of the root user.

/boot /dev /etc

/proc /root /sbin

Programs for use by the system and the system administrator.

/tmp

Temporary space for use by the system, cleaned upon reboot, so don't use this for saving any work!

/usr

Programs, libraries, documentation etc. for all user-related programs. Storage for all variable files and temporary files created by users, such as log files, the mail queue, the print spooler area, space for temporary storage of files downloaded from the Internet, or to keep an image of a CD before burning it.

/var

Closing 

Any question about what we have covered so far

Self study 

Brian carrier File system Forensic analysis  Chapters  (pages

14 and 15

397 to 478)

References 46 

Richard Russon and Yuval Fledel. NTFS Documentation. Retrieved on 2007-07-01.



Microsoft Corporation. Determining Maximum Volume Size. Retrieved on 2007-08-21.



NTFS Data Solutions Inc.. Retrieved on 2007-07-07.



UTF-16 codepoints accepted, but not validated



Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 978-1-55615-660-1. 



 

 

Loveall, John (2006). Storage improvements in Windows Vista and Windows Server 2008 (PowerPoint) 1420. Microsoft Corporation. Retrieved on 2007-09-04. "Microsoft TechNet Resource Kit" Mark Russinovich (November 2000). Inside Win2K NTFS, Part 1: New features improve efficiency, optimize disk utilization, and enable developers to add . Windows 2000 Magazine. Microsoft. Retrieved on 2008-01-14. "ntfsmount wiki page on linux-ntfs.org" cfsbloggers (July 14, 2006). How restore points and other recovery features in Windows Vista are affected when dual-booting with Windows XP . The Filing Cabinet. Retrieved on 2007-03-21.



How to Convert FAT Disks to NTFS. Microsoft Corporation (2001-10-25). Retrieved on 2007-08-27.



"Beating the Daylight Savings Time bug and getting correct file modification times" The Code Project

References 47 

Sparse Files. MSDN Platform SDK: File Systems. Retrieved on 2005-05-22.



Sparse FIles and Disk Quotas. Win32 and COM Development: File Systems. Retrieved on 2007-12-05.



Mark Russinovich, "Inside Win2K NTFS, Part 1"



MS Windows NT Workstation 4.0 Resource Guide, "POSIX Compatibility"



John Saville, "What is Native Structured Storage?"



File Compression and Decompression. MSDN Platform SDK: File Systems. Retrieved on 2005-08-18.



"Best practices for NTFS compression in Windows." Microsoft Knowledge Base. Retrieved on 2005-08-18.



Daily, Sean (January 1998). Optimizing Disks. IDG books. Retrieved on 2007-12-17.



Single Instance Storage in Windows 2000 (PDF). Microsoft Research and Balder Technology Group.



How EFS Works, Microsoft Windows 2000 Resource Kit



Symbolic Links. MSDN. Retrieved on 2007-01-05.



Transactional NTFS. MSDN. Retrieved on 2007-02-02.



"How NTFS Works" Windows Server 2003 Technical Reference



Bolosky, William J.; Corbin, Scott; Goebel, David; & Douceur, John R. (date). "Single Instance Storage in Windows 2000" (PDF). Microsoft Research & Balder Technology Group, Inc..



Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 978-1-55615-660-1. 



Nagar, Rajeev (1997). Windows NT File System Internals: A Developer's Guide. O'Reilly. ISBN 978-1-56592-249-5. 

Related Documents

Session 15amzb
April 2020 18
Session
July 2020 19
Session
October 2019 57
Session
November 2019 44
Session
November 2019 42
Session 06
October 2019 2

More Documents from ""

Session 13amzb
April 2020 12
Session 10 Answers 3 4 5
April 2020 16
Session 15amzb
April 2020 18
Session 1amzb
April 2020 11
Part Amzb
April 2020 14