FAT/NTFS The Wily Internals of Windows’s File Systems
1
FAT12 FAT16 FAT32(VFAT) HPFS/HPFS386 NTFS WinFS
©2005 Christopher Taylor
Windows File Systems
FAT12 – 12 bit allocation system used on floppy diskettes. Windows 2000 will default to formatting a very small volume (64MB flash card, for instance) with FAT12 to prevent waste of space for the overhead incurred with other file systems. Every operating system uses FAT12 for floppies to allow for interoperability. FAT16 – 16 bit allocation used on small (<2GB) hard drives. Was sufficient for small volumes on personal computers, but lacked the performance and stability needed for a file server. The 8.3 file naming convention was a serious limitation that tormented users for years. FAT32 – Also called VFAT, used a 32 bit allocation table to overcome size limitations on the 16 bit system. Above 500MB, FAT16 required cluster sizes that started to induce large amounts of wasted slack space and performance lagged. Introduced with Windows 95 OSR2, mainstreamed with Windows 98. Long filename support provided as a horrendous kludge. HPFS – High Performance File System – built for speeed! Designed as a joint venture between Microsoft and IBM and included many marked improvements over FAT. These improvements include long filename support, 512byte allocation units (which almost completely did away with file slack), prevention of file fragmentation, and tracking of considerably more file metadata than FAT allowed. The 386 version had nothing to do with CPU architecture, but rather used the name as the latest marketing buzzword meaning ‘faster’. It had a larger file I/O cache and some other minor improvements that resulted in a marked performance boost for the server version of OS/2. But, it was release about the same time IBM and MS split and IBM had to pay royalties to MS anytime it sold HPFS386 so it was not widely used. Windows support for this FS dropped with Windows 2000. NTFS – While many consider it a ‘new file system built from the ground up’, it is largely based on HPFS. It incorporates all the features of HPFS and Macintosh’s HFS and overcomes many performance and security issues they lacked. Where HPFS’s performance starts degrading above 400MB volumes, NTFS’s performance theoretically doesn’t drop off right up to the theoretical volume limit of 16EB (2TB is the practical limit, but we’ll see what happens when drives get that big). WinFS – The new and improved file system that Microsoft is still working on. It was supposed to ship with VISTA, but due to production delays it will most likely not be available until several years later. The information provided on the following links details more than I could ever fit on this page. http://en.wikipedia.org/wiki/Comparison_of_file_systems http://support.microsoft.com/kb/100108
2
Originally conceived when floppy disks were the only media Extremely limited
©2005 Christopher Taylor
FAT Overview
No security Short file names No POSIX support
MS-DOS went through three versions before FAT16 was introduced with DOS4.0. For all three of these, DOS ran from a floppy in a PC that had no Hard Disk Drive. In this sort of environment, issues such as file security were more easily dealt with by locking the floppy in a filing cabinet then by technical means. Thus, FAT was never designed with the features that are now considered ‘a must’ in a file system, such as access controls, encryption, compression, foreign language support, etc. Also, since FAT was designed to operate in the confined spaces of a floppy diskette, its performance on larger volumes degrades rapidly.
3
First sector contains the MBR and partition table The FAT and a backup are in the next few sectors The root directory is immediately after that Everything else
©2005 Christopher Taylor
FAT Volume Layout
FAT12 and FAT16 volumes will always follow the physical layout above. FAT32 changed some of the rules about where to find these pieces, but still contains all these same basic structures. There is always a ‘backup’ FAT in case the first becomes corrupted. The two copies of the FAT are mirrored and unless the system dies while actually writing to the FAT, they will always be the same. The only real reason for the FATs to not match is a physical defect on the media. If this happens, it will most likely affect more than just a single sector and having the FATs adjacent means you run the risk of losing both the primary and your backup. Bytes 17-18 in the first sector specify the maximum number of entries in the root directory and is usually 224 on a floppy disk and 512 on a hard disk. That number times the cluster size indicates the amount of space on the drive immediately following the backup FAT that will be reserved for the contents of the root directory. In a FAT file system, the root directory will always start in the first cluster following the backup FAT. The rest of the data and subdirectories will lie after that space. Regular directories will shrink and grow as necessary based on the number of entries in the directory and are located anywhere on the drive. Due to formatting constraints, there are often a number of sectors after the data that cannot be allocated as part of the partition. These are usually referred to as hidden sectors. It is the same concept as file slack– think of this as partition slack. This space usually goes unused and inaccessible unless formatted as a small second partition or accessed via specially written software that bypasses the OS and read/writes directly to the drive. Bytes 28-29 of the first sector specify how many hidden sectors there are. The changes made in FAT32 include allowing the second FAT and root directory to reside physically anywhere on the drive. Figure 17.3 from Windows 2000 Professional Resource Kit
4
©2005 Christopher Taylor
FAT12 Designed for floppy disks Every OS uses it for floppies Allows for portability between OSs
Sometimes used on very small volumes Less than 16MB
FAT12 uses a 12-bit file allocation table entry (212 clusters) - primarily used on floppy disks, but can be found on very small volumes in Windows 2000. On volumes with fewer than 32,680 sectors (less than 16 MB), the cluster sizes can be up to 8 sectors per cluster. In this circumstance, the format program creates a 12-bit FAT. It generally doesn’t matter what OS you are running – a floppy is formatted with FAT12. The table below shows the specifications on each type of floppy diskette. size
5-1/4 5-1/4 5-1/4 5-1/4 5-1/4 3-1/2 3-1/2 3-1/2
capacity (in KB)
160
180
320
360
1200
720
1440
2880
density (Double/High)
D
D
D
D
H
D
H
E
media descriptor byte
FE
FC
FF
FD
F9
F9
F0
F0
Sides (heads)
1
1
2
2
2
2
2
2
tracks
40
40
40
40
80
80
80
80
sectors per track
8
9
8
9
15
9
18
36
total sectors
320
360
640
720
2400
1440
2880
5760
bytes per sector
512
512
512
512
512
512
512
512
reserved sectors
1
1
1
1
1
1
1
1
entries in root dir
64
64
112
112
224
112
224
sectors per FAT
1
2
1
2
7
3
9
hidden sectors
0
0
0
0
0
0
0
0
5
Used for the large hard drives available at the time MSDOS 4.0 was released – 30MB Larger FAT table entries allowed for larger volume size – up to a whopping 2GB Otherwise, identical to FAT12
©2005 Christopher Taylor
FAT16
Exception: Win2K/XP have LFN support in FAT16 volumes created under them
FAT16 uses a 16-bit file allocation table entry. Practical limit of 65,524(216) clusters with cluster size increasing as volume size increases. If the drive is small enough, each sector will be its own cluster. But as soon as the number of sectors increases above 65,524 (at roughly 32MB), a cluster becomes multiple sectors. Volume size max 4GB on w2k and 2GB on everything prior. This process of saving files is extremely efficient in terms of memory use and read/write speed, but tends to waste space on the drive. Typical cluster sizes are: Drive size
Cluster Size
Sectors/Cluster
128 MB - 255 MB
4 KB
8
256 MB - 511 MB
8 KB
16
512 MB - 1,023 MB
16 KB
32
1,024 MB - 2,048 MB
32 KB
64
Yet, in a typical Windows install the number of files that are under 10KB are in the thousands. A 5KB file saved on a drive using 16KB cluster size wastes 11KB. This is call file slack or slack space.
6
Larger FAT entries give way to larger volumes and ability to track more files Introduces concept of Long File Names to FAT Root directory no longer ‘nailed down’ Backup FAT no longer just a mirror
©2005 Christopher Taylor
FAT32
FAT32 was “released” as a bandage fix to those limitations in FAT16. I put release in quotes because FAT32 was never really released by Microsoft. Rather, they supplied it in an OEM version of Win95 and in doing so pushed the support and compatibility issues onedit a:to the manufactures. Microsoft offers no support for OEM releases – users have to call the manufacturer of the hardware that they bought that copy of Windows bundled with. The root directory, while still special in its own right, was now treated as a regular directory in that it could be located anywhere on the drive and no longer had a max entry limit. Bytes 44-47 of the BIOS Parameter Block portion of the Boot Sector contain the number of the first cluster of the root directory. Either FAT can be designated at the Primary now. Before, there was always a ‘backup’, but never really a way to use it. FAT32 allowed for either FAT to be the primary or backup which gave rise to a whole new level of reliability and flexibility. The theoretical max volume size is 2TB, but this is not practical due to overhead that would occur at those sizes. Current limits on volume size (32GB in Win2k/XP and 127GB in win98) are imposed by the software creating the partitions (format.com); not a limitation of the file system. Win2K will attempt to format a volume larger than 32GB, then (after a long while) will produce an error “Volume too large” and a search for this error code in MS TechNet will eventually lead you to an article named “Advantages of Using NTFS”. Win2K/XP can read an already formatted FAT32 volume of any size, it just won’t allow you to create one. Volume
Cluster Size
256MB-8GB
4KB
8-16GB
8KB
16-32GB
16KB
32-??GB
32KB
7
Stored in the Directory Entries Cannot contain any of the following: . " / \ [ ] : ; | = , or a space 8.3 – Eight ASCII characters for name, a period, and three ASCII characters for extension
©2005 Christopher Taylor
FAT File Names
The name and other metadata about a file is all stored in the 32-byte directory entry for that file. The list of characters that cannot be used in a file name, “. “ / \ [ ] ; : | = or 0x20” is really an OS issue, not a file system issue. Linux, via its FAT support, can create files with some of these characters in their names. This may cause problems with portability if that disk is later read in a Windows environment, so it is better to go with the longer list of what is not allowed. Dating back to the creation of the first FAT12 volumes in the 70’s, all files were given a name in the 8.3 naming convention. That is, eight characters for the name, a “dot”, and three characters for an extension that identified the type of file. Eight characters is not a lot of space and left many a secretary confused about that that file really was. Long file name support was later show-horned in, but not in any semblance of an elegant way.
8
Directories are files that contain a list of other files. The OS interprets this as a container, but they are stored the same as files with regards to how they consume entries in the FAT Each directory entry contains a file’s name, attributes, date/time stamps, starting cluster, and file’s size
©2005 Christopher Taylor
FAT Directory
Directories, with the exception of root, grow, shrink, and get fragmented just like any file. The root directory is located either immediately after the FAT area in FAT12 and FAT16 or at the location specified in the BIOS Parameter Block in FAT32. Usually, FAT32 will still place the root directory in the first available cluster, which places it right behind the FAT area, but this doesn’t HAVE to be its location. All other directories in all the FAT file systems will be allocated clusters as they need them and can reside anywhere on the disk. When deleting a file, the first byte in the directory entry is changed to 0xE5 which tells the file system that that entry is available to be overwritten by a new file. Commands like DIR ignore files that start with this byte code as if they do not exist. Nothing else is changed or deleted. The UNDELETE command searches for all entries that start with 0xE5 and lists them. The first byte used to be the first letter of the file’s name so the list produced by undelete utilities will usually ask you to supply a letter to start the file name with and then change this 0xE5 to that letter and the file is ‘magically’ restored. While not a lot of metadata, all metadata about a file is stored in its directory entry. This will include a name for the file, a create date and time, a modified date and time, a last accessed date, the files attributes, the starting cluster where the file is located on the disk, and the file’s size in bytes. Image is from Tim Paterson’s 1983 article titled An Inside Look at MS-DOS in Byte Magazine. Tim wrote DOS 1.0.
http://www.alumni.caltech.edu/~pje/dosfiles.html
9
File Name File Extension
©2005 Christopher Taylor
FAT Directory
Attributes Created Time Created Date Accessed Date Modified Time Modified Date Starting Cluster File Size
Folders contain a 32-byte entry for each file and folder they contain. The entry includes the following information: Byte(s) contents . 0-7 file name or first 8 characters of volume name 8-10 file extension or last 3 characters of volume name 11 attribute byte 12 Reserved 13 C-Time’s seconds (10 millisecond resolution) 14-17 C-Time/Date 18-19 A-Date 20-21 unused 22-25 M-Time/Date 26-27 number of first cluster 28-31 number of bytes in file, or zero for subdirectory or volume label The attribute byte breaks into: bit meaning if bit = 1 . 7, 6 unused 5 file has been changed since last backup (archive bit) 4 entry represents a subdirectory 3 entry represents a volume label 2 system file 1 hidden file 0 read-only The Time and Date attributes are read as follows: Time bits contents . 15-11 hour (0-23) 10-5 minute (0-59) 4-0 double seconds (0-29) Date bits contents . 15-9 years elapsed since 1980 (0-127) 8-5 month (1=January, 2=February, ..., 12=December) 4-0 day (1-31) According to DOS, the world began on January 1, 1980 at 00:00:00 and will end on December 31, 2107 at 23:59:58.
10
Causes the name to no longer fit in a 32-byte entry. This is fixed by pre-pending 32-byte headers to accommodate the LFN.
©2005 Christopher Taylor
Long File Names
“The quick brown.fox” →
Long file name support was added to FAT first via the FastFAT drivers in NT3.5 and later via VFAT in Win95. VFAT was released with the initial release of Windows 95 and included long file name support and some performance enhancements, but was still a 16-bit FAT. It wasn’t until OEM Service Release 2 of Win95 that FAT32 was released. The image on this page shows a typical LFN entry. It is actually 3 short filename entries. The bottom two rows are the primary entry that contains the derived short name and the file’s other metadata. Extra entries are added in from the bottom up until the entire LFN is accommodated. To derive the short name, Windows will take the first 6 characters of the LFN, omitting spaces and special characters, and add a tilde(~) and a number. If other files have the same first 6 characters, the number will be incremented to deconflict. If you copy a file into a directory that already has a file that has the same alias, the file being copied will have its short name changed to another number. This can cause problems with programs that have links to the file’s short name instead of its LFN. The registry often points to alias. The first byte of each long entry is a count of how many entries were added to accommodate the entire name. That counter on the last entry added will be increased by 0x40 to indicate that it is the last entry. The LFN is stored in a Unicode (POSIX compliance!) format for support for languages that don’t use English letters. For this reason 2-bytes are allocated for each character of the LFN. If the LFN is in English, only 1byte is needed and the second is a null. (i.e.: ABC in hex is 41 42 43 but in the LFN it will be stored as 41 00 42 00 43 00) The LFN is always null terminated (0x0000) and the rest of the bytes in that entry that could hold characters will be padded with 0xFFFF. Figure 17.5 from Windows 2000 Professional Resource Kit
11
©2005 Christopher Taylor
Long File Name Entries
Long File Name Short File Name Extension Entry counter Short name checksum Always the same meaningless dribble for each LFN entry Rest of the file’s metadata
Here is a real world example of a common long file name, “New Text Document.txt”. This name consumes three entries. The bottom entry contains the short name for the file and all the normal information that is required to track the file. The second and third entries are the long name for the file, but there is a hole cut in the middle of the name where the attributes, reserved bytes, and lack of A-Time would be in a normal entry. The attributes will always be 0x0F and the reserve byte is a checksum of the short file name, in this case 0x9F. This checksum tells the file system if something corrupted the list of entries and verifies that this entry goes with the shortnamed entry below it. The first byte is a counter that shows how many entries are in use. The last entry will have this counter’s first nibble incremented by 0x4 to denote that it is the last entry. For normal names this will place the counter in the 0x41-0x44 range, but given the 255 character limit to the names the actual top end of this counter is 0x54. The LFN is null terminated, notice the 0x0000 after the last character in the top entry, and the rest of the last entry is padded with 0xFF in all of the fields that could store a character of the filename.
12
©2005 Christopher Taylor
An Extreme LFN Example
Here we have a file whose name is 255 characters long. The view has been expanded to 32bytes per row so that each line is a separate directory entry. The top three lines are the original name of the file as it was created when clicking ‘new’ -> ‘text document’ from the right-click context menu. This operation actually creates a file named “New Text Document.txt” and then immediately kicks off a rename operation. Since FAT is reluctant to overwrite existing entries as long as there is room, we have this artifact to contend with. Column 0 provides the counter to show how far into the LFN each entry is. Column 11 is 0x20 for the two short name entries to denote the archive attribute is the only one set for these files. The rest of the entries are set to 0x0F, which sets the Volume, System, Hidden, and Read-Only attributes – this is the setting for all LFN entries. Column 12 is always reserved. It is usually either 0x00 or 0x18 Column 13 is reserved in a short-name entry or is the checksum of the short name when in a long-name entry. This checksum provides a check that the long name is associated with the correct short name. Columns 26-27 are the starting cluster on short-name entries and always 0x0000 in long-name entries
13
File Allocation Table – cluster map that shows which clusters are in use and which are free File’s directory entry contains the starting cluster number. Each entry contains the number of the next cluster occupied by the file or an EOF marker
©2005 Christopher Taylor
The Skinny on the FAT
This process is referred to as cluster chaining
When creating a new file, the file’s name, attributes, starting cluster, and size are saved in an entry in the directory where the file resides. The OS goes through the FAT and finds the first unused cluster and sets this as the file’s starting cluster. The starting cluster is the address of the first cluster used by the file. Each cluster’s entry in the FAT contains a pointer to the next cluster in the file, or marker (0xFFFF) which indicates that this cluster is the end of the chain. Files can be put together without knowing the name, attributes, or even size by just following the chain in the FAT. Start with entry 2, the first addressable cluster, and follow it to completion. Keep a tally of each cluster you have crossed so far. Save the data extracted from those clusters with some annotation that they started in cluster 2. Then go to cluster 3 and do the same. Then go to the next cluster you haven’t extracted already and do the same. Eventually you will have all the files, but with no file names, times, or attributes. Now traverse all the directory listings and annotate the starting cluster each belongs to in order to associate a name with the extracted files.
Figure 17.4 from Windows 2000 Professional Resource Kit Image is from Tim Paterson’s 1983 article titled An Inside Look at MS-DOS in Byte Magazine. Tim wrote DOS 1.0.
14
16 bit entries
©2005 Christopher Taylor
FAT16 entry is two bytes written ‘backwards’ (little(little-endian) First two entries are reserved Entry 0 is the Media Descriptor Entry 1 is used for a few different things Entry 2 is the first addressable cluster in the volume Reserved File1 (2) File2 (3)
FFF8, FFFF FFFF 0004, 0005, 0006, 0007, 0008, 0009, FFFF
00 00 11 11 22 22 33 33
44 44 55 55 66 66 77 77
Being a x86 based system, all of the multiple byte numbers that we encounter will be written ‘backwards’. 0x1234 will be stored on the disk as 0x3412. This is a function of how the processor picks the bytes up off the drive. The first entry is reserved and contains the Media Descriptor, in this case 0xFFF8 since 0xF8 is the media descriptor for a fixed disk (see page 5 for a partial list of other media descriptor bytes). In the original, early DOS days this is where the Media Descriptor lived, full time. Later on it was moved to the Boot Sector, but for backward compatibility reasons this field is still used. Some utilities, ScanDisk for instance, use this field to get the Media Descriptor byte instead of the one in the Boot Sector. The second entry is also reserved. It sometimes contains what is to be used as the end-of-chain marker throughout the rest of the FAT. In Windows this will always be 0xFFFF, but could be any value between 0xFFF8 and 0xFFFF. Linux used to use 0xFFF8 in this field and as the EOF marker elsewhere in the FAT, but it was found that some devices (certain MP3 players) didn’t recognize anything but 0xFFFF as an EOF marker so this was changed. Some references say that entry is a ‘dirty bit’ to track the state of consistency of the file system. Starting with entry 2, the first addressable cluster, the FAT is a series of either pointers to other clusters or EOF markers as described on the previous page. The example above is a FAT that contains 2 files: File1’s directory entry tells us it starts in cluster 2. In entry 2 is 0xFFFF. This tells us that this file is only 1 cluster long and that we should go to that cluster and read the number of bytes in the directory entry’s file size field to extract the file. File2’s directory entry tells us it starts in cluster 3. In entry 3 is 0x0004. This tells us that the data in cluster 3 continues in cluster 4. Entry 4 contains 0x0005, and entry 5 contains 0x0006, and so on until we reach the EOF marker. We then divide the file size, found in the directory entry, by the cluster size and the remainder is the number of bytes from the last cluster that we need to read to complete the file.
15
12-bit entries
©2005 Christopher Taylor
FAT12 Entries are read ‘backwards’ (little(little-endian) Entries are read 16 bits at a time and 4 bits are ignored – either first or last 4 depending which entry This causes us to have to split the middle byte and place its parts on the far side of the adjacent bytes Reserved File1 (2) File2 (3)
FF8, FFF 004, 005, 006, FFF FFF
00 10 11 22 32 33 44 54
55 66 76 77 88 98 99 aa
FAT12 was around before FAT16 but since bytes are multiples of 8 the math was a lot simpler in the last page. FAT12’s cluster chaining works identically to the previous example. The only difference, and it is a confusing one, is how we read the 12-bit entries in the 16-bit ‘hex’ editor. File1’s directory entry says it starts in cluster 2, so we read in the entry 2 from above: The computer has to read in two bytes – 04 F0 These are then reversed to correct the ‘endianess’ – F0 04 The first nibble of the (now) first byte actually belongs to the next entry so it is ignored, which leaves: 004 But, to read in odd numbered entries, like File2’s starting cluster of 3: The computer reads in two bytes – F0 FF These are then reversed to correct the ‘endianess’ – FF F0 The last nibble of the (now) last byte is actually from entry 2, so it is ignored – FFF
As a bonus to add to the confusion – File2
16
28-bit entries
©2005 Christopher Taylor
FAT32 Why 28? Will the madness ever end!?! Entries consume 32 bits, but last 4 bits are ignored Still littlelittle-endian
Reserved Root Dir (2) File1 (3) File2 (6)
FFFFFFF8, FFFFFFFF, 0FFFFFFF 00000004, 00000005, 0FFFFFFF 00000007, 0FFFFFFF
00 00 00 00 11 11 11 11
22 22 22 22 33 33 33 33
FAT32 actually uses 28-bit entries in the FAT. 32-bits are consumed, but the upper 4-bits are reserved and left 0. This gives us an interesting quark in that the second of the opening two reserved entries is supposed to be the EOF marker in use which is 0xFFFFFFFF, but due to the 4 reserved bits the actual EOF marker seen on Windows formatted volumes is 0x0FFFFFFF. The root directory was previously untracked because it was locked into its location and size, but with FAT32 it now is tracked in the FAT like every other directory. The BIOS Parameter Block in a FAT32 formatted Boot Sector contains a field that gives the cluster number of the starting point of the root directory. This value is almost always the first addressable cluster, 2. Other than reversing the numbers to counter the little-endian effect, the 32-bit fields line up nicely and are rather easy to pick out.
17
Each entry in the FAT contains one of the following values: FAT12 000 001 002002-FEF FF0FF0-FF6 FF7 FF8FF8-FFF
FAT16 0000 0001 00020002-FFEF FFF0FFF0-FFF6 FFF7 FFF8FFF8-FFFF
FAT32: 00000000 00000001 0000000200000002-FFFFFFEF FFFFFFF0FFFFFFF0-FFFFFFF6 FFFFFFF7 FFFFFFF8FFFFFFF8-FFFFFFFF
©2005 Christopher Taylor
FAT Values
Value unassigned invalid entry assigned reserved cluster has bad sector end of clustercluster-chain
When the volume is formatted, the FAT is initialized with all zeros. The first two entries are reserved. Because of this, an entry of 0 means the entry is available for use and an entry of 1 is invalid. The first addressable cluster becomes entry 2. In FAT12 and FAT16, the root directory is permanently anchored both in location and size, so there is no need to track it in the FAT. In FAT32 the root directory, which in theory can be anywhere, is almost always in the first addressable cluster and starts off only 1 cluster in size, so it will be present in entry 2. An entry ending in F7 means that at least on of the sectors in that cluster has produced an error that tells the file system that that sector will no longer reliably hold data. With modern hard drive controllers, the tracking of bad sectors is done on the controller and the file system would never see a bad sector and thus never mark one as such. There are programs written to hide data that do so by marking sectors as bad so that the file system will not allow the user access to that data and it usually won’t be seen even by anti-virus or similar scanners. Nowadays, any “bad” sectors on a volume should be scrutinized and verified as bad. The file system will treat any entry of all F ending in anything from 8 to F in hex as an end-of-cluster-chain marker. The system that formatted the volume should place the preferred EOF marker in entry 1 and use that one marker throughout. DOS and Windows will always use 0xFFFF. Originally, the FAT support in Linux used 0xFFF8 but it was found that some MP3 players did not recognize this marker and now the current Linux kernel uses 0xFFFF as well.
18
NTFS
19
©2005 Christopher Taylor
©2005 Christopher Taylor
NTFS Versions 1.0 / found in NT 3.1
1.1 / found in NT 3.5
1.2 / 4.0 found in NT 3.51 and NT 4
3.0 / 5.0 found in Windows 2000
3.1 / 5.1 (5.2?) found in Windows XP and Windows Server 2003
Easily the most confusing version numbering ever used. NTFS has its own version numbers but is often referred to by the version of Windows that it was used in. i.e.: version 3 of NTFS came out with Windows 5.0 and thus is often called NTFS5.0. This is all fine and good until Windows 6.0, which will undoubtedly use NTFS v4.0, comes out. Then when we refer to NTFS4.0, are we talking about the new one or the one from the 90’s? Many sources refer to the NTFS found in NT4 as version 1.1. The copy of NT4 that I have with no service packs installed uses NTFS version 1.2. I unfortunately don’t have copies of 3.1, 3.5, or 3.51 and, although the thought crossed my mind, it wasn’t worth the effort to find and install them just to verify the file system’s version number. So, where version numbers changed from 1.0 to 1.1 to 1.2 as noted above is based on multiple, conflicting sources and is my best guess. We will discuss how to verify the version number later.
My other complaint is the missing version 2. Where’d it go?
20
4KB cluster size right up to the max volume size No dependence on underlying hardware (512(512-byte sectors) No limit on file size (practically) No reliance on a single sector (FAT, superblock) superblock) Native support for encryption, compression, recoverability More permissions and attributes on each file
©2005 Christopher Taylor
NTFS Feature Overview
While most people think of NTFS as having been “built from the scratch” it is really more of an improvement on HPFS than a completely new technology. HPFS performance degraded above 400MB, but NTFS performance remains right up to the 16EB theoretical max volume limit. The practical limits is 2TB per volume. NTFS uses clusters rather than sectors to avoid the dependence on the underlying hardware. NTFS was built with the thought that other disk technologies may be used in the future and a 512-byte sector might not be used on them. The cluster size can be specified when formatting the volume, certain advanced features are not available if the cluster size is greater than 4KB. Volume size
default cluster size
512MB-1GB
1KB
1-2GB
2KB
2GB+
4KB
There is no single sector, save the boot sector, that is always found in the same place on every drive, and the boot sector is backed up at the end of the volume. This was done to remove the limitations of FAT’s FAT (always found right after VBR) and HPFS’s superblock (always found in block 16). Under NTFS all objects on the disk are tracked an protected against failures and multiple copies of the Master File Table are kept in separate locations on the disk.
21
NTFS maintains a transaction log and change journal and in the event of a failure can automatically restore the consistency of the file system. NTFS can natively compress and encrypt files without the use of a separate application. The file is encrypted/compressed during the file write and decrypted/uncompressed during the file read routine transparent to user. File permissions track users and groups and can prevent unauthorized access to a file. In order to gain POSIX compliance, case sensitive file names are used. FAT saved everything in uppercase and HPFS, while it did save the case of a file’s name, referred to the same file no matter how you mixed the case when opening it. Under NTFS README.TXT, Readme.txt, and readme.txt are all different files. Also, additional time stamps and hard links were added.
Windows 2000 Resource Kit – chapter 17
22
©2005 Christopher Taylor
NTFS Encryption Encrypted File System (EFS) Uses hybrid - public-key (RSA) key exchange and symmetric key (DESX) encryption. Option for stronger encryption (3DES) exists but is off by default Protects files as they reside on the drive, not in transit as viewed through the OS
YT..Uçx.°ó}Ó—l.ígp“ìKy‹Èx,.×,Ó°~E–%.xg{ðÐJ"~~³´{[‘. \ÚâG"ú}7%˜7$—`s.ŵñyQ.¥È]“NïqÃ%fpV“Ùb0S€.›¬…ÂQ.ؼ̈ç*J..Ñú× h›.. ¨ Kĸ6..5&oR–AU.å„&d.Ïï'˜v(6 ŒÓ¶.ZœcH<mýN?‹ØA'zÒ륜¾.ˆ(S.ý]j;«&?ï .6Æ~g¼¹×Ή„Ö^L$tÑ.èQü]½üöΉ¹F©ø¥‰¾=·Q=eüD‹B†‘
[email protected]¥. ‘§Kæ ·..—U½Ÿ\µ[À.5‘vw¢æ.p*oƧyMAýӓ¬Éírfz?H·\.kh¦.Y¨.’…ü.E¢˜‹K™7Hµ‹N.9^./á÷3Ã.…µ.×ä#¥8.¹m äØCDGÈs÷"ó G6.ÉF‡×C«æ?ºŒMåµØÔ0N‘…B7DåN 0 .Ï µ.Q‹@r˜¥÷¿+ã‡;ÝLpÐ^‡.a”~”™ï.Qò`‹Ö¿Ž´ÎÎØ ;¶ä~q. .ø¿3.YoÓu=‘F’.Þ±‰..«b}Ëïæ.ÏÄ .e©m.'£æ}P ¬«û..ßbÒŽ».»ülâc.°r¦.ï.jö ..*È9rí \º¶7§ðâ..Ìb8.5îü
The Encrypting File System (EFS) provides the file encryption used in NTFS. EFS prevents unauthorized viewing of files from those who try to bypass NT permissions by physically stealing the hardware (laptop, HDDs, etc) or by elevating privileges. For instance, the bootdisk-borne utility known as ‘chntpw’ allows a user with physical access to a system to change any user’s (including the administrator’s) password. The normal change password process updates the keys, but this backdoor method does not. If used on a drive using EFS, the user can log into the newly acquired account but then cannot access any of the encrypted files and there is no recovery for the plaintext at that point. EFS uses symmetric key encryption (DESX with 128bit key) in conjunction with public key technology (RSA). Users of EFS are issued a digital certificate with a public key and a private key pair. This key pair is used to decrypt a File Encryption Key (FEK) file that holds the key to decrypting the data. Each encrypted file has an associated FEK encrypted with the public key of each user allowed to use the file. With the FIPS compliant algorithms usage turned on in LSA policy, the encryption is with 3DES using a 168bit key. The process is completely transparent to the user. Key management is handled by the Local Security Authority Subsystem (lsass.exe) without the user’s intervention. The data is encrypted/decrypted as it passes to/from the drive. To turn EFS on, the user sets the encryption attribute in the file or folder’s properties. In addition to the individual user’s public key information, each FEK also contains a recovery key. An entity, usually a domain administrator, is assigned as a Recovery Agent. The recovery agent can decrypt ANY file, so an attacker who hijacks the recovery agent account has also hijacked the ability to read all encrypted data. This requires that the user/administrator authenticate with the same password they normally do so most attacks on privilege escalation render the data unrecoverable. There is an attack on the cached login credentials that an attacker can use. XP not on a domain does not create a recovery agent, so you home users have no recourse if you lock yourself out of a file.
http://www.ntfs.com/ntfs-encrypted.htm http://www.microsoft.com/windows2000/docs/encrypt.doc http://www.microsoft.com/windows2000/techinfo/howitworks/security/encrypt.asp
23
Compression works in blocks of 16 clusters Each block gets compressed independently If compressed block is not at least 1 cluster smaller, it is stored uncompressed Data compressed by referencing repeated strings rather than repeating them – modified version of LZ77 algorithm
©2005 Christopher Taylor
Compression
Files are compressed on “per chunk” basis, with a chunk being 16 clusters. Only the pieces that benefit from compression actually get compressed – if it saves at least one cluster then it is compressed, otherwise it is left uncompressed. For example: uncompressed file takes 50 clusters gets compressed: the first 16 clusters get compressed to 13 clusters – the next 16 clusters contain a lot of random data and don’t get compressed – the 3rd set of 16 clusters contain some data and some whitespace and get compressed to 5 clusters – the remaining 2 clusters are not a block of 16 clusters and thus get stored as is. 1234567890123456789012345678901234567890123456789012345... UNCOMP
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXOOOOO
COMP
XXXXXXXXXXXXXOOOXXXXXXXXXXXXXXXXXXXXXOOOOOOOOOOOXXOOOOO
OnDisk
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Note the space in the middle of the file: Compressing a file adds serious complexity to the way the file is stored. The MFT is the only place that contains information about what parts are compressed and by how much. If MFT is corrupted there is little hope retrieving the data. Data is compressed using a modified LZ77 algorithm. Strings in the block that have been seen before are compressed by referencing the string rather than saving it again. #include
\n #include <stdio.h>\n Is compressed to #include \n (-18,10)stdio(-17,4) So the algorithm recognizes that -18 bytes from the current position, it has already seen the text '#include <'. Then, stdio is new, but '.h>\n' has been seen before. http://linux-ntfs.sourceforge.net/ntfs/concepts/compression.html
24
Files with long runs of zeros don’t save all the zeros, just a count of how many zeros there are Example:
©2005 Christopher Taylor
Sparse Files
A 10MB file of all zeros will create only an MFT entry labeled as sparse and how big, and take up no space on the disk A 10MB file that contains a small header followed by zeros only takes 1 cluster to store and rest of file is recreated from count
Since long runs of zeros are common on disks, an easy way to compress a file is to simply remove them. For this to work, the run of zeros has to engulf an entire cluster. The ‘sparseness’ of the file is annotated in the data run that is used to map the clusters used by the file. More on data runs and how to read sparse files off the disk later.
25
Amount of disk space allowed for each user enforced at file system level
©2005 Christopher Taylor
Disk Quotas
When a limit is set, it applies to all users on all volumes. A warning level and a quota limit are set – Users are nagged if they are above the warning level and “physically” prevented from writing to the volume if over the limit. Quota is checked before NTFS compression is applied, so the limit is the logical size of the files not the amount of disk space they take up.
26
A special tag in a in a file/directory that allows for a custom driver to ‘reparse’ the file/directory in some custom way instead of just treating it as a file/directory Used to implement:
©2005 Christopher Taylor
Reparse Points
Mount Points Directory Junctions Hierarchical Storage Systems Anything you want
Originally was similar to the symbolic links in Unix based file systems. Even called that until later versions of NTFS. But reparse points in the current iteration of NTFS do everthing a symbolic link can do and anything else. When a file has a reparse point, as the file is being ‘parsed’ the first time the reparse point tells the file system to go back and process the file again in a different way, or using a different driver, or whatever else the reparse point means. That ‘whatever’ else is intentionally vague, because you can write your own driver and have it registered as a reparse point. So anything you can think to make code for you can have a file do.
27
Similar to Unix Mount Points Directory on NTFS volume is actually root of another volume Allows for one big directory tree instead of needing multiple drive letters
©2005 Christopher Taylor
Volume Mount Points
Similar to how multiple hard drives and cdrom and floppy all appear as subdirectories in a single tree in Unix (i.e. /dev/hda, /dev/hdb, /dev/cdrom, /dev/floppy), this allows multiple volumes to appear as directories instead of separate drive letters (i.e. those same four devices appearing as C:, D:, E:, A:, respectively).
28
Similar to Volume Mount Point, but target is a directory instead of a volume Similar to Unix Symbolic Link Link C:\firstdir to D:\anotherdir and any files created/accessed in C:\firstdir will actually reside in D:\anotherdir
©2005 Christopher Taylor
Directory Junctions
Two directories that are sub to different directories, but point to the same list of files. Anything to the files in one will be immediately present in the other (because they are the same thing!). The two directories can be either on the same volume or different NTFS volumes. Just like creating a symbolic link between two directories in Unix.
29
Transfers seldom used files to less “expensive” media Files that aren’t used often get moved off the HDD onto MT, but still appear in the directory listing on the HDD – if accessed, they are transferred back.
©2005 Christopher Taylor
Hierarchical Storage System
For an example, let’s say we have a moderate sized hard drive (because they are expensive but fast) and a large tape array (because tape is cheap but slow). Using HSS, My NTFS volume will appear to be a large as both together. NTFS will transfer any file whose last accessed date is older than a specified period off of the hard drive to the tape. That file’s MFT record is still present on the drive, but a reparse point tells it where in the tape array to get it. If the file is ever accessed, it will be transferred back to the drive and then accessed. This allows for dynamic archiving of seldom used files to make room for the files that are currently being accessed regularly. The “expense” of the storage media doesn’t necessarily have to be monetary.
30
Keeps “point-in-time” backups of files by copying modified files to the Shadow Copy
©2005 Christopher Taylor
Volume Shadow Copy
Clone – read/write – full copy (mirror) of original volume (online backup) CopyCopy-OnOn-Write – read only – any file that is modified is copied to the shadow first before modification occurs.
Used for making backups in critical servers that can’t afford the time to run a tape or the loss of data between tape runs. Is accomplished in two different ways (either can be hardware or software based): Clone drives are exact mirror images to the original. Can either operate like a RAID where any transaction on one happens on the other, or the mirror lagging one modification behind the original. This second mode allows for the original media’s information to be rolled back if a transaction goes bad. Copy-On-Write copies the original data that is about to be modified to a “difference area” and then modifies the original. The Original + “differences area” = shadow copy. This is faster because only the changes are written instead of everything, but the original must still be available to create the shadow copy. http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/TechRef/2b0d2457-b7d8-42c3b6c9-59c145b7765f.mspx
31
In a case where several directories contain the same files, only one instance of that file need really exist Consists of two parts
©2005 Christopher Taylor
Single Instance Storage
File system filter that manages the copies User space service (Groveler (Groveler)) that searches for identical files to merge
Designed for servers containing multiple installation programs in which many of the files were same from one install to another. If SIS is disabled after it has been running, only the Groveler is disabled. The SIS file system filter remains as long as there are merged files on the volume. There will be a SIS Common Store folder that contains information about what files are linked; if this folder is deleted all of the linked files will become inaccessible.
http://support.microsoft.com/default.aspx?scid=kb;en-us;299726
32
Rather than have a special sector like FAT or HPFS, all the file tracking information is saved in a file, the Master File Table - $MFT Everything is a file Very structured (database like) Consists of 1k records for each file, each record made up of several attributes Resides in “MFT Zone” (to prevent fragmentation)
©2005 Christopher Taylor
Master File Table
Everything in NTFS is a file. The MFT is a file. The boot sector is a file. Directory entries are files that contain a list of other files. NTFS was designed as a database. Microsoft’s documentation says, “The MFT is a relational database that consists of rows of file records and columns of file attributes. It contains at least one entry for every file on an NTFS volume, including the MFT itself.” Records are 1K according to the M$ documentation. MFT Zone is a chunk of free space immediately following the $MFT that is reserved for its growth. Once the rest of the free space is used up, this buffer zone is cut in half and files can be created in the half not near the $MFT. If the entire MFT Zone is used up and the $MFT grows, it may end up fragmented – it cannot be defrag’ed if this happens.
33
NT 4.0 & 5.0
NT 5.1
©2005 Christopher Taylor
FILE records
This is a FILE record Offset to update sequence Size of USN + USA $Logfile Sequence Number Sequence Number Hard Link count Offset to 1st attribute Flags Real size of record Allocated size of record Inode of base record Next attribute ID Inode of this record (XP only) Update Sequence Number Update Sequence Array
This big difference between XP (NTFS 3.1) and previous versions is the inclusion of the inode number at offset 0x2C. Due to this change, the offset found directly behind the word ‘FILE’ changes the ASCII letter from ‘*’ to ‘0’. This is a quick visual way of seeing what version of Windows is installed on the system, given that XP and 2K are the two most prolific installations at the time of the writing of this document. The length of the Update Sequence Array (USA) is variable, but the Update Sequence Number will always be 2 bytes. The number at offset 0x06 minus 2 will give you the length of the USA. Flags: 0x00
not in use
0x01
in use
0x02
directory
0x03
directory in use
Deleting a file changes the flag to 0x00, but does nothing to clear out the data, thus many deleted file’s metadata is still recoverable as long as the inode hasn’t been recycled already.
34
Hidden from view on an NTFS volume The MFT will always have at least 16 records – these 1st 16 being reserved for the system files MFT Records 00-11 are the system files MFT records 12 - 15 are marked as in use, but are empty
©2005 Christopher Taylor
NTFS System Files
(reserved for future system files)
MFT records 16 - 23 are marked as not in use, but are never used (reserved, because they can)
These are special metadata files that describe various pieces of the file system.
http://www.windowsnetworking.com/kbase/WindowsTips/WindowsXP/AdminTips/TroubleShooting/Whichversi onofNTFSamIrunning.html http://linux-ntfs.sourceforge.net/ntfs/attributes/volume_information.html
35
Inode
Filename
0 1 2 3 4 5 6 7 8 9 9 10 11 12-15 16-23 Any Any Any Any
$MFT $MFTMirr $LogFile $Volume $AttrDef . $Bitmap $Boot $BadClus $Quota $Secure $UpCase $Extend $ObjId $Quota $Reparse $UsnJrnl
OS
NT 2K 2K
2K 2K 2K 2K
Description Master File Table - An index of every file A backup copy of the first 4 records of the MFT Transactional logging file Serial number, creation time, dirty flag Attribute definitions Root directory of the disk Contains volume's cluster map (in-use vs. free) Boot record of the volume Lists bad clusters on the volume Quota information Security descriptors used by the volume Table of uppercase characters used for collating A directory: $ObjId, $Quota, $Reparse, $UsnJrnl Marked as in use but empty Marked as unused Unique Ids given to every file Quota information Reparse point information Journaling of Encryption
©2005 Christopher Taylor
NTFS System Files
This is the list of all the system files and which inode they reside in. In NTFS Version 1.x, inode 9 was $quota and there was no file in inode 11, but in version 3.x this file was moved out and $secure took its place in inode 9. The last four files can resided in any inode, but in practice they are usually found in inodes 24-26. These files reside in the directory $extend. $UsnJrnl only exists if that feature is being used.
36
Inode 0 Index of every file in volume Pointer in NTLOADER tells system where to find the MFT, the MFT tells the system where to find everything else Small files and directories (typically 700-800 bytes or smaller) can reside entirely within the master file table record
©2005 Christopher Taylor
$MFT
Everything is stored in the master file table. The master file table itself is a file just like every other file that it is referencing. Having the master file table be a regular file removes the limitations that were present in FAT and HPFS where the index of the files are stored in a particular spot on the hard drive. That particular spot on the hard drive presented a single point of failure. NTFS also maintains a copy of the MFT and keeps the two copies on opposite sides of the drive in case of physical damage to the drive. The downside to the master file table lies in its size.
37
©2005 Christopher Taylor
$MFT
Note the timestamp in the $standard_information and $file_name attributes are all the same. The timestamps in all the system files will be the same and will be the date/time the volume was formatted. 1 non-resident data attribute points to the start of $MFT. Since this is the first entry in $MFT, the starting LCN in the data run actually points to this cluster.
38
Inode 1 Copy of (usually) the 1st 4 records in $MFT Provides for error recovery if the sector the beginning of the $MFT is in fails
©2005 Christopher Taylor
$MFTMirr
In NT4.0, the $MFTMirr was the first 16 records. But, in current versions the $MFTMirr is always either 4 records or one cluster. Generally, the record size in the MFT is 1024 bytes and the cluster size of the drive it is on is 4048 bytes. So, 90% of the time the $MFTmirr will contain 4 MFT records. In the case of the cluster size being smaller than 4K, as many clusters as necessary to get 4 records will be used. But if the cluster size is bigger than 4K and there is room leftover, it may contain more than 4 records. Since having clusters larger than 4K breaks some of the other features, such as compression, this rarely occurs. The purpose of this file is to backup the starting point of the $MFT. If that sector dies, the whole $MFT would become unreadable. If that sector dies, the pointer to where to find the $MFTMirr would also be dead, so I’m not really sure what is gained. Also, if another section of the MFT that is holding regular FILE records dies, there is nothing that saves those files. The $MFTMirr is there to save the file system, not the files saved in that file system.
39
©2005 Christopher Taylor
$MFTMirr
Looks just like the $MFT record, but look where we are.
Usually located half way through the volume. Again, it only contains the vital records needed to get the $MFT readable again (usually the first 4 records).
40
Inode 2 Not a history of every file access IS detailed look at very recent transactions All file system accesses are a series of individual transactions This file keeps track of where in that series an operation is so that if it fails, the transactions that already occurred can be rolled back.
©2005 Christopher Taylor
$LogFile
The internal structure of the $LogFile is not well understood. It is a scrolling window usually viewed in a circular fashion. Once the log is full, the first entry is overwritten with the next new entry. What gets logged are the individual transactions that make up each file access or file write or whatever. For instance, when modifying a file the following steps might occur: - read MFT entry for directory entry file is in - read directory entry file is in - read MFT record for file - write file - update Atime in file’s MFT record - update Mtime in file’s MFT record - update Atime in directory entry for that file - update Mtime in directory entry for that file This list gets considerably longer if the file is encrypted or compressed. If the command fails before the entire string of transactions are completed, due to system crash or whatever other reason, the file system has to have a way to change each of the transactions involved back to their previous values in order to maintain consistency of the file system. This is not to be confused with running CHKDSK, what is described here is just the file system itself providing a reliable, crash-resilient environment.
41
©2005 Christopher Taylor
$LogFile
The internal structure of the $LogFile is not well understood. It is a scrolling window usually viewed in a circular fashion. Once the log is full, the first entry is overwritten with the next new entry. What gets logged are the individual transactions that make up each file access or file write or whatever. For instance, when modifying a file the following steps might occur: - read MFT entry for directory entry file is in - read directory entry file is in - read MFT record for file - write file - update Atime in file’s MFT record - update Mtime in file’s MFT record - update Atime in directory entry for that file - update Mtime in directory entry for that file This list gets considerably longer if the file is encrypted or compressed. If the command fails before the entire string of transactions are completed, due to system crash or whatever other reason, the file system has to have a way to change each of the transactions involved back to their previous values in order to maintain consistency of the file system. This is not to be confused with running CHKDSK, what is described here is just the file system itself providing a reliable, crash-resilient environment.
42
Inode 3 Contains information about the volume
©2005 Christopher Taylor
$Volume
Volume name NTFS version number Flags to signal certain operations on boot Run chkdsk Upgrade to new version Resize log file
The file $volume contains the name of the volume. That is its most important function. When you open ‘My Computer’ you want to see names you remember – like ‘Bob’ or ‘Jeff’ – instead of just ‘C:’. There is also volume_information data in this file that contains a version number and a set of flags. The version number will be broken into two pieces, a major and a minor version number. The major version number will be either ‘1’ or ‘3’ and will correspond to the list from page 3. The flags will hold the following values: Value
Description
0x0001
Dirty
0x0002
Resize LogFile
0x0004
Upgrade on Mount
0x0008
Mounted on NT4
0x0010
Delete USN underway
0x0020
Repair Object Ids
0x8000
Modified by chkdsk
The dirty bit tells Windows that ‘chkdsk /f’ needs to be run on the next boot.
43
©2005 Christopher Taylor
$Volume
$Volume_Name
$Volume_Information
The file $volume contains the name of the volume. That is its most important function. When you open ‘My Computer’ you want to see names you remember – like ‘Bob’ or ‘Jeff’ – instead of just ‘C:’. There is also volume_information data in this file that contains a version number and a set of flags. The version number will be broken into two pieces, a major and a minor version number. The major version number will be either ‘1’ or ‘3’ and will correspond to the list from page 3. The flags will hold the following values: Value
Description
0x0001
Dirty
0x0002
Resize LogFile
0x0004
Upgrade on Mount
0x0008
Mounted on NT4
0x0010
Delete USN underway
0x0020
Repair Object Ids
0x8000
Modified by chkdsk
The dirty bit tells Windows that ‘chkdsk /f’ needs to be run on the next boot.
44
Inode 4 List of the available attributes a file can have on this volume List changes between some versions of NTFS
©2005 Christopher Taylor
$AttrDef
This file contains the list of attributes availble to the file system in this version of NTFS. It is because of this file that we know the catchy names for the attributes that we are using. The entry for the attribute also contains some information about the allowable sizes and location (resident or not)of the attribute can be.
45
©2005 Christopher Taylor
$AttrDef Name ID Rules (currently not used) Flags (resident or not) Max Size Min Size
Flags 0x02 = Indexed; 0x40 = always Resident; 0x80 = can be Non-resident On a 2K or XP system: Type
Name
Flags
IRN
Min Size
Max Size
0x10
$STANDARD_INFORMATION
0x40
R
0x30
0x48
0x20
$ATTRIBUTE_LIST
0x80
N
-
-
0x30
$FILE_NAME
0x42
IR
0x44
0x242
0x40
$OBJECT_ID
0x40
R
-
0x100
0x50
$SECURITY_DESCRIPTOR
0x80
N
-
-
0x60
$VOLUME_NAME
0x40
R
0x2
0x100
0x70
$VOLUME_INFORMATION
0x40
R
0xC
0xC
0x80
$DATA
0x00
-
-
0x90
$INDEX_ROOT
0x40
R
-
-
0xA0
$INDEX_ALLOCATION
0x80
N
-
-
0xB0
$BITMAP
0x80
N
-
-
0xC0
$REPARSE_POINT
0x80
N
-
0x4000
0xD0
$EA_INFORMATION
0x40
R
0x8
0x8
0xE0
$EA
0x00
-
0x10000
0xF0
$PROPERTY_SET
?
?
?
?
N
-
0x10000
0x100 $LOGGED_UTILITY_STREAM 0x80
46
Inode 5 The root directory Laid out like any other directory listing in the MFT, but in inode 5 and named “.”
©2005 Christopher Taylor
.
More on directory entries later
The root directory is always found in Inode 5. Other than that, it is just like any other directory.
47
©2005 Christopher Taylor
.
48
Inode 6 Bitwise list of every cluster on the drive
©2005 Christopher Taylor
$Bitmap
0 means cluster is available 1 means cluster is in use
Data portion of this file is always a multiple of 8 bytes (64 clusters)
Because file is always a multiple of 64 clusters and drives aren’t always multiples of 64 clusters, there is usually a section at the end of the file that maps to space off the end of the drive. All of these bits are marked as ‘1’. In theory this file could be resident in the MFT on a small volume, but in practice Windows crashes if this occurs. Thus, the data attribute of this file is always non-resident.
49
©2005 Christopher Taylor
$Bitmap
Each bit marked with a 1 represents a cluster that is in use. Each byte that is not 0xFF is a hole that will hopefully not be there the next time I run Defrag.
50
Inode 7 Treated like a file, but the non-resident data attribute of this ‘file’ points to the boot sector Allows for system to access information found in the boot sector (such as volume serial number, sectors per cluster, media descriptor, etc) to be accessed just like accessing any other file
©2005 Christopher Taylor
$Boot
The data attribute of this file points to sector 0. This allows for access to the Boot Sector without having to write special code that bypasses the file system to access a ‘special’ portion of the disk. There is information in the partition table and BIOS Parameter Block that certain utilities need and this allows them a safer means to access that information via normal file system API calls.
51
©2005 Christopher Taylor
$Boot
Notice the offset in blue – this is ‘the file’ as followed by the MFT record, but we are in sector 0. The MFT record is just a pointer to this section of the disk, which makes partition and disk information readable at the file system layer just like any other file.
52
Inode 8 Tracks all the bad clusters in volume Cluster is bad if at least one sector in it is bad Sparse file the size of the volume with data in the clusters that are bad. This causes $bitmap to mark those clusters as in use.
©2005 Christopher Taylor
$BadClus
This file is the size of the NTFS volume, but is a sparse file of all zeros. Since zeros in sparse files are counted instead of saved, this file takes up no space on the disk. If a cluster is ever deemed ‘bad’, data will be written to this file at the same offset into this file as the offset the bad cluster is into the volume. This will causes this file to allocate clusters in the $bitmap file, which in turn prevents other files from trying to use the bad cluster in the future.
53
Inode 9 Has indexes of every file’s owner, ACL, etc For every file access, there is a lookup in this file to see if it is allowed
©2005 Christopher Taylor
$Secure
In NTFS version 1.x, this inode was the $quota file – more on $quota later
In Windows NT, every file had a $Security_Descriptor attribute that did this job. Since many files had the same values in that attribute it was moved to this file so that data wasn’t repeated.
54
©2005 Christopher Taylor
$Secure
55
Inode 10 128K file that lists every uppercase character in the UNICODE alphabet Used to compare and sort filenames independently of the their code page
©2005 Christopher Taylor
$UpCase
Case in the file name is preserved, but is converted to all uppercase for sorting as the directory entry is created. This file contains the uppercase characters of ‘every’ UNICODE alphabet so that NTFS knows the proper alphabetical order of each code page of UNICODE without having to inherently know every code page of UNICODE.
56
©2005 Christopher Taylor
$UpCase
57
Inode 11 Directory entry containing other system files
©2005 Christopher Taylor
$Extend
$ObjID $Quota $Reparse $UsnJrnl
$Extend is a directory that contains other system files. This allows for more system files to be added but without pushing the limit of the 16 Inodes reserved for system files. In reality, the four files in this directory could have all been given their own Inodes, but that limits future growth.
58
©2005 Christopher Taylor
$Extend
59
In \$Extend\ Index of every file’s $Object_ID attribute in the volume
©2005 Christopher Taylor
$ObjId
Not to be confused with the $Quota file Both use Object IDs and an index named $O, but they are different different lists for different things
Each file that has an $Object_ID attribute will have that ID listed here for reference. This allows files to be tracked by there ID instead of their name. This is most commonly used with Office documents, which can be linked and then the files renamed without breaking the links because they are linked by ID.
60
©2005 Christopher Taylor
$ObjId
Contained within is the Object ID, the Inode of the file that is associated, and other information.
61
In \$Extend\ Existed on Windows NT as inode 9, but wasn’t used. In Windows 2000+ it can be any inode Used to track/limit how much space in the volume each user was allocated
©2005 Christopher Taylor
$Quota
Not to be confused with the $ObjID $ObjID file Both use Object IDs and an index named $O, but they are different different lists for different things
Will explain the relationships in more detail when we get to the attributes. Uses two indexes, $O and $Q. $O contains an entry for everyone that has a quota enforced on them. $Q has an entry for every user login on the system. When a file is accessed, a lookup is done in $O for the owner to see if they have a quote and then a lookup is done in $Q to see what the quota is.
62
©2005 Christopher Taylor
$Quota
Will explain the relationships in more detail when we get to the attributes. Uses two indexes, $O and $Q. $O contains an entry for everyone that has a quota enforced on them. $Q has an entry for every user login on the system. When a file is accessed, a lookup is done in $O for the owner to see if they have a quote and then a lookup is done in $Q to see what the quota is.
63
In \$Extend\ Allows for system to mount a portion of the file system as another volume or in some other ‘special’ way This file contains an index of all the Reparse Points on the volume
©2005 Christopher Taylor
$Reparse
This file lists all the reparse points available on the volume. More information in the $Reparse_Point attribute’s page
64
©2005 Christopher Taylor
$Reparse
65
In \$Extend\ Similar in function to the $LogFile, but instead of tracking the individual transactions that makeup a file access it tracks the changes to files and structures for “data mirroring” like applications Short-term repository – some application tells NTFS to log the changes, reads this file to see the changes, then NTFS empties this file
©2005 Christopher Taylor
$UsnJrnl
Often called the Change Journal. Useful for file replication (for mirroring of data or other uses), tracking of which files to include in an incremental backup, by virus scanners, and other imaginative solutions. This file exists while it is being used and goes away as soon as the program that needed it no longer does. Thus, I have no screenshot for you.
66
Attributes within the MFT record are “resident” and outside are “Non-resident” Everything in a file is an attribute, including the actual file Each MFT record has a Standard Header, followed by a list of attributes (in order of ascending Attribute ID) and an end marker. The end marker is 0xFFFFFFFF
©2005 Christopher Taylor
MFT File Attributes
http://linux-ntfs.sourceforge.net/ntfs/attributes/index.html
67
Type OS 0x10 0x20 0x30 0x40 2K 0x50 0x60 0x70 0x80 0x90 0xA0 0xB0 0xC0 2K 0xD0 0xE0 0xF0 0x100 2K
Name $STANDARD_INFORMATION $ATTRIBUTE_LIST $FILE_NAME $OBJECT_ID $SECURITY_DESCRIPTOR $VOLUME_NAME $VOLUME_INFORMATION $DATA $INDEX_ROOT $INDEX_ALLOCATION $BITMAP $REPARSE_POINT $EA_INFORMATION $EA
NT
$VOLUME_VERSION
NT
$SYMBOLIC_LINK
NT
$PROPERTY_SET
©2005 Christopher Taylor
MFT File Attributes
$LOGGED_UTILITY_STREAM
These are all the attributes by Type number (how they are identified in the MFT records) and by name (courtesy $AttrDef system file). Between NTFS versions 1 and 3 some of the attributes moved from one type number to another, some changed names, some depricated, and some were born. The OS in front of the atribute’s name is the OS in which that attribute was found.
68
File
©2005 Christopher Taylor
What are the ‘typical’ attributes? $Standard_Information $File_Name (maybe two) $Data
Directory $Standard_Information $File_Name (maybe two) $Index_Root $Index_Allocation
These are the common attributes that are found on the common file and directory. If there was a James Smith file, it would have these attributes. Every entry, be it a file or a directory, must have a $Standard_Information and at least one $File_Name. If the file name was more than eight characters, it will also have a second $File_Name attribute. Even empty files will have a $Data attribute with nothing in it. And likewise, even empty directories will have the two attributes to create an index in them. That is all that is necessary to get basic file I/O to work and everything else is fluff, so to speak.
69
©2005 Christopher Taylor
Typical Small File
$Standard_Information
$File_Name
$Data
Name is >8 characters so a long and short $file_name are created. The horizontal, grey line at the bottom denotes a sector boundary. Each MFT record is 2 sectors. Using this visual reference, you can see that a file can’t be much more than 550 bytes before the record fills up and the $data has to become nonresident. Size of the file name contributes greatly to this.
70
©2005 Christopher Taylor
Typical ‘Big’ File
$Standard_Information
$File_Name
$Data
Here the name was already <8 characters, so only one $file_name was created. The $data attribute is nonresident.
71
©2005 Christopher Taylor
Typical EFS Encrypted File
$Standard_Information
$File_Name
$Data
$Logged_Utility_Stream
72
©2005 Christopher Taylor
Typical Small Directory
$Standard_Information
$File_Name
$Index_root
73
©2005 Christopher Taylor
Typical ‘Big’ Directory
$Standard_Information
$File_Name
$Index_Root
$Index_Allocation
74
Tells NTFS what Attribute is next
©2005 Christopher Taylor
Standard Attribute Header Is it named? Is it resident?
And how to read the attribute Length Name (if there is one) Flags Where it is on the drive (if it is) etc
75
Attribute will be one of four possible types Either
©2005 Christopher Taylor
Standard Attribute Header
Resident NonNon-resident
And either Named Unnamed
Attributes will be one of the following four types: Resident and Named Resident and Unnamed Non-Resident and Named Non-Resident and Unnamed The basic structure between named and unnamed is almost identical, but the difference between resident and non-resident is significant and important. Except for very small files (~900k), the actual file will be a nonresident attribute of the file record, so knowing how to read this type of attribute is essential to know how to find the file on the drive.
76
Type Length, including header
©2005 Christopher Taylor
Resident, No Name Header 0 = resident 0’s = unnamed Length, without header Offset to attribute The actual attribute
Resident, No Name Offset
Size
Value
Description
0x00
4
Attribute Type (e.g. 0x10, 0x60)
0x04
4
Length (including this header)
0x08
1
0x00
Non-resident flag
0x09
1
0x00
Name length
0x0A
2
0x00
Offset to the Name
0x0C
2
0x00
Flags
0x0E
2
0x10
4
L
Length of the Attribute
0x14
2
0x18
Offset to the Attribute
0x16
1
0x17
1
0x00
Padding
0x18
L
Attribute Id (a)
Indexed flag The Attribute
(a) Each attribute has a unique identifier
77
Type Length, including header 0 = resident Name length Offset to name Length, without header Offset to attribute Attribute’s name
©2005 Christopher Taylor
Resident, Named Header
The actual attribute
Resident, Named Offset
Size
Value
Description
0x00
4
Attribute Type (e.g. 0x90, 0xB0)
0x04
4
Length (including this header)
0x08
1
0x00
Non-resident flag
0x09
1
N
Name length
0x0A
2
0x18
Offset to the Name
0x0C
2
0x00
Flags
0x0E
2
0x10
4
L
Length of the Attribute
0x14
2
2N+0x18
Offset to the Attribute (b)
0x16
1
0x17
1
0x00
Padding
0x18
2N
Unicode
The Attribute's Name
2N+0x18
L
Attribute Id (a)
Indexed flag
The Attribute (c)
(a) Resident attributes cannot be compressed (b) Each attribute has a unique identifier (c) Rounded up to a multiple of 4 bytes
78
Type Length, including header 1 = non-resident 0’s = unnamed Starting VCN Last VCN Offset to data runs Allocated size of attribute True size of attribute Data Runs
©2005 Christopher Taylor
Non-Resident, No Name Header
Non-Resident, No Name Offset
Size
Value
0x00
4
Attribute Type (e.g. 0x20, 0x80)
0x04
4
Length (including this header)
0x08
1
0x01
Non-resident flag
0x09
1
0x00
Name length
0x0A
2
0x00
Offset to the Name
0x0C
2
Flags
0x0E
2
Attribute Id (a)
0x10
8
Starting VCN
0x18
8
Last VCN
0x20
2
0x22
2
0x24
4
0x28
8
Allocated size of the attribute (c)
0x30
8
Real size of the attribute
0x38
8
Initialized data size of the stream (d)
0x40
...
Data Runs
0x40
Description
Offset to the Data Runs Compression Unit Size (b)
0x00
Padding
Each attribute has a unique identifier (b) Compression unit size = 2x clusters. 0 implies uncompressed (c) This is the attribute size rounded up to the cluster size (d) When is this not equal to the allocated size?
79
Type Length, including header 1 = non-resident Name Length Offset to Name Starting VCN Last VCN Offset to Data Runs
©2005 Christopher Taylor
Non-Resident, Named Header
Allocated size of attribute True size of attribute Attribute name Data Runs
Non-Resident, Named Offset
Size
Value
0x00
4
Attribute Type (e.g. 0x80, 0xA0)
0x04
4
Length (including this header)
0x08
1
0x01
Non-resident flag
0x09
1
N
Name length
0x0A
2
0x40
Offset to the Name
0x0C
2
Flags
0x0E
2
Attribute Id (a)
0x10
8
Starting VCN
0x18
8
Last VCN
0x20
2
0x22
2
0x24
4
0x28
8
Allocated size of the attribute (d)
0x30
8
Real size of the attribute
0x38
8
Initialized data size of the stream (e)
0x40
2N
2N+0x40
...
2N+0x40
Description
Offset to the Data Runs (b) Compression Unit Size (c)
0x00
Unicode
Padding
The Attribute's Name Data Runs (b)
(a) Each attribute has a unique identifier (b) Rounded up to a multiple of 4 bytes (c) Compression unit size = 2x clusters. 0 implies uncompressed (d) This is the attribute size rounded up to the cluster size (e) When is this not equal to the allocated size?
80
Represents where on drive the pieces of the non-resident attribute are Offset and length of run
©2005 Christopher Taylor
Data Runs
81
One data run: 0x410DABBCE6010000 41 DA BB CE 60 10 00
©2005 Christopher Taylor
Normal, Unfragmented File
41 = Header 1 byte for length 4 byte for offset
0xDA = length of run 0x1060CEBB = offset 00 = Footer
82
Multiple runs: 0x110D2321E6011A210A211400 11 0D 23
©2005 Christopher Taylor
Normal, Fragmented File 11 = Header; 1 byte for length; 1 byte for offset 0D = length of run 23 = offset of run relative to 0
21 E6 01 1A 21 = Header; 1 byte for length; 2 byte for offset E6 = length of run 1A01 = offset of run relative to 23
21 0A 21 14 21 = Header; 1 byte for length; 2 byte for offset 0A = length of run 1421 = offset of run relative to 1A01+23
00 = Footer
83
Multiple runs: 0x1112210162110A1200 11 12 21
©2005 Christopher Taylor
Sparse, Unfragmented File 11 = Header; 1 byte for length; 1 byte for offset 12 = length of run 21 = offset of run
01 62 01 = Header; 1 byte for length; 0 offset = sparse 62 = length of run No offset because this section is make believe
11 0A 12 11 = Header; 1 byte for length; 2 byte for offset 0A = length of run 12 = offset of run relative to 21
00 = Footer
84
Shoot me now: 0x1109400107111008110C10010400 11 08 40
©2005 Christopher Taylor
Compressed, Unfragmented File 11 = Header; 1 byte length; 1 byte offset 09 = Length 40 = Offset
01 08 01 = Header; 1 byte length, 0 byte offset (sparse) 07 = Length (sparse run to complete 0x10 cluster compression block block size)
11 10 08 11 = Header; 1 byte length; 1 byte offset 13 = Length 08 = Offset relative to 40 = 48
11 0C 10 11 = Header; 1 byte length, 1 byte offset 0B = Length 13 = Offset relative to 48 = 5A
01 04 01 = Header; 1 byte length; 0 byte offset (sparse) 05 = Length (sparse run to complete 0x10 cluster compression block block size)
00 = footer
85
Run away.
©2005 Christopher Taylor
Compressed, Sparse, Fragmented File
86
Type 0x10 Contains:
©2005 Christopher Taylor
$STANDARD_INFORMATION
File timestamps DOS file permissions (rwx (rwx)) Tracking information for version, quota, security, logging, etc (if applicable)
A required attribute that shows up as the first attribute in every file. These are the timestamps that are reported by Windows. Timestamps are also kept in $file_name and in the parent directory listing, but these are the important ones. The ‘DOS permissions’ are read as follows: Flag
Description
0x0001
Read-Only
0x0002
Hidden
0x0004
System
0x0020
Archive
0x0040
Device
0x0080
Normal
0x0100
Temporary
0x0200
Sparse File
0x0400
Reparse Point
0x0800
Compressed
0x1000
Offline
0x2000
Not Content Indexed
0x4000
Encrypted
If version tracking is on, a max number of versions of this ‘same’ file allowed and which version this particular one is will be present. Also present in ver 3.x+ will be a owner ID from the $O and $Q indexes on the $Quota file; the Security ID from the $SII and $SDS indexes from the $Secure file; the number of bytes to charge to the user’s quota; and the this file’s Update Sequence Number (USN) from the index the $UsrJrnl file. Most of this extra info relates to optional functionality that is not normally turned on.
87
Attribute header File Creation Time File Modified Time MFT Record Modified Time File Accessed Time DOS permissions
©2005 Christopher Taylor
$STANDARD_INFORMATION
Version info Class ID Owner ID Security ID Quota Charge Size USN
88
Type 0x20 If there are a lot of resident attributes and not enough space in the record to fit them all, they get moved to another record and this attribute tells you where to find them
©2005 Christopher Taylor
$ATTRIBUTE_LIST
Lots of hard linked file = many $file_name’s Extremely fragmented = very long data runs Many named streams Multiple Indexes
Pretty rare that this is needed
89
Attribute Header Type Entry Length Name Length Offset to name Starting VCN Base File Reference Attribute ID
©2005 Christopher Taylor
$ATTRIBUTE_LIST
Attribute Name
90
Type 0x30 Contains:
©2005 Christopher Taylor
$FILE_NAME
Reference to parent directory Time stamps File size (both real and disk usage) Flags File’s Name
Contains the inode number of the parent directory that contains this file The same 4 time stamps from $standard_information are repeated here, but these times are not updated unless the file name is changed and thus are often out of sync. Once the file is renamed, these times are updated then promptly ignored again. Both the actual size of the file and the amount of disk space allocated for the file (file’s actual size + cluster slack) are recorded. The flags are the same as in $standard_information And most importantly, the file’s name is listed. After the name, the entry is padded with 0x00 to an even 8 byte cutoff point.
91
Attribute header Parent Directory Reference File Creation Time File Modified Time MFT Record Modified Time File Accessed Time Allocated Size
©2005 Christopher Taylor
$FILE_NAME
Real Size Flags Name Length Name
Parent Directory Reference is in two parts: 6 bytes for the inode number of the parent and 2 bytes for a sequence number. The sequence number is used to track references that haven’t been updated correctly (e.g. if the sequence number of this file is different from all the other files that are in this inode’s list, then it must be from the different iteration of this inode number and thus not valid anymore) If 8.3 name creation is turned on, any files with long names will have a second $file_name attribute that contains the short name. Only one of the two is shown in the directory listing depending on what program is traversing the directory to do the listing.
92
©2005 Christopher Taylor
Short File Names
8.3 file name
Long file name
File has multiple $file_names and both names appear as separate files in the directory listing. If ‘file’ is deleted, that $file_name attribute is removed but the file remains until the last $file_name is removed.
93
©2005 Christopher Taylor
Hard Links
The file named ‘file’
Same file named ‘linked’
File has multiple $file_names and both names appear as separate files in the directory listing. If ‘file’ is deleted, that $file_name attribute is removed but the file remains until the last $file_name is removed.
94
Type 0x40 Only existed in ver 1.2 Not present on my NT system
©2005 Christopher Taylor
$VOLUME_VERSION
95
Type 0x40 ID that follows file to allow tracking even if file name and location change.
©2005 Christopher Taylor
$OBJECT_ID
Mostly used by MS Office files for embedded files and by links Every file that has an $object_id will have it ‘registered’ in the $ObjId $ObjId file
96
Attribute header 16byte identifier
©2005 Christopher Taylor
$OBJECT_ID
This ID tracks the NTFS volume and will be present in an .lnk file whose target is on this volume
This ID tracks this .dot file and will be used to id this file if it is linked into another Office doc
Incidentally, I copied that .dot file and the copy’s ID was incremented by one from the ID shown.
97
Type 0x50 Tracks the owner of the file and permissions granted by that owner for anyone else that accesses the file
©2005 Christopher Taylor
$SECURITY_DESCRIPTOR
98
Attribute Header Revision Flags Offset to User SID Offset to Group SID Offset to SACL Offset to DACL DACL User SID Group SID
©2005 Christopher Taylor
$SECURITY_DESCRIPTOR
The structure changes depending on how many and what type of Access Control Lists (ACL) are present on the file. Use the above example as an example and not a set structure. http://linux-ntfs.sourceforge.net/ntfs/attributes/security_descriptor.html
99
Type 0x60 Contains the volume’s name
©2005 Christopher Taylor
$VOLUME_NAME
Because ‘C:’ isn’t descriptive enough and ‘Local Disk (C:)’ is boring and ‘My Jive Volume (C:)’ has so much better a ring to it. Allows for volumes to contain long names with special characters in different languages then the previously used ‘oem label’ found in the boot sector.
100
Attribute header
©2005 Christopher Taylor
$VOLUME_NAME The Volume Name
Kinda boring really
101
Type 0x70 Contains information about the volume
©2005 Christopher Taylor
$VOLUME_INFORMATION
NTFS version number Flags
This is where we get the 1.2 or 3.1 NTFS version numbers The flags can be: Value
Description
0x0001
Dirty
0x0002
Resize LogFile
0x0004
Upgrade on Mount
0x0008
Mounted on NT4
0x0010
Delete USN underway
0x0020
Repair Object Ids
0x8000
Modified by chkdsk
If the ‘dirty flag’ is set, on next bootup ‘chkdsk /f’ will run.
102
Attribute header Major Version Number Minor Version Number Flags
©2005 Christopher Taylor
$VOLUME_INFORMATION
103
Type 0x80 The actual file is in this ‘attribute’ of itself Can be any of the 4 attribute types
©2005 Christopher Taylor
$DATA
Nonresident, unnamed <- usually Resident, unnamed <- if small enough Nonresident, named Resident, named
104
Resident
Attribute Header
©2005 Christopher Taylor
$DATA The File
Nonresident
Cluster Number Data Runs
105
©2005 Christopher Taylor
Alternate $DATA Streams
The first $data stream is the one normal file operations see
Additional named $data streams
106
Type 0x90 Sets the parameters for the index
©2005 Christopher Taylor
$INDEX_ROOT
Collation rules Entry size Length of entire index
107
Attribute Header Index Entry Type Collation Rule Index Entry size Clusters per index record Offset to first entry Total size of all entries Allocated size of all entries $Index_Allocation needed?
©2005 Christopher Taylor
$INDEX_ROOT
???
108
Type 0xA0 The components that make up the index The ‘storage containers’ and b+ tree structures are laid out here Never resident. If the index is small enough to warrant it being resident, the data is put in the $index_root.
©2005 Christopher Taylor
$INDEX_ALLOCATION
Attribute is named so the bitmap is correlated to a specific index, since some files with indexes have multiple indexes.
109
Every index will have this header
©2005 Christopher Taylor
$INDEX_ALLOCATION
Hi. I’m an INDEX Offset to Update Sequence Size of Update Sequence $logfile sequence number VCN of this INDX buffer in Index Allocation Offset to index entries Size of index entries Allocated size of index entries 1 = not leaf node Update Sequence
110
Directory’s entries are like this: MFT inode of file Length of entry Offset to file name Parent Directory Reference File Creation Time File Modified Time MFT Record Modified Time File Accessed Time Allocated Size
©2005 Christopher Taylor
$INDEX_ALLOCATION
Real Size Flags Name Length Filename namespace Name
111
Type 0xB0 Used in Indexes and $MFT Shows which entries are in use
©2005 Christopher Taylor
$BITMAP
Individual records also have ‘in use’ flag, but this allows for quick view of whole index
String of bits that correlates to number of records
Attribute is named so the bitmap is correlated to a specific index, since some files with indexes have multiple indexes.
112
Attribute Header Bitmap
©2005 Christopher Taylor
$BITMAP
0x7FFF = 0111111111111111 = First entry is not is use; next 21 are in use.
113
Type 0xC0 Called this in version 1.x Functionality was broadened and name changed to…
©2005 Christopher Taylor
$SYMBOLIC_LINK
114
Type 0xC0 Allows for a file to be reprocessed in another way besides as a normal file (as it most likely started out being treated as)
©2005 Christopher Taylor
$REPARSE_POINT
Symbolic links Volume mount links Remote Storage Service
115
Attribute Header Type and flags Data Length Name length Offset to name Length, without header Offset to attribute The actual attribute
©2005 Christopher Taylor
$REPARSE_POINT
Symbolic Link is not the same as Hard Link. A Hard Linked file has two file names and editing either name updates the same data. A Symbolic Link is a separate file that contains no data, only a pointer to the other file. That is what we have pictured above. As this file is getting read, this attribute will be processed and the file system will then go read f:\file instead. This is a very powerful feature. The ‘reparsing’ can be done by anything. There are several ‘canned’ reparse point types, but a driver can be written to do anything and a file can be ‘reparsed’ however that driver wants to read it. This can allow the same data to be read different ways, depending on other variables sent to the driver.
116
Type 0xD0 Used to implement extended attributes used by HPFS on OS/2 clients that are saving files on this NTFS drive in this WinNT server This doesn’t happen very often these days
©2005 Christopher Taylor
$EA_INFORMATION
117
Type 0xE0 One of the HPFS extended attributes referenced in the $ea_information attribute
©2005 Christopher Taylor
$EA
118
Type 0x100 Layout and treatment of this attribute is just like $data Every EFS encrypted file will use this attribute to store the File Encryption Key (FEK) used during encryption/decryption
©2005 Christopher Taylor
$LOGGED_UTILITY_STREAM
119
Attribute Header
©2005 Christopher Taylor
$LOGGED_UTILITY_STREAM The File
120
©2005 Christopher Taylor
Resources http://linuxhttp://linux-ntfs.sourceforge.net - open source project to add NTFS support to the Linux kernel - has excellent (albeit unfinished) documentation http://www.pcguide.com/ref/hdd/file/ntfs/index.htm http://www.pcguide.com/ref/hdd/file/ntfs/index.htm http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ TechRef/8cc5891dTechRef/8cc5891d-bf8ebf8e-41644164-862d862d-dac5418c5948.mspx - How NTFS Works ‘chapter’ of Win2k3 Technical Reference http://www.windowsitpro.com/Article/ArticleID/15719/15719.html http://www.windowsitpro.com/Article/ArticleID/15900/15900.html Inside Win2k NTFS articles by Mark Russinovich www.ntfs.com - documentation and resources for NTFS File System Forensic Analysis by Brian Carrier, 2005, Addison Wesley, Wesley, ISBN 0321268172 Microsoft Windows Internals, Fourth Edition by Mark E. Russinovich, Russinovich, David A. Solomon, Solomon, Microsoft Press, ISBN: 0735619174 Windows NT File System Internals: A Developer's Guide (1st ed) by by Rajeev Nagar, Nagar, 1997, O'Reilly, ISBN 1565922492 .
121