The NTFS file system is managed by several metadata files that collectively make up a sophisticated relational database. Simply put, a metadata file is a file that contains descriptive information about other data. The metadata files do not reside within the confines of the traditional file system. Even if an advanced user selects options in Windows Explorer to view hidden files, system files, and so forth, the metadata files will not be visible in Explorer. However, a cluster-level search tool will allow the user to view and even edit them. The most important of these is the Master File Table (MFT).
The Master File Table (MFT)
The MFT contains all details of all “objects” on the volume, and thus it is the first port of call for evidence regarding the presence of files, the relevant dates and times of files, file sizes, and identifications, as well as their storage locations on the volume.
To prevent the MFT from becoming fragmented, Windows maintains a buffer around it. No new files will be created in this buffer region until the other disk space is used up. This area, by default, is about 12.5%. The buffer size is configurable and can be set to 12.5%, 25%, 37.5%, or 50% of the disk space. Each time the rest of the disk becomes full, the buffer size is halved. This area is known as the “MFT Zone.”
Microsoft documentation says it reserves only the first 16 MFT entries for the file system metadata files, but in practice, the first entry that is allocated to a user file or directory is entry 24. Entries 17 to 23 are sometimes used as overflow when the reserved entries are not enough. The table below defines the various metadata files as defined by Microsoft. From a digital forensics standpoint, the two metadata files most commonly examined are $MFT and $BadClus. The other files are included for completeness.
Entry | File Name | Description |
0 | $MFT | Actual file table. Contains at least one base file record for each file and folder on the volume that identifies what clusters host file data. A fragmented file will have a record for each fragment. |
1 | $MFTMirr | A duplicate image of the first four records of the MFT. This file guarantees access to the MFT in case of a single-sector failure. |
2 | $LogFile | Contains a list of transaction steps used for NTFS recoverability. Log file size depends on the volume size and can be as large as 4 MB. It is used by Windows NT/2000 to restore consistency to NTFS after a system failure. It provides information for file system journaling. It retains records of changes to the file system. |
3 | $Volume | Contains information about the volume, such as the volume label and the volume version. |
4 | $AttrDef | A table of attribute names, numbers, and descriptions. Defines attributes of files and folders on volume. These include hidden, read-only, and other file system attributes. |
5 | $. | Index of files in system root. Too many files in the root directory can impact system performance |
6 | $Bitmap | A representation of the volume showing which clusters are in use. |
7 | $Boot | Code used to mount the volume that defines the file system along with bootstrap loader code and a pointer to the OS boot files used if the volume is bootable. |
8 | $BadClus | Maps bad clusters for the volume. |
9 | $Secure | Contains security descriptors for files contained within the volume. Services running in the OS that maintain file-level security are dependent on this metafile. |
10 | $UpCase | Conversion table for translating lowercase characters into matching Unicode uppercase characters. |
11 | $Extend | Defines optional extensions, including quota definitions, reparse point data, and others. |
12 - 15 |
| Not currently used. |
Microsoft calls each entry in the MFT a file record. The MFT record is usually fixed at 1024 bytes, but only the first 42 bytes have a defined purpose. The remaining bytes store attributes, which are small data structures that have a specific purpose. All, except the Partition Boot Record, have a file record in the MFT.
The MFT record is usually fixed at 1024 bytes. Although facilities are available to accommodate other sizes, 1024 bytes is the only size that has been seen to date within the field of digital forensics. The defined size of the MFT record is specified at byte offset 0x40 of the boot record. This is a signed 8-bit number, which is used in two different ways. If this number is positive (between 00 and 7Fh), it defines the number of clusters for each MFT record. If this number is negative (80–FFh), it indicates the number of bytes allocated for each MFT record. The actual value is calculated by raising 2 to the power of the absolute value of this number. The second way is to examine the MFT record’s header at offset 0x1C (4 bytes), which shows the physical size of the record. Every $MFT FILE record has a 56 (0x38) byte long header as described in the table below. This header is followed by a number of attributes, which themselves have an Attribute Header. The Attribute header contains a field that stores the length of the Attribute. The next series of bytes following that stated length should either be the header field for the next Attribute or the end of record marker, 0xFFFFFFFF.
Offset | Length | Description |
0x00 | DWORD | File Record Signature |
0x04 | WORD | Offset to MFT Record update sequence number (relative to start of record) |
0x06 | WORD | The number of entries in the fixup array. |
0x08 | LONGLONG | $LogFile Sequence Number (LSN) |
0x10 | WORD | FILE Record Sequence Number |
0x12 | WORD | HardLink Count |
0x14 | WORD | Offset to 1st attribute (relative to start of record) |
0x16 | WORD | Allocation Status Flags (0x0000: Deleted file; 0x0001: Allocated file; 0x0002: Deleted directory; 0x0003: Allocated directory) |
0x18 | DWORD | Logical Size of MFT record |
0x1C | DWORD | Physical Size of MFT record |
0x20 | 6 BYTES | Base Record number |
0x26 | WORD | Base Record Sequence Number |
0x28 | WORD | Next available attribute ID |
0x2A | WORD | Alignment to 4-byte boundary |
0x2C | DWORD | MFT record Number |
0x30 | WORD | MFT Record Update Sequence Number |
0x32 | WORD | Update sequence Array - FixUp1 |
0x34 | WORD | Update sequence Array - FixUp2 |
0x36 | WORD | Reserved/Unused |
When the MFT file record size is 1024 bytes long, the ‘Number of fix up byte pairs’ equals 3: one is the ‘MFT record Update Sequence Number,’ and 2 are the FixUp values (one for each 512 bytes). When the $MFT file record size is 4096 bytes long, the ‘Number of fix up byte pairs’ is 9: one is the ‘MFT Record update sequence Number,’ and 8 are the FixUp values (8*512=4096). In those cases, the $MFT file record header is 66 bytes long:
Byte Offset | Length | Description |
0x32 | WORD | Update sequence Array FixUp1 |
0x34 | WORD | Update sequence Array FixUp2 |
0x36 | WORD | Update sequence Array FixUp3 |
0x38 | WORD | Update sequence Array FixUp4 |
0x3A | WORD | Update sequence Array FixUp5 |
0x3C | WORD | Update sequence Array FixUp6 |
0x3E | WORD | Update sequence Array FixUp7 |
0x40 | WORD | Update sequence Array FixUp8 |
Consider the hex dump of MFT entry 35 for our test bitstream image as shown in the figure below. The MFT entry header has been highlighted in red. This example uses a little-endian bit-ordering scheme.
The MFT record starts with a signature (aka “magic number”) 0x46494C45 or “FILE” (at offset 0x00-0x03), which identifies the entry as a file record. From an inspection of the sample MFT, we note that all records start in this way. If the entry is unusable, it would be “BAAD”.
The next two bytes at offsets 0x04 - 0x05 are the value 0x3000, which in big endian is 0x0030 = decimal 48. This is the offset pointer to the Update Sequence Number and Array, and counts from the beginning of the header. It is typically different for different operating systems. Examination of the two bytes at offset 0x30 (decimal 48) shows them to contain the values 0x0300.
Byte offsets 0x06 - 0x07 specify the size of the update sequence. As can be seen in the above figure, the value here when the endianness is reversed is 0x0003 or decimal 3. This number here is used to represent the number of words rather than the number of bytes. It indicates that three words are used for the Update Sequence Number and the Update Sequence Array. From an inspection of the MFT, we note that all records in the sample file contain 0x03 00 at this location.
The next 8 bytes, offsets 0x8 - 0x0F, refer to the Logfile Sequence Number (LSN). The LSN is used for the file system log (or journal). The log records when metadata updates are made to the file system so that a corrupt file system can be more quickly fixed. The link count shows how many directories have entries for this MFT entry. If hard links were created for the file, this number is incremented by one for each link.
Bytes 0x10 - 0x11 are referred to as the “Record Use Sequence Number”, which is set to “1” when the MFT record is first used. It is subsequently incremented each time that the record is reused. Its value is 0x0002 (endian reversed) in our test image. It is interesting to note that the actual increment operation is made when the record is marked as deleted and available for reuse.
Bytes 0x12 - 0x13 refer to the hard link count for the file, stored as a two-byte integer number. Microsoft defines hard links as:
“NTFS-based links to a file on an NTFS volume. By creating hard links, you can have a single file in multiple folders without duplicating the file. You can also create multiple hard links for a file in a folder if you use different file names for the hard links. Because all of the hard links reference the same file, applications can open any of the hard links and modify the file.”
Bytes situated at offsets 0x14 - 0x15 are a two-byte integer value specifying the “Offset to First Attribute” in this record. As with most of the pointers in the MFT, the figure is calculated from the start of the header. This pointer indicates that the first Attribute starts at 0x0038 (endian reversed), which is byte offset 56 (decimal). An examination of the value at that byte offset reveals it to be 0x1000, the $STANDARD_INFORMATION attribute (to be seen later in this post).
The next two bytes, 0x16 - 0x17, are a set of flags. The two bytes, and in particular the byte at offset 0x16, refer to the current state of this particular record. The possible values are: 0x00 00 = a deleted FILE record, 0x01 00 = a FILE record in use, 0x0200 = a deleted DIRECTORY record, and 0x0300 = a DIRECTORY record in use. In this case, we note that the value is 0x0100 and that it is a FILE record in use.
Bytes at offsets 0x18 - 0x1B indicate the “real” length of the file record. To clarify this, it is referred to here as the “logical size” in keeping with the standards operated within the forensic community relating to file size and allocated space size. This “logical” size is the actual number of bytes of data stored in the record. By inspection of the sample MFT record, the probable size may be identified visually as 432 bytes. Stored as a 4-byte integer in little endian format, the value 000001B0 equates to 432 decimal, exactly as expected!.
The final four bytes of this line, from offsets 0x1C to 0x1F, indicate the allocated storage size of the file record. This is referred to as the “physical” size, and you will recall that this size has already been preset to 1024 bytes by the BPB. In this case, translation from the little-endian format gives 0x00000400, which does indeed equate to 1024 decimal.
The bytes at offsets 0x20 - 0x27 store the “Base File Reference” that is used when the record to be stored exceeds the allocated space of one or more MFT records. When this occurs, the succeeding records, which may be considerably detached from the first one used, contain a reference at this location to their “parent” record. It is not necessarily the case that the next record in the MFT will be used to store such “overflow” data; the record used could be anywhere in the MFT and some distance, in terms of records, from the first one used. Experiments carried out on this “Base File Reference” concept found the name to be a rather poor description of this field. The 8-byte number is actually made up of more than one value. The function of the whole number is unclear, but it seems likely that the first six bytes specifically identify the record number in the MFT of the “parent” record, and the last two bytes of the field may well be a sequence number or identifier. In addition to these “backward” pointers to the parent in these Base File Reference fields, there are a series of “forward” pointers to the children (the extended records) in the parent record itself. These forward pointers are in the order that the children are to be read. In this particular case, there is no parent record, and the value in the Base File Reference is 0x0000000000000000.
Two bytes, at offset 0x28 - 0x29, appear to be used to identify the value of the ordinal number for the next Attribute to be added to this record. So the next attribute ID to be assigned is 4. Therefore, we should expect that there are attributes with IDs 1 to 3. This fact will be seen to be confirmed for this record as the deconstruction analysis progresses. However, visual inspection of 100 other records in this MFT revealed that the value is unreliable as an indicator of the number of Attributes present. It may well be that it is a count of Attributes that have been used within the record and that it reverts to zero on record reuse. Experiments have shown that the count is incremented when an Attribute is added, but does not seem to be decremented when one is removed.
The two bytes at the offset 0x2A-0x2B are actually used. Since file record numbers are 48-bit values, 4 bytes are not enough to store such a number. So, these two bytes at the offset 0x2A-0x2B are used to store the higher part (16 bits) of the record number (which is zero, nevertheless, because it’s almost impossible to reach the 32-bit limit). Note that the higher part is stored before the lower part!.
The MFT record number is found at a byte offset of 0x2C within the file record header for modern NTFS versions (Windows XP and later). This field is 4 bytes (DWORD) in size and holds the specific numerical identifier for that entry within the Master File Table ($MFT). In older versions (like NT 4), this specific field did not exist, and the header size was smaller (42 bytes instead of 48/56 bytes for modern systems).
The value at bytes 0x30 - 0x31 is the “Update Sequence Number”. This is apparently used to check the integrity of each MFT record. As may be noted by inspection, this same number (in this case 0x03 00) also appears as the last two bytes of each physical sector of the MFT record. The explanation for this will be given later.
The four bytes, offsets 0x32 - 0x35, are used for the “Update Sequence Array”. In this case, the sequence array consists of the four bytes 0x23 00 00 00. In order to check the consistency of MFT records and other protected records, NTFS uses a “fix-up” code that it places in the final two bytes of each sector. In this case, that code is 0x00 00, the number that is identified as the “Update Sequence Number”. The bytes which originally occupied these locations in each sector are placed in a buffer, the buffer we are now examining at bytes 0x32-0x35. There are two sectors in an MFT record, and therefore, four bytes are replaced. The four replaced bytes are 0x2300 for the first sector and 0x0000 for the second sector.
Each MFT entry (file record) has a 48-bit MFT record number, starting at 0 and increasing sequentially. Each MFT entry also contains a 16-bit sequence number at offset 0x10. This sequence number starts at 1 when the entry is first allocated and increments every time the entry is reused (e.g., after file deletion and reallocation). NTFS forms a 64-bit file reference by concatenating the sequence number in the upper 16 bits and the 48-bit record number in the lower bits. This file reference uniquely identifies an MFT entry. The physical position of an MFT record can be calculated by multiplying its record number by the fixed MFT record size and then referencing the corresponding byte offset within the $MFT file. The sequence number helps the file system detect stale references. Suppose a reference to an MFT entry has a sequence number that does not match the current sequence number of that entry. In that case, the reference is considered invalid, which helps maintain file system consistency and aids in data recovery.
An interesting feature of the first 15 MFT records after the record for the $MFT itself (record 0), is that their MFT Record number and their MFT Record Sequence number are the same, and these records are not reused as other records.
File Name | MFT Record Number | MFT Sequence Number |
$MFTMirr | 1 | 1 |
$LogFile | 2 | 2 |
$Volume | 3 | 3 |
$AttrDef | 4 | 4 |
$. | 5 | 5 |
$Bitmap | 6 | 6 |
$Boot | 7 | 7 |
$BadClus | 8 | 8 |
$Secure | 9 | 9 |
$UpCase | 10 | 10 |
$Extend | 11 | 11 |
*Reserved* | 12 | 12 |
*Reserved* | 13 | 13 |
*Reserved* | 14 | 14 |
*Reserved* | 15 | 15 |
The figure below shows the file reference number for the $AttrDef MFT record on a live system.
Each entry in the Master File Table (MFT) consists of a header and a collection of attributes that describe a file or directory. NTFS represents files as sets of attributes—some store metadata such as timestamps, names, and security information, while one (the $DATA attribute) contains the file’s actual content. As a result, every file or directory on an NTFS volume has at least one associated MFT entry, and large files may use multiple MFT records. In essence, files in NTFS are collections of attributes that hold both their descriptive information and their data.
MFT Entry Attributes Concepts
Every MFT entry begins with a small header that describes the structure and content of the entry. One of the key fields in this header is the offset (located at byte 0x14) to the first attribute. Attributes then follow sequentially, one after another, until an attribute type 0xFFFFFFFF marks the end of the list.
Each attribute consists of two main components: an attribute header, which defines the attribute’s type and metadata, and the attribute content, also referred to as the attribute’s stream. The supported Attributes on an $MFT file record can be seen by examining the $AttrDef metadata file in the same Volume, which also provides their general properties. The order in which Attributes appear in a file record is the same as the order they appear in the $AttrDef file. NTFS supports two forms of attribute storage:
Resident attributes store their content directly inside the MFT entry.
Non-resident attributes store their content outside the MFT entry on disk, requiring the file system to track the external cluster locations.
Therefore, the data structure for the non-resident attribute is slightly different from the resident attribute, particularly because the content of the attribute is stored outside the MFT entry, so the addresses of these clusters allocated to store the content must be specified. The contents of non-resident attributes are stored in intervals of clusters called data runs. Each run is represented by its starting cluster and its length in clusters. The lengths of data runs vary, and are determined by the first byte of a run, where the lower 4 bits represent the number of bytes for the length of the run and the upper 4 bits represent the number of bytes containing the starting cluster address for the run. Each run uses contiguous disk allocation.
Because each data run describes a contiguous block of clusters, the attribute’s content may span multiple runs scattered across the volume, depending on the level of fragmentation. This flexible, variable-length encoding allows NTFS to efficiently map non-resident data, even when stored in fragmented segments.
The table below shows the layout of an attribute, including resident attributes and non-resident attributes.
Byte Offset | Length | Description | |
0x00 | DWORD | Attribute type identifier is classified according to the type of information stored in the file (16 = $STANDARD_INFORMATION for general information, 48 = $FILE_NAME for file name & MAC, 64 = $OBJECT_ID for file & directory, 128 = $DATA for file content, etc.) | |
0x04 | DWORD | Length of Attribute | |
0x08 | BYTE | Non-resident flag (0x00: Resident; 0x01: Non-resident) | |
0x09 | BYTE | Length of stream name (Number of Unicode characters). Unicode characters are 2-byte values, so the length in bytes is 2x this number. | |
0x0A | WORD | Offset to stream name (from the start of the Attribute) | |
0x0C | WORD | Attribute Flags. The ‘Attribute Flags’ bytes have 3 currently observed: 0x0001 = Attribute content is compressed; 0x8000 = Attribute content is sparse; 0x4000 = Attribute content is encrypted. | |
0x0E | WORD | Attribute identifier | |
Resident Attribute | Non-Resident Attribute | ||
Bytes Offset | Description | Bytes Offset | Description |
0x10 - 0x13 | Size of file content | 0x10 - 0x17 | Starting virtual cluster number (VCN) of the run list |
0x14 - 0x15 | Offset of file content (from the start of the Attribute) | 0x18 - 0x1F | Last VCN of the run list |
|
| 0x20 - 0x21 | Offset to the data runs |
|
| 0x22 - 0x23 | Compression unit size |
|
| 0x24 - 0x27 | Unused |
|
| 0x28 - 0x2F | Allocated size of the attribute content |
|
| 0x30 - 0x37 | Actual size of the attribute content |
|
| 0x38 - 0x3F | Initialized size of the attribute content |
|
| 0x40 | Attribute Total Allocated Size* |
The ‘Attribute Total Allocated Size’ field is only found in attributes where the Attribute Flag 0x0001 (Attribute Content is Compressed) or 0x8000 (Attribute Content is Sparse) is set. The ‘DataRun offset’ in those Attributes is changed from the typical 0x40 (64) to 0x48 (72) to accommondate the 8 extra bytes.
The offset of the first Attribute in a FILE record is determined by the ‘Offset to 1st attribute’ in the Record Header. To get to the 2nd Attribute, one must add the ‘Attribute Length’ of the 1st Attribute to the ‘Offset to 1st attribute’, etc. For example:
- ‘Offset to 1st attribute’+ ‘1st Attribute Length’ → Start of 2nd Attribute.
- ‘Offset to 1st attribute’+ ‘1st Attribute Length’ + ‘2nd Attribute length’ → Start of 3rd Attribute.
The minimum Attribute Header length is 24 bytes long, whether it has a Stream Name or not. According to their purposes, there are many types of attributes used by an NTFS volume. This is defined by a hidden system file named $AttrDef. $AttrDef is made up of multiple 160-byte records, one for each attribute. Each record contains the attribute’s name, numeric type identifier, flags (e.g., Non-resident or Non-resident, indexed or not), minimum size, and a maximum size. If an attribute has no size limitations, the minimum size will be set to 0, and the maximum will have all bits set to 1. The table below lists the default MFT entry attribute types, although it is unlikely that all will be used in an MFT record.
Attribute Type Identifier | Attribute Name | Description |
0x10 | $STANDARD_INFORMATION | General information, such as flags; file system timestamps, including the last accessed, written, and, created times; and the owner and security ID. |
0x20 | $ATTRIBUTE_LIST | List where other attributes for a file can be found. |
0x30 | $FILE_NAME | Filename, in Unicode, and file system timestamps, including the last accessed, written, and created times. |
0x40 | $VOLUME_VERSION | Volume information. Exists only in version 1.2 (Windows NT). |
0x40 | $OBJECT_ID | A 16-byte unique identifier for the file or directory. Exists only in versions 3.0+ and after (Windows 2000+). |
0x50 | $SECURITY_DESCRIPTOR | The access control and security properties of the file. |
0x60 | $VOLUME_NAME | Volume name. |
0x70 | $VOLUME_INFORMATION | File system version and other flags. |
0x80 | $DATA | File contents. |
0x90 | $INDEX_ROOT | Root node of an index tree. |
0xA0 | $INDEX_ALLOCATION | Nodes of an index tree rooted in the $INDEX_ROOT attribute. |
0xB0 | $BITMAP | A bitmap for the $MFT file and for indexes. |
0xC0 | $SYMBOLIC_LINK | Soft link information. Exists only in version 1.2 (Windows NT). |
0xD0 | $REPARSE_POINT | Contains data about a reparse point, which is used as a soft link in version 3.0+ (Windows 2000+). |
0xE0 | $EA_INFORMATION | Used for backward compatibility with OS/2 applications (HPFS). |
0xF0 | $EA | Used for backward compatibility with OS/2 applications (HPFS). |
0x100 | $LOGGED_UTILITY_STREAM | Contains keys and information about encrypted attributes in version 3.0+ (Windows 2000+). |
Recall that every file on the file system will have at least one MFT entry, and the size of each MFT entry is only 1024 bytes. In case a file has too many attributes that won’t fit into a single MFT entry, an additional MFT entry would be used, linked from the base MFT through the use of the $ATTRIBUTE_LIST attribute. In other words, the $ATTRIBUTE_LIST attribute is used to indicate where other attributes can be found for the given MFT entry.





Post a Comment