The NTFS Master File Table (MFT)

Forensic examiners require an in-depth understanding of the MFT to interpret and verify what forensic tools are displaying and to retrieve additional information about deleted files beyond what is available using automated software.


 Formatting a volume as NTFS leads to the creation of two sections:


  1.  Partition Boot Sector – On an NTFS disk, the first data is the Partition Boot Sector which starts at sector 0 of the disk and can expand to 16 sectors. This defines the file system’s data structure. It provides the cluster size, MFT’s entry size, and the starting cluster address of the MFT since it is not placed in a predefined sector. This enables us to move the MFT whenever a bad sector takes up its normal location.
  2.  Data Area – This is made up of two components:
    • File Area
    •  Master File Table (MFT) – This is the main subject of discussion in this post.

NTFS layout
Figure 1: The Layout of an NTFS Volume


All “objects” stored on the volume are regarded as files, except for the Partition Boot Record.


Master File Table (MFT)

The Master File Table (MFT) is the primary source of metadata in NTFS. It contains or indirectly references everything about a file: its timestamps, size in bytes, attributes (such as permissions), parent directory, and contents. A sizeable area of the NTFS volume is reserved for the MFT to avoid it becoming fragmented as it grows in size. This area, by default, is about 12.5% of the volume size and is known as the “MFT Reserved Area”. As data is added, the MFT can expand to take up 50% of the disk.


Master File Table
Figure 2: The Master File Table
 

On a standard hard drive with 512-byte sectors, the MFT is structured as a series of 1,024-byte records, also known as “entries,” one for each file and directory on a volume but only the first 42 bytes (MFT header) have a defined purpose. The remaining 982 bytes store attributes, which are small data structures that have a very specific purpose. However, on advanced format (AF) drives with 4KB sectors, each MFT record will be 4,096 bytes instead.


Basic layout of MFT entry
Figure 3: Basic layout of MFT entry


The first field in each MFT entry is the signature (or magic number), and a standard entry will have the ASCII string "FILE." or the value 0x46494c45. If an error is found in the entry, it will have the string "BAAD." There is also a flag field that identifies if the entry is being used and if the entry is for a directory.


MFT entry header


The $MFT record starts with a header with a size of 42 bytes. The first 4 bytes (characters) for all MFT records are FILE (or 0x46494c45 as earlier stated). The header information contains additional data specifying where the first attribute ID starts, which is typically at offset 0x20 from the beginning of the record. Each attribute ID has a length value in hexadecimal defining where it ends and where the next attribute starts. The length value is located 4 bytes from the attribute ID.  The data structure of the MFT header shown in the figure above is analysed in the table below.


Data Structure of MFT Header

Byte range

Description

0 - 3

Signature (“FILE”). Size: 4 bytes

4 - 5

Offset to fixup array - 0x30 00. Size: 2 bytes. 

This output is in little-endian ordering, so we need to reverse the order of the numbers. So it becomes 00 30, which is 48 in decimal. This shows that the fixup array is located 48 bytes (0x0030) into the MFT entry.

6 - 7

Number of entries in fixup array - 0x03 00. Size: 2 bytes.

This output is in little-endian ordering, so we need to reverse the order of the numbers. So it becomes 00 03, which is 3 in decimal. This means that the array has three values in it. The fixup array is used to validate sectors within the MFT record. 

8 - 15

$LogFile sequence number (LSN) - 0x 4A 43 22 D4 0C 00 00 00. Size: 8 bytes.

Holds the sequence number of the logfile entry that tracks every change to the file. The log records when metadata updates are made to the file system so that a corrupt file system can be more quickly fixed.

This output is in little-endian ordering, so we need to reverse the order of the numbers. So it becomes 00 00 00 0C D4 22 43 4A, which is 55098622794 in decimal.


16 - 17

Sequence value - 0x 01 00. Size: 2 bytes

The  sequence  value  is  incremented  when  the  entry  is  either allocated  or  unallocated, determined by the OS. i.e it holds the sequence number of the logfile entry that tracks every change to the file 

This output is in little-endian ordering, so we need to reverse the order of the numbers. So it becomes 00 01, which is 1 in decimal. This means that this is the first time this entry has been used.


18 - 19

Link count - 0x 01 00. Size: 2 bytes

The link count shows how many directories have entries for this MFT entry. If hard links were created for the file, this number is incremented by one for each link.

Microsoft defines hard links as:

“NTFS-based links to a file on an NTFS volume. By creating hard links, you can have a single file in multiple folders without duplicating the file. You can also create multiple hard links for a file in a folder if you use different file names for the hard links. Because all of the hard links reference the same file, applications can open any of the hard links and modify the file.”

In little endian becomes 00 01 which is 1 in decimal. This brings us to a conclusion that only one directory has entry for this MFT record/entry.


20 - 21

Offset to first attribute -  0x38 00. Size: 2 bytes

This is the first attribute for the file. All other attributes follow the first one, and we find them by advancing ahead using the size field in the attribute header. The end of file marker 0xffffffff exists after the last attribute. If a file needs more than one MFT entry, the additional ones will have the file reference of the base entry in their MFT entry.

This output is in little-endian ordering, so we need to reverse the order of the numbers. So it becomes 00 38, which is 56 in decimal. This indicates that the first Attribute starts at byte offset 56.

22 - 23

Flag (in-use and directory): 0x0000: Deleted file; 0x0001: Allocated file; 0x0002: Deleted directory; 0x0003: Allocated directory. Size: 2 bytes

In this case we note that the value is 0x01 00 and that it is a FILE record in use

24 - 27

Used size of MFT entry - 0xB0 01 00 00. Size: 4 bytes

Indicates the “real” length of the file record. If this MFT record is the base entry for the file then this field is zero: if the record is an extension then this field holds the base record reference address. Here it is referred to here as the “logical size”. This “logical” size is the actual number of bytes of data stored in the record.

Reversing this value, it becomes 00 00 01 B0; which equates to 432 in decimal. Therefore it can be concluded that the entry size is 432 bytes.

28 - 31

Allocated size of MFT entry - 00 04 00 00. Size: 4 bytes

Indicates the allocated storage size of the file record. This is referred to as the “physical” size and this size has already been preset to 1024 bytes by the BPB. 

In this case translation from the little endian format gives 0x00 00 04 00, which does indeed equate to 1024 bytes in decimal.


32 - 39

File reference to base record. Size: 8 bytes

It is used when the record to be stored exceeds the allocated space of one or more MFT records.

40 - 41

Next attribute ID. Size: 2 bytes

42 - 43

Alignment to 4-byte boundary

44 - 47

MFT file record number (only in NTFS 3.1 and later)

42 - 1023

Attribute and Fixup value

 

 The figure below shows a pictorial representation of the above table.


MFT entry


Each entry contains metadata and attributes that both describe a file or directory and indicate where its contents reside on the physical disk. The essential elements of an MFT entry include the following:


Record Type

Specifies whether a given entry represents a file or directory.

Record #

An integer used to identify a given MFT entry. Record numbers grow sequentially as new entries are added.

Parent Record #

The record number of the parent directory. Each MFT entry only tracks the record number of its immediate parent, rather than its full path on disk. You can re-construct a file or directory’s full path by following this sequence of record numbers until you reach the root entry for a volume.

Active/Inactive Flag

MFT entries for deleted files or directories are marked “Inactive.” NTFS will automatically reclaim and replace inactive entries with new active entries to keep the MFT from growing indefinitely.

Attributes

Each MFT entry contains a number of “attributes” that contain metadata about a file—everything from timestamps to the physical location of the file’s contents on disk. Some important attributes include $STANDARD_INFORMATION, $FILENAME, and $DATA.


In addition to the MFT, there are a number of metadata files that are used by NTFS to manage the filing system and each has a record in the MFT as follows:


Description of Internal NTFS metadata files

Entry

File name

Description

0

$MFT

MFT entry itself. It is the base file record for each folder on the NTFS volume; other record positions in the MFT are allocated if more space is needed.

1

$MFTMirr

Backup copy of the first entry in MFT. The first four records of the MFT are saved in this position. If a single sector fails in the first MFT, the records can be restored, allowing recovery of the MFT.

2

$LogFile

Previous transactions are stored here to allow recovery after a system failure in the NTFS volume.  The size of the log file depends on the size of the volume, but you can increase the size of the log file by using the chkdsk command.

3

$Volume

Information specific to the volume, such as label and version, is stored here.

4

$AttrDef

A table listing attribute names, numbers, and descriptions.

5

$

Root directory of the file system

6

$Bitmap

A map which shows allocation status of each cluster in the file system (1 = cluster is allocated, 0 = cluster is unallocated)

7

$Boot

Used to mount the NTFS volume during the bootstrap process; additional code is listed here if it’s the boot drive for the system.

8

$BadClus

For clusters that have irrecoverable errors, an entry of the cluster location is made in this file.

9

$Secure

Unique security descriptors for the volume are listed in this file. It’s where the access control list (ACL) is maintained for all files and folders on the NTFS volume (only applicable for Windows 2000 and Windows XP).

 

10

$Upcase

Converts all lowercase characters to uppercase Unicode characters for the NTFS volume.

11

$Extend

Optional extensions are listed here, such as $Quota (disk quota limit), $ObjId (link tracking), and $Reparse (symbolic link)

12 - 15

----------

Reserved for extension entries or future metadata


These files are invisible to the user and manage the partition in terms of allocation of storage space, identification of space available, recovery information and descriptions of the file “attributes” available.


Location of the MFT

 The location of the MFT is given within the BIOS Parameter Block (BPB).  The logical cluster number of the start of the MFT is given at the 8 byte offsets 48 to 55 (in decimal) as a 64 bit little endian number or at byte offset 0x30-0x37 (in hex). This is given by the box highlighted red in the figure below.



Logical cluster number for $MFT: 0x00 00 0C 00 00 00 00 00. On converting to little endian becomes 0x0C 00 00, which is 786432 in decimal.


MFT Entry Addresses

Each MFT entry, also known as a file record, is assigned with a unique 48-bit sequence number (or the file (record) address). The first entry (record) has the address of zero and the address increases sequentially. In addition, each MFT entry has another 16-bit sequence number stored within the MFT entry, located at its byte offer 16. It starts with 1 when the entry is allocated, and is incremented by 1 whenever the entry is reallocated (or the file represented by it is deleted).  For example, consider MFT entry 313 with a sequence number of 1. The file that allocated entry 313 is deleted, and the entry is reallocated to a new file. When the entry is reallocated, it has a new sequence number of 2. The result of concatenating the sequence number in the upper 16-bits and the file (record) number in the lower 48-bits gives a 64-bit file reference address, which is used by NTFS to refer to MFT entries.


MFT file reference address
Figure 3: MFT file reference address


NTFS uses the file reference address to refer to MFT entries because the sequence number makes it easier to determine when the file system is in a corrupt state. For example, if the system crashes at some point while the various data structures for a file are being allocated, the sequence number can determine whether a data structure contains an MFT entry address because the previous file used it or because it is part of the new file. We also can use it when recovering deleted content. For example, if we have an unallocated data structure with a file reference number in it, we can determine if the MFT entry has been reallocated since this data structure used it. The sequence number can be useful during an investigation.



 MFT and File Attributes 

MFT entries are comprised of a header and sets of attributes that describe the files or directories on the disk. These attributes are stored as metadata, and contain information about the file. Thus, each file on NTFS file system has an associated MFT entry. In other words, files in NTFS are collections of attributes, so they contain their own descriptive information, as well as their own data.


File or folder information is typically stored in one of two ways in an MFT record: resident and nonresident. For very small files, about 512 bytes or less, all file metadata and data are stored in the MFT record. These types of records are called resident files because all their information is stored in the MFT record. If the amount of data for a file is too large to be accommodated within the MFT record, then the data is stored elsewhere on the volume in a separate cluster or series of clusters.  The file or folder’s MFT record provides cluster addresses where the file is stored on the drive’s partition. These cluster addresses are called data runs. This type of MFT record is referred to as “nonresident” because the file’s data is stored in its own separate file outside the MFT. 


Each MFT Entry Attribute consists of two parts: Attribute head and attribute content, where the important area in the attribute is the attribute’s header, which is generic to all attributes and describes the attribute’s properties like the type of value and the size of the attributes. The attribute header has a size of 16 bytes. The actual content or value of the attribute is also called stream. Many types of attributes have their own special internal structures for their contents, such as “FILE_NAME”, “INDEX_ROOT” and the attribute content vary depending on the size of the file.


Generic attribute header
Figure: Generic attribute header
 

According to their purposes, there are many types of attributes used by an NTFS volume. This is defined by a hidden system file named $AttrDef. $AttrDef is made up of multiple 160 byte records, one for each attribute. Each record contains the attribute’s name, numeric type identifier, flags (e.g., Resident or Non-resident, indexed or not), minimum size, a maximum size. If an attribute has no size limitations, the minimum size will be set to 0 and the maximum will have all bits set to 1. In addition, each attribute type has a name, and it has all capital letters and starts with "$."


MFT entry with header and content attribute
Figure 4: MFT entry with attribute header and content


Some of the default attribute types and their identifiers are given in the table below. It is important to note that not all these attribute types and identifiers will exist for every file.



ID

Attribute Type

Description

10 00 00 00

$STANDARD_INFORMATION

This field contains data on file creation, alterations, MFT changes, read dates and times, and DOS file permissions.

20 00 00 00

$ATTRIBUTE_LIST

Lists the location of all the attribute records that do not fit in the MFT record (nonresident attributes).

30 00 00 00

$FILE_NAME

The long and short names for a file are contained here. Up to 255 Unicode bytes are available for long filenames. For POSIX requirements, additional names or hard links can also be listed. Files with short filenames have only one attribute ID 0x30. Long filenames have two attribute ID 0x30s in the MFT record: one for the short name and one for the long name.

40 00 00 00

$OBJECT_ID (VOLUME_VERSION IN WINDOWS NT)

Ownership and who has access rights to the file or folder are listed here. Every MFT record is assigned a unique GUID. Depending on your NTFS setup, some file records might not contain this attribute ID. It has a size of 16 bytes.  Exists only in versions 3.0+ and after (Windows 2000+).

50 00 00 00

$SECURITY_DESCRIPTOR

Contains the access control list (ACL) for the file.

60 00 00 00

$VOLUME_NAME

The volume-unique file identifier is listed here. Not all files need this unique identifier

70 00 00 00

$VOLUME_INFORMATION

This field indicates the version and state of the volume.

80 00 00 00

$DATA

File data for resident files or data runs for nonresident files.

90 00 00 00

$INDEX_ROOT

Root node of an index tree

A0 00 00 00

$INDEX_ALLOCATION

Nodes of an index tree rooted in $INDEX_ROOT attribute.

B0 00 00 00

$BITMAP

A bitmap indicating cluster status, such as which clusters are in use and which are available.

C0 00 00 00

$REPARSE_POINT

This field is used for volume mount points and Installable File System (IFS) filter drivers. For the IFS, it marks specific files used by drivers.

D0 00 00 00

$EA_INFORMATION

Used for backward compatibility with OS/2 applications (HPFS)

E0 00 00 00

$EA

Used for backward compatibility with OS/2 applications (HPFS)

00 00 00 00

$LOGGED_UTILITY_STREAM

Contains keys and information about encrypted attributes in version 3.0+ (Windows 2000+)


In this post, we have introduced the NTFS Master File Table. In subsequent posts, we will elaborate on all the concepts discussed here from a forensics standpoint.

Post a Comment

Previous Post Next Post