NTFS Forensics: NTFS Fix-Ups

In digital forensics, it is standard practice to use MD5 (or similar cryptographic hash functions) to verify that two files contain identical binary content. This technique is frequently applied when investigators need to confirm that multiple copies of a file found on a storage volume are exact duplicates, particularly when file locations, timestamps, or other metadata are relevant to the case. It is also essential when processing large collections of files—such as tens of thousands of images—where automated hash comparison against a database of known files can identify items of interest without requiring manual review of every file.

For any hash-based comparison to be reliable, the recovered file must be an accurate and complete representation of its original content. NTFS employs a built-in integrity mechanism known as the Update Sequence Array (USA), also referred to as the fixup array, within structures such as Master File Table (MFT) records. This feature was not widely discussed in the forensic community until it was highlighted through detailed experimental analysis.

In a typical 1024-byte MFT record (spanning two 512-byte sectors), the final two bytes of each sector (offsets 510-511, 1022-1023) are overwritten with a two-byte Update Sequence Number (USN) during normal system operation. The original bytes that occupied those positions are saved in the Update Sequence Array (USA), located in the MFT record header (usually starting around byte offsets 42-48, depending on the NTFS version). When the operating system reads the record:

It checks that the end-of-sector bytes match the current USN (integrity check).
It replaces those USN bytes with the saved original values from the USA, ensuring the data appears intact to applications.

If an analyst extracts raw data directly from an MFT record (e.g., for a small resident file whose content lives inside the $DATA attribute of the MFT entry) without manually applying the fixup/USA restoration, the extracted file will contain the USN values in the last 2 bytes of each sector instead of the original data. The issue becomes significant when the resident file content crosses a sector boundary within the MFT record, a situation that occurs frequently with small files. In such cases, the extracted data will differ from the true file content, causing the computed MD5 hash to mismatch when compared against a reference database or another copy of the file. The same problem can affect any data recovered from record slack space or other structures that span sector boundaries.

This discrepancy can have serious consequences in legal proceedings. For example, if a document, such as a blackmail note, is created on one computer and later transferred to another, investigators may need to prove the files are identical. If the files are recovered in raw form from an NTFS volume, their hashes may not match either because the sector boundaries differ or because the Update Sequence Numbers vary between the two systems. A defense team could exploit such a mismatch to question the overall reliability of the forensic process. The same hashing failure can occur when comparing a raw recovery from NTFS against a logically copied version of the file from a non-NTFS file system (such as FAT), which does not use this update mechanism.

Many common forensic tools automatically apply the fixup correction when displaying file content in a parsed “File View,” but may not clearly document when/where they did so (or show the on-disk USN in hex view). In hexadecimal view, the software may misleadingly suggest that the Update Sequence Numbers reside at the physical end of the sectors, when in reality those positions hold the USN values, and the true data bytes are stored in the Update Sequence Array. Analysts must understand the low-level disk structures to verify tool behavior. This mechanism (fixups/USA) applies not only to MFT records but also to other NTFS structures like INDX records, $LogFile records, etc.

Modern forensic suites (e.g., EnCase, FTK, Autopsy/Sleuth Kit with proper NTFS parsers) generally handle fixups correctly when carving or exporting resident files. However, custom scripts, manual hex carving, or certain open-source tools can still trip over it if the parser skips the fixup step. The problem does not affect non-resident files (data stored in allocated clusters outside the MFT) or normal logical file copies made via the OS.

These subtleties reinforce the importance of forensic examiners possessing a solid understanding of low-level disk structures and storage principles. Analysts should always verify tool behavior independently rather than relying solely on automated output, especially when hash matching is critical to the investigation.

Practical Advice for Analysts

Always use tools that properly parse NTFS MFT records and apply fixups when recovering resident content.
When in doubt, manually verify: Parse the MFT header → locate the USA offset and size → restore the original end-of-sector bytes → recompute the hash.
For hash-based identification (e.g., NSRL, known-bad image databases), ensure the recovered file represents the logical content, not the raw on-disk MFT bytes.

While the USN/USA fix-up ensures data integrity for individual MFT records, the USN Journal provides a historical timeline of system activity, including actions on deleted files.

The Update Sequence Number (USN) is a 16-bit unsigned integer (uint16) that serves as an integrity marker in NTFS. It is stored both in the header of multi-sector NTFS structures (such as MFT records) and written into the final two bytes of every sector occupied by that structure. When NTFS writes the USN to the end of each sector, it overwrites the original data bytes that previously occupied those positions. These overwritten bytes may contain critical information, including portions of filenames, timestamps, attribute data, or any other content, depending on the sector boundary. To preserve this original data, NTFS maintains an Update Sequence Array (USA), also known as the fixup array. The USA stores copies of the bytes that were displaced by the Update Sequence Numbers. When the operating system reads the structure, it verifies the USN values for consistency and then restores the original bytes from the USA, ensuring the data is presented correctly to applications and processes. The types of data structures that we typically find USN and USA for are:

$MFT FILE Records
INDX Records for directories and other indexes
$LogFiles RCRD Records
$LogFile RSTR Records

Facebook SDK

NTFS Forensics: NTFS Fix-Ups

Practical Advice for Analysts

Joseph Moronwi

Post a Comment

Post a Comment

Contact Form