Deleted File Forensic Recovery In FAT File Systems



 

Data recovery techniques are broadly classified into two categories: logical data recovery and physical data recovery. The appropriate method depends on the nature of data loss.


Physical data recovery is necessary when data is lost due to actual damage to the storage device's hardware components. This occurs when the drive cannot function properly or is completely inaccessible. Common causes of this category of data loss include mechanical failure (e.g., a broken hard drive read/write head or hard drive motor failure), electronic failure (e.g., a faulty circuit board), physical damage (e.g., fire, water, or impact damage), SSD NAND chip degradation, bad sectors, or platter degradation. The recovery process must be conducted by professionals in a specialized, contaminant-free environment (a "cleanroom"). Technicians may need to repair or replace components to temporarily get the drive running long enough to create a complete image of the data, after which logical techniques are used to extract the files from that image, extracting data by bypassing faulty hardware for SSDs and flash: chip-off or JTAG recovery. Attempting DIY physical recovery can cause further, permanent data loss. This is beyond the scope of this post and will be treated in a later article.


Logical data recovery is performed when the storage medium itself is physically healthy, but the data contained therein is inaccessible due to non-hardware issues. The data loss is typically a result of software or user error problems. Common causes of this category of data loss include accidental file deletion, accidental formatting or re-partitioning of a drive, file system corruption (e.g., due to an operating system crash or power interruption), or malware attacks. In these scenarios, the data might still be present on the storage medium, but the pointers or file system entries that the operating system uses to locate the files are damaged or missing. Recovery technique often involves analyzing and reconstructing file system metadata, scanning for file signatures (file carving), rebuilding damaged partition tables, and recovering deleted entries that have not been overwritten.


Logical Data Recovery

Logical data recovery is commonly divided into two major categories, based on how the lost data is reconstructed. They include metadata-based (file system-based) recovery and file carving (signature-based recovery). The focus of this post is on the residual file system metadata-based recovery technique in the FAT file system. From a forensics point of view, however, it is not just deleted files that are of concern. The investigator needs to be able to recognize the presence of hidden files, disguised files, and invisible files as well.


When a user deletes a file, the clusters assigned by the file system to that file are marked as free, and the space is now unallocated. The data is not affected and is still kept intact until one of two things happens. The first is if the file system identifies the cluster as free and uses it for another file. In that case, however, if the cluster is not completely overwritten, it may be possible to extract data from the part of the cluster not completely overwritten (called file slack space). The second thing that can destroy the data in unallocated clusters is to use a wipe utility. Such a utility overwrites the data repeatedly with 0s and 1s, wiping the data between each pass. Some utilities perform as many as 32 passes. The average digital forensic lab will be unable to retrieve any data from clusters subjected to such a wipe. However, some highly specialized facilities have equipment that can extract information from drives on a molecular level. This, however, is far beyond the scope of this post. Also in this post, I am assuming that neither of the two scenarios just discussed applies


 

What is important from a forensic viewpoint is that on deletion of a file, the operating system does not delete the information contained in the clusters; it merely marks them as available for reallocation. It is therefore quite possible to restore a file that has been deleted, provided that the clusters of the file have not been reused. The deleted directory entry will often contain details of the first cluster and the file length, and this can greatly assist the process. Nevertheless, these traditional recovery methods that make use of the file system structure presented on storage devices become ineffective when the file system structure is corrupted or damaged, a task easily accomplished by a technically astute criminal or disgruntled insider with freely available tools. A more sophisticated data recovery solution that does not rely on this file system structure has thus become necessary. These new and sophisticated solutions are collectively known as file carving. File carving recovers data directly from raw disk blocks by identifying file signatures and internal structures, without relying on any file system metadata. This post also assumes that there is no damage to or corruption of the file system


To understand how file and data recovery work, it is essential to have at least a rudimentary understanding of how the file system manages data in memory and in storage systems.


File Creation and Deletion in FAT File Systems

For simplicity, we will focus on examples involving files located in the root directory. The same principles apply to files in other directories—the only additional step is identifying the directory that contains the target file. This is done by traversing the full file path, starting at the root and moving through each subdirectory until the correct location is reached


File Creation

When a new file is created, the operating system determines where to place it on the file system to allow efficient access later. It first looks for a large enough block of consecutive unallocated clusters to store the entire file. Over time, however, free space becomes scattered as files are added and removed, and the largest available block may not be sufficient. When this happens, the file must be divided into smaller parts and stored across multiple separate free-space segments. This process results in a fragmented file as shown by the example (file 2015_16.xlsx) in the figure below.


Figure 1: Layout of FAT file system


In a FAT file system, the file-creation process works as follows:


  1. Disk space check:
    When a new file is created, the operating system first checks whether enough free disk space exists to store its contents. If the available space is insufficient, the system returns an “Insufficient disk space” error, and the file is not created. If space is adequate, the required number of clusters is allocated to the file, and these clusters are marked as in use so they cannot be assigned to other files or directories.

  2. Directory entry creation:
    Next, the operating system locates the directory in which the file will reside and creates a directory entry for it. This entry stores key metadata such as the filename, extension, timestamps, file attributes/permissions, file size, and—most importantly—the address of the first cluster of the file.

  3. FAT chain setup:
    Finally, the FAT (File Allocation Table) is updated to build the cluster chain for the file. Each cluster has a corresponding entry in the FAT—for example, FAT entry 2 corresponds to cluster 2. Each FAT entry contains either the number of the next cluster in the file or a special end-of-chain marker (such as 0xFFFFFFF in FAT32) to indicate the final cluster in the sequenceIn the figure below, the newly created file file1.dat occupies clusters 8 through 10. The FAT entry for cluster 8 contains the value 9, indicating that the file continues in cluster 9. Likewise, the FAT entry for cluster 9 points to cluster 10. The FAT entry for cluster 10 contains an EOF marker, signifying that cluster 10 is the final cluster assigned to the file.


Figure 2: File creation process in the FAT file system

File Deletion

When a file is deleted in a FAT file system, the operating system simply updates its directory table by replacing the first character of the filename (the first byte of the entry) with the special marker 0xE5. This marks the entry as deleted and indicates to the system that the directory entry is available for use by a new entry, but none of the other fields in the directory entry are modified. The file name is then rendered “invisible” to the operating system and does not show up in directory listings, nor is the file name visible to applications, however, the remaining filename characters, timestamps, permissions, file size, and even the starting cluster number all remain intact


At the same time, the FAT entries that make up the file’s cluster chain are cleared (set to zero), indicating that those clusters are now free for allocation. However, the file’s actual data stored in the clusters is not overwritten—the contents remain on disk until those clusters are eventually reused by new data. In other words, the file’s data in the data area remains unchanged. As illustrated in Fig. 6.4b, the system still retains the residual information for File.txt after it is deleted (as shown earlier in Fig. 6.4a). The two main changes are that the filename is updated from “File.txt” to “_ile.txt”—reflecting the 0xE5 marker replacing the first character—and the file’s FAT cluster chain is cleared in the directory entry.


Figure 3: File deletion process in FAT file systems. (a) Relationship between the directory entry structures, clusters, and FAT structure before the file is deleted. (b) Relationship between the directory entry structures, clusters, and FAT structure after a file is deleted.

Deleted File Recovery In FAT File Systems

Recovering deleted files in a FAT file system is often straightforward because most files are stored in contiguous clusters. The basic approach is to scan the file system directory entries and identify those marked as deleted (indicated by the 0xE5 marker). For each such entry, the first character of the filename can be restored to its original value (or even another legal character). Since the directory entry still contains the starting cluster number—and the file was stored in consecutive clusters—the cluster chain can be reconstructed in the FAT table. The detailed steps for this recovery process will be discussed presently.


This post assumes that the deleted file was stored contiguously on disk, which is the most common case. In such situations, the cluster numbers follow a simple linear sequence (for example, 26, 27, 28, 29 for a four-cluster file beginning at cluster 26). However, if the deleted file was fragmented, recovery becomes much more difficult. These advanced techniques, known as file carving, are beyond the scope of this post and will be discussed in a subsequent post.


To recover a deleted file, several steps must be followed. Assuming the cluster size is known:


  1. Locate the directory entry:
    Search the directory table that originally contained the deleted file and use the remaining filename information to identify the directory entry associated with it.

  2. Restore the filename:
    In that directory entry, replace the first byte of the filename—currently marked with 0xE5—with its original character or another valid value.

  3. Extract file metadata:
    Next, read the file size and the starting cluster number from the directory entry. Using the file size and the known cluster size, calculate how many clusters were allocated to the file. For example, in Fig. 3a, the file size is 16,000 bytes and the cluster size is 4 KB, so:


No of clusters = ceil(file size)/cluster size // where ceil() is the ceiling function
= ceil(16000 bytes/4096 bytes)
= ceil (3.90625)
= 4


Since the starting cluster is 26, we can deduce that clusters 26, 27, 28, and 29 belong to the deleted file, assuming— as before— that the file was stored contiguously.


The final step is to reconstruct the file’s cluster chain in the FAT table. Beginning with the starting cluster recorded in the directory entry, each cluster must be linked to the next in sequential order, until the last cluster is reached. The FAT entry for the final cluster is then updated with the end-of-cluster-chain marker (e.g., 0xFFFFFFF in FAT32).


In this example, the FAT entries for clusters 26, 27, 28, and 29 must be updated. Each entry should contain the number of the next cluster in the chain, while the entry for cluster 29 should contain the end-of-chain marker, indicating that it is the last cluster associated with the file.


For a practical demonstration, consider the following root directory entries of a FAT16 file system. Sorting through this, you will find the entry pointing to a deleted “mult1.dat” file indicated by the 0xE5 character in the first byte of the file name.




The address of the first cluster of the file5.dat file can be obtained by concatenating the high two bytes of its first cluster address given at byte offset 0x94-0x95 (0x00000) with the low two bytes of its first cluster address given at byte offset 0x9A-0x9B (0x0008). Note that the bytes, which are stored in Little Endian, must be reversed. This gives the value 0x00000008. The address of the first cluster is therefore cluster 7.  Also, the byte offset 0x9C–0x9F, which represents the file size in bytes, is 0x00000ED9 (with the endianness reversed). This gives us a file size of 3801 bytes. Before proceeding with the recovery attempt, below is the output of the TSK fsstat command for the test image




In order to recover a deleted file in the FAT file system, we have to modify several areas.  However, in practice, we need to work on a disk image, which could be very large. It is hard to edit the whole image file. In reality, we only need to edit a small disk area. Therefore, a good approach is to locate the area that we will work on and then extract and save it into a small image file. Afterwards, we can edit the small image file and make any necessary changes for file recovery. Once all changes are complete, we can create a new disk partition image by integrating two images: the original image and the modified small image.


Suppose we want to work on FAT table 0 (sectors 8 - 31) to restore the cluster chain of the mult1.dat file. We can extract the sector with the dcfldd utility as follows:



Where “fat0.dd” is the file storing the extracted sector. Note that the size of “fat0.dd” is only one sector, which is very small compared with the original file system image.  The following is a snippet of the first sector of FAT 0.



Now, we have to put back the cluster chain, which was wiped out during the file deletion. Since the mult.dat file has 3801 bytes and the cluster size is 512 bytes, the number of clusters allocated to “mult1.dat” is therefore:


ceil (3801 / 1024) = 4


Further, it is worth noting that in this exercise, we only consider a scenario where a file is stored in a hard disk contiguously and without fragmentation. Then, we can use a hex editor to edit the “fat0.dd” file as follows.  



Next, suppose we want to work on the root directory to change the first character of the file name. We can extract the first sector of the root directory (56 - 87) as follows:



The following is the snippet of the first sector of root directory.




Then, we can use ghex to edit rootdir.dd, a very small dd file. Change “0xe5” located at byte offset 80 to “0x4D”, which is “M”.



Finally, it is time to reassemble the images we’ve already made changes to. Note that there are a total of 12,032 sectors (Total Range: 0–12,031) in this FAT file system.



Where “fatimage.dd” is the original file system image after file deletion and recover.dd is the resultant image following the successful recovery of the deleted “multi5.dat” file. To verify that our file recovery effort was indeed successful, we can utilize the TSK fls command to list the files in the forensic images before and after recovery as follows.



The star (*)  in front of a file name shows that it is deleted, and the first letter is missing because the first letter of the name is used to set the unallocated status. The number before any file name shows the address of the directory entry where the details can be found.


Post a Comment

Previous Post Next Post