Windows PE Forensics - The Section Table and Sections




Section Table

The Section Table follows immediately after the PE Header. It is an array of IMAGE_SECTION_HEADER structures, each containing the information about one section in the PE file such as its attribute and virtual offset.  The number of entries in the section table depends on the number of sections contained in the image file. The number of entries (or sections, for that matter) is defined in the NumberOfSections field in the PE header. The section table is also referred to as the IMAGE_SECTION_HEADER. Each header structure is 40 bytes apiece and there is no "padding" between them. The structure is defined in winnt.h thus:

IMAGE_SECTION_HEADER STRUCT
    Name			BYTE
    union Misc
    	Physical Address	DWORD
        VirtualSize		DWORD
    //ends
    VirtualAddress		DWORD
    SizeOfRawData		DWORD
    PointerToRawData		DWORD
    PointerToRelocations	DWORD
    PointerToLineNumbers	DWORD
    NumberOfRelocations		WORD
    NumberOfLineNumbers		WORD
    Characteristics		DWORD
IMAGE_SECTION_HEADER ends  

The entries are explained in the table below. Take note that the offset is relative to the start of each entry.

Offset

 Size

 Field Name

 Description

 Possible Values

00h

QWORDNameThe name of the section. It is an 8-byte, null-padded UTF-8 string. There is no terminating null character if the string is exactly 8 characters..text

.data

08h

DWORDMiscA union of the file’s PhysicalAddress and the VirtualSize of the section. The VirtualSize is the total size of the section when loaded into memory, in bytes. If this value is greater than the SizeOfRawData entry, the section is filled with zeroes. This field is used only by executables.

 

0Ch

DWORDVirtualAddressThe address of the first byte of the section when loaded into memory, relative to the image base.

10h

DWORDSizeOfRawDataThe size of the initialized data on disk, in bytes. This value must be a multiple of the FileAlignment entry of the IMAGE_OPTIONAL_HEADER structure. If this value is less than the VirtualSize entry, the remainder of the sections is filled with zeroes. If the section contains only uninitialized data, the entry is zero.

14h

DWORDPointerToRawDataA file pointer to the first page within the COFF file. This value must be a multiple of the FileAlignment entry of the IMAGE_OPTIONAL_HEADER structure. If the section contains only uninitialized data, the entry is zero.

 

18h

 DWORD

PointerToRelocations

A file pointer to the beginning of the relocation entries for the section. If there are no relocations, the value is zero.

 

 1Ch

 DWORD

PointerToLineNumbers

A file pointer to the beginning of the line-number entries for the section. If there are no COFF line numbers, this value is zero.

 

20h

WORD

NumberOfRelocations

The number of relocation entries for the section. This value is zero for executable images.


 22h

 WORD

NumberOfLineNumbers

The number of line-number entries for the section

 

24h

DWORD

Characteristics

This defines the characteristic of the image. Table 9-5 contains the flags and their meanings.



It is important to note that the section names have only eight characters reserved for them. If an image has more than eight characters for a section name, the name field will contain a forward slash (/) followed by an ASCII representation of a decimal number that is an offset into the string table. An image that has more than eight characters for a section name is anything but an executable file. Executable files only support section names up to eight characters, and they do not use a string table.

The last field in the section table defines the image characteristics. It is aptly called the Characteristics field. The table below describes the flag values of this field.


 Flag

 Characteristic Name

 Meaning

 00300000h

IMAGE_SCN_ALIGN_4BYTES

 Align data on a 4-byte boundary. This is valid only for object files.

 00400000h

IMAGE_SCN_ALIGN_8BYTES

 Align data on a 8-byte boundary. This is valid only for object files.

 00500000h

IMAGE_SCN_ALIGN_16BYTES

 Align data on a 16-byte boundary. This is valid only for object files.

 00600000h

IMAGE_SCN_ALIGN_32BYTES

 Align data on a 32-byte boundary. This is valid only for object files.

 00700000h

IMAGE_SCN_ALIGN_64BYTES

 Align data on a 64-byte boundary. This is valid only for object files.

 00800000h

IMAGE_SCN_ALIGN_128BYTES

 Align data on a 128-byte boundary. This is valid only for object files.

 00900000h

IMAGE_SCN_ALIGN_256BYTES

 Align data on a 256-byte boundary. This is valid only for object files.

 00A00000h

IMAGE_SCN_ALIGN_512BYTES

 Align data on a 512-byte boundary. This is valid only for object files.

 00B00000h

IMAGE_SCN_ALIGN_1024BYTES

 Align data on a 1,024- byte boundary. This is valid only for object files.

 00C00000h

IMAGE_SCN_ALIGN_2048BYTES

 Align data on a 2,048- byte boundary. This is valid only for object files.

  00D00000h

IMAGE_SCN_ALIGN_4096BYTES

 Align data on a 4,096- byte boundary. This is valid only for object files.

 00E00000h

IMAGE_SCN_ALIGN_8192BYTES

 Align data on a 8,192- byte boundary. This is valid only for object files.

 01000000

IMAGE_SCN_LNK_NRELOC_OVFL

 This section contains extended relocations. The count of relocations for the section exceeds the 16 bits reserved for it in the section header. If the NumberOfRelocations field in the section header is 0xFFFF, the actual relocation count is stored in the VirtualAddress field of the first relocation. It is an error if IMAGE_ SCN_LNK_NRELOC_ OVFL is set and there are fewer than 0xFFFF relocations in the section.

 02000000h

IMAGE_SCN_MEM_DISCARDABLE

The section can be discarded as needed.

 04000000h

IMAGE_SCN_MEM_NOT_CACHED

The section cannot be cached.

 08000000h

IMAGE_SCN_MEM_NOT_PAGED

The section cannot be paged.

 10000000h

IMAGE_SCN_MEM_SHARED

The section can be shared in memory.

 20000000h

IMAGE_SCN_MEM_EXECUTE

The section can be executed as code.

 40000000h

IMAGE_SCN_MEM_READ

 The section can be read

 80000000h

IMAGE_SCN_MEM_WRITEThe section can be written to.


Note: In the section table, the sections are sorted according to their relative virtual address rather than alphabetically. 

After the section headers we find the sections themselves.


Sections

The sections contain the main content of the file, including code, data, resources, and other executable information. Sections, in general, are simply blocks of data, which can have certain attributes as described in the section table. A section can be code, data, or a combination of both. The main thing they have in common is their attributes.

Each section has a unique name. The name is there to describe what the section is. For example, a section named CODE can represent the code section, while a section named .rdata can represent a read-only data section. As a default setting, Borland linker uses section names such as CODE and DATA, while Microsoft prefixes sections with a period, such as .text and .rdata. It is important to remember that the PE loader and Windows itself does not care what the section is called. The names are there for us humans. A developer is free to name the sections of her programs as she sees fit as long as it does not go over the eight-character limit.

Each section has a header and a body (the raw data). The section headers are contained in the Section Table but section bodies lack a rigid file structure. They can be organized almost any way a linker wishes to organize them, as long as the header is filled with enough information to be able to decipher the data.

Note: The minimum number of sections a PE file can have is two: one for code and the other for data.

As you look at different binaries, you will see different section names. Some make sense and follow the standard naming, while some will have names that are hard to comprehend. To prepare you, it is good to be familiar with common sections. The following is a short list of common sections. Take note that these are section names produced by Microsoft compilers/linkers.

  • .text or CODE
  • .data
  • .bss
  • .CRT
  • .rsrc
  • .idata
  • .edata
  • .reloc
  • .tls
  • .rdata
  • .debug



In the file on disk, each section starts at an offset that is some multiple of the FileAlignment value found in OptionalHeader. Between each section's data there will be 00 byte padding.

When loaded into RAM, the sections always start on a page boundary so that the first byte of each section corresponds to a memory page. On x86 CPUs pages are 4kB aligned, whilst on IA-64, they are 8kB aligned. This alignment value is stored in SectionAlignment also in OptionalHeader.

For example, if the optional header ends at file offset 981 and FileAlignment is 512, the first section will start at byte 1024. Note that you can find the sections via the PointerToRawData or the VirtualAddress, so there is no need to bother with alignments.

In the figure above, the Import Data Section (.idata) will start at offset 0002AC00h (highlighted pink, NB reverse byte order) from the start of the file. Its size, given by the DWORD before, will be 1A00h bytes. 


Executable Code

This section contains all the instructions executed by the program. In Windows NT all code segments reside in a single section called .text or CODE. Since Windows NT uses a page-based virtual memory management system, having one large code section is easier to manage for both the operating system and the application developer. This section also contains the entry point and the jump thunk table (where present) which points to the Import Address Table.

Data

The .bss section represents uninitialized data for the application, including all variables declared as static within a function or source module.

The .rdata section contains read-only data such as literal strings, constants, and the debug directory, which can be found only in EXE files. As defined by Microsoft, the debug directory is an array of IMAGE_DEBUG_DIRECTORY structures. These structures hold information about the type, size, and location of the various types of debug information stored in the file. 

All other variables (except automatic variables, which appear on the stack) are stored in the .data section. These are application or module global variables.

The .CRT is a weird one because it contains initialized data as well. It is a mystery why the data contained in the .CRT section is not joined with the data in the .data section.

Resources

The .rsrc section contains resource information used by the program. This section begins with a resource directory structure called IMAGE_RESOURCE_DIRECTORY. It contains the following information:

  • Characteristics
  • DateTimeStamp
  • MajorVersion'
  • MinorVersion
  • NumberOfNamedEntries
  • NumberOfIdEntries

The first 16 bytes comprises a header like most other sections, but this section's data is further structured into a resource tree which is best viewed using a resource editor. A good one is Resoure Hacker.



This is a powerful tool for cracking purposes as it will quickly display dialog boxes including those concerning incorrect registration details or nag screens. A shareware app can often be cracked just by deleting the nagscreen dialog resource in Resource Hacker. 


Export Data

The .edata section contains the list of functions and data that the program exports or makes available to other programs or modules. Take note that the .edata section appears only in DLL files because there is rarely a reason for EXE files to import functions to other programs. When present, this section contains information about the names and addresses of exported functions.


Import Data

The .idata section contains function and data information that the program imports from other dynamic link libraries (DLLs) including the Import Directory and the Import Address Table (IAT). Each function that a program imports is specifically listed in this section. 

Debug Information

Debug information is initially placed in the .debug section. The PE file format also supports separate debug files (normally identified with a .DBG extension) as a means of collecting debug information in a central location. The debug section contains the debug information, but the debug directories live in the .rdata section mentioned earlier. Each of those directories references debug information in the .debug section.

Note: It is important to remember that TLINK32 EXEs put the debug directory in the .debug section and not the .rdata section. So if you cannot find the debug directory in the .rdata section, look for it in the .debug section.


 Thread Local Storage

The .tls section contains data that was defined using the compiler directive _ _declspec(thread). The .tls section got its name from TLS, the acronym of thread local storage. When it comes to dealing with the .tls section, Microsoft explains it best: The .tls section is related to the TlsAlloc family of Win32 functions. When dealing with a .tls section, the memory manager sets up the page tables so that whenever a process switches threads, a new set of physical memory pages is mapped to the .tls section’s address space. This permits per-thread local variables. In most cases, it is much easier to use this mechanism than to allocate memory on a per-thread basis and store its pointer in a TlsAlloc’ed slot. There’s one unfortunate note that must be added about the .tls section and _ _declspec(thread) variables. In Windows NT and Windows 95, this thread local storage mechanism won’t work in a DLL if LoadLibrary loads the DLL dynamically. In an EXE or an implicitly loaded DLL, everything works fine. If you can’t implicitly link to the DLL but need per-thread data, you’ll have to fall back to using TlsAlloc and TlsGetValue with dynamically allocated memory.


Base Relocations

The .reloc section contains a table of base relocations. As Microsoft puts it, a base relocation is an adjustment to an instruction or initialized variable value that’s needed if the PE loader could not load the file where the linker assumed it would. If the PE loader is able to load the image at the linker’s preferred base address, the PE loader completely ignores the relocation information in the .reloc section. 

When the linker creates an EXE file, it makes an assumption about where the file will be mapped into memory. Based on this, the linker puts the real addresses of code and data items into the executable file. If for whatever reason the executable ends up being loaded somewhere else in the virtual address space, the addresses the linker plugged into the image are wrong. The information stored in the .reloc section allows the PE loader to fix these addresses in the loaded image so that they're correct again. On the other hand, if the loader was able to load the file at the base address assumed by the linker, the .reloc section data isn't needed and is ignored.

The entries in the .reloc section are called base relocations since their use depends on the base address of the loaded image. Base relocations are simply a list of locations in the image that need a value added to them. The format of the base relocation data is somewhat quirky. The base relocation entries are packaged in a series of variable length chunks. Each chunk describes the relocations for one 4KB page in the image.

For example, if an executable file is linked assuming a base address of 10000h. At offset 2134h within the image is a pointer containing the address of a string. The string starts at physical address 14002h, so the pointer contains the value 14002h. You then load the file, but the loader decides that it needs to map the image starting at physical address 60000h. The difference between the linker-assumed base load address and the actual load address is called the delta. In this case, the delta is 50000h. Since the entire image is 50000h bytes higher in memory, so is the string (now at address 64002h). The pointer to the string is now incorrect. The executable file contains a base relocation for the memory location where the pointer to the string resides. To resolve a base relocation, the loader adds the delta value to the original value at the base relocation address. In this case, the loader would add 50000h to the original pointer value (14002h), and store the result (64002h) back into the pointer's memory. Since the string really is at 64002h, everything is fine.


Relative Virtual Address (RVA)

You first encountered the relative virtual address while I was discussing the PE header in a previous posts and you will encounter it some more as I discuss the PE file and as you analyse malware. So, what is a relative virtual address?

To understand what it is, you need to know first what a Virtual Address (VA) space is. As defined by Microsoft, a Virtual Address space is a set of virtual memory addresses that a process can use. A virtual address does not represent the actual physical location of an object in memory. Instead, the system maintains a page table for each process, which is an internal data structure used to translate virtual address into their corresponding physical addresses. Each time a thread references an address, the system translates the virtual address to a physical address.

Note: The virtual address space for 32-bit Windows is 4GB, while for 64-bit Windows, the default is 8TB.

The Relative Virtual Address is simply a distance from a reference point in the virtual address space. A similar concept is a file offset. The file offset describes the location of something relative to the start of the file, while the relative virtual address describes the location of something relative to a point in the virtual address space. To illustrate further, let’s take a PE file that usually loads at 400000h virtual address, and let’s say that the start of the PE file’s .text or code section is at 401000h. From this, the RVA of the code section is 1000h because that is where it is relative to the loading location of the file in the virtual address space. The formula for this is simply as follows:

RVA = Target Address – Load Address 
RVA = 401000h – 400000h
RVA = 1000h 


To convert the RVA to the actual address, which is the target address, simply reverse the process by adding the load address to the RVA

Note: The virtual address is simply an RVA with the HMODULE added in. HMODULE is the same as the load address.


Post a Comment

Previous Post Next Post