These functions themselves might be considered suspicious; therefore, it is better to implement a custom version of these functions by parsing the PE structure manually to resolve functions. We will go over this in the following section.
Custom implementation
In this section, we will cover how to manually resolve a function. As with the GetModuleHandle and GetProcAddress functions, we will need to know the name of the function and the DLL the function is exported by. Our implementation uses the actual name of the function or DLL. However, there are other implementations out there that use hashes of the DLL and function names instead. This has the advantage that these implementations do not ship the function names in their executable, which might be suspicious. To archive a similar effect, we chose to encrypt the strings in the code used by our loader instead of using a hash.
Custom GetModuleHandle()
The first step is to resolve the loaded module using the DLL name. For this, we will first take a look at the Thread Environment Block (TEB), which is stored in the GS register on 64bit systems. At offset 0x60 there is a pointer to the Process Environment Block (PEB) located in the TEB.
typedef struct _TEB {
PVOID Reserved1[12];
PPEB ProcessEnvironmentBlock;
[...]
} TEB, *PTEB;
In the PEB we will find a pointer to a PEB_LDR_DATA structure:
typedef struct _PEB {
BYTE Reserved1[2];
BYTE BeingDebugged;
BYTE Reserved2[1];
PVOID Reserved3[2];
PPEB_LDR_DATA Ldr;
[..]
} PEB, *PPEB;
This structure then contains a list of the modules that are loaded by the current process:
typedef struct _PEB_LDR_DATA {
BYTE Reserved1[8];
PVOID Reserved2[3];
LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;
The LIST_ENTRY structure is a doubly linked list, which is defined as follows:
typedef struct _LIST_ENTRY {
struct _LIST_ENTRY *Flink;
struct _LIST_ENTRY *Blink;
} LIST_ENTRY, *PLIST_ENTRY, *RESTRICTED_POINTER PRLIST_ENTRY;
Each of these LIST_ENTRY structs is part of an LDR_DATA_TABLE_ENTRY. The structure provided by Microsoft is as follows:
typedef struct _LDR_DATA_TABLE_ENTRY {
PVOID Reserved1[2];
LIST_ENTRY InMemoryOrderLinks;
PVOID Reserved2[2];
PVOID DllBase;
PVOID EntryPoint;
PVOID Reserved3;
UNICODE_STRING FullDllName;
BYTE Reserved4[8];
PVOID Reserved5[3];
union {
ULONG CheckSum;
PVOID Reserved6;
};
ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;
However, we can find a more complete structure in the ProcessHacker source code. Here we see, that directly after the FullDllName there is also a BaseDllName. Our understanding is, that the FullDllName should include the full path, while the BaseDllName does not and therefore, the BaseDllName is more convenient for our use case.
We can compare the BaseDllName to the module we are searching for and return the DllBase field if we find our DLL. If we end at the LIST_ENTRY structure we initially found in the PEB, then we have looked at all modules without finding the target DLL and should return NULL to indicate that we have not found the module.
Custom GetProcAddress()
With the handle to our module, we then can resolve an actual function as GetProcAddress would do. Again, we will traverse several different structures to find the relevant fields. The first structure we will look at is the IMAGE_DOS_HEADER structure. The definition can e.g. be found in the ReactOS source code:
typedef struct _IMAGE_DOS_HEADER {
[..]
LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
The last field here is named e_lfanew and contains the offset to the IMAGE_NT_HEADERS structure, which we need to look at next. Our understanding is that the IMAGE_DOS_HEADER structure is a legacy structure and for most purposes, we will move on to the IMAGE_NT_HEADER. The definition for this structure looks as follows:
typedef struct _IMAGE_NT_HEADERS64 {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;
Of interest to us is the OptionalHeader field. The definition looks as follows:
typedef struct _IMAGE_OPTIONAL_HEADER64 {
[...]
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;
Here we specifically want to look at the DataDirectory field, which is the last field. The definition looks as follows:
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
This is an array and there are multiple entries here that are at specific offsets. The offset that is of interest to us is IMAGE_DIRECTORY_ENTRY_EXPORT, which contains the exported functions. The value contained here is an offset from the base address of our module. Using the base address and this offset, we can find the IMAGE_EXPORT_DIRECTORY structure for which ReactOS again has a definition:
typedef struct IMAGE_EXPORT_DIRECTORY {
[...]
DWORD NumberOfFunctions;
DWORD NumberOfNames;
DWORD AddressOfFunctions;
DWORD AddressOfNames;
DWORD AddressOfNameOrdinals;
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;
The AddressOfNames, AddressOfNameOrdinal, and AddressOfFunctions fields are again an offset from the base address of the module. This is also called a Relative Virtual address (RVA). The AddressOfNames field points to an array containing the function names of the exported functions. The NumberOfNames field contains the number of function names that are contained in this array. We can iterate over these names and compare them to the name of the function we are searching for. If we find our function, we can then use the offset we found the name at to locate the ordinal that belongs to our function in the AddressOfNameOrdinals array. The ordinal can then be used as an index into the AddressOfFunctions array to find the address of our function, in most cases allowing us to return the address of the function as GetProcAddress() does.
In some cases, the function is forwarded to another DLL. In our use case here, we are looking up the DLL name for our own implementation, so this is somewhat unlikely and we could likely fix it by providing the name of the DLL that our call gets forwarded to. However, if we want to consider this in our implementation, we can recognize it, as the function pointer we retrieve in the final step should in this case point to a string in our IMAGE_EXPORT_DIRECTORY structure. Thus, we can compare the limits of this structure using the Size field from our IMAGE_DATA_DIRECTORY structure with our pointer to see if this is the case and then handle these cases differently.
If the function is forwarded, our understanding is that the address of our functions points to a string of the form DLLNAME.FUNCTIONNAME. Therefore, we can parse this string and then invoke our logic again with the new DLL and function name.
A full implementation of the discussed logic can e.g. be found in @C5pider’s KaynLdr .
Strings
As already mentioned before, the strings that we use to dynamically resolve the used functions can give an indication that we are trying to hide a suspicious import. We can manually find these strings using the string command on Linux:
$ strings basic_loader.exe | grep "Virtual"
VirtualAlloc
VirtualQuery failed for %d bytes at address %p
VirtualProtect failed with code 0x%x
VirtualProtect
VirtualQuery
VirtualAddress
VirtualSize
VirtualAddress
VirtualSize
VirtualProtect
VirtualQuery
VirtualAddress
VirtualQuery
VirtualProtect
__imp_VirtualProtect
__imp_VirtualQuery
As can be seen, the VirtualAlloc function is still visible here and a security product could easily recognize what we are up to. As mentioned before, one way to get around this is to use hashes instead of the function name to find the function we want to resolve. However, these hashes themselves might be an indicator of malicious intent if they are frequently used by malware. Therefore, it would be advantageous to use a less known hash algorithm here.
Another option is to encrypt the strings and decrypt them during runtime. This is the route we went in our loader.
Summary
In this blog post, we discussed imports and their usage for static analysis by AV solutions. We then went over the structures and fields we need to look at to resolve a module similar to GetModuleHandle() manually. Subsequently, we did take a look at resolving a function using a function name and a pointer to the module in memory as done by GetProcAddress(). Finally, we briefly mentioned the need for obfuscating the function names that we want to resolve. The structures seen in this post will be relevant again in the following posts.