Search

Loader Dev. 2 – Dynamically resolving functions

Search
Blog article

Loader Dev. 2 – Dynamically resolving functions

In this post, we discuss dynamically resolving functions, which help to avoid static detections based on the functions imported by our executable.

Disclaimer

These posts are written to provide information to other professionals of the discussed topics.

The techniques used here are not novel and were documented by other people before. Therefore, the benefit of these posts for threat actors will likely be minimal. Nonetheless, we decided against releasing a full PoC implementation and will instead only provide code snippets as part of the posts. All credit should go to the people who did the original research on the techniques used.

There will also be an accompanying blog post on detecting or hunting for malware using the discussed techniques to enable readers to protect their environment.

Imports

The functions our executable uses are by default easily viewable in its imports section. The following code could be used in a basic loader:

#include <stdio.h>
#include <windows.h>

int main() {
  unsigned char shellcode[] = […];
  unsigned char* base_address = VirtualAlloc(NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  memcpy(base_address, shellcode, sizeof(shellcode));
  ((void(WINAPI*)(void))base_address)();
}

After compilation, we can view the imports of our executable e.g. using PE-bear:

Kolja Grassmann

Consultant

Category

Date

Navigation

Note that the VirtualAlloc function is imported by our executable. AV solutions consider these imports when evaluating whether our executable is malicious. Therefore, we should avoid suspicious function imports like the VirtualAlloc function.

Dynamic function resolution

It is possible to dynamically resolve function addresses by using the GetModuleHandle or LoadLibraryA and GetProcAddress functions. By using these functions, we could avoid importing VirtualAlloc:

#include <stdio.h>
#include <windows.h>

int main() {
  unsigned char shellcode[] = [...];
  unsigned char* base_address = (unsigned char*(WINAPI*)(LPVOID,SIZE_T,DWORD,DWORD))GetProcAddress(GetModuleHandle("Kernel32.dll"), "VirtualAlloc")(NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  memcpy(base_address, shellcode, sizeof(shellcode));
  ((void(WINAPI*)(void))base_address)();
}

As can be seen in the following screenshot, the VirtualAlloc function is now no longer imported, but the functions used for resolving it will be imported:

These functions themselves might be considered suspicious; therefore, it is better to implement a custom version of these functions by parsing the PE structure manually to resolve functions. We will go over this in the following section.

Custom implementation

In this section, we will cover how to manually resolve a function. As with the GetModuleHandle and GetProcAddress functions, we will need to know the name of the function and the DLL the function is exported by. Our implementation uses the actual name of the function or DLL. However, there are other implementations out there that use hashes of the DLL and function names instead. This has the advantage that these implementations do not ship the function names in their executable, which might be suspicious. To archive a similar effect, we chose to encrypt the strings in the code used by our loader instead of using a hash.

Custom GetModuleHandle()

The first step is to resolve the loaded module using the DLL name. For this, we will first take a look at the Thread Environment Block (TEB), which is stored in the GS register on 64bit systems. At offset 0x60 there is a pointer to the Process Environment Block (PEB) located in the TEB.

typedef struct _TEB {
  PVOID Reserved1[12];
  PPEB  ProcessEnvironmentBlock;
  [...]
} TEB, *PTEB;

In the PEB we will find a pointer to a PEB_LDR_DATA structure:

typedef struct _PEB {
  BYTE                          Reserved1[2];
  BYTE                          BeingDebugged;
  BYTE                          Reserved2[1];
  PVOID                         Reserved3[2];
  PPEB_LDR_DATA                 Ldr;
  [..]
} PEB, *PPEB;

This structure then contains a list of the modules that are loaded by the current process:

typedef struct _PEB_LDR_DATA {
  BYTE       Reserved1[8];
  PVOID      Reserved2[3];
  LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;

The LIST_ENTRY structure is a doubly linked list, which is defined as follows:

typedef struct _LIST_ENTRY {
   struct _LIST_ENTRY *Flink;
   struct _LIST_ENTRY *Blink;
} LIST_ENTRY, *PLIST_ENTRY, *RESTRICTED_POINTER PRLIST_ENTRY;

Each of these LIST_ENTRY structs is part of an LDR_DATA_TABLE_ENTRY. The structure provided by Microsoft is as follows:

typedef struct _LDR_DATA_TABLE_ENTRY {
    PVOID Reserved1[2];
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID DllBase;
    PVOID EntryPoint;
    PVOID Reserved3;
    UNICODE_STRING FullDllName;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    };
    ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

However, we can find a more complete structure in the ProcessHacker source code. Here we see, that directly after the FullDllName there is also a BaseDllName. Our understanding is, that the FullDllName should include the full path, while the BaseDllName does not and therefore, the BaseDllName is more convenient for our use case.

We can compare the BaseDllName to the module we are searching for and return the DllBase field if we find our DLL. If we end at the LIST_ENTRY structure we initially found in the PEB, then we have looked at all modules without finding the target DLL and should return NULL to indicate that we have not found the module.

Custom GetProcAddress()

With the handle to our module, we then can resolve an actual function as GetProcAddress would do. Again, we will traverse several different structures to find the relevant fields. The first structure we will look at is the IMAGE_DOS_HEADER structure. The definition can e.g. be found in the ReactOS source code:

typedef struct _IMAGE_DOS_HEADER {
    [..]
    LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

The last field here is named e_lfanew and contains the offset to the IMAGE_NT_HEADERS structure, which we need to look at next. Our understanding is that the IMAGE_DOS_HEADER structure is a legacy structure and for most purposes, we will move on to the IMAGE_NT_HEADER. The definition for this structure looks as follows:

typedef struct _IMAGE_NT_HEADERS64 {
    DWORD Signature;
    IMAGE_FILE_HEADER FileHeader;
    IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;

Of interest to us is the OptionalHeader field. The definition looks as follows:

typedef struct _IMAGE_OPTIONAL_HEADER64 {
  [...]
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;

Here we specifically want to look at the DataDirectory field, which is the last field. The definition looks as follows:

typedef struct _IMAGE_DATA_DIRECTORY {
  DWORD VirtualAddress;
  DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

This is an array and there are multiple entries here that are at specific offsets. The offset that is of interest to us is IMAGE_DIRECTORY_ENTRY_EXPORT, which contains the exported functions. The value contained here is an offset from the base address of our module. Using the base address and this offset, we can find the IMAGE_EXPORT_DIRECTORY structure for which ReactOS again has a definition:

typedef struct  IMAGE_EXPORT_DIRECTORY {
  [...]
  DWORD NumberOfFunctions;
  DWORD NumberOfNames;
  DWORD AddressOfFunctions;
  DWORD AddressOfNames;
  DWORD AddressOfNameOrdinals;
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

The AddressOfNames, AddressOfNameOrdinal, and AddressOfFunctions fields are again an offset from the base address of the module. This is also called a Relative Virtual address (RVA). The AddressOfNames field points to an array containing the function names of the exported functions. The NumberOfNames field contains the number of function names that are contained in this array. We can iterate over these names and compare them to the name of the function we are searching for. If we find our function, we can then use the offset we found the name at to locate the ordinal that belongs to our function in the AddressOfNameOrdinals array. The ordinal can then be used as an index into the AddressOfFunctions array to find the address of our function, in most cases allowing us to return the address of the function as GetProcAddress() does.

In some cases, the function is forwarded to another DLL. In our use case here, we are looking up the DLL name for our own implementation, so this is somewhat unlikely and we could likely fix it by providing the name of the DLL that our call gets forwarded to. However, if we want to consider this in our implementation, we can recognize it, as the function pointer we retrieve in the final step should in this case point to a string in our IMAGE_EXPORT_DIRECTORY structure. Thus, we can compare the limits of this structure using the Size field from our IMAGE_DATA_DIRECTORY structure with our pointer to see if this is the case and then handle these cases differently.

If the function is forwarded, our understanding is that the address of our functions points to a string of the form DLLNAME.FUNCTIONNAME. Therefore, we can parse this string and then invoke our logic again with the new DLL and function name.

A full implementation of the discussed logic can e.g. be found in @C5pider’s KaynLdr .

Strings

As already mentioned before, the strings that we use to dynamically resolve the used functions can give an indication that we are trying to hide a suspicious import. We can manually find these strings using the string command on Linux:

$ strings basic_loader.exe | grep "Virtual"
VirtualAlloc
  VirtualQuery failed for %d bytes at address %p
  VirtualProtect failed with code 0x%x
VirtualProtect
VirtualQuery
    VirtualAddress
VirtualSize
VirtualAddress
VirtualSize
VirtualProtect
VirtualQuery
VirtualAddress
VirtualQuery
VirtualProtect
__imp_VirtualProtect
__imp_VirtualQuery

As can be seen, the VirtualAlloc function is still visible here and a security product could easily recognize what we are up to. As mentioned before, one way to get around this is to use hashes instead of the function name to find the function we want to resolve. However, these hashes themselves might be an indicator of malicious intent if they are frequently used by malware. Therefore, it would be advantageous to use a less known hash algorithm here.

Another option is to encrypt the strings and decrypt them during runtime. This is the route we went in our loader.

Summary

In this blog post, we discussed imports and their usage for static analysis by AV solutions. We then went over the structures and fields we need to look at to resolve a module similar to GetModuleHandle() manually. Subsequently, we did take a look at resolving a function using a function name and a pointer to the module in memory as done by GetProcAddress(). Finally, we briefly mentioned the need for obfuscating the function names that we want to resolve. The structures seen in this post will be relevant again in the following posts.

Further blog articles

Blog

Loader Dev. 1 – Basics

February 10, 2024 – This is the first post in a series of posts that will cover the development of a loader for evading AV and EDR solutions.

Author: Kolja Grassmann

Mehr Infos »
Do you want to protect your systems? Get in touch with us.
Search
Search