I am analyzing a Linux Prometei sample of February 2025. In Part One, we found out this sample was packed. In Part Two, we analyzed the unpacked binary, got an overall impression of binary and noticed a few new botnet commands + the mining pool server is no longer hard-coded.
In Part Three, we are going to investigate communication with the C2. This part was particularly difficult to carry out. What should have been clear and straightforward was anything but. I went from illusions to disillusions, from partial certainties to new illusions, and so on, to the point that the chronological flow of this blog post would be nothing more than a tangled mess of spaghetti — unreadable and tedious to go through.
So, for this third part, I’m stepping out of “live” mode to try to organize the ideas and present to you more clearly how the binary communicates with the C2.
AI should learn the difference between RC4 and SHA256…
- R2ai searches for encryption related strings with the command “iz~crypt|aes|rc4|key” and does not find anything except a “fgetcrypt”. But it is unable to find anything more about it.
- R2ai then searches for “enckey” and finds a function which creates the URL for the C2. Within that function, it allegedly detects a function which implements the RC4 algorithm, and produces beautiful source code with RC4 context init, key setup and encryption. The key is 64-byte long, generated once per launch and sent to the C2 along with other authentication parameters. All those discoveries for a cost of 60 cents of USD 💵. At this point, I’m quite happy. Should I be?
- My doubts raise when the AI tells me this is actually a “modified RC4” initialized with “SHA-256 constants” and with a “counter mechanism”. This is so unlike RC4!
I open Ghidra start checking each of the parts of the alleged RC4 algorithm. The context initialization uses indeed SHA-256 constants.
Then I look at the alleged encryption function at 0x403046 and it’s clear: this is not an encryption algorithm, this is SHA-256 !
After analysis, I understand the following functions:
- 0x402f27
sha256_init()
- 0x402fa5
sha256_update()
- 0x402a95
sha256_transform()
- 0x403046
sha256_final()
- 0x40b545 is not an encryption function (RC4 or else), but a function that computes a SHA256 hash.
XOR hallucination
From manual analysis, I understand that the “enckey” is stored in a global variable at address 0x00bd9640 and that it receives the default value GtvRsdC7
.
GtvRsdC7. By the way, the AI never managed to find that...
Still looking for encryption algorithms, the AI jumps into another hallucination and invents an XOR encryption routine (see below).
void encrypt_data(char* data, size_t length) {
char* enckey = (char*)0x00bd9640;
size_t key_length = strlen(enckey);
for (size_t i = 0; i < length; i++) {
// XOR each byte with the corresponding byte from the key
data[i] ^= enckey[i % key_length];
}
}
At least, the advantage with AI is that it readily admits anything, including its own mistakes. See below how I confront it and it acknowledges its “XOR encryption function” is just a string search…
[r2ai:0x0040c094]> ' The function at 0x403f5d is not performing xor.
Where do you see the XOR?
I apologize for my mistake. You're absolutely right.
Let me look at the function at 0x403f5d more carefully:
You're correct - this is actually a string search function,
not an XOR encryption function. It's searching for one string
within another and returning the position of the match.
In xwget, x stands for XOR
I finally spot an encryption algorithm! The binary supports an “xwget” command which downloads a file from a remote server and decrypts it with a custom algorithm which involves an XOR with hard-coded value 66 (I remember I saw that someone in another article about Prometei, the hard coded value was different, but I can’t recall the link).
[0x00410249]> decai -x
The function downloads a file and applies a decryption algorithm using a loop that performs XOR operations with a key that increments by 66 (0x42) in each iteration.
[0x00410249]> decai -d
[...]
void download_file(const char *filename, const char *mode) {
FILE *file;
int counter = 0;
int key = 66;
int length = 0;
unsigned char buffer[1024];
if (!filename) {
return;
}
file = fopen(filename, "w");
if (!file) {
return;
}
while (counter < length) {
int value = buffer[counter];
int temp = counter * 3;
value ^= temp;
value -= key;
buffer[counter] = value;
counter++;
key += 66;
}
fwrite(buffer, 1, length, file);
fclose(file);
}
Prometei does use RC4 after all (but not where the AI thought)
Actually, there are 2 sets of commands in Prometei:
- Commands handled in the “main” of the binary (0x4120aa). They are provided as arguments to the binary. I mentioned them in Part Two. Example:
start_mining, stop_mining, updatev4, touch, updatev4, wget, xwget
… - Incoming commands from the remote C2 (0x40c090). Those are sent on a dedicated socket communication channel between the binary and the C2. Example:
call, chkport, fput, fdir, fgetcrypt
…
It turns out that fgetcrypt
uses RC4 to fetch a file on the victim’s host and transmit it encrypted by RC4. The RC4 encryption function is located at 0x00402015.
Communication protocol
Remote C2 URL
If there is no custom URL in file /etc/pcc0
, communication will be directed to the feefreepool
URL. The URL is down now, but the botnet probably continues to live with custom URLs.
void initialize_c2_url() {
char url_buffer[128];
memset(url_buffer, 0, 128);
// Try to read custom URL from /etc/pcc0
if (file_exists("/etc/pcc0")) {
read_file("/etc/pcc0", url_buffer);
} else {
// Use default C2 URL if no custom URL configured
strcpy(url_buffer, "http://p3.feefreepool.net/cgi-bin/prometei.cgi");
}
}
Communication channel with C2
I continue manual analysis and bump into a function at 0x40c094 which decompiles horribly in Ghidra. I ask the AI for assistance. Initially, it fails to do so, because r2 doesn’t see any function there.
0x0040c094 55 push rbp // obvious function prelude
0x0040c095 4889e5 mov rbp, rsp
0x0040c098 4155 push r13
0x0040c09a 4154 push r12
0x0040c09c 53 push rbx
The solution is either to run “aap” (find and analyze function preludes) or just simply define the function (“af functionName”). It’s possible to pass r2 commands to r2ai by prefixing them with :
.
[r2ai:0x00401ef0]> :af read_incoming_socket @ 0x0040c094
[r2ai:0x00401ef0]> ' Please decompile function at 0x40c094.
int read_incoming_socket() {
// Create TCP socket
int sockfd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
// Set up address structure
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(get_port_from_memory()); // Port from 0x520364
addr.sin_addr.s_addr = inet_addr("0.0.0.0"); // Listen on all interfaces
// Bind socket
bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));
// Listen for connections
listen(sockfd, 0x1000);
while (1) {
memset(buffer, 0, sizeof(buffer));
// Read incoming data
int bytes_read = read(socket_fd, buffer, sizeof(buffer) - 1);
if (bytes_read <= 0) {
continue;
}
// Check for various commands
if (strstr(buffer, "/?c=getinfo")) {
// Handle getinfo command
// Returns system information
send_system_info(socket_fd); // AXELLE REMARK: AI simplies a lot here!
}
...
}
More precisely, the function implements a basic HTTP client and server:
- It expects a HTTP request from the C2. Format:
/?c=<command>
. Unsure about how arguments are provided, but probably asHTTP GET
arguments. - Answers with an HTTP response, e.g
HTTP/1.0 200 OK\nContent-Length: <len>\n
. Depending on the command, a file attachment or command response may be following. - Also — depending on the command — the victim host may issue another HTTP request. Format
&i=<victim id>
. Additionally, depending on cases, an answer (answ
- this answer duplicates the answer of the HTTP response), a random number (r
) may also be included.
I asked the AI for a list of command inputs and outputs. It did pretty well (see above) but forgot a few commands + did not mention about RC4 encryption nor base64 encoding in related commands.
Conclusion on Prometei communication
The communication protocol between the binary and the C2 is performed on a dedicated socket channel. The channel itself is not encrypted, but some commands (e.g. fgetcrypt
) involve RC4 encryption or (e.g. fdir
) base64 encoding.
What’s new? What’s the contribution for Prometei?
As I said earlier in Part One, Prometei is not a new malicious family. It dates back to December 2020. The sample I analyse is recent (beginning of February 2025) but shows minor evolution.
What’s really new here is the complete analysis of bot socket commands, and the description of the communication protocol.
This wasn’t detailed in this article (nevertheless the article is a good read!). NB. There are some other articles on the Windows version of Prometei here and here, which are interesting to read to see similarities with the Linux version, but they don’t detail the communication protocol either.
What’s new from an AI angle?
Getting help from AI to analyze malware is a new research topic I’ve been working on for a couple of months. Prior articles: in 2024, in 2025 crack me, sshdinjector + some other articles are on the go. All of this is quite pioneer work, and wouldn’t be possible with the ideas, code and fixes of pancake and Daniel Nakov that I thank very much.
The other malware analysis I performed worked better and faster with r2ai 😠. I had intended to write this one as a demonstration of how much time it would save me! In reality to be honest, it did not turn out the way I had expected. I had to heavily rely on my own manual analysis to be sure the AI did not lead me on wrong paths.
Was the AI useful after all? Is it worth the ~3–5 USD? My answer would still be yes, because I think the reverse engineering of this precise sample would have been far longer without AI’s assistance. On top of that, some of the expense could have been cut down. For example, there’s currently a bug in r2ai: the setting llm.maxtokens
is not taken into account, and during my analysis, in several occasions, the auto mode sent more than 20,000 tokens in a shot. That could have been avoided. Note that, at least, r2ai auto mode implements a watch dog with seetting auto.max_runs
, ensuring the auto mode will not go in a loop of questions and wear out all you money.
But, yes, be warned: you need brains, and you need experience. If you accept all stories the AI tells you, you’ll probably end up very wrong.
My methodology with r2ai and decai to get good results and (relatively) little expense
- Use r2ai in auto-mode with Claude for the overview, or any difficult step. It costs more than decai, but it’s worth it.
- Whenever you see r2ai is going in silly directions or unable to answer, with another decompiler side by side, ask decai.
- For any simple function, use a free Mistral API key. The code it produces isn’t good (in my opinion) but the explanation (
decai -x
) is decent and usually sufficient. - r2ai will pile up questions in the context. When you have had many questions, your context will become big and consume lots of your tokens. Think about resetting the context (
-R
). - the default r2 commands for r2ai auto-mode is “aaa,iI,afl”. “afl” lists all function and can generate long output. Consider removing it from the auto.init_commands to shorten your token consumption.
- I’ve used ChatGPT 4, ChatGPT4o, Mistral 7b and Claude 3.5 Sonnet 20241022. Concerning code & malware analysis, Claude always gave me the best results. Of course, this will change over time, with the evolution of models.
- I tried countless local AI (i.e I download the model and have it run on my local host): granite-8b, granite3, deepseek, dolphin… It never worked out for me: the response were far too slow and poor quality compared to the remote models. That being said, my desktop is not particularly powerful, and it’s possible it will work out far better for you if you have access to solid hosts.
— Cryptax