Cracking my own CrackMe with r2ai

5 min read2 days ago

Recently, I viewed the excellent “Cracking binaries with r2ai visual mode” by Daniel Nakov, at r2con 2024. In September 2024, I had tried and failed to get AI crack my own simple crackme, but Daniel Nakov did it differently, so I immediately decided to give it another try.

Want the spoiler now? Yes, it works very well. Fabulously well. 😃

In this blog post, I won’t cover how to install r2ai. Please refer to this previous post if needed.

Supply a binary to r2ai

r2ai is used to talk with an AI, the “trick” is to supply it a binary, not to use decai (AI-assisted decompiler).

I installed r2ai as a r2 package, using r2pm. In that case, you cannot (as in Daniel Nakov’s video) simply run r2ai mybinary, you need to run r2pm -r r2ai mybinary, and the path to binary must be an absolute path (not a relative path to your current directory, because r2ai is run via r2pm…).

Check the model

Then, the rest works as expected, as well as in Daniel Nakov’s video. I tested my crackme with claude-3–5-sonnet-20241022.

r2ai auto mode

The second “trick” is to use r2ai’s “auto” mode with a prompt beginning with a quote.

I acquired an API key for Claude and use it with r2ai. I feared this would be a bit expensive, but good surprise, several weeks of research cost me ~5$, and limits can be set.

In the auto mode, r2ai automatically issues r2 commands to solve the question. This is extremely helpful to beginners to r2, or simply to verify and understand what the AI is doing.

r2ai shows how it solves the crackme. It first issues the r2 command “afl”, then “aaa”, then “afl”.

From the previous list of functions, it identifies the main, and a function named “substitute” and decides to disassemble main.

The workflow is excellent: this is precisely how a human being should do it.

Based on prior disassembly, it explains its finding. For example, the length of the password, 17, is correct. It understands it has to “reverse” the substitution of xEkqeatrOzLgTqln, and writes a Python script to do that.

Its first 2 attempts at finding the password are failures, but it detects its own failure, and automatically tries again after fixing its own program.

Marvelous! The AI notices it fails and fixes its own error. This is automatic: the end-user does not have to supply any information whatsoever.

Actually, the second attempt fails too, and only the third attempt succeeds at last. But all of this is automatic. We don’t have to tell the AI the response is wrong, to retry, or where to fix.

Third automatic attempt of the AI succeeds. The correct password is at the end: uCrackedItSoEasy.

Finally, the AI recaps it all.

Final solution, cracking the crackme in a few seconds.

Conclusion

True, this crackme was simple. Yet, it gets solved within seconds with r2ai, and the solution is educative in terms of use of radare2 and in terms of solving a crackme. Wonderful!

Want to try with another model? In r2ai, set your model with -m. Some will succeed, others will fail (e.g. ChatGPT 4 actually failed to solve my crackme).

Direct approaches with free web interface to ChatGPT and Claude

Without r2ai, can we do the same with adequate questions to ChatGPT and Claude?

I set up free accounts on both, using ChatGPT 4 and Claude 3.5 Sonnet. I supplied the binary at first, and if it failed, then the corresponding source code (no comments included).

Typical inquiry to AI, to solve my crackme

ChatGPT 4 found the correct solution only when I supplied the source code — which is far easier (NB. In September 2024, ChatGPT had failed even with the source code, so this is an improvement). All other tests failed.

On the binary, ChatGPT failed for an unclear reason, and to be honest, I should perhaps have tried again.

ChatGPT 4 failed to analyze the binary, but this was perhaps a temporary issue.

Why did Claude fail with a direct approach?

Isn’t it strange it worked so well with r2ai, and fails on its own?

Claude 3.5 finds password “bAlchemist” which is wrong. It does not have the ability to test with the binary I provided, which would have helped it see this does not work.

The reason is that r2ai generates a prompt for the AI, and then runs the commands the AI suggests. Thus, it is able to run r2 commands or run the binary and see if it succeeds or failed! The standalone online version of Claude does not support this feature.

Final word

Please do not read this blog post as a comparison between ChatGPT and Claude. A test on a single case would be highly insufficient. Rather, the important point is how to use r2ai, and why r2ai gets the best out of AI. More to come in 2025, stay tuned!

Source code of the crack me

NB. When supplied to AI, all comments were removed.

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>

#define ALPHABET_SIZE 26

void substitute(const char *input, const char *substitution, char *output) {
    char mapping[ALPHABET_SIZE];

    // Use the direct substitution table for encryption
    for (int i = 0; i < ALPHABET_SIZE; i++) {
      mapping[i] = substitution[i];
    }

    for (int i = 0; input[i] != '\0'; i++) {
        char c = input[i];

        // Handle uppercase letters
        if (isupper(c)) {
            output[i] = mapping[c - 'A'];
        }
        // Handle lowercase letters
        else if (islower(c)) {
            output[i] = tolower(mapping[c - 'a']);
        }
        // Leave non-alphabet characters unchanged
        else {
            output[i] = c;
        }
    }

    output[strlen(input)] = '\0';  // Null-terminate the output string
}

int main() {
    char input[40];
    char output[40];
    char expected[] = "xEkqeatrOzLgTqln"; // uCrackedItSoEasy
    int expected_len = sizeof(expected) / sizeof(char);

    // Configurable substitution key (A-Z)
    const char *key = "QWERTYUIOPASDFGHJKLZXCVBNM";  // Example substitution key

    // Get input from the user
    printf("Password: ");
    fgets(input, sizeof(input), stdin);

    // Remove newline character from input if it exists
    input[strcspn(input, "\n")] = 0;

    substitute(input, key, output);
    if (memcmp(output, expected, expected_len)==0 && output[expected_len] == '\0') {
      printf("Congrats!!!\n");
    } else {
      printf("Wrong\n");
      exit(1);
    }

    return 0;
}

— Cryptax