Really, the title does not literary means it. This writeup is about a research but not mine. And you will see why this writeup is called “Through the Heaven’s Gate” later on.
Background
My interest in this topic started from reversing a game. This game hooks many userland functions including the ones I’m interested in, VirtualProtect and NtVirtualProtectMemory. Without this, I am unable to change protection on pages and such.
This pushes me to resolve my need via kernel driver. I map my own kernel and execute a ZwVirtualprotectmemory from there, sure, it worked. But I want to make everything stay in usermode as their Anti-cheat just stays too in ring3.
The path to solution
Luckily, I have some several contacts that helps me to resolve me problem.
Me: How can I use VirtualProtect or NtVirtualProtectMemory when it's hooked at all.
az: use syscall
Me: *after some quite time* I can't find decent articles about syscall.
az: You can syscall, and since league is wow64, you can do heaven's gate on it
Me: ???
After that conversation I was like, “WHAAAATT???”. So I then proceed to read some articles regarding this. I’m thankful to this person because he does not give the solution directly, but he did point me to the process on how I can formulate the solution. So let’s break it down!
Syscall
In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the kernel of the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.
…
For example, the x86 instruction set contains the instructions
https://en.wikipedia.org/wiki/System_callSYSCALL
/SYSRET
andSYSENTER
/SYSEXIT
(these two mechanisms were independently created by AMD and Intel, respectively, but in essence they do the same thing). These are “fast” control transfer instructions that are designed to quickly transfer control to the kernel for a system call without the overhead of an interrupt.[8] Linux 2.5 began using this on the x86, where available; formerly it used theINT
instruction, where the system call number was placed in theEAX
register before interrupt 0x80 was executed.[9][10]
But there were problem regarding this, syscall cannot be manually called from 32bit application running in a 64bit environment.
Wow64
In computing on Microsoft platforms, WoW64 (Windows 32-bit on Windows 64-bit) is a subsystem of the Windows operating system capable of running 32-bit applications on 64-bit Windows. It is included in all 64-bit versions of Windows—including Windows XP Professional x64 Edition, IA-64 and x64 versions of Windows Server 2003, as well as 64-bit versions of Windows Vista, Windows Server 2008, Windows 7, Windows 8, Windows Server 2012, Windows 8.1 and Windows 10. In Windows Server 2008 R2 Server Core, it is an optional component, but not in Nano Server[clarification needed]. WoW64 aims to take care of many of the differences between 32-bit Windows and 64-bit Windows, particularly involving structural changes to Windows itself.
https://en.wikipedia.org/wiki/WoW64
Let’s start reversing!
Okay, so first, I will be using Cheat Engine because it has a powerful tool that helps to enumerate dll’s. Second, I will be dissecting discord app as an example.
Look at that! Faker what was that?. We have seen two ntdll.dll, wow64.dll, wow64win.dll and wow64cpu.dll. Also, if you noticed, 3 dll’s are in 64bit address space. Remember that we cannot execute 64bit codes directly in 32bit application. So what’s happening?
Answer: WOW64
We’ll follow the traces from 32bit ntdll. Let’s trace the NtVirtualProtectMemory on it.
It’s not a surprise that we might not found syscall here. But we’ll follow the call.
Look at that! RAX?!! 64bit code! What is this?
In fact, on 64-bit Windows, the first piece of code to execute in *any* process, is always the 64-bit NTDLL, which takes care of initializing the process in user-mode (as a 64-bit process!). It’s only later that the Windows-on-Windows (WoW64) interface takes over, loads a 32-bit NTDLL, and execution begins in 32-bit mode through a far jump to a compatibility code segment. The 64-bit world is never entered again, except whenever the 32-bit code attempts to issue a system call. The 32-bit NTDLL that was loaded, instead of containing the expected SYSENTER instruction, actually contains a series of instructions to jump back into 64-bit mode, so that the system call can be issued with the SYSCALL instruction, and so that parameters can be sent using the x64 ABI, sign-extending as needed.
In Alex Lonescu’ blog, he said.
So, whenever you are trying to syscall a function on 32bit ntdll, it will then traverse from 32bit ntdll to 64bit ntdll via wow64 layer dll’s.
To summarize,
32-bit ntdll.dll -> wow64cpu.dll’s Heaven’s Gate -> 64-bit ntdll.dll syscall-> kernel-land
The solution
We just need to copy the opcode from ZwProtectVirtualMemory in 32bit ntdll. As I said, it was already hooked so we cannot use it. Meanwhile, we can imitate the original opcodes of it before it was hooked.
template<typename T>
void makesyscall<T>::CreateShellSysCall(byte sysindex1, byte sysindex2, byte sysindex3, byte sysindex4, LPCSTR lpFuncName, DWORD offsetToFunc, byte retCode, byte ret1, byte ret2)
{
if (!sysindex1 && !sysindex2 && !sysindex3 && !sysindex4)
return;
#ifdef _WIN64
byte ShellCode[]
{
0x4C, 0x8B, 0xD1, //mov r10, rcx
0xB8, 0x00, 0x00, 0x00, 0x00, //mov eax, SysCallIndex
0x0F, 0x05, //syscall
0xC3 //ret
};
m_pShellCode = (char*)VirtualAlloc(nullptr, sizeof(ShellCode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
if (!m_pShellCode)
return;
memcpy(m_pShellCode, ShellCode, sizeof(ShellCode));
*(byte*)(m_pShellCode + 4) = sysindex1;
*(byte*)(m_pShellCode + 5) = sysindex2;
*(byte*)(m_pShellCode + 6) = sysindex3;
*(byte*)(m_pShellCode + 7) = sysindex4;
#elif _WIN32
byte ShellCode[]
{
0xB8, 0x00, 0x00, 0x00, 0x00, //mov eax, SysCallIndex
0xBA, 0x00, 0x00, 0x00, 0x00, //mov edx, [function]
0xFF, 0xD2, //call edx
0xC2, 0x14, 0x00 //ret
};
m_pShellCode = (char*)VirtualAlloc(nullptr, sizeof(ShellCode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
if (!m_pShellCode)
return;
memcpy(m_pShellCode, ShellCode, sizeof(ShellCode));
*(uintptr_t*)(m_pShellCode + 6) = (uintptr_t)((DWORD)GetProcAddress(GetModuleHandleA("ntdll.dll"), lpFuncName) + offsetToFunc);
*(byte*)(m_pShellCode + 1) = sysindex1;
*(byte*)(m_pShellCode + 2) = sysindex2;
*(byte*)(m_pShellCode + 3) = sysindex3;
*(byte*)(m_pShellCode + 4) = sysindex4;
*(byte*)(m_pShellCode + 12) = retCode;
*(byte*)(m_pShellCode + 13) = ret1;
*(byte*)(m_pShellCode + 14) = ret2;
#endif
}
makesyscall<NTSTATUS>(0x50, 0x00, 0x00, 0x00, "RtlInterlockedCompareExchange64", 0x170, 0xC2, 0x14, 0x00)(GetCurrentProcess(), &addr, &size, PAGE_EXECUTE_READ | PAGE_GUARD, &oldProtection)
POC
Okay, so here is it. We’ve injected the dll and got some print for debugging.
We dumped the running executable to check the print results. And, hell yeah!
We then therefore conclude that we have successfully bypassed basic usermode hook.
Extended usage
With all this knowledge, we can also implement heaven’s gate hook! All syscalls will then be caught, and have the option to do actions based on the syscalls as your will. But we will not cover this topics as it can be cited from another writeup: WOW64!Hooks: WOW64 Subsystem Internals and Hooking Techniques
Conclusion
We therefore conclude that wow64 application are able to execute 64bit syscalls via Heaven’s Gate.
A big thanks to admiralzero@UC for pointing me on the right direction. When I figured out that they hook usermode functions, I feel that was locked out and pushed to do kernel usage, but no, there was a way. And here it is, going through the heaven’s gate!