In this challenge, I learned that this binary is somewhat similar to Ghostpulse where it hides payload on the PNG. I was able to uncover this by checking some functions and saw that the binary loads a PNG resource file and it do some decryption routine.
EDIT: My solution to this is somewhat weird. After the event, the challenge creator revealed that the actual encrypted data wasn’t on the PNG itself. Until this day, I was still puzzled and looking for a “better” way to solve the challenge than my methodology explained below.
As stated in the article above, it do some crc and hashing checking to the PNG parts to identify location of the encrypted locations.
I first put a breakpoint on the cmp dword ptr [rsp+238h+var_1E8+0Ch], 0AAAAAAAAh. It seems like 0AAAAAAAAh is like an index for an encrypted message that the programs want to print out. You will notice this also in other parts, such as 0AAAAh, 0AAAAAh, 0AAAAAAh.
In the first breakpoint hit, you will see this:
Since, 0AAAAh is not equals to 0AAAAAAAAh, then it will just skip and proceed to the next iteration of loop. But what we can do here is to control the rip to proceed with the decryption block instead of reiterating the loop.
It did leaked the first encrypted message from the PNG file.
We just repeat these step to leak others as well.
We could go on to look other message, I think there are 12 encrypted messages there. But to cut short, this message is interesting.
In this challenge we are given a binary to reverse. The flag is in the binary and we need to find it.
After some guessing we are able to get a clue. I tried finding the bytes on the memory but I couldn’t get whole flag.
So what I did was to look around more, and try to check some function calls.
These 2 function calls are somewhat weird to me. I tried to check arguments passed to these 2 functions. I found out that the 1st function is a XOR cipher, and the 2nd call is a XOR key. There was 14 loops on it, so meaning there are 14 ciphers. You can just write down those values and manually xor for around 30 mins, or build an automated solution for 2 hours. Pick your poison. lol.
So I choose the automated solution
// FunctionHooks.cpp : Defines the exported functions for the DLL application.
//
#define NOMINMAX // Prevents Windows headers from defining min and max macros
//#define AllowDebug // uncomment to show debug messages
#include "pch.h"
#include <windows.h>
#include "detours.h"
#include <cstdint>
#include <mutex>
#include <fstream>
#include <string>
#include <vector>
#include <queue>
#include <thread>
#include <condition_variable>
#include <atomic>
#include <sstream>
#include <iomanip>
#include <cctype> // For isprint
#include <intrin.h>
// 1. Define the function pointer type matching the target function's signature.
typedef __int64(__fastcall* sub_0x1AC0_t)(__int64 a1, __int64 a2, __int64 a3);
// 2. Replace with the actual module name containing the target function.
const char* TARGET_MODULE_NAME = "rusty_bin.exe"; // Ensure this matches the actual module name
// 3. Calculated RVA of the target function (0x1AC0 based on previous calculation)
const uintptr_t FUNCTION_RVA = 0x1AC0;
// 4. Declare a pointer to the original function.
sub_0x1AC0_t TrueFunction = nullptr;
// 5. Logging components
std::queue<std::string> logQueue;
std::mutex queueMutex;
std::condition_variable cv;
std::thread logThread;
std::atomic<bool> isLoggingActive(false);
std::ofstream logFile;
// 6. Data management components
std::vector<std::vector<unsigned char>> byteVectors;
bool isOdd = true;
std::mutex dataMutex;
// 9. Helper function to convert uintptr_t to hex string
std::string ToHex(uintptr_t value)
{
std::stringstream ss;
ss << "0x"
<< std::hex << std::uppercase << value;
return ss.str();
}
// 7. Helper function to convert a single byte to hex string
std::string ByteToHex(unsigned char byte)
{
char buffer[3];
sprintf_s(buffer, sizeof(buffer), "%02X", byte);
return std::string(buffer);
}
// 8. Helper function to convert a vector of bytes to hex string with spaces
std::string BytesToHex(const std::vector<unsigned char>& bytes)
{
std::string hexStr;
for (auto byte : bytes)
{
hexStr += ByteToHex(byte) + " ";
}
if (!hexStr.empty())
hexStr.pop_back(); // Remove trailing space
return hexStr;
}
// 19. Helper function to convert a vector of bytes to a human-readable string
std::string BytesToString(const std::vector<unsigned char>& bytes)
{
std::string result;
result.reserve(bytes.size());
for (auto byte : bytes)
{
if (isprint(byte))
{
result += static_cast<char>(byte);
}
else
{
result += '.'; // Placeholder for non-printable characters
}
}
return result;
}
// 10. Enqueue a log message
void LogMessage(const std::string& message)
{
{
std::lock_guard<std::mutex> guard(queueMutex);
logQueue.push(message);
}
cv.notify_one();
}
// 11. Logging thread function
void ProcessLogQueue()
{
while (isLoggingActive)
{
std::unique_lock<std::mutex> lock(queueMutex);
cv.wait(lock, [] { return !logQueue.empty() || !isLoggingActive; });
while (!logQueue.empty())
{
std::string msg = logQueue.front();
logQueue.pop();
lock.unlock(); // Unlock while writing to minimize lock contention
if (logFile.is_open())
{
logFile << msg;
// Optionally, implement log rotation or size checks here
}
lock.lock();
}
}
// Flush remaining messages before exiting
while (true)
{
std::lock_guard<std::mutex> guard(queueMutex);
if (logQueue.empty())
break;
std::string msg = logQueue.front();
logQueue.pop();
if (logFile.is_open())
{
logFile << msg;
}
}
}
// 12. Initialize logging system
bool InitializeLogging()
{
{
std::lock_guard<std::mutex> guard(queueMutex);
logFile.open("rusty_bin.log", std::ios::out | std::ios::app);
if (!logFile.is_open())
{
return false;
}
}
isLoggingActive = true;
logThread = std::thread(ProcessLogQueue);
return true;
}
// 13. Shutdown logging system
void ShutdownLogging()
{
isLoggingActive = false;
cv.notify_one();
if (logThread.joinable())
{
logThread.join();
}
{
std::lock_guard<std::mutex> guard(queueMutex);
if (logFile.is_open())
{
logFile.close();
}
}
}
// 14. Implement the HookedFunction with the same signature.
__int64 __fastcall HookedFunction(__int64 a1, __int64 a2, __int64 a3)
{
// Retrieve the return address using the MSVC intrinsic
void* returnAddress = _ReturnAddress();
// Get the base address of the target module
HMODULE hModule = GetModuleHandleA(TARGET_MODULE_NAME);
if (!hModule)
{
// If unable to get module handle, log and call the true function
std::string errorLog = "Failed to get module handle for " + std::string(TARGET_MODULE_NAME) + ".\n";
#ifdef AllowDebug
LogMessage(errorLog);
#endif
return TrueFunction(a1, a2, a3);
}
uintptr_t moduleBase = reinterpret_cast<uintptr_t>(hModule);
uintptr_t retAddr = reinterpret_cast<uintptr_t>(returnAddress);
uintptr_t rva = retAddr - moduleBase;
// Define the specific RVAs to check against
const std::vector<uintptr_t> validRVAs = { 0x17B1, 0x17C8 };
// Check if the return address RVA matches 0x17B1 or 0x17C8
bool shouldProcess = false;
for (auto& validRVA : validRVAs)
{
if (rva == validRVA)
{
shouldProcess = true;
break;
}
}
if (shouldProcess)
{
// Convert a1 and a3 to uintptr_t using static_cast
uintptr_t ptrA1 = static_cast<uintptr_t>(a1);
uintptr_t ptrA3 = static_cast<uintptr_t>(a3);
// Log the function call parameters using ToHex
std::string logMessage = "HookedFunction called with a1=" + ToHex(ptrA1) +
", a2=" + std::to_string(a2) + ", a3=" + ToHex(ptrA3) + "\n";
#ifdef AllowDebug
LogMessage(logMessage);
#endif
// Initialize variables for reading bytes
std::vector<unsigned char> currentBytes;
__int64 result = 0;
// Check if a1 is valid and a2 is positive
if (a1 != 0 && a2 > 0)
{
unsigned char* buffer = reinterpret_cast<unsigned char*>(a1);
// Reserve space to minimize reallocations
currentBytes.reserve(static_cast<size_t>(a2));
for (size_t i = 0; i < static_cast<size_t>(a2); ++i)
{
unsigned char byte = buffer[i];
currentBytes.push_back(byte);
}
// Convert bytes to hex string
std::string bytesHex = BytesToHex(currentBytes);
// Log the bytes read
#ifdef AllowDebug
LogMessage("Bytes read: " + bytesHex + "\n");
#endif
}
else
{
// Log invalid parameters
std::string invalidParamsLog = "Invalid a1 or a2. a1: " + ToHex(ptrA1) +
", a2: " + std::to_string(a2) + "\n";
#ifdef AllowDebug
LogMessage(invalidParamsLog);
#endif
}
// Data management: Handle isOdd and byteVectors
{
std::lock_guard<std::mutex> guard(dataMutex);
if (isOdd)
{
// Odd call: push the bytes read to byteVectors
byteVectors.push_back(currentBytes);
#ifdef AllowDebug
LogMessage("Pushed bytes to array.\n");
#endif
}
else
{
// Even call: perform XOR with the last vector in byteVectors
if (!byteVectors.empty())
{
const std::vector<unsigned char>& lastVector = byteVectors.back();
size_t minSize = (currentBytes.size() < lastVector.size()) ? currentBytes.size() : lastVector.size();
std::vector<unsigned char> xorResult;
xorResult.reserve(minSize);
for (size_t i = 0; i < minSize; ++i)
{
xorResult.push_back(currentBytes[i] ^ lastVector[i]);
}
// Convert XOR result to hex string
std::string xorHex = BytesToHex(xorResult);
// Convert XOR result to human-readable string
std::string xorString = BytesToString(xorResult);
// Log both hex and string representations
#ifdef AllowDebug
LogMessage("XOR output (Hex): " + xorHex + "\n");
#endif
LogMessage("XOR output (String): " + xorString + "\n");
}
else
{
#ifdef AllowDebug
// Log that there's no previous vector to XOR with
LogMessage("No previous byte vector to XOR with.\n");
#endif
}
}
// Toggle isOdd for the next call
isOdd = !isOdd;
}
// Call the original function
result = TrueFunction(a1, a2, a3);
// Log the function result
std::string resultLog = "Original function returned " + std::to_string(result) + "\n";
#ifdef AllowDebug
LogMessage(resultLog);
#endif
// Return the original result
return result;
}
else
{
// If the return address RVA is not 0x17B1 or 0x17C8, directly call the true function
return TrueFunction(a1, a2, a3);
}
}
// 15. Function to dynamically resolve the target function's address
sub_0x1AC0_t GetTargetFunctionAddress()
{
HMODULE hModule = GetModuleHandleA(TARGET_MODULE_NAME);
if (!hModule)
{
#ifdef AllowDebug
LogMessage("Failed to get handle of target module: " + std::string(TARGET_MODULE_NAME) + "\n");
#endif
return nullptr;
}
// Calculate the absolute address by adding the RVA to the module's base address.
uintptr_t funcAddr = reinterpret_cast<uintptr_t>(hModule) + FUNCTION_RVA;
return reinterpret_cast<sub_0x1AC0_t>(funcAddr);
}
// 16. Attach hooks
BOOL AttachHooks()
{
// Initialize logging system
if (!InitializeLogging())
{
// If the log file cannot be opened, return FALSE to prevent hooking
return FALSE;
}
// Dynamically resolve the original function address
TrueFunction = GetTargetFunctionAddress();
if (!TrueFunction)
{
#ifdef AllowDebug
LogMessage("TrueFunction is null. Cannot attach hook.\n");
#endif
ShutdownLogging();
return FALSE;
}
// Begin a Detour transaction
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
// Attach the hooked function
DetourAttach(&(PVOID&)TrueFunction, HookedFunction);
// Commit the transaction
LONG error = DetourTransactionCommit();
if (error == NO_ERROR)
{
#ifdef AllowDebug
LogMessage("Hooks successfully attached.\n");
#endif
return TRUE;
}
else
{
#ifdef AllowDebug
LogMessage("Failed to attach hooks. Error code: " + std::to_string(error) + "\n");
#endif
ShutdownLogging();
return FALSE;
}
}
// 17. Detach hooks
BOOL DetachHooks()
{
// Begin a Detour transaction
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
// Detach the hooked function
DetourDetach(&(PVOID&)TrueFunction, HookedFunction);
// Commit the transaction
LONG error = DetourTransactionCommit();
if (error == NO_ERROR)
{
#ifdef AllowDebug
LogMessage("Hooks successfully detached.\n");
#endif
// Shutdown logging system
ShutdownLogging();
return TRUE;
}
else
{
#ifdef AllowDebug
LogMessage("Failed to detach hooks. Error code: " + std::to_string(error) + "\n");
#endif
return FALSE;
}
}
// 18. DLL entry point
BOOL WINAPI DllMain(HINSTANCE hinst, DWORD dwReason, LPVOID reserved)
{
switch (dwReason)
{
case DLL_PROCESS_ATTACH:
DisableThreadLibraryCalls(hinst);
DetourRestoreAfterWith();
if (!AttachHooks())
{
// Handle hook attachment failure if necessary
// Note: At this point, logging might not be fully operational
}
break;
case DLL_PROCESS_DETACH:
if (!DetachHooks())
{
// Handle hook detachment failure if necessary
}
break;
}
return TRUE;
}
Upon initial triage, the binary is built from protobuf and every tick is saved in file named game_state.pb. Upon observation, there are 12 X and O below the game screen. Sometimes they switch. Based from inference, we must at least meet all 12 to be O. A single O, means a condition was met, so we must investigate to get the conditions so we can probably win this game.
We are able to get the protobuf definition. Now we try to parse the game_state.pb
// cmd/deserialize.go
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
"example.com/m/v2/pb"
"google.golang.org/protobuf/proto"
)
func main() {
// Path to the serialized Grid file
filePath := "game_state.pb"
// Read the serialized Grid from the file
data, err := ioutil.ReadFile(filePath)
if err != nil {
log.Fatalf("Failed to read file %s: %v", filePath, err)
}
// Create an empty Grid object
var grid pb.Grid
// Deserialize the data into the Grid object
err = proto.Unmarshal(data, &grid)
if err != nil {
log.Fatalf("Failed to deserialize Grid: %v", err)
}
// Generate Go code representation
goCode := formatGridAsGoCode("grid", &grid)
// Print the generated Go code
fmt.Println(goCode)
}
// formatGridAsGoCode formats the Grid object into a Go code snippet
func formatGridAsGoCode(varName string, grid *pb.Grid) string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("%s := &pb.Grid{\n", varName))
sb.WriteString(fmt.Sprintf(" Width: %d,\n", grid.Width))
sb.WriteString(fmt.Sprintf(" Height: %d,\n", grid.Height))
sb.WriteString(" Rows: []*pb.CellRow{\n")
for _, row := range grid.Rows {
sb.WriteString(" {\n")
sb.WriteString(" Cells: []*pb.Cell{\n")
for _, cell := range row.Cells {
sb.WriteString(fmt.Sprintf(" {Alive: %t, Color: %d},\n", cell.Alive, cell.Color))
}
sb.WriteString(" },\n")
sb.WriteString(" },\n")
}
sb.WriteString(" },\n")
sb.WriteString("}\n")
return sb.String()
}
We have successfully deserialized the pb file. Therefore we can create a solution too by forging our own data based on winning conditions and serialize it.
Upon further reverse engineering, there is a win criteria generation in the binary.
Upon investigating the qwordWinCriteria, it seems like it stores the data for the win condition.
Entry 0:
- Field 0 (Row): 0x0A00000000000000 -> 10
- Field 8 (Column): 0x0F00000000000000 -> 15
- Field 16 (Value): 0x1F00000000000000 -> 31
Entry 1:
- Field 0 (Row): 0x1400000000000000 -> 20
- Field 8 (Column): 0x1900000000000000 -> 25
- Field 16 (Value): 0x2000000000000000 -> 32
... and so on.
With these information, we are now ready to craft the solution.
// serialize.go
package main
import (
"log"
"os"
"example.com/m/v2/pb"
"google.golang.org/protobuf/proto"
)
func main() {
// Define the win criteria
winCriteria := []struct {
Row int32
Column int32
Color int32
}{
{10, 15, 31},
{20, 25, 32},
{30, 35, 33},
{40, 45, 34},
{25, 50, 35},
{5, 55, 36},
{15, 60, 37},
{35, 65, 31},
{45, 70, 32},
{0, 75, 33},
{1, 80, 34},
{2, 85, 35},
}
// Initialize the grid
width := int32(400)
height := int32(50)
grid := &pb.Grid{
Width: width,
Height: height,
Rows: make([]*pb.CellRow, height),
}
// Initialize all cells to dead and color 0
for i := int32(0); i < height; i++ {
row := &pb.CellRow{
Cells: make([]*pb.Cell, width),
}
for j := int32(0); j < width; j++ {
row.Cells[j] = &pb.Cell{
Alive: false,
Color: 0,
}
}
grid.Rows[i] = row
}
// Apply the win criteria
for _, wc := range winCriteria {
if wc.Row >= 0 && wc.Row < height && wc.Column >= 0 && wc.Column < width {
grid.Rows[wc.Row].Cells[wc.Column].Alive = true
grid.Rows[wc.Row].Cells[wc.Column].Color = wc.Color
} else {
log.Fatalf("Win criteria position out of bounds: (%d, %d)", wc.Row, wc.Column)
}
}
// Serialize the Grid to binary format
data, err := proto.Marshal(grid)
if err != nil {
log.Fatalf("Failed to serialize Grid: %v", err)
}
// Write to a file
file, err := os.Create("game_state.pb") // The game expects this filename
if err != nil {
log.Fatalf("Failed to create file: %v", err)
}
defer file.Close()
_, err = file.Write(data)
if err != nil {
log.Fatalf("Failed to write data to file: %v", err)
}
log.Println("Grid serialized to game_state.pb successfully.")
}
We are given a png file and a binary with it. Upon initial triage, seems like the binary is a tool for Steganography. Our task is to retrieve the file from the png by reversing the binary and make a decryption tool.
Finding the entry point:
Now we reverse this gigantic function.
First, let’s understand the PNG file.
1. Understand the PNG File Structure
PNG files consist of an 8-byte signature followed by a series of chunks. Each chunk has the following format:
Length: 4 bytes (big-endian integer)
Chunk Type: 4 bytes (ASCII characters)
Chunk Data: Variable length
CRC: 4 bytes (Cyclic Redundancy Check)
Standard chunk types include IHDR, PLTE, IDAT, and IEND. However, PNG files can also contain custom ancillary chunks, which can be used to store additional data without affecting the image’s visual appearance.
2. Identify Custom Chunks
From the code snippet, it seems the application is adding custom chunks to the PNG file. Look for chunk types that are not standard. In the code, you can see references to functions that handle chunks, such as sub_140005D60, which appears to add a chunk with a given type.
sub_140005D60(&v70, &v50, "IHDR", 4i64);
But since IHDR is a standard chunk, look for other custom chunk types being used. Since the code is obfuscated, we might not see the actual chunk names directly. However, we can infer that custom chunks are being added to store the flag data.
So what I did was to put a breakpoint at `v19 = v69` as this variables would likely contain the information how the chunks are stored.
1st bp hit:
debug023:0000023C584F7EC0 db 62h ; b
debug023:0000023C584F7EC1 db 69h ; i
debug023:0000023C584F7EC2 db 54h ; T
debug023:0000023C584F7EC3 db 61h ; a
debug023:0000023C584F7EC4 db 3Ch ; <
debug023:0000023C584F7EC5 db 2
2nd bp hit:
debug023:0000023C584F7EC0 db 62h ; b
debug023:0000023C584F7EC1 db 69h ; i
debug023:0000023C584F7EC2 db 54h ; T
debug023:0000023C584F7EC3 db 62h ; b
debug023:0000023C584F7EC4 db 3Ch ; <
debug023:0000023C584F7EC5 db 2
3rd bp hit:
debug023:0000023C584F7EC0 db 62h ; b
debug023:0000023C584F7EC1 db 69h ; i
debug023:0000023C584F7EC2 db 54h ; T
debug023:0000023C584F7EC3 db 63h ; c
debug023:0000023C584F7EC4 db 3Ch ; <
debug023:0000023C584F7EC5 db 2
It just repeats, but only the 0000023C584F7EC3 changes alphabetically until reaching `i`.
Notable Patterns:
The bytes at offsets 0 to 2 are constant: 'b', 'i', 'T'.
The byte at offset 3 changes from 'a' to 'b' to 'c', incrementing alphabetically up to 'i'.
The rest of the bytes remain constant or contain padding.
Interpreting the Data
Given that the data starts with 'biT' followed by a changing letter, it’s likely that this forms a chunk type in the PNG file.
Chunk Type Formation:
Chunk Type: 4 ASCII characters.
The observed chunk types are:
'biTa'
'biTb'
'biTc'
…
'biTi'
Understanding the Application’s Behavior
From your decompiled code and observations, the application seems to:
Create Custom PNG Chunks:
It generates multiple custom chunks with types 'biTa', 'biTb', …, 'biTi'.
These chunks are likely used to store encrypted portions of the flag.
Encrypt Flag Data:
The flag is divided into segments.
Each segment is XORed with a key derived from the chunk type or chunk data.
The encrypted segments are stored in the corresponding custom chunks.
Key Derivation:
The key used for XORing seems to be derived from the chunk data (v69) or possibly the chunk type.
Since v19 = v69, and v69 points to the data starting with 'biTa', it’s possible that the chunk data itself is used as the key.
Reversing the Process
To extract and decode the embedded flag, we’ll need to:
Parse the PNG File and Extract Custom Chunks:
Read the PNG file and extract all chunks, including custom ones with types 'biTa', 'biTb', …, 'biTi'.
Collect Encrypted Data and Keys:
For each custom chunk:
Extract the encrypted data (chunk data).
Derive the key from the chunk data or type.
Decrypt the Data:
XOR the encrypted data with the derived key to recover the original flag segments.
Concatenate the decrypted segments to reconstruct the full flag.
1. Read the PNG File and Extract Chunks
import struct
def read_chunks(file_path):
with open(file_path, 'rb') as f:
# Read the PNG signature
signature = f.read(8)
if signature != b'\x89PNG\r\n\x1a\n':
raise Exception('Not a valid PNG file')
chunks = []
while True:
# Read the length (4 bytes)
length_bytes = f.read(4)
if len(length_bytes) < 4:
break # End of file
length = struct.unpack('>I', length_bytes)[0]
# Read the chunk type (4 bytes)
chunk_type = f.read(4).decode('ascii')
# Read the chunk data
data = f.read(length)
# Read the CRC (4 bytes)
crc = f.read(4)
chunks.append({
'type': chunk_type,
'data': data,
'crc': crc
})
return chunks
2. Identify Custom Chunks
def extract_custom_chunks(chunks):
standard_chunks = {
'IHDR', 'PLTE', 'IDAT', 'IEND', 'tEXt', 'zTXt', 'iTXt',
'bKGD', 'cHRM', 'gAMA', 'hIST', 'iCCP', 'pHYs', 'sBIT',
'sPLT', 'sRGB', 'tIME', 'tRNS'
}
custom_chunks = []
for chunk in chunks:
if chunk['type'] not in standard_chunks:
custom_chunks.append(chunk)
return custom_chunks
3. Sort Chunks Based on Sequence
def sort_custom_chunks(chunks):
# Sort chunks based on the fourth character of the chunk type
return sorted(chunks, key=lambda c: c['type'][3])
def xor_decrypt(data, key):
decrypted = bytearray()
key_length = len(key)
for i in range(len(data)):
decrypted_byte = data[i] ^ key[i % key_length]
decrypted.append(decrypted_byte)
return bytes(decrypted)
6. Combine Decrypted Segments
def extract_flag_from_chunks(chunks):
flag_parts = []
for chunk in chunks:
key = derive_key_from_chunk_type(chunk['type'])
# Or use derive_key_from_chunk_data(chunk)
encrypted_data = chunk['data']
decrypted_data = xor_decrypt(encrypted_data, key)
flag_parts.append(decrypted_data)
flag = b''.join(flag_parts)
return flag.decode()
7. Full Extraction Script
def extract_flag(file_path):
chunks = read_chunks(file_path)
custom_chunks = extract_custom_chunks(chunks)
sorted_chunks = sort_custom_chunks(custom_chunks)
flag = extract_flag_from_chunks(sorted_chunks)
return flag
# Example usage
flag = extract_flag('embedded_flag.png')
print("Recovered Flag:", flag)
Full Code
import struct
import sys
def read_chunks(file_path):
"""
Reads all chunks from a PNG file.
:param file_path: Path to the PNG file.
:return: List of chunks with their type, data, and CRC.
"""
chunks = []
with open(file_path, 'rb') as f:
# Read the PNG signature (8 bytes)
signature = f.read(8)
if signature != b'\x89PNG\r\n\x1a\n':
raise Exception('Not a valid PNG file')
while True:
# Read the length of the chunk data (4 bytes, big-endian)
length_bytes = f.read(4)
if len(length_bytes) < 4:
break # End of file reached
length = struct.unpack('>I', length_bytes)[0]
# Read the chunk type (4 bytes)
chunk_type = f.read(4).decode('ascii')
# Read the chunk data
data = f.read(length)
# Read the CRC (4 bytes)
crc = f.read(4)
chunks.append({
'type': chunk_type,
'data': data,
'crc': crc
})
return chunks
def extract_custom_chunks(chunks):
"""
Filters out standard PNG chunks to extract custom chunks.
:param chunks: List of all chunks from the PNG file.
:return: List of custom chunks.
"""
standard_chunks = {
'IHDR', 'PLTE', 'IDAT', 'IEND', 'tEXt', 'zTXt', 'iTXt',
'bKGD', 'cHRM', 'gAMA', 'hIST', 'iCCP', 'pHYs', 'sBIT',
'sPLT', 'sRGB', 'tIME', 'tRNS'
}
custom_chunks = []
for chunk in chunks:
if chunk['type'] not in standard_chunks:
custom_chunks.append(chunk)
return custom_chunks
def sort_custom_chunks(chunks):
"""
Sorts custom chunks based on the fourth character of the chunk type.
:param chunks: List of custom chunks.
:return: Sorted list of custom chunks.
"""
return sorted(chunks, key=lambda c: c['type'][3])
def derive_key_from_chunk_type(chunk_type):
"""
Derives the key from the chunk type.
:param chunk_type: Type of the chunk (string).
:return: Key as bytes.
"""
return chunk_type.encode('ascii')
def xor_decrypt(data, key):
"""
Decrypts data by XORing it with the key.
:param data: Encrypted data as bytes.
:param key: Key as bytes.
:return: Decrypted data as bytes.
"""
decrypted = bytearray()
key_length = len(key)
for i in range(len(data)):
decrypted_byte = data[i] ^ key[i % key_length]
decrypted.append(decrypted_byte)
return bytes(decrypted)
def extract_flag_from_chunks(chunks):
"""
Extracts and decrypts the flag from custom chunks.
:param chunks: List of sorted custom chunks.
:return: Decrypted flag as a string.
"""
flag_parts = []
for chunk in chunks:
key = derive_key_from_chunk_type(chunk['type'])
encrypted_data = chunk['data']
decrypted_data = xor_decrypt(encrypted_data, key)
flag_parts.append(decrypted_data)
flag = b''.join(flag_parts)
# Remove padding if any (e.g., 0xAB bytes)
flag = flag.rstrip(b'\xAB')
return flag.decode('utf-8', errors='replace')
def extract_flag(file_path):
"""
Main function to extract the flag from the PNG file.
:param file_path: Path to the PNG file.
:return: Decrypted flag as a string.
"""
# Read all chunks from the PNG file
chunks = read_chunks(file_path)
# Extract custom chunks where the flag is hidden
custom_chunks = extract_custom_chunks(chunks)
# Sort the custom chunks based on their sequence
sorted_chunks = sort_custom_chunks(custom_chunks)
# Extract and decrypt the flag from the custom chunks
flag = extract_flag_from_chunks(sorted_chunks)
return flag
if __name__ == '__main__':
if len(sys.argv) != 2:
print("Usage: python extract_flag.py <path_to_png_file>")
sys.exit(1)
png_file_path = sys.argv[1]
try:
recovered_flag = extract_flag(png_file_path)
print("Recovered Flag:", recovered_flag)
except Exception as e:
print("An error occurred:", str(e))
sys.exit(1)
This challenge is an executable file with areas or regions that can never be reached due to logic conditions built in. The challenge is to redirect the flow to force it reach the memory regions that contains the flag.
In the main function:
Notice that what ever happens, it always lands on that else block. How about we force it to satisfy the condition to true? Or just simply nop the jump to the else block
Before:
After:
Another interesting function is this one.
However, the logic prevents in getting to that block so we patch it.
Before:
After:
We also notice a function return that prevents us going further down. So we patch it too.
So it’s been a while since I posted a blog. I was so busy with other things, especially adjusting the schedule with my work and my studies.
This short article I’ll discuss some very basic techniques on evading anti-cheat. Of course, you would still need to adjust the evasion mechanism depending on the anti-cheat you are trying to defeat.
On this blog, we will focus on Internal anti-cheat evasion techniques.
Part 1: The injector
First part of making your “cheat” is creating an executable that would inject your .dll into the process, A.K.A the game.
There are lot of injection mechanisms (copied from cynet). Below is the list but not limited to:
Classic DLL injection
Classic DLL injection is one of the most popular techniques in use. First, the malicious process injects the path to the malicious DLL in the legitimate process’ address space. The Injector process then invokes the DLL via a remote thread execution. It is a fairly easy method, but with some downsides:
Reflective DLL injection
Reflective DLL injection, unlike the previous method mentioned above, refers to loading a DLL from memory rather than from disk. Windows does not have a LoadLibrary function that supports this. To achieve the functionality, adversaries must write their own function, omitting some of the things Windows normally does, such as registering the DLL as a loaded module in the process, potentially bypassing DLL load monitoring.
Thread execution hijacking
Thread Hijacking is an operation in which a malicious shellcode is injected into a legitimate thread. Like Process Hollowing, the thread must be suspended before injection.
PE Injection / Manual Mapping
Like Reflective DLL injection, PE injection does not require the executable to be on the disk. This is the most often used technique seen in the wild. PE injection works by copying its malicious code into an existing open process and causing it to execute. To understand how PE injection works, we must first understand shellcode.
Shellcode is a sequence of machine code, or executable instructions, that is injected into a computer’s memory with the intent of taking control of a running program. Most shellcodes are written in assembly language.
Manual Mapping + Thread execution hijacking = Best Combo
Above all of this, I think the very stealthy technique is the manual mapping with thread hijacking. This is because when you manual map a DLL into a memory, you wouldn’t need to call DLL related WinAPI as you are emulating the whole process itself. Windows isn’t aware that a DLL has been loaded, therefore it wouldn’t link the DLL to the PEB, and it would not create structs nor thread local storage. Aside from these, since you would be having thread hijacking to execute the DLL, then you are not creating a new thread, therefore you are safe from anti-cheat that checks for suspicious threads that are spawned. After the DLL sets up all initialization and hooks, it would return the control of the hijacked thread its original state, therefore, like nothing happened.
The line 7 is where you put the image base address, the line 9 is for dwReason, the line 11 is for DLL’s entrypoint and the line 14 is for the original thread RIP that it would jump back after finishing the DLL’s execution.
This injection mechanism is prone to lot of crashes. Approximately around 1 out of 5 injection succeeds. You need to load the game until on the lobby screen, then open the injector, if it crashes, just reboot the game and repeat the process until successful injection.
Part 2: The DLL
Of course, in the dll itself, you still need to do some cleanups. The injection part is done but the “main event of the evening” is just getting started.
This one is unlinking the DLL from PEB. But since we are doing Manual Map, it wouldn’t have an effect at all, because windows didn’t even know that a DLL is loaded at all. This is useful tho, if we injected the DLL using classic injection method.
FakePeHeader
This one is replacing the PE header of DLL with a fakeone. Most memory scanner, tries to find suspicious memory location by checking if a PE exists. An MS-DOS header begins with the magic code 0x5A4D, so if an opcodes begin with that magic bytes, chances are, a PE is occupying that space. After that, the memory scanner might read that header for more information on what is really loaded with that memory location.
No Thread Creation
THIS IS IMPORTANT! Since we are hooking the IDXGISwapChain::Present, then we don’t see any reason to keep another thread running, so after our DLL finishes the setup, we then return the control of the thread to its original state. We can use the PresentHook to continue our “dirty business” inside the programs memory. Besides, as mentioned earlier, having threads can lead to anti-cheat flagging.
CALLBACKS_INSTANCE = new CALLBACKS();
MAINMENU_INSTANCE = new MAINMENU();
XORSTR
Ah, yes, the XORSTR. We can use this to hide the real string and will only be calculated upon usage. To demonstrate the XORSTR, here is a sample usage. Focus on the line with “##overlay” string.
And this is what it looks like after compiling and putting it under decompiler.
Other methodologies
There are some few more basic methodologies that wasn’t applied in the project. Below are following but not limited to:
Anti-debugging
Anti-VM
Polymorphism and Code mutation (to avoid heuristic patten scanners)
So, with the basic knowledge we have here, we tried to inject this on one of a common game that is still on ring3 (because ring0 AC’s are much more harder to defeat ?).
BEWARE THAT THE ABOVE SCREENSHOTS ARE ONLY DONE IN A NON-COMPETITIVE MODE, AND ONLY STANDS FOR EDUCATIONAL PURPOSES ONLY. I AM NOT RESPONSIBLE FOR ANY ACTION YOU MAKE WITH THE KNOWLEDGE THAT I SHARED WITH YOU.
And now, we reached the end of this blog, but before I finished this article, I want to say thank you for reading this entire blog, also, I just want to say that I also passed the CISSP last October 2023, but wasn’t able to update here due to lot of workloads.
Again, I am really grateful for your time. Until next time!
Quick Context: Okay, so recently, we come across some fancy NFT project wherein “Students” are invited to join “Quizzes” and “Projects” to “Graduate”.
A “Graduate” means whitelisted for the mint of the NFT collection.
Our Goal
Our goal is to get into the top leaderboard so we can ensure our whitelist slot. And we want this by all means, so we use our hacker instinct to get advantage on the quiz.
However, we wouldn’t wanna overkill the contest. We didn’t spawn bots to automatically answer the quizzes (which is easy to do), so we just sticked with our bare hands, manually answering the quizzes. And we just stick to one-to-one account to human. We don’t want to disrupt the experience of other people.
The quiz
The quiz is a client sided web app. Meaning, all of the password for the quiz and questions are given to client without levels of authorization. Below are the steps of our reconnaissance and enumeration to extract the password and the set of question for a quiz.
Cracking the Password
Every quiz has different password. And our goal is to crack the password before the quiz starts (hours before the quiz so we have the chance to crack it).
Upon logging-in and browsing to /quiz page, we could see a web api requests. We can see that a request has a response that includes juicy information. We saw a json response that includes quiz details and we write down the _id and the password to our notes.
The first thing we did was to list all possible passwords and try to compare them against the hash. But sadly, we didn’t got any “possible password” correct.
What is Bcrypt?
The input to the bcrypt function is the password string (up to 72 bytes), a numeric cost, and a 16-byte (128-bit) salt value. The salt is typically a random value. The bcrypt function uses these inputs to compute a 24-byte (192-bit) hash. The final output of the bcrypt function is a string of the form:
$2<a/b/x/y>$[cost]$[22 character salt][31 character hash]
For example, with input password abc123xyz, cost 12, and a random salt, the output of bcrypt is the string
$2a$12$R9h/cIPz0gi.URNNX3kh2OPST9/PgBkqquzi.Ss7KIUgO2t0jWMUW
\__/\/ \____________________/\_____________________________/
Alg Cost Salt Hash
Where:
$2a$: The hash algorithm identifier (bcrypt)
12: Input cost (212 i.e. 4096 rounds)
R9h/cIPz0gi.URNNX3kh2O: A base-64 encoding of the input salt
PST9/PgBkqquzi.Ss7KIUgO2t0jWMUW: A base-64 encoding of the first 23 bytes of the computed 24 byte hash
The base-64 encoding in bcrypt uses the table ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789,[9] which is different than RFC4648Base64 encoding.
Back to our discussion
So now we know the basics of bcrypt, we could now start attacking the password hash.
Well, luckily, we got a tool named hashcat. Without having any more ideas about the password, we can now use the bruteforce technique. We also know that the password only contains numbers. So we could go bruteforce increment from ZERO until 10^n. Where n is the number of digits.
Here, we tell hashcat that our attack mode is Brute-force (-a 3), increment each password iteration (–increment), start from 1 digit (–increment-min 1), end the iteration with maximum of 8 digit (–increment-max 8), password hash that we found earlier ($2a$10$msFPZnG.NKHaCcVupGsQyuvpB8IwtZ7v3UxPBwf3fXe8hGdCMEwsu) and the pattern that we want our hashcat to follow (?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d?d).
And after some couple of minutes, we cracked the hash!
It took only 8 minutes for my GTX1050 to crack a 5-digit password. But it would definitely lasts more longer if the password was longer than 5-digit. Luckily, the password for this quiz is shorter than the first set of quizzes so we are able to bruteforce this in a very small amount of time.
Extracting Questions
We found a page where we can browse the quiz. We just enter the password that we found for this quiz.
The web app then make a request to the web api and we could see a juicy information here that includes the quiz questionnaires (testData).
We just parse the testData. And boom! Successfully extracted the PASSWORD and the QUESTIONS.
Conclusion
I understand the intention of the developer that they don’t want the participants kinda “DDoS” their servers by having a lot of authentication and authorization though their servers. They just give all their password and quiz data to the client because they want the validation to be on client’s side and not having loads to their server.
The web app’s architecture, does not really abide the Zero Trust Security because they just make the client’s authorized themselves and “trusts” them without proper validation.
Thanks for reading this short writeup! I hope you enjoy and see you on my next writeup!
The goal of this writeup is to create an additional layer of defense versus analysis. A lot of malwares utilize this technique in order for the binary analysis make more harder.
Polymorphism is an important concept of object-oriented programming. It simply means more than one form. That is, the same entity (function or operator) behaves differently in different scenarios
www.programiz.com
We can implement polymorphism in C++ using the following ways:
Now, let’s get it working. For this article, we are using a basic class named HEAVENSGATE_BASE and HEAVENSGATE.
Then we will be calling a function on an Instantiated Object.
Normal Declarations
When we examine the function call (Fig2) under IDA, we get the result of:
and when we cross-reference the functions, we will see on screen:
The xref on the .rdata is a call from VirtualTable of the Instantiated object. And the xref on the InitThread is a call to the function (Fig2).
Basic Obfuscation
So, how do we apply basic obfuscation?
We just need to change the declaration of Object to be the “_BASE” level.
Unlike earlier, the pointer points to a class named HEAVENSGATE. But this time we will be using the “_BASE”.
Under the IDA, we can see the following instructions:
Well, technically, it isn’t obfuscated. But the thing is, when an analyzer doesn’t have the .pdb file which contains the symbols name, then it will be harder to follow the calls and purpose of a certain call without using debugger.
This disassembly shows exactly what is going on under the hood with relation to polymorphism. For the invocations of function, the compiler moves the address of the object in to the EDX register. This is then dereferenced to get the base of the VMT and stored in the EAX register. The appropriate VMT entry for the function is found by using EAX as an index and storing the address in EDX. This function is then called. Since HEAVENSGATE_BASE and HEAVENSGATE have different VMTs, this code will call different functions — the appropriate ones — for the appropriate object type. Seeing how it’s done under the hood also allows us to easily write a function to print the VMT.
We can now just see that the direct call (in comparison with Fig5) is now gone. Traces and footprints will be harder to be traced.
Conclusion
Dividing the classes into two: a Base and the Original class, is a time consuming task. It also make the code looks ugly. But somehow, it can greatly add protection to our binary from analysis.
This won’t get too long. Just a quick fix for heavens gate hook (http://mark.rxmsolutions.com/through-the-heavens-gate/) as Microsoft updates the wow64cpu.dll that manages the translation from 32bit to 64bit syscalls of WoW64 applications.
To better visualize the change, here is the comparison of before and after.
With that being said, you cannot place a hook on 0x3010 as it would take a size of 8 bytes replacement. And would destroy the call mechanism even if you fix the displacement of call.
The solution
The solution is pretty simple. As in very very simple. Copy all the bytes from 0x3010 down until 0x302D. Fix the displacement only for the copied jmp at 0x3028. Then place the hook at 0x3010. Basically, the copied gate (via VirtualAlloc or Codecave) will continue execution from original 0x3010. And so, the original 0x3015 and onwards will not be executed ever again.
Pretty easy right?
Notes
In the past, Microsoft tends to use far jump to set the CS:33. CS:33 signify that the execution will be a long 64 bit mode in order to translate from 32bit to 64bit. Now, they managed to create bridge without the need for far jmp. Lot of readings need to be cited in order to understand these new mechanism but please do let me know!
I am now close at finishing the HTB Junior Pentester role course but decided to take a quick brake and focus on one of my favorite fields: reversing games and evading anti-cheat.
The goal
The end goal is simple, to bypass the Cheat Engine for usermode anti-cheats and allow us to debug a game using type-1 hypervisor.
This writeup will be divided into 3 parts.
First will be the concept of Direct Kernel Object Manipulation to make a process unlink from eprocess struct.
Second, the concept of hypervisor for debugging.
And lastly, is the concept of Patchguard, Driver Signature Enforcement and how to disable those.
So without further ado, let’s get our hands dirty!
Difference Between Kernel mode and User mode
Kernel-mode vs User mode
In kernel mode, the program has direct and unrestricted access to system resources.
In user mode, the application program executes and starts.
Interruptions
In Kernel mode, the whole operating system might go down if an interrupt occurs
In user mode, a single process fails if an interrupt occurs.
Modes
Kernel mode is also known as the master mode, privileged mode, or system mode.
User mode is also known as the unprivileged mode, restricted mode, or slave mode.
Virtual address space
In kernel mode, all processes share a single virtual address space.
In user mode, all processes get separate virtual address space.
Level of privilege
In kernel mode, the applications have more privileges as compared to user mode.
While in user mode the applications have fewer privileges.
Restrictions
As kernel mode can access both the user programs as well as the kernel programs there are no restrictions.
While user mode needs to access kernel programs as it cannot directly access them.
Mode bit value
The mode bit of kernel-mode is 0.
While; the mode bit of user-mode is 3.
Memory References
It is capable of referencing both memory areas.
It can only make references to memory allocated for user mode.
System Crash
A system crash in kernel mode is severe and makes things more complicated.
In user mode, a system crash can be recovered by simply resuming the session.
Access
Only essential functionality is permitted to operate in this mode.
User programs can access and execute in this mode for a given system.
Functionality
The kernel mode can refer to any memory block in the system and can also direct the CPU for the execution of an instruction, making it a very potent and significant mode.
The user mode is a standard and typical viewing mode, which implies that information cannot be executed on its own or reference any memory block; it needs an Application Protocol Interface (API) to achieve these things.
Basically, if the anti-cheat resides only in usermode, then the anti-cheat doesn’t have the total control of the system. If you manage to get into the kernelmode, then you can easily manipulate all objects and events in the usermode. However, it is not advised to do the whole cheat in the kernel alone. One single mistake can cause Blue Screen Of Death, but we do need the kernel to allow us for easy read and write on processes.
EPROCESS
The EPROCESS structure is an opaque structure that serves as the process object for a process.
Some routines, such as PsGetProcessCreateTimeQuadPart, use EPROCESS to identify the process to operate on. Drivers can use the PsGetCurrentProcess routine to obtain a pointer to the process object for the current process and can use the ObReferenceObjectByHandle routine to obtain a pointer to the process object that is associated with the specified handle. The PsInitialSystemProcess global variable points to the process object for the system process.
Note that a process object is an Object Manager object. Drivers should use Object Manager routines such as ObReferenceObject and ObDereferenceObject to maintain the object’s reference count.
Each list element in LIST_ENTRY is linked towards the next application pointer (flink) and also backwards (blink) which then from a circular list pattern. Each application opened is added to the list, and removed also when closed.
Now here comes the juicy part!
Unlinking the process
Basically, removing the pointer of an application in the ActiveProcessLinks, means the application will now be invisible from other process enumeration. But don’t get me wrong. This is still detectable especially when an anti-cheat have kernel driver because they can easily scan for unlinked patterns and/or perform memory pattern scanning.
A lot of rootkits use this method to hide their process.
Visualization
Checkout this link for image credits and for also a different perspective of the attack.
Kernel Driver
NTSTATUS processHiderDeviceControl(PDEVICE_OBJECT, PIRP irp) {
auto stack = IoGetCurrentIrpStackLocation(irp);
auto status = STATUS_SUCCESS;
switch (stack->Parameters.DeviceIoControl.IoControlCode) {
case IOCTL_PROCESS_HIDE_BY_PID:
{
const auto size = stack->Parameters.DeviceIoControl.InputBufferLength;
if (size != sizeof(HANDLE)) {
status = STATUS_INVALID_BUFFER_SIZE;
}
const auto pid = *reinterpret_cast<HANDLE*>(stack->Parameters.DeviceIoControl.Type3InputBuffer);
PEPROCESS eprocessAddress = nullptr;
status = PsLookupProcessByProcessId(pid, &eprocessAddress);
if (!NT_SUCCESS(status)) {
KdPrint(("Failed to look for process by id (0x%08X)\n", status));
break;
}
Here, we can see that we are finding the eprocessAddress by using PsLookupProcessByProcessId. We will also get the offset by finding the pid in the struct. We know that ActiveProcessLinks is just below the UniqueProcessId. This might not be the best possible way because it may break on the future patches when a new element is inserted below UniqueProcessId.
Here is a table of offsets used by different windows versions if you want to use manual offsets rather than the method above.
Win7Sp0
0x188
Win7Sp1
0x188
Win8p1
0x2e8
Win10v1607
0x2f0
Win10v1703
0x2e8
Win10v1709
0x2e8
Win10v1803
0x2e8
Win10v1809
0x2e8
Win10v1903
0x2f0
Win10v1909
0x2f0
Win10v2004
0x448
Win10v20H1
0x448
Win10v2009
0x448
Win10v20H2
0x448
Win10v21H1
0x448
Win10v21H2
0x448
ActiveProcessLinks offsets
auto addr = reinterpret_cast<HANDLE*>(eprocessAddress);
LIST_ENTRY* activeProcessList = 0;
for (SIZE_T offset = 0; offset < consts::MAX_EPROCESS_SIZE / sizeof(SIZE_T*); offset++) {
if (addr[offset] == pid) {
activeProcessList = reinterpret_cast<LIST_ENTRY*>(addr + offset + 1);
break;
}
}
if (!activeProcessList) {
ObDereferenceObject(eprocessAddress);
status = STATUS_UNSUCCESSFUL;
break;
}
KdPrint(("Found address for ActiveProcessList! (0x%08X)\n", activeProcessList));
if (activeProcessList->Flink == activeProcessList && activeProcessList->Blink == activeProcessList) {
ObDereferenceObject(eprocessAddress);
status = STATUS_ALREADY_COMPLETE;
break;
}
LIST_ENTRY* prevProcess = activeProcessList->Blink;
LIST_ENTRY* nextProcess = activeProcessList->Flink;
prevProcess->Flink = nextProcess;
nextProcess->Blink = prevProcess;
We also want the process-to-be-hidden to link on its own because the pointer might not exists anymore if the linked process dies.
There are 2 problems that you need to solve first before being able to do this method.
First: You need to disable Driver Signature Enforcement
You need to load your driver to be able to execute kernel functions. You either buy a certificate to sign your own driver so you do not need to disable DSE or you can just disable DSE from windows itself. The only problem of disabling DSE is that some games requires you to have enabled DSE before playing.
Second: Bypass Patchguard
Manually messing with DKOM will result you to BSOD. They got a tons of checks. But luckily we have some ways to bypass patchguard.
These 2 will be tackled on the 3rd part of the writeup. Stay tuned!