惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
T
Tenable Blog
Webroot Blog
Webroot Blog
L
Lohrmann on Cybersecurity
S
Securelist
S
Schneier on Security
NISL@THU
NISL@THU
Know Your Adversary
Know Your Adversary
C
Cybersecurity and Infrastructure Security Agency CISA
T
The Exploit Database - CXSecurity.com
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
O
OpenAI News
I
Intezer
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
TaoSecurity Blog
TaoSecurity Blog
S
Secure Thoughts
Application and Cybersecurity Blog
Application and Cybersecurity Blog
P
Privacy International News Feed
H
Hacker News: Front Page
N
Netflix TechBlog - Medium
M
MIT News - Artificial intelligence
博客园 - Franky
PCI Perspectives
PCI Perspectives
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Microsoft Azure Blog
Microsoft Azure Blog
MongoDB | Blog
MongoDB | Blog
L
LangChain Blog
P
Proofpoint News Feed
S
Security Affairs
WordPress大学
WordPress大学
The Last Watchdog
The Last Watchdog
S
SegmentFault 最新的问题
小众软件
小众软件
F
Full Disclosure
博客园 - 叶小钗
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
T
The Blog of Author Tim Ferriss
Simon Willison's Weblog
Simon Willison's Weblog
P
Palo Alto Networks Blog
Security Latest
Security Latest
P
Proofpoint News Feed
月光博客
月光博客
T
Tailwind CSS Blog
Scott Helme
Scott Helme
Hacker News - Newest:
Hacker News - Newest: "LLM"
Google Online Security Blog
Google Online Security Blog
T
Threat Research - Cisco Blogs
Help Net Security
Help Net Security
Project Zero
Project Zero

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything Updated: BFF Pattern I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
From Packed Binary to Readable Code: A Hands-On Walkthrough of Unpacking, Shellcode Analysis, and Memory Forensics
Khalif AL Mahmud · 2026-06-27 · via DEV Community

A few weeks ago I spent a full lab session doing something that sounds simple on paper and is genuinely satisfying in practice: taking a packed, obfuscated piece of malware and peeling back every layer until I could see what it actually does.

This post is my write-up of that session. It's long, because the lab itself covered a lot of ground — static analysis, manual unpacking with a debugger, multi-stage shellcode extraction, code injection patterns, API hooking, and finally memory forensics with Volatility. I'm documenting it the way I wish more "intro to malware analysis" posts were documented: with the actual commands, the actual reasoning behind each step, and the dead ends along the way.

If you're getting into reverse engineering or malware analysis, this should give you a realistic feel for what a packed-malware investigation actually looks like end to end — not just the highlight reel.

A quick but important note: this entire exercise was done on isolated, throwaway virtual machines (a Windows analysis VM with no network access beyond an internal isolated segment, plus a REMnux Linux VM) using known teaching specimens. Never run unknown executables, run unpackers, or experiment with shellcode on a machine connected to a real network or containing real data. Everything here assumes a fully isolated, snapshot-able VM setup.

Problem Statement

Modern malware rarely ships as a plain, readable executable. Authors wrap their code in packers (like UPX) to shrink the file and make static analysis harder, and they layer in techniques like shellcode, code injection, and API hooking to evade detection and persist on a system. As an analyst, your job is to answer a chain of questions:

  • Is this file packed, and with what?
  • Can I unpack it automatically, or do I need to dump it manually from memory?
  • What does the unpacked code actually do?
  • If it drops or injects further payloads (like shellcode), can I extract and analyze those too?
  • After infection, what artifacts are left behind in memory that I can find forensically?

This walkthrough tackles that chain using three teaching specimens: a UPX-packed sample (brbbot.exe), a multi-technology dropper that chains JavaScript → PowerShell → shellcode (PDFXCview.exe), and a code-injecting, API-hooking sample analyzed both statically and via a memory image (great.exe / great.vmem).

Lab Setup

Two VMs, both reverted to clean snapshots before starting:

  • Windows analysis VM — tools used: PeStudio, Process Hacker, Process Monitor, ProcDOT, Scylla, x64dbg/x32dbg, OllyDumpEx, IDA, Regedit, Notepad++, PowerShell ISE, WinSCP
  • REMnux VM (Linux malware analysis distro) — tools used: pescanner.py, diec, strings, SpiderMonkey (js), base64dump.py, Volatility (vol.py)

Both VMs were on an isolated internal network segment so the Windows VM and REMnux VM could talk to each other (for file transfer and the JavaScript dropper's local web server) without any route to the internet.


Part 1: Identifying and Unpacking a UPX-Packed Sample

Step 1 — Confirm the file is packed

First pass: load the suspicious binary in PeStudio and check three things — imports, section names, and strings.

A packed file typically shows:

  • Fewer imports than you'd expect from a normal-sized executable (because the real import table is hidden until the unpacking stub runs)
  • Unusual section names — instead of the standard .text, .rdata, .data, you'll see something like UPX0 and UPX1
  • Fewer readable strings, since the actual strings are compressed/encoded until runtime

All three were true here, and the UPX naming convention in the section headers was a strong hint about which packer was used.

Step 2 — Confirm with entropy analysis

On REMnux, pescanner.py measures the entropy of each section. High entropy (close to random) is a hallmark of compressed or encrypted data:

pescanner.py brbbot.exe | more

The tool flagged sections as "SUSPICIOUS" — one for unusually high entropy (consistent with packed/compressed code), and one with an entropy of exactly 0 (because its raw size was 0 — also anomalous for a legitimate section).

Step 3 — Identify the packer

diec brbbot.exe

diec (Detect It Easy, command-line version) reported UPX as the most likely packer — confirming the hint from the section names.

Step 4 — Try the obvious thing first: just run the unpacker

upx -d %AppData%\brbbot.exe

This is always worth trying, but it commonly fails on malware, because authors deliberately corrupt the UPX header/footer to block the standard unpacker while leaving the actual UPX decompression stub intact:

CantUnpackException: file is possibly modified/hacked/protected; take care!

Since the automated route was blocked, the next move is manual dumping: let the malware unpack itself in memory at runtime, then dump that unpacked memory image to disk.

Step 5 — Disable ASLR so addresses stay predictable

setdllcharacteristics -d %AppData%\brbbot.exe

This flips the DYNAMIC_BASE flag in the PE header from 1 to 0. Without this, the binary would load at a randomized base address every run, which makes it harder to find a stable breakpoint address across debugging sessions.

Step 6 — Run it and dump it with Scylla

With the sample running (via a desktop shortcut set to "Run as administrator"), attach Scylla x64 to the process and click Dump. This grabs the in-memory, already-unpacked version of the code.

But a raw memory dump alone usually isn't runnable — the Import Address Table (IAT) is broken, because imports get resolved dynamically and the dump doesn't capture that resolution cleanly. So:

  1. Click IAT Autosearch in Scylla
  2. Click Get Imports
  3. Click Fix Dump, pointing it at the dumped file

Scylla writes a new file with _SCY appended to the name (e.g., brbbot-dumped_SCY.exe) — this is the "fixed" version with a repaired import table.

Step 7 — Verify, but don't assume success

Loading the fixed dump back into PeStudio showed more imports than the packed original — a good sign. But running the fixed dump directly produced a different outcome than expected (it exited immediately, without dropping the configuration file the real malware drops). This is a useful and realistic lesson: successfully fixing the IAT doesn't guarantee a perfectly runnable standalone binary. Sometimes further reconstruction is needed. Don't take "it loads more imports now" as proof the unpacking job is fully done — verify behavior too.


Part 2: Manual Unpacking via Debugger (x64dbg + OllyDumpEx)

Scylla's automatic dump-and-fix approach doesn't always work cleanly, so it's worth knowing the manual debugger-based path too.

Step 1 — Find the jump to the Original Entry Point (OEP)

Load the packed binary in x64dbg. Scroll through the disassembly until the unpacking stub's instructions end and you hit a long run of zero bytes — that boundary is usually right where the final jump sits:

jmp brbbot.140003F94

That 140003F94 target address is the OEP — the address where the real, unpacked program logic begins.

Step 2 — Breakpoint right before the jump

Set a breakpoint on the JMP instruction, then run (F9). The process will execute all the unpacking logic and pause right at that breakpoint, immediately before transferring control to the unpacked code.

Step 3 — Step into the OEP

Step over the jump (F7 or F8) to land at the OEP — execution is now paused inside the unpacked code.

Step 4 — Confirm you're actually looking at unpacked code

Don't just trust the address — verify it. Right-click in the CPU view and run:

  • Search for → Current Region → String references — you should see far more readable strings than the packed version showed
  • Search for → Current Region → Intermodular calls — you should see real API call references that weren't visible before

Both showing up is good confirmation you're looking at genuinely unpacked code.

Step 5 — Dump with OllyDumpEx

From x64dbg's Plugins menu: OllyDumpEx → Dump process. Key details:

  • Use "Get EIP as OEP" so the dump records the correct entry point
  • Find the UPX1 section row and enable the MEM_WRITE characteristic flag before dumping (without write permission flagged, some dumpers won't capture the section properly)
  • Save the dump (e.g. brbbot_dump_64.exe)

Step 6 — Fix the IAT with the Scylla plugin (inside x64dbg)

Same logic as before — IAT Autosearch → Get Imports → Fix Dump, pointed at the OllyDumpEx output. Result: a _SCY-suffixed file with a repaired import table.


Part 3: Debugging Packed Code Directly (Finding Decryption Routines)

Sometimes you don't want to fully unpack a sample — you just want to watch a specific operation happen, like decryption of an embedded configuration.

Step 1 — Let it run and self-unpack

Run the packed binary in x64dbg with no breakpoints set (F9). It unpacks itself into memory and continues normally.

Step 2 — Find the unpacked region via Memory Map

In the Memory Map tab, look for memory regions that don't belong to a Windows DLL and have "E" (execute) in the Protection column. In this sample, two regions matched that profile — the unpacker code region and a second region holding the freshly unpacked code. Right-click the latter and choose Follow in Disassembler.

Step 3 — Search for interesting API calls

Right-click → Search for → Current Region → Intermodular calls, then filter the results by typing a keyword (e.g. Crypt) in the search box. This surfaced a call to CryptDecrypt — a strong signal that the malware decrypts an embedded configuration at runtime.

Step 4 — Breakpoint after the call, then restart cleanly

Select the instruction right after the CryptDecrypt call (the result-checking instruction), and set a hardware breakpoint on execution. Then restart the process (Ctrl+F2) and run again (F9).

Why restart rather than just continuing? Because the process may have already executed past this point once — restarting guarantees you hit the breakpoint fresh, from the actual entry point, so register/stack state is consistent with a real first-run analysis.

Once paused there, the decrypted configuration data is sitting in memory (commonly reachable via the stack) — ready for inspection, exactly like you would when analyzing the unpacked version of the same family of malware.


Part 4: Multi-Stage Dropper Analysis (JavaScript → PowerShell → Shellcode)

This is where things get more interesting: a single executable that chains together several different technologies to avoid writing an obviously malicious file to disk.

Step 1 — Infect while monitoring, then cut it off

Start Process Monitor capturing, then run the sample. Watch the process tree in Process Hacker: the initial process spawns mshta.exe and powershell.exe, then after roughly a minute or two, spawns a couple of regsvr32.exe processes. Once those appear, terminate the process tree and pause Process Monitor capture — you don't need to let it run indefinitely, you just need enough activity captured to reconstruct the chain.

Step 2 — Reconstruct the infection chain visually

Export the Process Monitor log as CSV, then load it into ProcDOT along with the initial malicious process. ProcDOT generates a visual graph of what touched what — registry keys created, files dropped, and a persistence entry added under the Run autostart key. It also revealed the malware created files with an unusual, randomly-generated extension and a batch file, plus matching registry entries describing how Windows should handle that custom file extension.

Step 3 — Find the "execute this when this file type opens" command

In Regedit, navigate to:

HKEY_CURRENT_USER\Software\Classes\.<random-extension>

The (Default) value there points to another key (a random-looking hex string), which under shell\open\command contains the actual command Windows runs. In this lab it looked roughly like:

"C:\WINDOWS\system32\mshta.exe" "javascript:...eval(IV2u4L)..."

This is a classic file-less technique: rather than dropping a .js file, the script content lives in a registry value, and mshta.exe is abused to execute inline JavaScript that reads and eval()s it.

Step 4 — Extract the script from the registry

reg_export HKCU\software\<random-key> <random-value> script.js

Step 5 — Move it to REMnux and deobfuscate

Transfer with WinSCP, then try SpiderMonkey directly:

js -f /usr/share/remnux/objects.js -f script.js

This threw an "illegal character" error — the script was UTF-16 encoded, which SpiderMonkey can't parse directly. Fix the encoding first:

strings --encoding=l script.js > script2.js

(-l here is lowercase L, not the number 1 — easy typo to make.)

Then deobfuscate properly:

js -f /usr/share/remnux/objects.js -f script2.js > script3.js
scite script3.js &

The deobfuscated script revealed a call resembling [Convert]::FromBase64String, with the decoded result handed off to powershell.exe — meaning the JavaScript's whole job was to decode and launch a Base64-encoded PowerShell stage.

Step 6 — Pull the Base64 PowerShell payload out

base64dump.py script3.js

This lists every candidate Base64 blob found, each with an ID. Look in the Decoded column for the largest entry that decodes into readable ASCII — that's almost always the real payload, as opposed to short incidental Base64-looking noise.

base64dump.py script3.js -s 10 -d > script.ps1

(-s 10 selects that specific entry's ID — yours will likely be a different number.)

Step 7 — Read the PowerShell, understand the shellcode loader pattern

Transfer script.ps1 back to Windows and open in Notepad++. The pattern here is a textbook shellcode loader:

  1. A variable ($sc32) holds hex-encoded shellcode
  2. VirtualAlloc allocates memory with PAGE_EXECUTE_READWRITE
  3. The shellcode bytes are copied into that memory
  4. CreateThread is called, pointing at the shellcode's address, to execute it

Recognizing this pattern is genuinely useful — it shows up constantly across unrelated malware families because it's the simplest way to run raw shellcode from a scripting language.

Step 8 — Extract the raw shellcode with a debugger breakpoint

powershell_ise script.ps1

Set a breakpoint on the line right after $sc32 is assigned (before $pr gets defined), run to it (Debug → Run/Continue), then once paused, dump the variable's contents to a raw binary file:

[io.file]::WriteAllBytes('sc32.bin',$sc32)

Now you have the raw shellcode isolated in its own file, ready for dedicated shellcode analysis tools.


Part 5: Shellcode Analysis and Unpacking

Step 1 — Quick emulation pass with scdbg

scdbg.exe -f sc32.bin

(Or via the GUI: load the file, leave default options, click Launch.) scdbg emulates the shellcode's likely API calls without actually executing it dangerously. Here it showed the code loading advapi32.dll and calling RegOpenKeyExA against both HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER — useful, but it didn't reveal which specific registry keys were targeted.

Step 2 — Run it for real (carefully) with jmp2it

jmp2it sc32.bin 0x0 pause

0x0 means the shellcode starts at offset zero in the file. The pause argument makes jmp2it insert an infinite loop before jumping into the shellcode, buying you time to attach a debugger before anything actually runs.

Step 3 — Attach a debugger and patch the entry condition

Attach x32dbg to the jmp2it process, run briefly, then pause — you'll land inside the infinite loop jmp2it created. The shellcode in this case expected a parameter (its own memory address) to be pushed onto the stack before it starts, mimicking how the PowerShell loader called it via CreateThread. Since jmp2it happens to store that address in the EDI register, you can satisfy that expectation by patching the infinite-loop instruction:

  1. Select the loop instruction, press spacebar to open the Assemble dialog
  2. Enable "Fill with NOPs"
  3. Type the replacement instruction:
push edi

This single patched instruction is what lets the shellcode run as if it had been called the same way the original loader called it.

Step 4 — Trace API usage with a breakpoint

SetBPX advapi32.RegOpenKeyExA

Run (F9) to hit it, then check the Call Stack tab for the first frame that isn't inside a Windows DLL — that's the shellcode's own calling code. Following that call stack entry back into the disassembler showed, a short distance later, a call to VirtualAlloc — a strong hint that this shellcode unpacks another payload into memory, just like the outer executable did.

Step 5 — Dump the unpacked second-stage payload

Set a breakpoint on VirtualAlloc itself:

SetBPX VirtualAlloc

The pattern that emerged from hitting this breakpoint multiple times:

  • 1st hit, after return: allocated memory is empty (all zeros)
  • 2nd hit, after return: memory now has some content — code ran between calls and started populating it
  • 3rd hit: the memory now contains strings consistent with a Windows executable (an actual embedded PE file)

Each time, right-clicking EAX (which holds the returned memory address) → Follow in Dump → Dump 1/Dump 2 lets you watch that specific memory region fill in over successive breakpoint hits.

Once the third allocation showed clear PE-file characteristics, right-click that dump pane → Follow in Memory Map, then right-click the corresponding row → Dump Memory to File. That gives you a final extracted executable, ready to load into PeStudio to confirm it's a structurally valid PE file with imports and strings.


Part 6: Code Injection Analysis (Static, with IDA)

Switching specimens here — a sample that injects code into other running processes.

Step 1 — Find the CreateRemoteThread call

In IDA, jump to the Imports tab and locate CreateRemoteThread. Double-click it, then in the disassembler view, select it and press x to bring up cross-references. This shows every place in the code that calls this function.

Step 2 — Trace backward from the call to find the source process handle

CreateRemoteThread takes a process handle (hProcess) as a parameter. Tracing that register backward through the disassembly led to a call to OpenProcess — the function that obtains a handle to an existing process by PID. This is the classic injection setup: get a handle to a target process, then create a thread inside it.

Step 3 — Find where the payload gets written into the target

A separate function call (visible just before the CreateRemoteThread call, taking the same process handle as a parameter) turned out to contain calls to WriteProcessMemory — the actual mechanism for placing code into another process's address space.

A useful shortcut here: rather than manually walking every function called from that one, IDA's View → Graphs → Xrefs from generates a call graph showing everything reachable from a given function. That graph surfaced exactly which sub-function calls VirtualAllocEx — the memory allocation step that has to happen in the target process before you can write to it.

Step 4 — Confirm the allocation permissions

At the VirtualAllocEx call site, the flProtect parameter being pushed was 0x40. Right-clicking that value in IDA and choosing "Use standard symbolic constant" reveals it as PAGE_EXECUTE_READWRITE — memory that can be written to and executed. That combination, allocated in someone else's process, is the textbook signature of code injection intent.

Step 5 — Find how the target process gets chosen

Walking back further up the call chain (using IDA's back-arrow navigation) led to a function that calls CreateToolhelp32Snapshot, which — combined with Process32FirstW/Process32NextW — is the standard Windows API trio for enumerating every running process. That's the malware searching for a suitable target before injecting into it.

Function called Role in the injection chain
CreateToolhelp32Snapshot + Process32FirstW/NextW Enumerate running processes to pick a target
OpenProcess Get a handle to the chosen target process
VirtualAllocEx Allocate executable+writable memory inside the target
WriteProcessMemory Write the payload into that allocated memory
CreateRemoteThread Start execution of the injected code

Part 7: API Hooking Analysis (Static, with IDA)

Same specimen, different capability: modifying other functions in memory so calls to them get redirected.

Step 1 — Find where it reads existing code (to back it up before overwriting)

Following cross-references to ReadProcessMemory (same Imports-tab → xrefs approach as before) led to a function that reads memory from a target process — almost always the first step before overwriting something, since you typically want to preserve the original bytes you're about to clobber.

Step 2 — Find where it writes the hook

The same function later calls WriteProcessMemory twice, with two different byte patterns:

  • One write starts with byte 0xE9 — the opcode for a relative JMP instruction
  • Another write starts with 0x68 (the start of a PUSH instruction), paired with a 0xC3 (RET) written five bytes later

The second pattern — PUSH followed by RET — is a sneakier alternative to a plain JMP for redirecting execution, since it doesn't look like an obvious jump instruction at a glance.

Step 3 — Find the table of targeted functions

Walking the call chain upward (xrefs again) eventually reaches a function that builds a table of function addresses — saving various API addresses into memory, one after another, to be passed as the list of functions to hook. The functions referenced there were largely browser-related, suggesting the malware's actual goal: intercepting and observing the victim's web browsing activity.


Part 8: Memory Forensics with Volatility

Final piece: instead of analyzing a live process or a static file, this works from a memory snapshot (.vmem) captured from an already-infected machine.

Step 1 — Identify the right profile

vol.py -f great.vmem kdbgscan | more

This suggests one or more candidate OS profiles. The first suggestion isn't guaranteed to be correct — try it, and if Volatility throws errors like "need base" or "No Base Address Space", that profile doesn't match and you move to the next candidate:

vol.py -f great.vmem --profile=Win10x86 pslist

Once a profile returns clean, readable process output instead of errors, lock it in for the rest of the session:

export VOLATILITY_PROFILE=Win10x86

pslist output itself is worth scanning closely here — a process with an unusual, non-standard-looking name stood out immediately as worth investigating further.

Step 2 — Pull command-line history and dump suspicious process memory

vol.py -f great.vmem cmdline | more

This surfaced a cmd.exe invocation running a batch file out of %Temp% with a randomized filename — code running from the Temp folder with a random name is a strong red flag on its own.

vol.py -f great.vmem memdump -p <PID> -D /tmp
strings /tmp/<PID>.dmp | grep -B3 -A3 <batch-filename>

The surrounding strings matched typical batch-file syntax — consistent with a self-deleting cleanup script (delete the dropped executable, then delete itself), a very common malware self-cleanup pattern.

Step 3 — Find injected code across the whole memory image

vol.py -f great.vmem malfind -D /tmp > malfind.txt
scite malfind.txt &

malfind scans the entire memory image for telltale signs of injected code (executable memory regions with suspicious characteristics, frequently starting with the MZ signature of a PE header) and dumps each one it finds. In this case it flagged several legitimate-looking processes — explorer.exe and a couple of others — as containing injected PE content, each at a different memory address. Several of the dumped files were exactly the same size, hinting they're likely the same payload injected repeatedly into different processes.

A quick static check on one of those dumped files with a couple of additional command-line tools (string extraction, automated triage) turned up the same suspicious indicators seen earlier — references to a known risky DLL associated with silent file downloads, and string patterns matching the cleanup batch file extracted earlier. That overlap is good corroborating evidence that this is the same malware family operating across multiple injected processes.

Step 4 — Confirm hooked functions with apihooks

vol.py -f great.vmem apihooks -p <PID> --skip-kernel > apihooks.txt

Scrolling past the IAT-based entries (commonly false positives) to the first Inline/Trampoline entry revealed a hooked ntdll.dll!LdrLoadDll, patched with the same PUSH/RET redirection technique identified earlier via static analysis — confirming that what was theorized from the binary alone is actually happening at runtime, in memory.

Step 5 — Connect the hook target back to an extracted dump

The hook redirected execution to a small address range. Using pslist again to find the virtual offset of the specific process being investigated let me narrow down, among all the files malfind had extracted, which one's address range actually encompassed that hook target — confirming exactly which extracted memory dump contains the code the hijacked function jumps into.


How to Verify Your Work

A checklist for confirming each major milestone in this kind of analysis:

  • [ ] Packed? Confirm via at least two independent signals (unusual section names + entropy, or + reduced imports/strings) before concluding a file is packed
  • [ ] Unpacked successfully? The dumped file should show more imports and strings than the packed version — but also try actually running it; a clean static profile doesn't guarantee a runnable binary
  • [ ] OEP correct? After jumping to a suspected OEP, confirm via string references and intermodular call references — both should increase noticeably compared to the packed view
  • [ ] Shellcode extracted correctly? Run it through an emulator (scdbg) first before live-running it, even in an isolated VM
  • [ ] Injection confirmed? Look for the full chain — process enumeration, handle acquisition, RWX memory allocation, memory write, remote thread creation — not just one piece in isolation
  • [ ] Hooking confirmed? Cross-check static findings (IDA) against live/memory evidence (Volatility's apihooks) when both are available
  • [ ] Memory artifacts make sense together? File sizes, address ranges, and process offsets from different Volatility plugins (malfind, apihooks, pslist) should agree with each other

What I Learned

A few things stuck with me after this session:

  • Automated unpackers fail more often than they succeed against real malware, because authors specifically anticipate and break them. Knowing the manual debugger-based path (find OEP → breakpoint → dump → fix IAT) isn't optional knowledge, it's the default expectation.
  • A successful technical unpack doesn't always mean a fully functional standalone binary. I had a moment where the "fixed" dump loaded better in PeStudio but still didn't run correctly — that gap between "looks unpacked" and "behaves correctly" is a distinction I won't forget.
  • Multi-stage droppers chain together completely different technology stacks specifically to break single-tool analysis. JavaScript → PowerShell → shellcode means no one tool sees the whole picture; you have to follow the thread through three completely different toolchains (SpiderMonkey, then Notepad++/PowerShell ISE, then a shellcode debugger).
  • The same low-level techniques (RWX memory, PUSH/RET redirection) show up again and again across unrelated samples. Once you've pattern-matched the code injection sequence once (enumerate → open → allocate → write → thread), you start recognizing it instantly elsewhere.
  • Static analysis and memory forensics genuinely complement each other. IDA told me what the code is capable of; Volatility told me that it actually did it, on a real infected system. Neither alone gives you the full confidence the combination does.

Common Mistakes

Mistake Why It Happens How to Avoid It
Assuming UPX (or any packer) can always be unpacked with the standard tool Authors deliberately corrupt headers to break generic unpackers Always have a manual debugger-based fallback ready
Forgetting to disable ASLR before debugging Default behavior on modern Windows Run setdllcharacteristics -d (or equivalent) before setting breakpoints by address
Trusting a dumped file just because PeStudio shows more imports More imports indicates partial success, not full functional correctness Actually try running the dumped/fixed binary, and watch for expected side effects (dropped files, registry changes)
Using strings without --encoding=l on UTF-16 obfuscated scripts Many obfuscation toolkits output UTF-16 by default If a deobfuscator throws an encoding/illegal-character error, check the source encoding first
Picking the wrong Base64 blob from a dump tool's output Obfuscated scripts often contain several short, irrelevant Base64-looking strings Sort by decoded size and check for actual readable ASCII content in the decode preview
Trying the first Volatility profile suggestion and giving up if it errors kdbgscan often suggests multiple plausible profiles Treat profile errors as informative, not blocking — try the next suggested profile
Treating IAT hook entries from apihooks as real hooks IAT-style entries are common false positives in Volatility's hook detection Specifically look for "Inline/Trampoline" hook type entries, which are far more reliable indicators
Analyzing shellcode by directly running it without emulating first Skips a safe verification step Run through scdbg (emulation) before live execution, even in an isolated VM

Conclusion

Going from "this file is packed" to "I understand exactly how it injects code, hooks APIs, and what it left behind in memory" took a genuinely long chain of tools and techniques — and that's honestly the most realistic takeaway here. Real malware analysis is rarely a single tool giving you a single clean answer. It's PeStudio pointing you toward a hypothesis, a debugger confirming it, IDA explaining the why, and Volatility proving it actually happened on a real system.

If you're working through similar material, my biggest piece of advice is: don't skip the verification steps. It's tempting to declare victory the moment a tool produces some output, but the real confidence comes from cross-checking — static findings against dynamic behavior, debugger observations against memory forensics, one tool's output against another's.

If you found this useful, I'm planning to keep documenting more of this kind of hands-on analysis work — let me know in the comments if there's a specific technique here you'd like a deeper dive into.