Disclaimer: This isn’t a vuln! This isn’t even a critical bug. Please no hyping this out of proportions! Kthxbye.
Ghidra has a bug. It was noticed by a few people123 due to its (devastating) effects manifesting when loading samples of the ShadowHammer malware into Ghidra.
In the ShadowHammer’s Setup.exe
a small shellcode, which then unpacked more shellcode from a resource, was appended to the end of the .text
section. However, embedded this shellcode, however, only adjusted the SizeOfRawData
and not the VirtualSize
. This means the .text
section’s layouts can be outlined as (addresses only for illustration):
0x0000 0x1000 0x2000 0x3000 0x4000
|.........|.........|.........|.........|....... . . . .
Before shellcode injection:
< original code ............... >
|-------------------------------| VirtualSize
|-------------------------------| SizeOfRawData
After shellcode injection:
< original code .....X......... ><*> * = shellcode, X = modified call target (to shellcode)
|-------------------------------| VirtualSize (not updated)
|----------------------------------| SizeOfRawData (updated)
The problem is in the way that Ghidra handles the alignment of the sections’ virtual sizes. According to Microsoft[1] sections must be aligned according to the PE SectionAlignment
:
SectionAlignment
The alignment (in bytes) of sections when they are loaded into memory.
It must be greater than or equal to FileAlignment.
The default is the page size for the architecture.
This means the Windows PE loader aligns the VirtualSize
to 0x1000
and thus loads the section as follows:
0x0000 0x1000 0x2000 0x3000 0x4000
|.........|.........|.........|.........|....... . . . .
< original code .....X......... ><*> * = shellcode, X = modified call target (to shellcode)
|-------------------------------| VirtualSize
|----------------------------------| SizeOfRawData
|---------------------------------------| VirtualSize aligned to SectionAlignment
Thus correctly loading the SizeOfRawData
number of bytes of code into the VirtualSize aligned to SectionAlignment
long memory space.
Ghidra 9.0.2 on the other hand does not align the VirtualSize
and thus truncates the code to only feature the original code as follows:
0x0000 0x1000 0x2000 0x3000 0x4000
|.........|.........|.........|.........|....... . . . .
< original code .....X......... ><*> * = shellcode, X = modified call target (to shellcode)
|-------------------------------| VirtualSize
|----------------------------------| SizeOfRawData
|-------------------------------| What Ghidra loads
< original code .....X......... > What you see in Ghidra
Thus when loading the ShadowHammer sample into Ghidra you will only see the call (denoted in the above ASCII arts as X
) to the shellcode at the end of the .text
section, however, because that part of the code was truncated this call points to unknown memory.
Being annoyed by manually fixing the VirtualSize
before being able to properly import said sample into Ghidra, I worked on a patch. However, while doing so I realized the potential this bug had for intentional data hiding.4
So I wrote a little PoC, that:
VirtualSize
.The (shortend) C code is as follows:
#include <stdio.h>
static const char __attribute__ ((section ("ghidera")))
dummy_str[] = "AAA...AA";
// actually 1025 A's to push the following hidden_str over
// the alignment boundary
static const char __attribute__ ((section ("ghidera")))
hidden_str[] = "Ghidra can't see me!";
asm volatile ("nop\nnop\nnop\nnop...");
// a nop sled to push the following code out of the .text section
printf(hidden_str);
return 0;
}
Download the full code here: ghidera.c
After building the above code we need to manipulate the VirtualSize
’s of sections ghidera
and .text
as follows:
0x0000 0x1000 0x2000 0x3000 0x4000
|.........|.........|.........|.........|....... . . . .
Original: (ghidera.exe)
< AAAAAAAAAAAAAAAAAA><Ghi.. > = ghidera section
< nop sled .........><printf> = .text section
|---------------------------| VirtualSize = 0x2734
|---------------------------| SizeOfRawData
Manipulated: (ghidera-patched.exe)
< AAAAAAAAAAAAAAAAAA><Ghi.. > = ghidera section
< nop sled .........><printf> = .text section
|-------------------| VirtualSize = 0x2001
|-------------------|-------| SizeOfRawData
|
|
What Ghidra sees: |
|
< AAAAAAAAAAAAAAAAAA> A long string of A's
< nop sled .........> A nop slep ending in nowhere land
What is actually in memory when loaded by Windows PE loader:
< AAAAAAAAAAAAAAAAAA><Ghi.. > = ghidera section
< nop sled .........><printf> = .text section
|---------------------------| VirtualSize = 0x3000 (correctly aligned)
|---------------------------| SizeOfRawData
Doing this in practice is straight forward:
VirtualSize
header field in Ghidra:Bytes
view:The expected results would look like:
While this isn’t a serious bug (let alone a vulnerability at all) I hope everyone can see how this could frustrate an analyst, especially with malware such as ShadowHammer where the only (obvious) indicator in the parts that Ghidra “sees” is one unresolvable call.
Anyway, my patch to fix this (and a related “issues”) in Ghidra is here: ghidra-peloader-alignment.patch
I hope that patch, first of all, doesn’t break more than it fixes, and second, gets included into the next version.
A video outlining some of this is available at https://www.youtube.com/watch?v=vpt7-Hn-Uhg
[1] Microsoft Docs Contributors, “PE format,” 2019. [Online]. Available: https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format.