Memory Forensics: Code Injection and Extraction - Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (2011)

Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (2011)

Chapter 15. Memory Forensics: Code Injection and Extraction

Malware leverages code injection to perform actions from within the context of another process. By doing so, the malware can force a legitimate process to perform actions on its behalf, such as downloading additional trojans or stealing information from the system. Attackers can inject code into a process in many ways, such as writing to the remote process’s memory directly or adding a registry key that makes new processes load a DLL of the attacker’s choice. This chapter discusses how you can determine if any processes on the system are victims of code injection, and if so, how you can extract the memory segments that contain malicious code.

Investigating DLLs

Every _EPROCESS structure contains a member called the PEB (Process Environment Block). The PEB contains the full path to the process executable, the full command line used to start the process, the current working directory, and three doubly linked lists that contain the full path to DLLs loaded by the process. All three lists should contain the same DLLs, but ordered differently depending on their position in memory (InMemoryOrderModuleList), when they were loaded (InLoadOrderModuleList), and when they initialized (InInitializationOrderList).

To enumerate the loaded DLLs in a process, you can parse the three doubly linked lists. Using WinDbg (once again on an XP system for our examples), you can see that at offset 0xC of the PEB there is a member named Ldr, which is a PEB_LDR_ DATA structure. As shown in the following code, theLdr structure contains three doubly linked lists of type LDR_DATA_TABLE_ENTRY where you can find the DLL base address, size, and name.

kd> dt _PEB

ntdll!_PEB

+0x000 InheritedAddressSpace : UChar

+0x001 ReadImageFileExecOptions : UChar

+0x002 BeingDebugged : UChar

+0x003 SpareBool : UChar

+0x004 Mutant : Ptr32 Void

+0x008 ImageBaseAddress : Ptr32 Void

+0x00c Ldr : Ptr32 _PEB_LDR_DATA

+0x010 ProcessParameters : Ptr32 _RTL_USER_PROCESS_PARAMETERS

[...]

kd> dt _PEB_LDR_DATA

ntdll!_PEB_LDR_DATA

+0x000 Length : Uint4B

+0x004 Initialized : UChar

+0x008 SsHandle : Ptr32 Void

+0x00c InLoadOrderModuleList : _LIST_ENTRY

+0x014 InMemoryOrderModuleList : _LIST_ENTRY

+0x01c InInitializationOrderModuleList : _LIST_ENTRY

+0x024 EntryInProgress : Ptr32 Void

kd> dt _LDR_DATA_TABLE_ENTRY

ntdll!_LDR_DATA_TABLE_ENTRY

+0x000 InLoadOrderLinks : _LIST_ENTRY

+0x008 InMemoryOrderLinks : _LIST_ENTRY

+0x010 InInitializationOrderLinks : _LIST_ENTRY

+0x018 DllBase : Ptr32 Void

+0x01c EntryPoint : Ptr32 Void

+0x020 SizeOfImage : Uint4B

+0x024 FullDllName : _UNICODE_STRING

+0x02c BaseDllName : _UNICODE_STRING

[...]

Table 16-1 contains a list of PEB members that we’ll discuss further in the recipes that follow.

Table 16-1: Important members of the PEB

Structure member

Description

PEB.ProcessParameters.CommandLine

The command line parameters passed to the process

PEB.ProcessParameters.CurrentDirectory.DosPath

The current working directory for the process

PEB.Ldr.InLoadOrderModuleList

The process’s modules/DLLs – listed in load order

PEB.Ldr.InMemoryOrderModuleList

The process’s modules/DLLs – listed in memory order

PEB.Ldr.InInitializationOrderLinks

The process’s modules/DLLs – listed in initialization order

Recipe 16-1: Hunting Suspicious Loaded DLLs

To print loaded DLLs with Volatility, use the dlllist command. If you do not specify a particular process with the –p argument, then it will print DLLs for all processes. It is important to note that the dlllist command can list DLLs only for active, linked processes. In other words, you cannot use dlllist on processes that have terminated (even if their _EPROCESS structure still exists) or that a rootkit unlinked. Here is an example of what you should see:

$ python volatility.py dlllist -p 820 -f memory.bin

svchost.exe pid: 820

Command line : C:\WINDOWS\system32\svchost -k DcomLaunch

None

Base Size Path

0x1000000 0x6000 C:\WINDOWS\system32\svchost.exe

0x7c900000 0xb0000 C:\WINDOWS\system32\ntdll.dll

0x7c800000 0xf4000 C:\WINDOWS\system32\kernel32.dll

0x77dd0000 0x9b000 C:\WINDOWS\system32\ADVAPI32.dll

0x77e70000 0x91000 C:\WINDOWS\system32\RPCRT4.dll

0x5cb70000 0x26000 C:\WINDOWS\system32\ShimEng.dll

0x6f880000 0x1ca000 C:\WINDOWS\AppPatch\AcGenral.DLL

0x77d40000 0x90000 C:\WINDOWS\system32\USER32.dll

0x77f10000 0x46000 C:\WINDOWS\system32\GDI32.dll

0x76b40000 0x2d000 C:\WINDOWS\system32\WINMM.dll

0x774e0000 0x13c000 C:\WINDOWS\system32\ole32.dll

[...]

Unless you’re looking for a malicious DLL by name, the number of DLLs loaded into a given process may overwhelm you. It’s a good idea to view the output from various systems prior to conducting an investigation so you are familiar enough to spot discrepancies. Use the following guidelines to interpret the information; you want to look for:

· DLLs with suspicious names or names that you have never seen before.

· DLLs with common names that are loaded from a non-standard directory (for example C:\WINDOWS\sys\kernel32.dll).

· DLLs that allow access to protected resources or otherwise alter system security. For example, malware can load sfc_os.dll to disable Windows File Protection and pstorec.dll to extract credentials from the Windows Protected Storage.

· Legitimate DLLs that are out of context. For example, ws2_32.dll, winsock32.dll, wininet.dll, and urlmon.dll provide network functionality, which is certainly not malicious per se. However, if you see them loaded into processes, such as notepad.exe, that don’t usually access the Internet, then it might indicate the presence of malware that injects code (with networking dependencies) into processes on the system.

On the other hand, sometimes you will be surprised how easy it is to spot malicious activity based on loaded DLLs. Although it is rare, attackers program bots in Python or Perl and then compile them into executables using py2exe or perl2exe, respectively. This produces a standalone program that does not require the Python or Perl interpreter on a target system. The basic idea is that the compiled executable actually contains the interpreter, and any necessary DLLs that the interpreter needs at runtime. Programs compiled with perl2exe will therefore drop and load a main module named p2x587.dll (5.8.7 is the Perl version number) and various DLLs named according to the Perl modules. For example, if the Perl source code included “use Glob,” then the compiled executable would drop and load Glob.dll. Although it might be quick and easy to write malicious code in Python or Perl, the results stick out like a sore thumb.

$ python volatility.py dlllist -p 1572 -f perl2exebot.vmem

d546d36461fb948 pid: 1572

Command line : 1.tmp

Service Pack 2

Base Size Path

0x400000 0x5000 C:\1.tmp

0x7c900000 0xb0000 C:\WINDOWS\system32\ntdll.dll

0x7c800000 0xf4000 C:\WINDOWS\system32\kernel32.dll

0x77d40000 0x90000 C:\WINDOWS\system32\USER32.dll

0x77f10000 0x46000 C:\WINDOWS\system32\GDI32.dll

0x77c10000 0x58000 C:\WINDOWS\system32\MSVCRT.dll

0x28000000 0xd6000 C:\WINDOWS\TEMP\p2xtmp-1572\p2x587.dll

0x77dd0000 0x9b000 C:\WINDOWS\system32\ADVAPI32.dll

0x77e70000 0x91000 C:\WINDOWS\system32\RPCRT4.dll

0x10000000 0x5000 C:\WINDOWS\TEMP\p2xtmp-1572\Cwd.dll

0x1a50000 0x7000 C:\WINDOWS\TEMP\p2xtmp-1572\Socket.dll

0x1a60000 0x6000 C:\WINDOWS\TEMP\p2xtmp-1572\IO.dll

0x1a70000 0x6000 C:\WINDOWS\TEMP\p2xtmp-1572\Fcntl.dll

0x1e80000 0x6000 C:\WINDOWS\TEMP\p2xtmp-1572\Glob.dll

0x71ab0000 0x17000 C:\WINDOWS\system32\WS2_32.dll

0x71aa0000 0x8000 C:\WINDOWS\system32\WS2HELP.dll

0x71a50000 0x3f000 C:\WINDOWS\System32\mswsock.dll

0x76F16000 0x27000 C:\WINDOWS\system32\DNSAPI.dll

0x76fb0000 0x8000 C:\WINDOWS\System32\winrnr.dll

0x76f60000 0x2c000 C:\WINDOWS\system32\WLDAP32.dll

0x76fc0000 0x6000 C:\WINDOWS\system32\rasadhlp.dll

The malware used in the example is a variant of Zbot, which you can read more about on the ThreatExpert website.1

1 http://www.threatexpert.com/report.aspx?md5=26dc4f3221c7b5a3252fb33379d88a0a

Recipe 16-2: Detecting Unlinked DLLs with ldr_modules

The PEB for a process exists in user mode. Therefore, it is possible for a process to hide the DLLs it has loaded by unlinking entries from one or more of the three module lists. The act of unlinking DLLs is similar to the DKOM attack described in Recipe 15-6, except because the lists exist in user mode, it does not require kernel-level privileges. This technique is demonstrated by CloakDLL2 and NtIllusion,3 and is discussed with source code examples in an OpenRCE post.4 When malware unlinks a DLL, tools such as listdlls.exe, Process Explorer, Process Hacker, and even Volatility’s default dlllist command will not show the unlinked DLL. This recipe describes a method of detecting the malicious behavior by comparing the PEB lists with data in the VAD.

LoadLibrary and Mapped Files

To understand how you can detect unlinked DLLs, consider some of the first actions performed by LoadLibrary:

· Opens a handle to the DLL on disk using ZwCreateFile

· Creates a section (virtual memory block) associated with the file handle using ZwCreateSection

· Copies the contents of the file into the section using ZwMapViewOfSection

As a result of these actions, the kernel stores information that links the newly created section with its associated file (the DLL). By checking each allocated memory range in a process to see if it contains a mapped file (and if so, the name of the file), you can detect DLLs that are loaded in a process, even if there’s no entry for the DLL in the process’s PEB. The kernel stores the information you need in the VAD (Virtual Address Descriptor).

Brief Introduction to the VAD

The VAD is an excellent forensic resource because you can use it to determine which memory ranges are accessible in a given process’s virtual address space. When a process allocates memory with VirtualAlloc, the memory manager creates an entry in the VAD tree. Along with information such as the starting and ending addresses of the allocated memory block, the VAD contains some nested structures that, if present, can identify which file is mapped into the memory region.

The following WinDbg output shows the relevant data structures. We explain the VAD more thoroughly in Recipe 16-3, so for now, just know that if the VAD for a given memory range contains non-NULL ControlArea and ControlArea.FilePointer members, that means the memory range contains a mapped file.

kd> dt _MMVAD

nt!_MMVAD

+0x000 StartingVpn : Uint4B

+0x004 EndingVpn : Uint4B

+0x008 Parent : Ptr32 _MMVAD

+0x00c LeftChild : Ptr32 _MMVAD

+0x010 RightChild : Ptr32 _MMVAD

+0x014 u : __unnamed

+0x018 ControlArea : Ptr32 _CONTROL_AREA

+0x01c FirstPrototypePte : Ptr32 _MMPTE

+0x020 LastContiguousPte : Ptr32 _MMPTE

+0x024 u2 : __unnamed

kd> dt _CONTROL_AREA

nt!_CONTROL_AREA

+0x000 Segment : Ptr32 _SEGMENT

+0x004 DereferenceList : _LIST_ENTRY

+0x00c NumberOfSectionReferences : Uint4B

+0x010 NumberOfPfnReferences : Uint4B

+0x014 NumberOfMappedViews : Uint4B

+0x018 NumberOfSubsections : Uint2B

+0x01a FlushInProgressCount : Uint2B

+0x01c NumberOfUserReferences : Uint4B

+0x020 u : __unnamed

+0x024 FilePointer : Ptr32 _FILE_OBJECT

+0x028 WaitingForDeletion : Ptr32 _EVENT_COUNTER

+0x02c ModifiedWriteCount : Uint2B

+0x02e NumberOfSystemCacheViews : Uint2B

kd> dt _FILE_OBJECT

ntdll!_FILE_OBJECT

+0x000 Type : Int2B

+0x002 Size : Int2B

+0x004 DeviceObject : Ptr32 _DEVICE_OBJECT

+0x008 Vpb : Ptr32 _VPB

+0x00c FsContext : Ptr32 Void

+0x010 FsContext2 : Ptr32 Void

+0x014 SectionObjectPointer : Ptr32 _SECTION_OBJECT_POINTERS

+0x018 PrivateCacheMap : Ptr32 Void

+0x01c FinalStatus : Int4B

+0x020 RelatedFileObject : Ptr32 _FILE_OBJECT

+0x024 LockOperation : UChar

+0x025 DeletePending : UChar

+0x026 ReadAccess : UChar

+0x027 WriteAccess : UChar

+0x028 DeleteAccess : UChar

+0x029 SharedRead : UChar

+0x02a SharedWrite : UChar

+0x02b SharedDelete : UChar

+0x02c Flags : Uint4B

+0x030 FileName : _UNICODE_STRING

+0x038 CurrentByteOffset : _LARGE_INTEGER

+0x040 Waiters : Uint4B

+0x044 Busy : Uint4B

+0x048 LastLock : Ptr32 Void

+0x04c Lock : _KEVENT

+0x05c Event : _KEVENT

+0x06c CompletionContext : Ptr32 _IO_COMPLETION_CONTEXT

Based on this information, all DLLs loaded with LoadLibrary will result in a VAD structure that associates the DLL’s load address (StartingVpn) in memory with its file on disk (ControlArea.FilePointer.FileName). When malware unlinks a DLL from one or more of the PEB lists, it doesn’t affect the data in the VAD. Therefore, when performing an investigation, you can enumerate the memory-mapped files in a process and compare them with the lists in the PEB. If the VAD reports any DLLs that the PEB fails to mention, then the DLL is likely unlinked.

The Hiding Effect

To test unlinked DLL detection, we compiled a program called unlinker.exe using source code snippets from the proof-of-concept kits mentioned earlier. It unlinks the entry for kernel32.dll from all three PEB lists. After executing unlinker.exe, you can use listdlls.exe on the live Windows machine to list the loaded DLLs:

C:\>listdlls.exe unlinker.exe

ListDLLs v2.25 - DLL lister for Win9x/NT

Copyright (C) 1997-2004 Mark Russinovich

Sysinternals - www.sysinternals.com

-----------------------------------------------------------------------

unlinker.exe pid: 2368

Command line: "C:\unlinker.exe"

Base Size Version Path

0x00400000 0x13000 C:\unlinker.exe

0x7c900000 0xb2000 5.01.2600.5755 C:\WINDOWS\system32\ntdll.dll

As expected, the tool does not report kernel32.dll, because the tool enumerates DLLs by walking the lists in the PEB. We’re not picking on listdlls.exe—almost all utilities you can run on a live machine (with exception of Vmmap, which is discussed next) enumerate DLLs using the PEB lists.

Using Vmmap to View DLLs

You can verify that kernel32.dll is, in fact, loaded in unlinker.exe by using Vmmap (see Figure 16-1). The Vmmap program is able to report the loaded DLL because it does not rely on the PEB lists. Instead, it calls ZwQueryVirtualMemory with the MemoryBasicInformation and MemorySectionName flags to obtain details about every allocated memory segment in the process. By using this native API function, Vmmap gets read access to members of the VAD, including the FILE_OBJECT structure, which contains the mapped file name.

Figure 16-1: You can use Vmmap to view memory-mapped images/DLLs

f1601.eps

Using the Volatility ldr_Modules plug-in

You can use the ldr_modules plug-in for Volatility to inspect discrepancies between the PEB lists and the VAD. The plug-in shows the base addresses and full paths to all mapped executables in a process. It displays a column for each of the three PEB lists (abbreviated InLoad, InInit, and InMem), which contain True or False based on whether a DLL with the same base address exists in the list. You can render output in text or HTML. If you use the HTML output, then the plug-in will highlight entries that are missing from the PEB lists, making it easier to spot discrepancies. We use the command in the following manner:

$ python volatility.py ldr_modules –f unlinker.bin --output=html

--output-file=report.html -p 2368

When you open the report, you should see something similar to what is shown in Figure 16-2.

Figure 16-2: Using ldr_modules to investigate unlinked DLLs

f1602.tif

Here you can see that the process’s main module (unlinker.exe) is mapped at 0x00400000. The InLoad and InMem lists contain an entry for unlinker.exe, but the InInit list does not. This is completely normal—the initialization order list does not count the process’s main module (*.exe) as an entry, whereas the others do. However, the output also shows that kernel32.dll is missing from all three PEB lists.

Limitations of ldr_Modules

There are two main arguments about the method that ldr_modules uses for detection. First, a rootkit can use DKOM and overwrite members of the VAD after unlinking a DLL from the lists in the PEB. Then it will appear as if there is no memory-mapped file. For example, during our testing, we performed the following steps:

1. Used Vmmap to find the memory segment associated with a given DLL in a process

2. Located the VAD structure in kernel memory for the DLL

3. Overwrote the ControlArea value of the VAD structure with a NULL pointer

4. Refreshed the Vmmap output. As a result of our change to the ControlArea value, Vmmap reported that the type of the memory segment (see the Type column of Figure 16-1) was Other rather than Image. In addition, the Details column of Vmmap’s output, which used to store the path to user32.dll, became empty.

5. Verified that the cmd.exe process remained running and that user32.dll was still accessible in the memory of cmd.exe

Due to our testing, we know it’s possible for malware to modify specific members of the VAD structures without causing short-term instability issues for the process. Our testing did not analyze long-term effects, such as what might happen if the memory manager tries to page some of the memory-mapped DLL back to disk (and can’t find out which file it belongs to). Either way, modifying the VAD structures would require a kernel rootkit rather than one that works completely in user mode. Thus, it would require more work on the attacker’s part to produce reliable and portable code. You can find more information on VAD data modification in the article titled “Hidden Dynamic-Link Library Detection Test.”5

The second argument about the method used by ldr_modules is that it is possible to load DLLs into a process without using LoadLibrary (see “Reflective DLL Injection”), which does not create a mapped file in the VAD or any entries in the PEB. However, it leaves various other artifacts that you can detect by exploring the page protections for the memory allocated by the reflective loader.

2 http://www.battleforums.com/forums/diablo-hacking/104427-cloakdll-cpp.html

3 http://rootkit.com/board_project_fused.php?did=proj22

4 http://www.openrce.org/blog/view/844/How_to_hide_dll

5 http://www.ntinternals.org/dll_detection_test.php

Code Injection and the VAD

As previously discussed, the VAD is an excellent source of forensic information. In this section, we’ll leverage data in the VAD to hunt down hidden and injected code. In particular, you’ll learn how to identify suspicious memory segments based on VAD attributes, how to scan process memory with YARA signatures, and how to interpret artifacts left by API-hooking malware.

Recipe 16-3: Exploring Virtual Address Descriptors (VAD)

In this recipe, we’ll cover more of the VAD and how you can use it in your malware investigations. To learn more about the VAD, you should review a paper called The VAD tree: A process-eye view of physical memory6 by Brendan Dolan-Gavitt. As Brendan explains in his paper, the VAD is known as a “self-balancing binary tree” whereby at any given node, memory addresses lower than the address of the current node can be found at the left of the tree and higher addresses can be found at the right. A process’s _EPROCESS structure contains a member named VadRoot, which points to the base of the tree. There are a few VAD related commands that you can use in Volatility:

· vadinfo: prints verbose information containing the VADs attributes, mapped files, and properties.

· vadwalk: prints basic information about the VADs and outputs data in text columns.

· vadtree: prints basic information about the VADs and outputs data in tree format (also supports rendering in Grapvhiz dot format).

The VAD commands in Volatility start reading from a process’s VadRoot and print details about each accessible memory range. The following command shows how to use vadtree to generate a Graphviz dot file for the process with Pid 680:

$ python volatility.py vadtree -f memory.bin –p 680 --output=dot

--output-file=vad.html

When you open the resulting file in Graphviz, you’ll see an image similar to what is shown in Figure 16-3. Each node in the figure contains either two or three boxes; from top to bottom these mean:

· First box: The tag (Vad, Vadl, or VadS) associated with the pool that contains the VAD structure and the address in kernel memory where the structure exists.

Figure 16-3: A process’s VAD tree in Graphviz

f1603.eps

· Second box: The starting and ending virtual addresses in the process’s memory space

· Third box (if applicable): The name of a memory-mapped file or image. This information is only available if the tag is type “Vad” or “Vadl” and if there is actually a file mapped into the range.

The tag is very important because it identifies the type of VAD structure stored within the pool. There are three types of VAD structures, shown here from smallest to largest in size:

· “VadS” is type _MMVAD_SHORT

· “Vad” is type _MMVAD

· “Vadl” is type _MMVAD_LONG

Each larger type of VAD structure builds on the smaller one. In Brendan’s publication, he explains several differences between the structures, but the most important aspect is that _MMVAD_SHORT structures are the only ones that do not contain a nested _CONTROL_AREA structure. The memory manager automatically chooses which type of VAD structure to use based on the purpose of the allocated memory. For example, if the memory needs to store a mapped file, then the system will choose one of the larger VAD structures so that it can store information about the mapped file. You can view the different VAD structures with WinDbg using the following commands:

kd> dt _MMVAD_SHORT

nt!_MMVAD_SHORT

+0x000 StartingVpn : Uint4B

+0x004 EndingVpn : Uint4B

+0x008 Parent : Ptr32 _MMVAD

+0x00c LeftChild : Ptr32 _MMVAD

+0x010 RightChild : Ptr32 _MMVAD

+0x014 u : __unnamed

kd> dt _MMVAD

nt!_MMVAD

+0x000 StartingVpn : Uint4B

+0x004 EndingVpn : Uint4B

+0x008 Parent : Ptr32 _MMVAD

+0x00c LeftChild : Ptr32 _MMVAD

+0x010 RightChild : Ptr32 _MMVAD

+0x014 u : __unnamed

+0x018 ControlArea : Ptr32 _CONTROL_AREA

+0x01c FirstPrototypePte : Ptr32 _MMPTE

+0x020 LastContiguousPte : Ptr32 _MMPTE

+0x024 u2 : __unnamed

kd> dt _MMVAD_LONG

nt!_MMVAD_LONG

+0x000 StartingVpn : Uint4B

+0x004 EndingVpn : Uint4B

+0x008 Parent : Ptr32 _MMVAD

+0x00c LeftChild : Ptr32 _MMVAD

+0x010 RightChild : Ptr32 _MMVAD

+0x014 u : __unnamed

+0x018 ControlArea : Ptr32 _CONTROL_AREA

+0x01c FirstPrototypePte : Ptr32 _MMPTE

+0x020 LastContiguousPte : Ptr32 _MMPTE

+0x024 u2 : __unnamed

+0x028 u3 : __unnamed

+0x030 u4 : __unnamed

To view detailed information about process memory, you can use the vadinfo command. The following output shows the details for the top two VAD nodes from Figure 16-3.

$ python volatility.py vadinfo -p 680 -f memory.bin

[...]

VAD node @821b9e60 Start 7ffab000 End 7ffabfff Tag Vadl

Flags: NoChange, PrivateMemory, MemCommit

Commit Charge: 1 Protection: 4

First prototype PTE: 00000000 Last contiguous PTE: 00000000

Flags2: LongVad, OneSecured

File offset: 00000000

Secured: 7ffab000 - 7ffabfff

Pointer to _MMEXTEND_INFO (or _MMBANKED_SECTION ?): 00000000

VAD node @821c3d18 Start 7c900000 End 7c9b1fff Tag Vad

Flags: ImageMap

Commit Charge: 5 Protection: 7

ControlArea @823c72d8 Segment e14cdcc8

Dereference list: Flink 00000000, Blink 00000000

NumberOfSectionReferences: 1 NumberOfPfnReferences: 105

NumberOfMappedViews: 30 NumberOfSubsections: 5

FlushInProgressCount: 0 NumberOfUserReferences: 31

Flags: Accessed, HadUserReference, DebugSymbolsLoaded, Image, File

FileObject @823e5f90 (023e5f90), Name: \WINDOWS\system32\ntdll.dll

WaitingForDeletion Event: 00000000

ModifiedWriteCount: 0 NumberOfSystemCacheViews: 0

First prototype PTE: e14cdd00 Last contiguous PTE: fffffffc

Flags2: Inherit

File offset: 00000000

[...]

The first VAD node, which exists at 821b9e60 in kernel memory, describes the addresses in range 7ffab000–7ffabfff of the process. The second VAD node at 821c3d18 describes the addresses in range 7c900000–7c9b1fff. Based on the tags (“Vadl” and “Vad,” respectively), a _CONTROL_AREA structure is available for both nodes, but it is only used in the second—to identify the memory-mapped image of ntdll.dll. Many other fields in the vadinfo output are useful to you in an investigation, especially the protection, which we describe in the next recipe.

6 http://dfrws.org/2007/proceedings/p62-dolan-gavitt.pdf

Recipe 16-4: Translating Page Protections

The field that you see named “Protection” in the vadinfo output describes what type of access is permitted on the memory region. The protection value is derived from the flProtect parameter that a process passes to VirtualAlloc. We said derived, because the value that you find in a memory dump is not the exact same as the flProtect value. This recipe shows you how to perform the translation. Before we begin, here is the function prototype for VirtualAlloc:

LPVOID WINAPI VirtualAlloc(

__in_opt LPVOID lpAddress,

__in SIZE_T dwSize,

__in DWORD flAllocationType,

__in DWORD flProtect

);

The flProtect parameter can be one of the following values, which are defined in WinNt.h. You can find explanations of the values on the Memory Protection Constants page of MSDN, but most of them are self-explanatory.

#define PAGE_NOACCESS 0x01

#define PAGE_READONLY 0x02

#define PAGE_READWRITE 0x04

#define PAGE_WRITECOPY 0x08

#define PAGE_EXECUTE 0x10

#define PAGE_EXECUTE_READ 0x20

#define PAGE_EXECUTE_READWRITE 0x40

#define PAGE_EXECUTE_WRITECOPY 0x80

#define PAGE_GUARD 0x100

#define PAGE_NOCACHE 0x200

#define PAGE_WRITECOMBINE 0x400

One of the protection values in the vadinfo output is 7; however, there is no corresponding definition for that value in WinNt.h. Although the header file has definitions for 4, 2, and 1 (which equals 7), you cannot combine memory protection constants. In fact, combining 4, 2, and 1 would not make any sense, because it would indicate a page is marked as read/write, read-only, and no-access at the same time.

To interpret the protection field from the vadinfo output, you need to perform a translation between the values that user mode programs pass to VirtualAlloc and the values that the kernel stores in the VAD structures. Consider the following program that allocates memory using a few possible page protections and prints the allocated address:

#define VA(x) VirtualAlloc(NULL, 0x1000, MEM_COMMIT, x)

int _tmain(int argc, _TCHAR* argv[])

{

// Allocate memory with various protections and print

// the base address of the allocated region

printf("PAGE_EXECUTE: %08x\n",

VA(PAGE_EXECUTE));

printf("PAGE_EXECUTE_READ: %08x\n",

VA(PAGE_EXECUTE_READ));

printf("PAGE_EXECUTE_READWRITE: %08x\n",

VA(PAGE_EXECUTE_READWRITE));

// Sleep so we can dump memory before the proc exits

Sleep(INFINITE);

return 0;

}

Example output:

C:\> ProtectTest.exe

PAGE_EXECUTE: 00370000

PAGE_EXECUTE_READ: 00380000

PAGE_EXECUTE_READWRITE: 00390000

After running this program, dump memory of the target system and use vadinfo to find the VAD node for each of the three allocated regions.

$ python volatility.py vadinfo -p 3340 -f alloc.bin

[...]

VAD node @81f7cc98 Start 00370000 End 00370fff Tag VadS

Flags: PrivateMemory, MemCommit

Commit Charge: 1 Protection: 2

VAD node @81efaae0 Start 00380000 End 00380fff Tag VadS

Flags: PrivateMemory, MemCommit

Commit Charge: 1 Protection: 3

VAD node @82308448 Start 00390000 End 00390fff Tag VadS

Flags: PrivateMemory, MemCommit

Commit Charge: 1 Protection: 6

[...]

The protection value for the memory range starting at 00370000 is 2, although we allocated it as PAGE_EXECUTE, which has a value of 0x10. In order to translate the value of 2 into its original 0x10 counterpart, we have to use 2 as an index in the translation table, which is stored at a symbol named MmProtectToValue (we found this on Ivanlef0u’s blog7) in the kernel executive module. Remember to start counting at 0 and not 1 . . .

kd> dd nt!MmProtectToValue

805514e8 00000001 00000002 00000010 00000020

805514f8 00000004 00000008 00000040 00000080

80551508 00000001 00000202 00000210 00000220

80551518 00000204 00000208 00000240 00000280

80551528 00000001 00000102 00000110 00000120

80551538 00000104 00000108 00000140 00000180

80551548 00000001 00000302 00000310 00000320

80551558 00000304 00000308 00000340 00000380

There it is! Now you know that whenever you see Protection: 2 in the vadinfo output that the memory is executable, since it was originally allocated with a PAGE_EXECUTE flag. Any attempts to read from or write to the memory range would result in an access violation. Table 16-2 provides a translation for a few of the common protection values.

Table 16-2: Page Protection Translations

Name

WinNt.h

VAD

PAGE_NOACCESS

0x1

0x0

PAGE_READONLY

0x2

0x1

PAGE_EXECUTE

0x10

0x2

PAGE_EXECUTE_READ

0x20

0x3

PAGE_READWRITE

0x4

0x4

PAGE_WRITECOPY

0x8

0x5

PAGE_EXECUTE_READWRITE

0x40

0x6

PAGE_EXECUTE_WRITECOPY

0x80

0x7

Being able to translate the page protections will come in handy when tracking down malicious code that may be hiding in another process. For example, sometimes you may only want to focus on memory ranges marked as executable. This is the theory behind detecting the reflective DLL injection described in Recipe 16-2 (for more information, see “FATKit: Detecting Malicious Library Injection and Upping the ‘Anti’”8 by AAron Walters). It’s also the basis for detecting blocks of shellcode that exist in a process’s memory due to an exploit or due to a trojan such as Zeus, which we’ll explore in the next recipes.

7 http://www.ivanlef0u.tuxfamily.org/?p=39

8 http://www.4tphi.net/fatkit/papers/fatkit_dll_rc3.pdf

Recipe 16-5: Finding Artifacts in Process Memory

Although vadwalk, vadinfo, and vadtree are very useful, they only supply metadata. There is a fourth command, vaddump, which allows access to the actual data contained within the memory ranges, provided it is not paged to disk. This recipe shows a simple example of how to hunt down artifacts in a process’s memory using vaddump. For a similar story, see Malware Forensics: How Ironic Can It Get?9

The Experiment

To begin the example, follow these steps:

1. Log into a website. In our case, we logged into a Gmail account using Firefox. We entered the credentials MySecretUserName and MySecretPass, as shown in Figure 16-4, and clicked Sign in. Of course, the sign on failed, but because Firefox accepted the input and constructed an HTTP request using the credentials, we should be able to find traces of it in Firefox’s memory.

Figure 16-4: Anything you enter into the browser will be saved in the process’s memory

f1604.tif

2. Acquire memory. Dump memory on your testing platform using one of the techniques described in Chapter 15.

3. Identify the target process. Use Volatility’s pslist command to find the process you used to log into the website.

$ python volatility.py pslist -f gmail.bin | grep firefox

Name Pid PPid Thds Hnds Time

firefox.exe 2288 4084 16 333 Fri Jan 08 04:29:10 2010

4. Dump the process’s memory. Use vaddump to extract each segment of the target process’s memory. The following command chooses to dump the memory segments to a directory named outdir.

$ python volatility.py vaddump -f gmail.bin -p 2288 –-dump-dir=outdir

**************************************************************

Pid: 2288

5. What you should find in the output directory is a separate file that contains the data described by each VAD node. Volatility names the files according to the process name, the physical address of the process’s _EPROCESS structure (to distinguish between multiple processes with the same name), the start address of the memory range, and the end address of the memory range.

$ ls outdir | wc -l

316

$ ls -al outdir

[...]

4096 Jan 8 17:43 firefox.exe.21ef640.00010000-00010fff.dmp

4096 Jan 8 17:43 firefox.exe.21ef640.00020000-00020fff.dmp

1048576 Jan 8 17:43 firefox.exe.21ef640.00030000-0012ffff.dmp

12288 Jan 8 17:43 firefox.exe.21ef640.00130000-00132fff.dmp

8192 Jan 8 17:43 firefox.exe.21ef640.00140000-00141fff.dmp

262144 Jan 8 17:43 firefox.exe.21ef640.00150000-0018ffff.dmp

65536 Jan 8 17:43 firefox.exe.21ef640.00190000-0019ffff.dmp

[...]

6. The vaddump command extracted 316 files of various sizes. These are binary files, so we can combine the strings and grep commands in order to find traces of the credentials:

$ strings outdir/* | grep -i secret

MySecretUserName

MySecretp

MySecretU

MySecretPass

MySecretUserNa)

https://mail.google.com/mail?gxlu=MySecretUserName&zx=1262988197643

HTTP:https://mail.google.com/mail?gxlu=MySecretUserName&zx=1262988197643

https://mail.google.com/mail?gxlu=MySecretUserName&zx=1262988210481

The fact that the credentials exist in memory even though Gmail uses an SSL-protected website and the login occurred many minutes ago isn’t a surprise. Jeff Bryner wrote a Python script10 that can extract Gmail message bodies, contact lists, and other artifacts, even if the user logged out of Gmail with the browser. You have to wonder—what else can you find in a process’s memory?

9 http://mnin.blogspot.com/2009/04/malware-forensics-how-ironic-can-it-get.html

10 http://www.jeffbryner.com/code/pdgmail

Recipe 16-6: Identifying Injected Code with Malfind and YARA

dvd1.eps

You can find supporting materials for this recipe on the companion DVD.

The last example showed how you could find particular artifacts in process memory, but it is limited in scope. If you do not know which credentials you are looking for or in which process they might exist, the procedure can become tedious. The malfind plug-in addresses some of these concerns by automating several of the steps involved in identifying suspicious memory ranges based on both the contents of memory and VAD characteristics, and optionally, a configurable list of signatures that you provide in YARA format. Here are a few of the possibilities using malfind:

· Dump memory ranges marked as executable and that do not contain mapped files. This detects a majority of shellcode and DLLs injected into a process by a malicious process.

· Search for bank domains, encryption or hashing constants, IP addresses or hostnames, instruction sequences, regular expressions, case-insensitive strings, or anything you can detect with a YARA signature.

· View hex dumps or disassemblies of suspicious areas of memory for a quick preview of its contents.

· Render output into text or HTML reports.

· Import modules like PEScanner from Recipe 3-8 or one of the antivirus submission modules from Recipe 4-4.

Table 16-3 shows the syntax for the malfind command.

Table 16-3: Malfind Syntax

Syntax

Req/Opt

Description

-f FILENAME, --file=FILENAME

Required

Path to memory dump file

-D DIR, --dump-dir=DIR

Required

Directory to store dumped memory segments

-p PID, --pid=PID

Optional

Process to inspect (if not specified, then all processes are inspected)

-Y YARARULES, --yara-rules=YARARULES

Optional

Path to YARA rules file (if not specified, then malfind only detects injections based on VAD characteristics)

Adding YARA to malfind

We introduced YARA back in Chapter 3 and we have been mentioning it consistently throughout this book. You can pass the same rulesets to malfind as you use in other investigations. However, you should consider creating additional rules for criteria that you expect to find in unpacked memory. In the following example, we create a YARA signature based on the Gmail credentials from the previous recipe and then search for hits in the memory of any process on the system.

rule credentials

{

meta:

description = "Malfind w/ Yara Example"

strings:

$a = "secret" nocase

condition:

any of them

}

You can pass the YARA rules file to malfind like this:

$ python volatility.py malfind –f gmail.bin -p 2288 --dump-dir=outdir

--yara-rules=./example.yara

#

# firefox.exe (Pid: 2288)

#

[!] 0x00030000 - 0x0012ffff (Tag: VadS, Protection: 0x4 - MM_READWRITE)

Dumping to outdir/malfind.2288.30000-12ffff.dmp

YARA rule: credentials

Description: Malfind w/ Yara Example

Hit: MySecretUserName

0x0003315c 4d79536563726574-557365724e616d65 MySecretUserName

0x0003316c e2eff1ffe2eff1ff-e2eff1ffe2eff1ff ................

[!] 0x00e00000 - 0x00efffff (Tag: VadS, Protection: 0x4 - MM_READWRITE)

Dumping to outdir/malfind.2288.e00000-efffff.dmp

YARA rule: credentials

Description: Malfind w/ Yara Example

Hit: MySecretPass

0x00e322a0 4d79536563726574-5061737300000000 MySecretPass....

0x00e322b0 0000000000000000-0000000000000000 ................

[...]

The output shows two suspicious memory ranges in firefox.exe. One is 0x00030000–0x0012ffff and the other is 0x00e00000–0x00efffff. The ranges were marked as suspicious because YARA detected signature hits at offsets within the memory ranges, at 0x0003315c and 0x00e322a0 respectively. The plug-in extracted the contents of both memory ranges to a separate file in the output directory. It is important to note that because the process executable, loaded DLLs, and mapped files all exist in the process’s memory space, there is a corresponding VAD entry for them as well. Therefore, when you use malfind with YARA, the signatures apply to everything.

Finding Injected Code

You can use malfind to hunt down hidden or injected code, even without YARA rules. To perform a typical code injection, malware will call VirtualAllocEx to allocate memory in the target process. This API call leaves artifacts that you can detect by looking at the tags and protections stored in the VAD. To demonstrate, the next example deals with Zeus—one of the most prevalent information-stealing malware families. Zeus has used the same method of code injection since 2006 to achieve a certain level of stealth and to hide from process listings. The following command shows how to use render output in HTML with malfind.

$ python volatility.py malfind -f zeus.vmem --dump-dir=outdir

–-yara-rules=./rules.yara –output=html --output-file=zeus.html

Notice we didn’t supply a --pid this time. In this case, malfind scans the memory of all processes on the system. Your output will appear like the image in Figure 16-5. In particular, you’ll see a header line describing the location of the suspicious memory segment, which includes the process in which it was found, the starting and ending address, the VAD tag, number of YARA hits, and the page protection. Below each header, you’ll find the details, including the name of the YARA rule that was triggered, a hex dump of the content in the memory dump, and information on the dumped PE file per the PEScanner module from Recipe 3-8.

Figure 16-5: Code injected into the System process as a result of Zeus

f1605.tif

Although we only show one entry in Figure 16-5, you will notice that Zeus injects code into all processes on the system except csrss.exe. Zeus avoids csrss.exe because any programming errors within the injected code will cause the target process to crash. In the case of csrss.exe, that would shut down the entire system.

If a PE header exists at the base address of the suspicious memory segment, then malfind uses Volatility’s executable rebuilding functions instead of just dumping a raw copy of the memory. This saves a step or two if you plan on analyzing the injected code in IDA, because the PE file will already be properly structured. Based on the suspicious PE section names in Figure 16-5(.odkx, .itiz, and .ryd), it appears malfind worked as intended. To verify, you can run strings on the dumped files and see that many of the references are for stealing protected storage passwords and performing HTML injection/TAN-grabbing.

$ strings outdir/malfind.4.400000-427fff.dmp

[...]

PStoreCreateInstance

pstorec.dll

IE Cookies:

software\microsoft\internet explorer\main

POST

GetProcAddress

LoadLibraryA

=-=-PaNdA!$2+)(*

&email=

btn=

*<select

*<option selected

*<input *value="

[...]

Conficker and CoreFlood

Conficker and CoreFlood are two other examples of malware that inject code into a target process (albeit, by using completely different methods than Zeus). With these two families, and undoubtedly several others, you will not find a PE header at the base address of the memory segment. This is because Conficker overwrites the entire memory page containing its PE header with zeros. Similarly, CoreFlood actually frees the memory page using VirtualFree. Of course, the point is to make the detection and extraction procedure more difficult. Many dumping utilities such as ProcDump and LordPE will not even recognize these trojans as loaded DLLs, much less be able to determine the required information about sections and sizes (which usually comes from fields in the PE header).

A missing PE header doesn’t mean you’re doomed. You can manually rebuild the PE header after dumping the segments with Volatility (see Recovering CoreFlood Binaries with Volatility11) or even write a plug-in for Volatility that automates the steps (see the video on fixiat.py plug-in12).

The following command uses malfind to locate CoreFlood’s injected code in the memory of Internet Explorer:

$ python volatility.py malfind -f coreflood.vmem --dump-dir=outdir –p 248

#

# IEXPLORE.EXE (Pid: 248)

#

0x7ff80000 - 0x7ffadfff (Tag: VadS, Protection: MM_EXECUTE_READWRITE)

Dumping to outdir/malfind.248.7ff80000-7ffadfff.dmp

Hexdump:

0x7ff80000 81ec20010000538b9c24300100008bc3 .. ...S..$0.....

0x7ff80010 240455f6d856578bbc24340100006805 $.U..VW..$4...h.

Disassembly:

0x7ff80000 sub esp,0x120

0x7ff80006 push ebx

0x7ff80007 mov ebx,[esp+0x130]

0x7ff8000e mov eax,ebx

0x7ff80010 and al,0x4

0x7ff80012 push ebp

0x7ff80013 neg al

0x7ff80015 push esi

0x7ff80016 push edi

0x7ff80017 mov edi,[esp+0x134]

0x7ff8001e push dword 0x105

As you can see, it looks like plain shellcode or an EXE/DLL without a PE header. Because the page protection is executable (MM_EXECUTE_READWRITE), malfind prints a disassembly of a small portion of the code using the pydasm library. If the memory is read-only or read-write, then malfind only prints a hex dump.

API Hook Artifacts

Another artifact that you will frequently see using malfind is the trampoline code created by API-hooking libraries such as Microsoft Detours, Mhook, and any malware using the same common technique of inline/trampoline-style redirection (see Recipe 9-8 for more information and for links to the mentioned tools). The following examples show the output of malfind on two memory dumps (one infected with Silent Banker and one infected with Tigger).

$ python volatility.py malfind -f sb.vmem --dump-dir=outdir –p 1876

#

# IEXPLORE.EXE (Pid: 1876)

#

0x01390000 - 0x01390fff (Tag: VadS, Protection: MM_EXECUTE_READWRITE)

Dumping to out/malfind.1876.1390000-1390fff.dmp

Hexdump:

0x01390000 586805003a016800000000680000807c Xh..:.h....h...|

0x01390010 6868180b105068e7990a10c300000000 hh...Ph.........

Disassembly:

0x01390000 pop eax

0x01390001 push dword 0x13a0005

0x01390006 push dword 0x0

0x0139000b push dword 0x7c800000

0x01390010 push dword 0x100b1868

0x01390015 push eax

0x01390016 push dword 0x100a99e7

0x0139001b ret ; Execution continues at 0x100a99e7

0x01280000 - 0x01280fff (Tag: VadS, Protection: MM_EXECUTE_READWRITE)

Dumping to out/malfind.1876.1280000-1280fff.dmp

Hexdump:

0x01280000 68010000106a016800000a10b8cf4c0a h....j.h......L.

0x01280010 10ffd0c3000000000000000000000000 ................

Disassembly:

0x01280000 push dword 0x10000001

0x01280005 push byte 0x1

0x01280007 push dword 0x100a0000

0x0128000c mov eax,0x100a4ccf

0x01280011 call eax ; Execution continues at 0x100a4ccf

0x01280013 ret

$ python volatility.py malfind -f tigger.vmem --dump-dir=outdir –p 644

#

# explorer.exe (Pid: 644)

#

0x00d70000 - 0x00d70fff (Tag: VadS, Protection: MM_EXECUTE_READWRITE)

Dumping to out/malfind.644.d70000-d70fff.dmp

Hexdump:

0x00d70000 8bff558bec6a1355ff250000d8000000 ..U..j.U.%......

0x00d70010 00000000000000000000000000000000 ................

Disassembly:

0x00d70000 mov edi,edi

0x00d70002 push ebp

0x00d70003 mov ebp,esp

0x00d70005 push byte 0x13

0x00d70007 push ebp

0x00d70008 jmp [0xd80000] ; Execution continues at the address stored at 0xd80000

You might notice that Silent Banker used two different techniques to transfer control to the destination address. In the first example, it used a push/ret combination to arrive at 0x100a99e7. In the second example, it moved the destination address 0x100a4ccf into the eax register and then issued a call eax command. Tigger used yet another technique—an indirect jmp to the address stored at 0xd80000. The point is—regardless of the technique or instruction sets that the malware uses, it does not change the fact that the instructions exist in memory pages marked as executable and that do not already have files mapped into the region. Therefore, these memory segments stand out as suspicious and you can quickly identify them using Volatility with malfind. One component of the puzzle that malfind does not solve in these cases is telling you which API function is hooked. For that, you can use the apihooks plug-in, which is discussed in Chapter 17.

11 http://mnin.blogspot.com/2008/11/recovering-coreflood-binaries-with.html

12 http://mhl-malware-scripts.googlecode.com/files/coreflood_fixiat.mov.zip

Reconstructing Binaries

One of most useful features of Volatility is the ability to dump and rebuild PE files (executables, DLLs, and kernel drivers). Because of changes that occur during execution of a program, it is not likely that you will get an exact copy of the original binary, or even one that will run on another machine. However, the dumped copy should be close enough to the original to allow you to disassemble the malware and determine its capabilities, reverse any algorithms, and so forth.

The smallest page size on a typical 32-bit x86 Windows system is 4,096 bytes. Most PE files have sections that are not exact multiples of the smallest page size. Figure 16-6 shows the effect that this has on reconstructing binaries. The .text section, which is not an exact multiple of 4,096, must fully exist in memory marked as RX (read, execute) and the .data section must fully exist in memory marked as RWX (read, write, execute). Because protections are applied at the page-level (in other words, if a page is marked as executable, then all bytes in the page are executable), the two sections must be separated once loaded into memory. Otherwise, the beginning of the .data section would end up being RX instead of RWX.

The dotted lines in Figure 16-6 indicate page boundaries and the filled-in areas represent slack space due to section sizes that are not multiples of the smallest page size. Thus, if you dump an image in memory directly to disk, your dumped copy will also contain the slack space. In some cases, the slack space will be irrelevant to your investigation, because it will just contain uninitialized data. However, there certainly could be artifacts in slack space

Figure 16-6: Executables expand in memory due to section alignment

f1606.eps

(just like slack space on disk). Volatility can dump images with or without slack space, depending on which command you use (see Recipe 16-7). In general, to rebuild an executable from memory, you need to parse the PE section headers to learn the addresses and sizes of the PE sections. Then, you can carve out the appropriate amount of data from memory and re-combine the sections into a file on disk according to their original positions. For a deeper explanation of the steps involved in rebuilding binaries, see the following resources:

· Andreas Schuster’s multi-part tutorial on reconstructing binaries from memory dumps: http://computer.forensikblog.de/en/2006/04/reconstructing_a_binary.html

· Harlan Carvey’s blog on automatic reconstruction of binaries from memory dumps: http://windowsir.blogspot.com/2006/07/automatic-binary-reassembly-from-ram.html

· Jesse Kornblum’s blog “Recovering Executables from Windows Memory Images:” http://jessekornblum.com/presentations/dodcc07.html

The methods described in the existing publications rely on information in the PE header and don’t attempt to reconstruct the Import Address Table (IAT). Malware samples that erase the entire PE header, relocate the IAT, or that use run-time dynamic linking (which does not leave entries in the IAT at all) cause significant problems. You’ll still be able to dump the binary using the base address and size information from the PE header (if it exists) or the base address and size information from the VAD; however, you won’t be able to tell which API functions the malware calls. In the next few recipes, we present a method to work around these anti-analysis techniques based on scanning the process address space for API calls, without relying on data in the IAT.

Recipe 16-7: Rebuilding Executable Images from Memory

dvd1.eps

You can find supporting materials for this recipe on the companion DVD.

You can use Volatility’s procexedump (do not preserve slack space) or procmemdump (preserve slack space) commands to extract processes from memory. Table 16-4 shows the most important command-line switches. To see all possible switches, pass –-help to one of the commands.

Table 16-4: Procdump Syntax

Syntax

Req/Opt

Description

-f FILENAME, --file=FILENAME

Required

Path to memory dump file

-o OFFSET, --offset=OFFSET

Optional

_EPROCESS offset in physical memory for the process to dump

-p PID, --pid=PID

Optional

Process to dump (if not specified, then all processes are dumped)

-D DIR, --dump-dir=DIR

Optional

Output path for dumped files

The first step is to use pslist or psscan to generate a list of processes. Once you know the PID or _EPROCESS offset for the process that you want to dump, then you can pass it to procexedump or simply leave off the –p parameter to dump all processes. In the following example, we will investigate a system infected with the Laqma trojan. For the sake of brevity, we removed all processes from the output except lanmanwrk.exe (the potential malware sample) and jusched.exe (a legitimate component of Java that we chose at random for some comparisons). You will notice an obvious difference between the ability to rebuild the IAT of these two processes. The difference is often caused by packers or anti-analysis tricks, or simply because the required memory segments were paged to disk at the time of the acquisition.

$ python volatility.py pslist -f laqma.vmem

Name Pid PPid Thds Hnds Time

[...]

jusched.exe 1788 1624 1 26 Thu Sep 18 05:33:02 2008

lanmanwrk.exe 920 612 2 37 Wed Feb 11 20:31:35 2009

$ python volatility.py procexedump -f laqma.vmem --dump-dir=outdir

[...]

************************************************************************

Dumping jusched.exe, pid: 1788 output: executable.1788.exe

************************************************************************

Dumping lanmanwrk.exe, pid: 920 output: executable.920.exe

Now, retrieve the two dumped files and open them in your favorite PE viewer (we like CFF Explorer, as mentioned in Chapter 13). Examine the IAT for executable.1788.exe (originally jusched.exe), and you will notice that it appears to contain the right information. As shown in Figure 16-7, the IAT lists the DLLs required by the process and each API function imported from the respective DLLs.

Figure 16-7: The Legitimate Process’s IAT is Properly Rebuilt.

f1607.eps

Examine the IAT for executable.920.exe (originally lanmanwrk.exe) and you will notice that it contains significantly less information than executable.1788.exe. As shown in Figure 16-8, the IAT of our dumped lanmanwrk.exe contains DLL names, but none of the imported function names.

At this point, you could load the dumped file in IDA Pro and try your best to determine its capabilities without IAT information. Or you could scan the file with multiple antivirus engines to see if they detect anything in the unpacked process image. However, what we typically want to do is perform more thorough reverse-engineering tasks, which requires information about the imported functions. The next recipe describes where to go from here.

Figure 16-8: The malware’s IAT is not rebuilt, perhaps due to packing

f1608.eps

Recipe 16-8: Scanning for Imported Functions with impscan

dvd1.eps

You can find supporting materials for this recipe on the companion DVD.

The reason you should be concerned with an incomplete IAT is that it will hinder your ability to perform a thorough code analysis. If you try to examine the instructions in the dumped file using IDA Pro, then you will see placeholders instead of API calls. For example, Figure 16-7 shows how the start function of the dumped lanmanwrk.exe appears. You can tell it calls two functions, but which two functions does it call? The placeholders (dword_406034 and dword_406030) are locations in the program’s IAT that store the address of an API function at runtime. However, because IDA does not have access to the entire process’s memory, it cannot determine what APIs exist at those addresses in order to label them.

Figure 16-9: Missing IAT information can hinder your analysis in IDA Pro

f1609.tif

The impscan plug-in for Volatility aims to solve the problem of incomplete import tables. As previously mentioned, it is very unlikely that the dumped program will match the original or even execute on another machine. That is fine because all you really need to complete a thorough analysis of the malware’s capabilities is to be able to see which API functions it is calling in the disassembly. Therefore, impscan does not attempt to produce a patched version of the dumped file as Import REConstructor does for live systems (see Recipe 12-10). Instead, it simply provides labels that you can import into IDA Pro. Table 16-5 shows the syntax for impscan.

Table 16-5: Impscan Syntax

Syntax

Req/Opt

Description

-f FILENAME, --file=FILENAME

Required

Path to memory dump file

-D DIR, --dump-dir=DIR

Required

Output directory for dumped files

-k, --kernel

Optional

By specifying this flag, you intend to scan a kernel module. If it is not specified, then you intend to scan a user mode process.

-p PID, --pid=PID

Optional

Process ID that identifies the target process context—it is required for user mode scans. If the –k flag is set, this parameter is ignored.

-a ADDR, --address=ADDR

Optional

Base address to start scanning. If the –k flag is set, this parameter is required. If a valid PE header does not exist at this address, then the –s parameter is also required. For user mode scans, this parameter is not required if you intend to scan the executable image itself. If you intend to scan a DLL or arbitrary memory segment in the target process memory, then this parameter is required.

-s SIZE, --size=SIZE

Optional

Size of memory to scan. This is only required if there is not a PE header at the address specified with the –a parameter.

The following command shows you how to scan the lanmanwrk.exe process for imported functions.

$ python volatility.py impscan -p 920 -f laqma.vmem --dump-dir=outdir

***********************************************************

Kernel & User Mode Import Scanner

#Exports Base DLL

675 77dd0000 \WINDOWS\system32\advapi32.dll

609 77f10000 \WINDOWS\system32\gdi32.dll

117 71ab0000 \WINDOWS\system32\ws2_32.dll

858 77f60000 \WINDOWS\system32\shlwapi.dll

94 5ad70000 \WINDOWS\system32\uxtheme.dll

242 771b0000 \WINDOWS\system32\wininet.dll

1315 7c900000 \WINDOWS\system32\ntdll.dll

23 71aa0000 \WINDOWS\system32\ws2help.dll

514 77e70000 \WINDOWS\system32\rpcrt4.dll

398 77120000 \WINDOWS\system32\oleaut32.dll

76 77fe0000 \WINDOWS\system32\secur32.dll

949 7c800000 \WINDOWS\system32\kernel32.dll

183 773d0000 \WINDOWS\WinSxS\x86_Microsoft.Win[REMOVED]

287 77a80000 \WINDOWS\system32\crypt32.dll

339 774e0000 \WINDOWS\system32\ole32.dll

732 7e410000 \WINDOWS\system32\user32.dll

266 77b20000 \WINDOWS\system32\msasn1.dll

830 77c10000 \WINDOWS\system32\msvcrt.dll

Scanning process memory: 0x400000 - 0x40a000

Imports found: 68

Forward vicinity scan from 0x406000...found 0 new entries

Reverse vicinity scan from 0x408a9c...found 2 new entries

Done. Identified 70 imports!

MakeName(0x406000, "ControlService");

MakeName(0x406004, "RegDeleteValueA");

MakeName(0x406008, "RegCloseKey");

MakeName(0x40600c, "DeleteService");

MakeName(0x406010, "OpenSCManagerA");

MakeName(0x406014, "CreateServiceA");

[...]

impscan works by determining the base address and size of all DLLs in a process. Using pefile, it parses the Export Address Table (EAT) of the DLLs to determine the offsets and names of exported functions (i.e. the APIs). Then, using pydasm, it scans the process executable (or any memory range in the process address space as specified with the –a and –s flags) looking for callor jmp instructions. If the destination of one of the call or jmp instructions leads to an API, then impscan records the address of the instruction and the corresponding API function name.

As shown in the output, impscan produces MakeName statements, which you can transfer into IDA Pro. These statements contain the missing information that IDA needs to link the placeholders presented earlier (e.g., dword_406034) with the name of the API function stored at that address. To apply the labels, click File ⇒ IDC Command, paste in the MakeName statements, and click OK.Figure 16-10 shows how your window should appear.

Figure 16-10: Entering IDC statements into IDA Pro

f1610.tif

Once you have clicked OK, you will immediately see changes applied throughout the program. For example, the call ds:dword_406034 instructions will turn into call ds:CreateThread. You can get even more information out of IDA Pro by choosing to re-analyze the program. Now that IDA can tell which API functions the program is calling, IDA can label arguments accordingly. To do this, click Options ⇒ General ⇒ Analysis ⇒ Reanalyze Program. Your result should appear like Figure 16-11. Note that the figure shows the same start function as Figure 16-9, but with the new changes applied.

Figure 16-11: The malware in IDA Pro after importing IAT information

f1611.tif

Recipe 16-9: Dumping Suspicious Kernel Modules

dvd1.eps

You can find supporting materials for this recipe on the companion DVD.

Windows maintains a doubly linked list of LDR_DATA_TABLE_ENTRY structures that you can use to enumerate the list of loaded modules on a system. If these structures sound familiar, it’s because Windows also uses them to store the list of loaded DLLs in a process (see the Investigating DLLs section at the beginning of this chapter).

The modules command in Volatility prints a list of loaded kernel modules by walking the list of LDR_DATA_TABLE_ENTRY structures. Because of the nature of the doubly linked list, it is possible for malware to unlink entries and hide drivers. However, just as psscan (see Recipe 15-6) provides you with the capability to detect unlinked processes, the modscan2 command gives you the power to detect unlinked kernel modules. Just compare the output between modules and modscan2 and see if there are any discrepancies.

Listing Loaded Modules

The following command shows you how to list loaded modules. In this example, we are using the same memory dump infected with Laqma as described in the previous two recipes. So that each line will fit on the page without wrapping, we removed the size field of the normal output.

$ python volatility.py modules -f laqma.vmem

File Base Name

\WINDOWS\system32\ntkrnlpa.exe 0x00804d7000 ntoskrnl.exe

\WINDOWS\system32\hal.dll 0x00806ce000 hal.dll

\WINDOWS\system32\KDCOM.DLL 0x00f8b9a000 kdcom.dll

\WINDOWS\system32\BOOTVID.dll 0x00f8aaa000 BOOTVID.dll

[...]

\SystemRoot\system32\DRIVERS\srv.sys 0x00f66fd000 srv.sys

\SystemRoot\System32\Drivers\HTTP.sys 0x00f643c000 HTTP.sys

\SystemRoot\system32\drivers\kmixer.sys 0x00f622e000 kmixer.sys

\??\C:\WINDOWS\System32\lanmandrv.sys 0x00f8c52000 lanmandrv.sys

On a typical system, there will be well over 100+ drivers loaded, thus making it difficult to determine which driver is suspicious. Here are a few techniques you can use to spot the needle in the haystack:

· Use the modules command and look near the end of the list to see the most recently loaded driver. This technique is useful if you encounter a machine very shortly after a compromise. Otherwise, and especially if the machine has been rebooted since the infection, you cannot rely on this method.

· Use brute force—dump all drivers and scan them with your favorite antivirus program or your custom YARA signatures.

· Use one of the hook detection plug-ins (apihooks, driverirp, ssdt, idt) to determine which drivers are responsible for the hooks. These plug-ins are introduced in Chapter 17.

· Many kernel drivers are installed by a user mode process, which remains running on the system to communicate with the driver after it has loaded. In these cases, you can examine the user mode process and its memory to try and locate the name of the driver or the name of the device (e.g., \Device\zyyssb)

· Microsoft’s recommended method of installing drivers, which also happens to be the most popular among malware authors, is to use a service. Instead of trying to detect a malicious driver by name, look for new service entries with the svcscan plug-in (see Recipe 17-10), which shows the driver name associated with a service.

Dumping kernel modules

Once you’ve identified a malicious driver, you can use the moddump plug-in to perform the extraction. Table 16-6 shows the syntax (for all options, use moddump --help).

Table 16-6: Moddump Syntax

Syntax

Req/Opt

Description

-f FILENAME, --file=FILENAME

Required

Path to memory dump file

-D DIR, --dump-dir=DIR

Optional

Output directory for dumped files

-o OFFSET, --offset=OFFSET

Optional

Dump module whose base address is OFFSET (hex)

-p REGEX, --pattern=REGEX

Optional

Dump modules whose name matches REGEX

-i, --ignore-case

Optional

Ignore case in pattern matching

If you use moddump without the –o or –p parameters, then it will dump all kernel drivers. Here, we extract the lanmandrv.sys driver using its offset, as you saw in the modules output.

$ python volatility.py moddump –o f8c52000 -f laqma.vmem

Dumping \??\C:\WINDOWS\System32\lanmandrv.sys

(lanmandrv.sys) @f8c52000 => driver.f8c52000.sys

The dumped file (driver.f8c52000.sys) will no doubt suffer from the same incomplete IAT problem as the user mode processes, especially if the driver was initially packed. You can use impscan to help resolve the imports so that IDA can recognize the API calls. Notice that this is nearly the same command used in Recipe 16-8, but with the –k flag for kernel mode and –a flag specifying the base address of lanmandrv.sys.

$ python volatility.py impscan -k -a 0xf8c52000 -f laqma.vmem

--dump-dir=outdir

*********************************************************

Kernel & User Mode Import Scanner

#Exports Base Driver

1485 804d7000 ntoskrnl.exe

92 806ce000 hal.dll

8 f8b9a000 kdcom.dll

[...]

Scanning kernel memory: 0xf8c52000 - 0xf8c53700

Imports found: 13

Forward vicinity scan from 0xf8c53080...found 0 new entries

Reverse vicinity scan from 0xf8c533bc...found 0 new entries

Done. Identified 13 imports!

MakeName(0xf8c53080, "IofCompleteRequest");

MakeName(0xf8c53084, "IoDeleteDevice");

MakeName(0xf8c53088, "IoDeleteSymbolicLink");

MakeName(0xf8c5308c, "IoCreateSymbolicLink");

MakeName(0xf8c53090, "MmGetSystemRoutineAddress");

MakeName(0xf8c53094, "IoCreateDevice");

MakeName(0xf8c53098, "ExAllocatePoolWithTag");

MakeName(0xf8c5309c, "wcscmp");

MakeName(0xf8c530a0, "ZwOpenKey");

MakeName(0xf8c530a4, "_except_handler3");

MakeName(0xf8c533ac, "NtQueryDirectoryFile");

MakeName(0xf8c533b4, "NtQuerySystemInformation");

MakeName(0xf8c533bc, "NtOpenProcess");

Now you can import the MakeName statements into IDA Pro just as we did for the user mode process. The result is a nicely labeled kernel driver (see Figure 16-12), where you can see the names of the devices that it creates and the API calls it makes. In this case, you can even see the KeServiceDescriptorTable string, which usually indicates that the rootkit hooks API functions in the SSDT. Chapter 17 shows you how to detect hooked SSDT functions.

Figure 16-12: The rebuilt kernel driver in IDA Pro

f1612.tif