The Antivirus Hacker's Handbook (2015)

Part I. Antivirus Basics

Chapter 3. The Plug-ins System

Antivirus plug-ins are small parts of the core antivirus software that offer support for some specific task. They are not typically a core part of the antivirus kernel. The core of the antivirus product loads through various methods and uses them at runtime.

Plug-ins are not a vital part of the core libraries and are intended to enhance the features supported by the antivirus core. They can be considered add-ons. Some example plug-ins include a PDF parser, an unpacker for a specific EXE packer (such as UPX), an emulator for Intel x86, a sandbox on top of the emulator, or a heuristic engine using statistics gathered by other plug-ins. These plug-ins are usually loaded at runtime using manually created loading systems that typically involve decryption, decompression, relocation, and loading.

This chapter covers some loading implementations of typical antivirus plug-ins and analyzes the loading process. Heuristic-based detection algorithms, emulators, and script-based plug-ins will also be covered. After you complete this chapter, you should be able to

· Understand how plug-in loaders work

· Analyze a plug-in's code and know where to look for vulnerabilities

· Research and implement evasion techniques

Understanding How Plug-ins Are Loaded

Each antivirus company designs and implements a completely different way to load its plug-ins. The most common way is to allocate Read/Write/eXecute (RWX) memory pages, decrypt and decompress the plug-in file contents to the allocated memory, relocate the code if appropriate (like Bitdefender does), and finally remove the write (W) privilege from the page or pages. Those new memory pages, which now constitute a plug-in module, are added to the loaded plug-ins list.

Other AV companies ship the plug-ins as Dynamic Link Libraries (DLLs), making the loading process much simpler by relying on the operating system's library loading mechanism (for example, using the LoadLibrary API in Microsoft Windows). In that case, to protect the plug-in's code and logic, the DLLs often implement code and data obfuscation. For example, the Avira antivirus product encrypts all the strings in its plug-in DLLs and decrypts them in memory when the plug-in is loaded (with a simple XOR algorithm and a fixed key stored in the actual plug-in code).

In another example, Kaspersky Anti-Virus uses a different approach to loading plug-ins: the plug-in updates are distributed as object files in the COFF file format and are then linked to the antivirus core.

The following sections discuss the various plug-in loading approaches and their advantages and disadvantages.

A Full-Featured Linker in Antivirus Software

Instead of dynamically loading libraries or creating RWX pages and patching them with the contents of the plug-ins, Kaspersky distributes their updates in the Common Object File Format (COFF). After being decrypted and decompressed, these files are linked together, and the newly generated binary forms the new core, with all of the plug-ins statically linked. From an antivirus design point of view, this method offers low memory usage and faster start-up. On the other hand, it requires Kaspersky developers to write and maintain a full-featured linker.

Note

The Common Object File Format is used to store compiled code and data. COFF files are then used in the final compilation stage—the linking stage—to produce an executable module.

The update files are distributed in the form of many little files with an *.avc extension, for example, base001.avc. These files start with a header like this:

0000 41 56 50 20 41 6E 74 69 76 69 72 61 6C 20 44 61 AVP Antiviral Da

0010 74 61 62 61 73 65 2E 20 28 63 29 4B 61 73 70 65 tabase. (c)Kaspe

0020 72 73 6B 79 20 4C 61 62 20 31 39 39 37 2D 32 30 rsky Lab 1997-20

0030 31 33 2E 00 00 00 00 00 00 00 00 00 00 00 0D 0A 13..............

0040 4B 61 73 70 65 72 73 6B 79 20 4C 61 62 2E 20 31 Kaspersky Lab. 1

0050 36 20 53 65 70 20 32 30 31 33 20 20 31 30 3A 30 6 Sep 2013 10:0

0060 32 3A 31 38 00 00 00 00 00 00 00 00 00 00 00 00 2:18............

0070 00 00 00 00 00 00 00 00 00 00 00 00 0D 0A 0D 0A ................

0080 45 4B 2E 38 03 00 00 00 01 00 00 00 E9 66 02 00 EK.8.........f..

In this example, there is an ASCII header with the banner, “AVP Antiviral Database. (c)Kaspersky Lab 1997-2013”; a padding with the 0x00 characters; the date of distribution (“Kaspersky Lab. 16 Sep 2013 10:02:18”); and more padding with the 0x00 characters. Starting at offset 0x80, the header ends, and actual binary data follows. This binary data is encrypted with a simple XOR-ADD algorithm. After it is decrypted, the data is decompressed with a custom algorithm. After decompression, you have a set of COFF files that are linked together (using routines in the AvpBase.DLL library) so the target operating system can use them.

This approach to loading plug-ins appears to be exclusive to the Kaspersky antivirus kernel. This plug-in loading process is discussed later in this chapter.

Understanding Dynamic Loading

Dynamic loading is the most typical way of loading antivirus plug-ins. The plug-in files are either inside a container file (such as the PAV.SIG file for Panda Antivirus, the *.VPS files for Avast, or the Microsoft antivirus *.VDB files) or spread in many small files (as in the case of Bitdefender). These files are usually encrypted (although each vendor uses a different type of encryption) and compressed, commonly with zlib. The plug-in files are first decrypted, when appropriate (for example, Microsoft does not use encryption for its antivirus database files; they are just compressed), and then loaded in memory. To load them in memory, the antivirus core typically creates RWX pages on the heap, copies the content of each decrypted and decompressed file to the newly created memory page, adjusts the privileges of the page, and, if required, relocates the code in memory.

Reverse-engineering an antivirus product that uses this approach is more difficult than reverse-engineering products that use the static object linking approach (as Kaspersky does), because all the segments are created in different memory addresses each time the core is loaded because of ASLR. This makes reverse-engineering difficult because all the comments, assigned function names, and so on in IDA are not relocated to the new page where the plug-in's code is each time you run the debugger. There are partial solutions to this problem: for example, using the open-source plug-in for IDA “Diaphora” or the commercial Zynamics BinDiff, you can do binary differentiation (also called bindiffing) on the process as-is in memory against a database that contains the comments and the function names.

The bindiffing process allows the reverse-engineer to import names from a previously analyzed IDA database to a new instance of the same (loaded at a different memory address). However, a reverse-engineer needs to run the plug-in code each time the debugger is loaded, which is annoying. There are other open-source approaches such as the IDA plug-in MyNav, which has import and export capabilities that may help you access the plug-in code you need. However, it suffers from the very same problem: a reverse-engineer needs to reload plug-ins for each execution.

Some antivirus kernels do not protect their plug-ins; these plug-ins are simply libraries that can be opened in IDA and debugged. However, this approach is used very rarely—indeed, only in the case of Comodo antivirus.

A Note About Containers

Rather than distribute each plug-in as an independent file, some antivirus products use containers with all the updated files inside them. If the antivirus product you are targeting uses a container file format, an analyst will need to research its file format before he or she can access all the files inside it. From the viewpoint of the antivirus company, both methods offer benefits and drawbacks. If a container is used, the intellectual property is somewhat more “protected” because research is needed to reverse-engineer the file format of the container and write an unpacker. On the other hand, distributing a single, large file to customers can make updates slower and more expensive. Distributing the plug-in files as many small files means that an update may involve only a few bytes or kilobytes instead of a multi-megabyte file. Depending on the size and quantity of the update files that are served, the researchers can get a rough idea of the capabilities of the antivirus core in question: more code means more features.

Advantages and Disadvantages of the Approaches for Packaging Plug-ins

Antivirus engineers and reverse-engineers have different viewpoints when assessing the advantages and disadvantages of the two approaches to packaging plug-ins. For engineers, the dynamic loading approach is the easiest, but it is also the most problematic one. Antivirus products that offer plug-ins that are encrypted, compressed, and loaded dynamically in memory have the following disadvantages, from a developer's point of view:

· They consume more memory.

· Developers must write specific linkers so the code compiled with Microsoft Visual C++, Clang, or GCC can be converted to a form the antivirus kernel understands.

· They make it significantly more difficult for developers to debug their own plug-ins. Often, they are forced to hard-code INT 3 instructions or use OutputDebugString, printf for debugging. However, such calls are not always available. For example, OutputDebugStringis not an option in Linux or Mac OS X. Furthermore, some plug-ins are not native code, such as those for the Symantec Guest Virtual Machines (GVMs).

· Developers are forced to create their own plug-ins loader for each operating system. Naturally, the different loaders must be maintained, thus the work is multiplied by the number of different operating systems the antivirus company supports (commonly two or three: Windows, Mac OS X, and Linux), although most of the code can be shared.

· If the code copied to memory needs to be relocated, the complexity significantly increases, as does the time required to load a plug-in.

The complexity of developing such a system is increased because files that are encrypted and compressed require a whole new file format. Also, because generated binaries are not standard executables (like PE files, MachO files, or ELF files), antivirus developers must create a specific signing scheme for their antivirus plug-in files. However, antivirus developers are not doing this as often as they should. Indeed, most antivirus software does not implement any kind of signing scheme for its update files besides simple CRC32 checks.

From the viewpoint of an antivirus engineer, antivirus kernels using the Kaspersky approach have the following advantages:

· They consume less memory.

· Developers can debug their native code with any debugging tool.

On the other hand, this approach has the following disadvantages:

· Developers must write their own full-featured linker inside the antivirus core. This is not a trivial task.

· The linker must be written and maintained for any supported platform (although most code will be shared).

Each antivirus company must decide which scheme is best for it. Unfortunately, it sometimes seems like antivirus product designers simply implement the first method that they come up with, without thinking about the implications or how much work will be required later to maintain it or, even worse, port it to new operating systems, such as Linux and Android or Mac OS X and iOS. This is the case with various antivirus products implementing a loader for PE files for both Linux and Mac OS X. Their plug-ins were created as non-standard PE files (using the PE header as the container for the plug-in but with a totally different file format than usual PE files) for only the platform that was supported at the time (Windows), and they did not think about porting the code in the future to other platforms. Many antivirus companies are affected by the same design failure: an excessive focus on Windows platforms.

From a reverse-engineering point of view, however, there is a clear winner: object files that are linked together in the machine running the AV product are the ones to analyze. There are many reasons why these plug-ins' loading mechanisms are better to reverse-engineer the antivirus product:

· If the antivirus product implements a linker and distributes all plug-in files as COFF objects, the COFF objects can be directly opened with IDA. They contain symbols because the linker needs them. These symbols will make it considerably easier to start analyzing the inner workings of the antivirus product being targeted.

· If the files are simple libraries supported by the operating system, you can just load them in IDA and start the analysis. Depending on the platform, symbols can be available (like, as is typical, in the Linux, *BSD, and MacOSX versions).

If the antivirus product uses a dynamic loading approach of non-operating system standard modules, you need to decode the plug-in files and decode them into a form that can be loaded in IDA or any other reverse-engineering tool. Also, because the code is loaded in the heap, because of ASLR the modules will always be loaded at a different address. The process of debugging a piece of code can be really tedious because every time the debugger is launched, the code will be located in a different position, and all the comments, names, and any notes you made during the disassembly are lost, unless the IDA database is manually rebased correctly. IDA does not correctly rebase code in debugging segments. The same applies to breakpoints: if you put a breakpoint in some instruction and re-launch the debugger, the breakpoint is likely going to be at an invalid memory address because the code changed its base address.

Note

You might think that it is better to implement a dynamic loading approach in order to protect the intellectual property of your antivirus products. However, making an analyst's work a bit more difficult initially does not really protect anything. It just makes it more challenging to analyze the product, and it makes the analysis more difficult for only the first steps.

Types of Plug-ins

There are many different plug-in types: some plug-ins simply extend the list of compressors supported by antivirus products, and other plug-ins implement complex detection and disinfection routines for file infectors (such as Sality or Virut). Some plug-ins can be considered helpers for the antivirus engineers (because they export functionality useful for generic detections and disinfections, like disassembler engines, emulators, or even new signature types), or they can be loaders of new, completely different, plug-in types, such as plug-ins for antivirus-specific virtual machines (like routines to unpack the first layers of VMProtect in order to retrieve the license identifier) or support for scripting languages. Understanding the antivirus plug-in loading system and the supported plug-in types is essential to any analyst who wants to know how an antivirus product really works. This is because the most interesting features of an antivirus kernel are not in the kernel but in the components that it loads.

The following sections cover some of the more common (and less common) plug-ins supported by antivirus products.

Scanners and Generic Routines

The most common plug-in type in any antivirus is a scanner. A scanner is a plug-in that performs some kind of scanning of specific file types, directories, user and kernel memory, and so on. An example plug-in of this type is an Alternate Data Streams (ADS) scanner. The core kernel typically offers only the ability to analyze files and directories (and sometimes, userland memory) using the operating-system-supplied methods (that is, CreateFile or the open syscall). However, in some file systems, such as HFS+ (in Mac OS X) and NTFS (in Windows), files can be hidden in alternate data streams so the core routines know nothing about them. Such a plug-in is an add-on to the antivirus core that can list, iterate, and launch other scanning routines against all files discovered in an ADS.

Other scanner types can offer the ability to scan memory when this ability is not directly offered by the antivirus product, or they might offer direct access to kernel memory (as the Microsoft antivirus does) by communicating with a kernel driver. Other scanner types can be launched only after being triggered by another plug-in. For example, while scanning a file, if a URL is discovered inside the file, the URL scanner is triggered. The scanner checks the validity of the URL to determine whether it is red-flagged as malicious.

When reverse-engineering to find security bugs or evade antivirus software, the following information can be enlightening:

· How and when a file is detected as malicious

· How file parsers, de-compressors, and EXE unpackers are launched

· When generic routines are launched against a single sample

· When samples are selected to be executed under the internal sandbox if the antivirus has one

When analyzing scanners, you can determine the different types of signatures used and how they are applied to the file or buffer.

Other scanner types may fall into the generic routines category. Generic routines are plug-ins created to detect (and probably disinfect) a specific file, directory, registry key, and so on. For example, such a plug-in might be a routine to detect some variant of the popular Sality file infector, get the data required for disinfection, and, if available, put this information in internal structures so other plug-ins (such as disinfection routines) can use it.

From a reverse-engineering viewpoint, especially when talking about vulnerability development, generic routines are very interesting as they are typically a very good source of security bugs. The code handling of complex viruses is error prone, and after a wave of infections, the routine may be untouched for years because the malware is considered almost dead or eradicated. Therefore, bugs in the code of such routines can remain hidden for a long time. It is not uncommon to discover security bugs (that lead to exploitation) in the generic routines that are used to detect viruses from the 29A team, MS-DOS, and the very first versions of Microsoft Windows.

Security Implications Of Code Duplication

While generic routines and their corresponding generic disinfections may seem like a basic feature, some antivirus kernels do not offer any methods for plug-ins to communicate. Because of this design weakness, antivirus kernels that do not offer this intercommunication duplicate the code from the generic routines used to detect a file infector to another plug-in that is used to disinfect it. A bug in a file infector may be fixed in the detection routines but not in the code that is copied to the disinfection routines. This bug remains hidden unless you instruct the antivirus scanner to disinfect files. Bugs found in disinfection routines are one of the less researched areas in the antivirus field.

File Format and Protocol Support

Some plug-ins are designed to understand file formats and protocols. These plug-ins increase the capabilities of the antivirus kernel to parse, open, and analyze new file formats (such as compressors or EXE packers) and protocols. Plug-ins designed to understand protocols are more common in gateways and server product lines than in desktop lines, but some antivirus products implement support for understanding the most common protocols (such as HTTP), even in the desktop version.

Such plug-ins can be unpackers for UPX, Armadillo, FSG, PeLite, or ASPack EXE packers; parsers for PDF, OLE2, LNK, SIS, CLASS, DEX, or SWF files; or decompression routines for zlib, gzip, RAR, ACE, XZ, 7z, and so on. The list of plug-ins of this type for antivirus engines is so long that it is the biggest source of bugs in any antivirus core. What are the odds of Adobe not having vulnerabilities its own PDF file format in Acrobat Reader? If you take a look at the long list of Common Vulnerabilities and Exposures (CVEs) covering the vulnerabilities discovered in Acrobat Reader during the last few years, you may get an idea of how difficult it is to correctly parse this file format. What are the odds of an antivirus company writing a bug-free plug-in to parse a file format for which the partial documentation published is 1,310 pages long (1,159 pages without the index)?

Naturally, the odds are against the antivirus engineers. The implementation of a PDF engine has already been mentioned, but what about an OLE2 engine to support Microsoft Word, Excel, Visio, and PowerPoint files; an ASF video formats engine; a MachO engine to analyze executables for Mac OS X operating systems; ELF executables support; and a long list of even more complex file formats? The answer is easy: the number of potential bugs in antivirus software due to the number of file formats they must support is extremely high. If you consider the support for protocols, some of them undocumented or vaguely documented (such as the Oracle TNS Protocol or the CIFS protocol), then you can say that without doubt, this is the biggest attack surface of any antivirus product.

Parser and Decoder Plug-ins Are Complex

An antivirus product deals with hostile code. However, when writing parsers or decoders for file formats, antivirus engineers do not always keep this in mind, and many treat the files they are going to handle as well formed. This leads to mistakes when parsing file formats and protocols. Others over-engineer the parser to accommodate as many fringe cases as possible, increasing the complexity of the plug-in and, likely, introducing more bugs in a dense plug-in that tries to handle everything. Security researchers and antivirus engineers should pay special attention to file format decoder and parser plug-ins in antivirus software.

Heuristics

Heuristic engines can be implemented as add-ons (plug-ins) on top of the antivirus core routines that communicate with other plug-in types or use the information gathered previously by them. An example from the open-source antivirus ClamAV is theHeuristics.Encrypted.Zip heuristic engine. This heuristic engine is implemented by simply checking that the ZIP file under scrutiny is encrypted with a password. This information is normally extracted by a previous plug-in, such as a file format plug-in for ZIP-compressed files that has statically gathered as much information from this file as possible and filled internal antivirus structures with this data. The ZIP engine is launched by a scanner engine that determines in the first analysis steps that the file format of the ZIP file is understood by the kernel. Finally, the heuristic engine uses all of this information to determine that the buffer or file under analysis is “suspicious” enough to raise an alert, according to the heuristic level specified.

Heuristic engines are prone to false positives because they are simply evidence-based. For example, a PDF may look malformed because it contains JavaScript, includes streams that are encoded with multiple encoders (some of which are repeated, for example, where FlateDecode or ASCII85Decode are used twice for the same stream), and contains strings that seem to be encoded in ASCII, hexadecimal, and octal. In this case, heuristic engines would likely consider it an exploit. However, buggy generator software could produce such malformed PDF files, and Adobe Reader would open them without complaint. This is a typical challenge for antivirus developers: detecting malware without causing false positives with goodware that generates highly suspicious files.

There are two types of heuristic engines: static and dynamic. Heuristic engines based on static data do not need to execute (or emulate) the sample to determine whether it looks like malware. Dynamic engines monitor the execution of a program in the host operating system or in a guest operating system, such as a sandbox created by the antivirus developers running on top of an Intel ARM or a JavaScript emulator. The previous examples discussing PDFs or ZIP files fall into the category of static-based heuristic engines. Later in this chapter, in the “Weights-Based Heuristics” section, the dynamic heuristic engines category is discussed.

This section explained some of the simpler heuristic engines an antivirus can offer. However, antivirus products also offer very complex types of heuristic engines. Those are discussed next.

Bayesian Networks

Bayesian networks, as implemented by antivirus products, comprise a statistical model that represents a set of variables. These variables are typically conditional dependencies, PE header flags, and other heuristic flags, such as whether the file is compressed or packed, whether the entropy of some section is too high, and so on. Bayesian networks are used to represent probabilistic relationships between different malware files. Antivirus engineers exercise the Bayesian networks in their laboratories with both malware files and goodware files and then use the network to implement heuristic detection for malware files based on the training data. Such networks can be used in-house, exclusively for the antivirus companies (the most common case), or implemented in distributed products. Although this is a powerful heuristic method with solid roots in statistical models, it may cause many false positives. Bayesian networks as used by antivirus companies (after being trained) usually work in the following way:

1. Antivirus engineers feed the network a new sample.

2. The sample's heuristic flags are gathered, and the state is saved in internal variables.

3. If the flags gathered are from known malware families or are too similar to previously known malware families, the Bayesian network gives a score accordingly.

4. Using the score given by the Bayesian network, the sample is then considered “likely malware” or “likely goodware.”

The problem with such an approach is always the same: what if a true malware file uses the same PE header flags or the gathered heuristic flags (compression, entropy, and so on), or both, as the typical goodware samples? The antivirus will have a true negative (a malware sample wrongly classified as non-malicious). What if a goodware program is protected by some packer or virtualizer and the heuristic flags generated for this file correspond to some malware family? You guessed it: a false positive.

Bypassing Bayesian networks, as well as any kind of heuristic engine implemented in antivirus engines, is typically easy. The rule of thumb for writing malware that slips past heuristic engines is to always make your malware as similar as possible to goodware.

Commonly, Bayesian networks implemented in antivirus engines are used for two purposes:

· Detecting new samples that are likely to be malware

· Gathering new suspicious sample files

Antivirus companies often ask the users to join a company network or to allow the antivirus product to send sample files to the antivirus companies. Bayesian networks are the heuristic engines that classify potentially malicious files as candidates to be sent to antivirus companies for analysis (once the volume of such files becomes high enough or interesting enough).

Bloom Filters

A bloom filter is a data structure that antivirus software uses to determine whether an element is a member of a known malware set. A bloom filter determines either that the element is absolutely not in the set or that it is probably in the set. If the heuristic flags gathered from another plug-in pass the bloom filter, the sample is definitely not in the set, and the antivirus software does not need to send the file or buffer to other, more complex (and likely slower) routines. Only the files that pass through the bloom filter are sent to more complex heuristic engines.

The following is a hypothetical bloom filter and is useful only for explanation purposes. This is a filter for a database of MD5 hashes. Say that in your database, you have samples containing the following hashes:

99754106633f94d350db34d548d6091a9fe934c7a727864763bff7eddba8bd49

e6e5fd26daa9bca985675f67015fd882e87cdcaeed6aa12fb52ed552de99d1aa

If the MD5 hash of the new sample or buffer under analysis does not start with either 9 or E, you can conclude that the file is definitely not in the set of files you want to submit to slower routines. However, if the hash of the new sample starts with either 9 or E, the sample “might be” in the set, but you would need to perform more complex queries to check whether it is a member of the sample set. The previous example was hypothetical only and was meant to show how a bloom filter works. There are much better approaches for determining whether a hash is in a known database of fixed-size strings.

Almost all antivirus products implement some sort of heuristic engines based on hashes (either cryptographic or fuzzy hashes) using bloom filters. In general, bloom filters are exclusively used to determine whether a sample should be researched in more depth or just discarded from an analysis routine.

Weights-Based Heuristics

Weights-based heuristics appear in various antivirus engines. After a plug-in gathers information about a sample file or a buffer, internal heuristic flags are filled accordingly. Then, depending on each flag, a weight is assigned. For example, say that a sample is run under the antivirus emulator or in a sandbox, and the behavior of this sample (when running under the emulator or sandbox) is recorded. Weight-based heuristic engines assign different weights to different actions (the values can be negative or positive). After all the actions performed by the sample being analyzed have been weighted, the heuristic engine determines whether it looks like malware. Consider an example where an AV has recorded the following activity of a hypothetical malware:

1. The malware reads a plain text file in the directory where it is being executed.

2. It opens a window and then shows the user a dialog box for confirming or cancelling the process.

3. It downloads an executable file from an unknown domain.

4. It copies the executable file to %SystemDir%.

5. It executes the downloaded file.

6. Finally, it tries to remove itself by running a helper batch file that tries to terminate the malware process and then clean it from disk.

A weight-based heuristic engine assigns negative values to the first two actions (as they are likely benign actions) but assigns positive values to the subsequent actions (as they look like the typical actions of a malware dropper). After a weight is applied to each action, the final score of the sample's behavior is calculated, and, depending on the threshold specified by the user (antivirus researcher), the malware is judged as either probably malware or definitely not malware.

Some Advanced Plug-ins

Antivirus products use many different kinds of plug-ins in addition to the types discussed previously in this chapter. This section looks at some of the most common advanced plug-ins used in antivirus products.

Memory Scanners

A scanner is the most common type of plug-in that antivirus products use. One example of an advanced scanner usually found in antivirus products is a memory scanner. Such a scanner type offers the ability to read the memory of the processes being executed and apply signatures, generic detections, and so on to buffers extracted from memory. Almost all antivirus engines offer memory analysis tools in some form.

There are two types of memory scanners: userland and kernel-land memory-based scanners. Userland scanners perform queries over memory blocks of userland programs, and kernel-land scanners perform queries over kernel drivers, threads, and so on. Both types are really slow and are often used only after some specific event, such as when the heuristics detect a potential problem. Often, users can employ the AV interface to initiate a complete memory scan. Userland-based memory scanning techniques can be implemented by using the operating system APIs (such as OpenProcess and ReadProcessMemory in Windows-based operating systems) or by kernel drivers created by antivirus developers.

Using the operating system APIs is not always ideal, because they can be intrusive, and malware writers have developed evasion techniques to work around them. For example, some malware samples are written to perform preventive actions when a memory read from an external process occurs. The malware might choose to terminate itself, remove some files, or act to prevent detection in some way. A goodware program with built-in protection may misinterpret such a scan and refuse to continue working to prevent analysis. This is why antivirus programmers do not like this approach and prefer to implement kernel drivers to read memory from foreign processes. Unless the malware is communicating with another kernel component (a rootkit), there is no way to know whether or not the memory of a process is being read. To read kernel memory, AV companies have to write a kernel driver. Some antivirus products develop a kernel driver that allows reading of both user and kernel memory, implements a communication layer for retrieving this information from userland processes, and then passes the read buffers to analysis routines.

Implementing these features without proper security checks is a good source of bugs. What if the kernel driver does not verify which application is calling the exported I/O Control Codes (IOCTLs) used to read the kernel memory? This can lead to serious security issues where any user-mode application that knows about this communication layer and the proper IOCTLs can read kernel memory. The problem becomes even more severe if the developers of this kernel driver also provided a mechanism (via additional IOCTLs) to write to kernel memory!

Loaded Modules Analysis Versus Memory Analysis

Some antivirus products, which are not listed here, claim to support memory analysis, but that is not accurate. Such products do not really perform memory analysis but, rather, query the list of processes being executed and analyze the modules loaded in each one using the files as they are on disk. Memory analysis techniques can be intrusive and must be used with great caution because anti-debugging, anti-attaching, and other anti-reverse-engineering techniques can detect these techniques and prevent the application from working properly. In part, this design protects the intellectual property of the software program. Antivirus companies try to be as unobtrusive as possible. Some companies simply do not bother trying to read the memory of a process because of the implications of interfering with legitimate software. Their approach is that it is sufficient to read the bytes of the modules on disk.

Non-native Code

Antivirus kernels are almost always written in C or C++ languages for performance reasons. However, the plug-ins can be written in higher-level languages. Some antivirus products offer support for .NET or for specific virtual machines to create plug-ins (such as generic detections, disinfections, or heuristics). An antivirus company may decide to take this route for the following reasons:

· Complexity—It could be easier to write a detection, disinfection, or heuristic engine with a higher-level programming language.

· Security—If the language chosen is executed under a virtual machine, bugs in the code parsing a complex file format or disinfecting a file infector would affect not the entire product but only the processes running under the virtual machine, emulator, or interpreter they selected.

· Ability to debug—If a generic detection, disinfection, or heuristic engine is written in a specific language and a wrapper for the API offered by the antivirus is available, antivirus developers can debug their code with the tools available for the language they decided to use.

When the decision to use non-native code is driven by security, the first and third reasons are sometimes lost. For example, some antivirus products may create different types of virtual machines to run their parsers and generic routines under the “matrix” (in a sandbox-like environment) instead of running directly as native code. That approach means that when a vulnerability is discovered in the code, such as a buffer overflow, it does not directly affect the entire scanner (such as the resident program, usually running as root or SYSTEM). This forces an exploit developer to research the virtual machine as well, in order to find escapes (requiring the use of two or more exploits instead of a single one). On the other hand, some antivirus products (at least during the first versions of their new virtual machines) create a full instruction set and offer an API but no way to debug code with a debugger, which causes problems to antivirus engineers.

If you mention GVM (Guest Virtual Machine) to some developers from the old days of Symantec, they will tell you horror stories about it. In the past, the GVM was a virtual machine that did not allow the debugging of code with a debugger. This forced developers to invent their own debugging techniques to determine why their code was not working. Even worse for some virtual machines, the detections were written directly in assembly, because there was no translator or compiler that generated code as supported by the virtual machine. If you combine this annoying inability to debug with familiar tools (such as OllyDbg, GDB, and IDA), you will get an idea of how little developers in the anti-malware industry appreciate virtual machines.

If you combine this annoying inability to debug with familiar tools (such as OllyDbg, GDB, and IDA), you will get an idea of how little developers in the anti-malware industry appreciate virtual machines.

Lua and .NET are among the most common non-native languages being used in antivirus products. Some companies write .NET bytecode translators for a format supported by their virtual machines; others directly embed an entire .NET virtual machine inside their antivirus software. Still others use Lua as their embedded high-level language because it is lightweight and fast, it has good support for string handling, and the license is rather permissive, allowing its use in commercial, closed-source products, like 99.99 percent of the antivirus industry.

While it is a nightmare for antivirus programmers to debug their code if there is no way to use the typical debugging tools, it is easier to write code in .NET languages, such as C#, than in C or C++. Another point is that the security implications of having a bug in the code are obviously less worrisome in managed languages than in unmanaged languages; if the code is running inside a virtual machine, an exploit writer needs to concatenate at least one more bug to get out of the virtual machine, making it considerably more complex to exploit the antivirus product. Also, the odds of having security vulnerabilities in managed languages compared to C or C++ are remarkably lower.

From a reverse-engineering viewpoint, however, if the targeted antivirus product uses a virtual machine of some sort, it can be a true nightmare. Say that the antivirus “ACME AV” implemented a virtual machine of its own, and most of its generic detections, disinfections, and heuristic routines are written for this virtual machine. If the VM is a non-standard one, the unfortunate analyst will need to go through the following steps:

1. Discover that code is written for a virtual machine. Naturally, when a reverse-engineer starts his or her work on a new target, this information is not available.

2. Discover the whole instruction set is supported by a virtual machine.

3. Write a disassembler, usually an IDA processor module plug-in, for the whole new instruction set.

4. Discover where the plug-ins' routine bytes are located (in the plug-in files or in memory), and dump or extract them.

5. Start the analysis of the plug-ins implemented for the specific virtual machine in IDA or with the custom disassembler that he or she developed in step 3.

It can be even worse: while not necessarily in antivirus products, it does occur in software protection tools such as Themida or VMProtect. If the processor virtual machine is randomly generated and completely different for each build or update, the difficulty of analyzing the code increases exponentially. Every time a new version of the virtual machine is released, a new disassembler, possibly an emulator, or any tools the reverse-engineer wrote relying on the previous instruction set, must be updated or re-written from scratch. But there are even more problems for security researchers: if the developers of the product cannot debug the code with their tools, the analyst is also unable to do so. Thus, they need to write an emulator or a debugger (or both) for it.

Researching these plug-ins is typically too complex. However, if the selected virtual machine is well known, such as the .NET virtual machine, then the researcher happens to be lucky enough to discover complete .NET libraries or executables hidden somewhere in the database files and then be able to use a publicly available decompiler such as the open-source ILSpy or the commercial .NET Reflector. This makes his or her life easier, as the analyst can read high-level code (with variable and function names!) instead of the always less friendly assembly code.

Scripting Languages

Antivirus products may use scripting languages, such as the aforementioned Lua or even JavaScript, to execute generic detections, disinfections, heuristic engines, and so on. As in the previous case, the reasons for implementing the aforementioned features using scripting languages are exactly the same: security, debugging, and development complexity. Naturally, there are also business-level reasons for using scripting languages: it is easier to find good high-level programmers than it is to find good software developers in languages such as C or C++. Thus, a new antivirus engineer joining an antivirus firm does not really need to know how to program in C or C++ or even assembly, because that person writes plug-ins in Lua, JavaScript, or some other scripting language supported by the antivirus core. That means a programmer needs to learn only the APIs that the core exports in order to write script plug-ins.

As with the previous case, there are two different viewpoints regarding plug-ins implemented in antivirus products with scripting languages: those of the antivirus developer and those of the researchers. For antivirus companies, it is easier to write code in high-level languages because they are more secure, and it is usually easier to find developers of high-level languages. For reverse-engineers, in contrast with what usually happens with virtual machines, if the antivirus product directly executes scripts, the researcher simply needs to find where the scripts are, dump them, and start the analysis with actual source code. If the scripts are compiled to some sort of bytecode, the researcher might be lucky enough to discover that the virtual machine is the standard one offered by the embedded scripting language, such as Lua, and find an already written decompiler such as (following with the Lua example) the open-source unluac. The researcher may be required to make some small modifications to the code of the decompiler in order to correctly get back the source code of the script, but this is usually a matter of only a few hours' work.

Emulators

The emulators are one of the key parts of an antivirus product. They are used for many tasks such as analyzing the behavior of a suspicious sample, unpacking samples compressed or encrypted with unknown algorithms, analyzing shellcode embedded in file formats, and so on. Most antivirus engines, with the notable exception of ClamAV, implement at least one emulator: an Intel 8086 emulator. The emulator is typically used to emulate PE files, with the help of another loader module (which is sometimes baked into the emulator's code), boot sectors, and shellcode. Some antivirus products also use it to emulate ELF files. There is no known emulator that does the same for MachO files.

The Intel x86 emulator is not the only one that antivirus kernels use; some emulators are used for ARM, x86_64, .NET bytecode, and even JavaScript or ActionScript. The emulators by themselves are not that useful and tend to be limited if the malware issues many system or API calls. This stems from the fact that the emulators set a limit to the number of API calls that are emulated before they halt the emulation. Supporting the instruction set—the architecture—is halfway to emulating a binary; the other half is properly emulating the API calls. The other responsibility of an emulator is to support either the APIs or the system calls that are offered by the actual operating system or environment it is mimicking. Usually, some Windows libraries, such as ntdll.dll or kernel32.dll, are “supported,” in the sense that most of the typical calls are somehow implemented by the antivirus. Very often, the implemented functions do not really do anything but return codes that are considered as successful return values. The same applies to emulators of userland programs instead of entire operating systems: the APIs offered by the product (such as Internet Explorer or Acrobat Reader) are mimicked so the code being executed under the “matrix” does not fail and performs its actions. Then the behavior, whether bad or good, can be recorded and analyzed.

The emulators are usually updated because malware authors and commercial software protection developers discover and implement new anti-emulation techniques almost daily. When the antivirus engineers discover that some instruction or API is being used in a new malware or protector, the instructions or APIs are updated so that they are supported. The malware authors and software protection developers then discover more. This is the old cat-and-mouse game where the antivirus industry is naturally always behind. The reason is simple: supporting a recent entire CPU architecture is a gigantic task. Supporting not only an entire CPU but also an entire set of operating system APIs in an engine that runs in a desktop solution, without causing enormous performance losses, is simply an impossible task. What the antivirus companies try to do is to balance the quantity of APIs and instructions they have to support without implementing all of the instruction sets or APIs that can emulate as much malware as possible. Then they wait until a new anti-emulation technique appears in some new malware, packer, or protector.

Summary

This chapter covered antivirus plug-ins—how they are loaded, types of plug-ins, and the functionality and features they provide.

In summary, the following topics were discussed:

· Antivirus plug-ins are not a vital part of the core of the AV. They are loaded by the AV on demand.

· There is not a single method that is used by AVs to load plug-ins. Some AVs rely on simple operating system APIs to load plug-ins; other AVs use a custom plug-in decryption and loading mechanism.

· The plug-in loading mechanism dictates how hard the reverse-engineer has to work to understand its functionality.

· There is a simple set of steps a reverse-engineer can follow when trying to understand the plug-in functionality.

· There are various types of plug-ins, ranging from simple ones to more complex ones. Examples of relatively simple plug-ins include scanners and generic detection routines, file format parsers, protocol parsers, executable files and archive files decompressors, heuristics engine, and so on.

· Heuristic engines work by looking at anomalies in the input files. These engines may be based on simple logic or more complex logic, such as those based on statistical modeling (Bayesian networks) or weight-based heuristics.

· There are two types of heuristic engines: static and dynamic. Static engines look into the files statically without running or emulating them. For example, PE files that have unusual fields in their headers or PDF files that have streams that are encoded multiple times using different encoders can trigger the detection. The dynamic heuristic engines try to deduce malicious activity based on the behavior of the emulated or executing code.

· File format or protocol parsers for complex or undocumented formats are usually an interesting source of security bugs.

· Some advanced plug-ins include memory scanners, plug-ins written using interpreted languages and run within a virtual machine, and emulators.

· Memory scanner plug-ins may scan the memory from userland or kernel-land. Userland memory scanners tend to be intrusive and may interfere with the execution of the program. Kernel-mode scanners are less intrusive but can expose security bugs if it is not properly implemented.

· Plug-ins written using scripting languages not only are easier to write and maintain but also offer an extra layer of protection because they run through an interpreter. Reverse-engineering such plug-ins can be very challenging especially if the language is interpreted using a custom-built virtual machine.

· Emulators are key parts of an antivirus. Writing a foolproof and decent emulator for various architectures is not an easy task. Nonetheless, they can still help in unpacking compressed or encrypted executable and analyzing shellcode embedded in documents.

The next chapter covers antivirus signatures, how they work, and how they can be circumvented.