Researching Domains and IP Addresses - Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (2011)

Malware Analyst’s Cookbook and DVD: Tools and Techniques for Fighting Malicious Code (2011)

Chapter 5. Researching Domains and IP Addresses

To fully investigate malware, it is essential that you know the ins and outs of researching domains and IP addresses. Conducting these investigations is a requirement for anyone who works in the information security field and deals with malware. The domains and IP addresses that malware uses can you tell you a lot about the origin of an attack and how miscreants conduct their operations. This chapter provides you with the investigative techniques and tools to put IP addresses and domains under the microscope.

Before you read this chapter, note that some of the information that we present has been sanitized to protect the innocent. However, other information (such as data that appears in screenshots or that is readily available on other websites) is not sanitized. Do not try to visit or contact sites that we use as examples in this chapter. Also, the registrars and ISPs mentioned in this chapter are not necessarily malicious and are simply included as they were discovered in the course of our investigations. Finally, we use the terms domain and hostname interchangeably. A domain is, for example,malwarecookbook.com, while a hostname is ftp.malwarecookbook.com (otherwise known as a fully qualified domain name or FQDN).

Researching Suspicious Domains

The vast majority of malware makes use of the domain name system (DNS) for address resolution. DNS is what keeps us from having to remember IP addresses. Domains have DNS servers that tell you where to find resources on the Internet—like a phone book. When you want to visitwww.malwarecookbook.com, you type exactly that into your browser. In a split second, your computer finds out that the IP address for the website is 75.127.96.232. Without DNS, you would have to type the IP address for every website to which you connect. This, of course, would not work very well.

The miscreants behind malware, however, like using domain names for other reasons—resilience and sustainability. A good thing about DNS is that you can easily and quickly update it. However, miscreants know this and use it to their advantage. They register their own domain, such asbaddomain.com, and point it to the IP address of a server that they control. Should the server they are using be taken down, they can quickly move the malware to a new server by simply updating a DNS entry.

The techniques described in this chapter can be applied to researching any domain name; however, they are especially useful when it comes to investigating suspicious domains. Here are a few heuristic techniques you can use to determine if a domain is suspicious:

· The domain is strikingly similar to a real domain (for example rnalwarecookbook.com instead of malwarecookbook.com).

· The domain consists entirely of random letters and/or numbers. This could indicate that a Domain Generation Algorithm (DGA) created the domain name (see Recipe 12-11).

· The domain was registered or updated just a few hours or days before the time you discovered it. Most legitimate businesses do not frequently update their domain’s registration information or DNS records.

· The domain expires within a few weeks or months. Most legitimate companies with the expectation of staying in business will renew their domains long before the expiration date approaches.

· The registrant’s information is unavailable or filled with garbage.

· Search engine results for the domain name return several websites indicating it’s associated with exploits or malware.

· The domain exists on RBLs or has been reported by automated scanning engines as hosting malicious content (see Recipe 5-10).

· The domain is exhibiting fast flux characteristics (see Recipe 5-11).

Recipe 5-1: Researching Domains with WHOIS

One of the first actions you should take when researching a domain is to obtain its WHOIS (pronounced who is) information. WHOIS information normally includes contact details for the domain’s registrant and the person(s) responsible for administrative, technical, and/or billing issues. These details may include a name, organization, address, phone number, and e-mail address. In some cases, the data is accurate for all of the contacts. In other cases, the data is blank or filled with false information. WHOIS queries also return the domain’s DNS servers, the domain’s creation date, and the domain’s expiration date—all of which can help you triage contact information and determine if it’s legitimate or not.

WHOIS on Linux and Mac OS X

The whois utility is resident on most Unix-based platforms. On Linux and Mac OS X, the file is usually located at /usr/bin/whois. If it is not present on your Ubuntu machine, you can install it by typing apt-get install whois. In the following example, assume you uploaded a malware sample to one of the sandboxes in Chapter 4. In the network traffic results, you saw that the malware communicated with www.my-traff.net. You’ll now want to do a WHOIS query to find out more about this domain. Note that the malware used www.my-traff.net, but when doing WHOIS queries you can only look up the domain and not anything else preceding it, such as www or ftp.

$ whois my-traff.net

[Querying whois.verisign-grs.com]

[whois.verisign-grs.com]

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered

with many different competing registrars. Go to

http://www.internic.net for detailed information.

Domain Name: MY-TRAFF.NET

Registrar: NAMEBAY

Whois Server: whois.namebay.com

Referral URL: http://www.namebay.com

Name Server: NS1.INSORG.NET

Name Server: NS2.INSORG.NET

Status: ok

Updated Date: 29-jun-2009

Creation Date: 15-jul-2006

Expiration Date: 15-jul-2010

>>> Last update of whois database: Wed, 03 Mar 2010 06:37:00 UTC <<<

The output shows the domain was registered through a company called Namebay (the registrar) on July 15, 2006. The domain was updated on June 29, 2009 and expires on July 15, 2010. However, you do not have the details on the registrant or the technical, administrative, or billing contacts for the domain. This is because the whois command usedwhois.verisign.grs.com by default, but Namebay actually stores the contact information in its own WHOIS server (whois.namebay.com).

To query a specific WHOIS server directly, you can use the host parameter (-h HOST, --host=HOST) to whois. The following command shows an example:

$ whois -h whois.namebay.com my-traff.net

[Querying whois.namebay.com]

[whois.namebay.com]

<a href='http://www.namebay.com'>NAMEBAY</a>

Domain Name : MY-TRAFF.NET

Created On : 2006-07-15

Expiration Date : 2010-07-15

Status : ACTIVE

Registrant Name : INSORG

Registrant Street1 : 63,Palatin prospekt

Registrant City : Moscow

Registrant State/Province :

Registrant Postal Code : 117917

Registrant Country : RU

Admin Name : INSORG

Admin Street1 : 63,Palatin prospekt

Admin City : Moscow

Admin State/Province : RU

Admin Postal Code : 117917

Admin Country : RU

Admin Phone : +7.2941258032

Admin Email : igor@pipen.net

Tech Name : INSORG

Tech Street1 : 63,Palatin prospekt

Tech City : Moscow

Tech State/Province : RU

Tech Postal Code : 117917

Tech Country : RU

Tech Phone : +7.2941258032

Tech Email : igor@pipen.net

Billing Name : INSORG

Billing Street1 : 63,Palatin prospekt

Billing City : Moscow

Billing State/Province : RU

Billing Postal Code : 117917

Billing Country : RU

Billing Phone : +7.2941258032

Billing Email : igor@pipen.net

Name Server : NS1.INSORG.NET

Name Server : NS2.INSORG.NET

Registrar Name : Namebay

You now have a lot more information to work with. In this case, it is evident that the domain is registered to someone in Moscow, Russia with the e-mail address igor@pipen.net. The registrant’s name is listed as “INSORG,” which does not appear to have a clear meaning but notice that the name servers are both part of INSORG.NET. There is no way to tell right off the bat if this information is real or fake. It is possible that the miscreants used a credit card to purchase the domain and then put the victim’s information into the WHOIS database.

Cygwin on Windows

Cygwin1 is free software that provides a Linux-like environment for Microsoft Windows users. To get started, download the Cygwin installer file. When you reach the package selection screen, type whois into the search box. If you see the word Skip to the left of the package name, as shown in Figure 5-1, the package will not be installed. If this is the case, click the wordSkip to change the settings so it is set to install. The installation window should now display the version number of the GNU Whois package instead of the word Skip.

Figure 5-1: Installing the whois package in Cygwin

f0501.tif

Once the installation has completed, you can launch the Cygwin shell from your Start menu and execute commands as if you were logged into a Linux machine. Figure 5-2 shows the result of a WHOIS query performed with the whois command from the Cygwin shell.

Figure 5-2: Querying WHOIS on Windows via Cygwin

f0502.tif

WHOIS with Sysinternals on Windows

If you do not want all the functionality and additional packages that Cygwin provides, you can use the Sysinternals WHOIS utility2 by Mark Russinovich. Place the whois.exe binary in your command shell’s PATH (such as the system32 directory) and then invoke it in the following manner:

C:\>whois my-traff.net

Whois v1.01 - Domain information lookup utility

Sysinternals - www.sysinternals.com

Copyright (C) 2005 Mark Russinovich

Connecting to NET.whois-servers.net

Connecting to whois.namebay.com...

<a href='http://www.namebay.com'>NAMEBAY</a>

Domain Name : MY-TRAFF.NET

Created On : 2006-07-15

Expiration Date : 2010-07-15

Status : ACTIVE

Registrant Name : INSORG

Registrant Street1 : 63,Palatin prospekt

Registrant City : Moscow

[REMOVED]

The tool only takes two possible parameters, a hostname and an optional WHOIS server to query. Instead of supplying the –h or --host flags as you would have to do in Linux, you just type the server name after the domain you are querying.

Additional Tools for Windows

Here are some additional tools you can use on Windows to look up WHOIS information:

· Foundstone’s SuperScan3: This tool is primarily for port scanning but has additional features that have the same functionality as ping, traceroute, whois, and other popular networking tools.

· UnxUtils (GNU Utilities for Win32)4: This is a collection of over 50 common GNU utilities that have been ported to run on Windows, including, of course, whois.exe.

Web Tools

Most registrars have Web-based WHOIS database search tools. For example, you can scroll to the bottom of GoDaddy’s website (www.godaddy.com) and select WHOIS Search. In most cases, the search results are not limited to just domains registered through the registrar’s website. As a result, you should be able to pull up the WHOIS information for almost any domain.

Several other websites specialize in providing various DNS tools that include WHOIS database lookup options. Most of these websites function similarly, but may have some slight differences, such as requiring you to fill out a captcha, limiting the TLDs (.com, .net, .org, .uk, and so on), or filtering the search results to obfuscate e-mail addresses. The following is a list of a few websites that you can use to perform WHOIS queries.

· http://www.dnstools.com

· http://swhois.net

· http://www.whois-search.com

· http://www.betterwhois.com

· http://who.is

· http://www.domaintools.com

· http://www.allwhois.com

1 http://www.cygwin.com

2 http://technet.microsoft.com/en-us/sysinternals/bb897435.aspx

3 http://www.foundstone.com/us/resources/proddesc/superscan.htm

4 http://unxutils.sourceforge.net/

Recipe 5-2: Resolving DNS Hostnames

This recipe covers a few ways to determine a hostname’s IP address from the command line on Linux, Windows, and on any platform using a web browser. For your research, you will mostly be interested in getting the A records for a given hostname. A records store IP addresses. Other record types that you’ll likely encounter frequently are name server (NS), mail exchange (MX), and pointer (PTR) records. For more information on these types, see DNS Resource Records5.

There are several ways to quickly obtain a hostname’s IP address with tools that are often already built into the operating systems. On Unix-based systems, you can use the host or dig command. If you are running Ubuntu and it does not have either of these tools, you can install them by typing apt-get install dnsutils. On Windows systems, you can use the nslookup andping commands. Note that nslookup and ping are also available on Unix-based systems.

The Host Command (Unix only)

The host command is a tool used to perform DNS lookups on Unix-based systems. To obtain an IP address using the host command, type the following:

$ host my-traff.net

my-traff.net has address 85.17.139.54

my-traff.net mail is handled by 10 mail.my-traff.net.

The output shows that the IP address of my-traff.net is 85.17.139.54, which is an A record. By default, the host command returns A, AAAA, and MX records. To show DNS records of all types, use the –t ANY flag.

$ host -t ANY my-traff.net

my-traff.net mail is handled by 10 mail.my-traff.net.

my-traff.net descriptive text "v=spf1 a mx ip4:85.17.139.35 ?all"

my-traff.net has address 85.17.139.54

my-traff.net has SOA record ns1.srv.com. \

root.my-traff.net. 2009010100 \

14400 3600 1209600 86400

my-traff.net name server ns2.srv.com.

my-traff.net name server ns1.srv.com.

The Dig Command (Unix only)

Another useful DNS lookup utility for Unix-based systems is dig. To obtain the IP address using the dig command, do the following from the command line:

$ dig my-traff.net

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.1 <<>> my-traff.net

;; global options: printcmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56019

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:

;my-traff.net. IN A

;; ANSWER SECTION:

my-traff.net. 14400 IN A 85.17.139.54

;; AUTHORITY SECTION:

my-traff.net. 86400 IN NS ns1.insorg.net.

my-traff.net. 86400 IN NS ns2.insorg.net.

Here you can see the IP address 85.17.139.54 was returned as the A record. If you want to return just the IP address of the site and nothing else, you can modify the command by adding the +short query option.

$ dig +short my-traff.net

85.17.139.54

The nslookup command

nslookup is an administrative tool for testing and troubleshooting DNS servers. The utility takes a hostname as an argument and returns the associated IP address, as shown in the following command:

C:\>nslookup my-traff.net

Server: temp

Address: 192.168.1.1

Non-authoritative answer:

Name: my-traff.net

Address: 85.17.139.54

The Ping Command

The primary purpose of the ping command is to check if a computer is online and reachable. It works by sending a packet of data to the remote computer’s IP address and then waiting for a reply. When you use ping, you can supply either the IP address or the hostname of the remote computer. If you supply the hostname, ping will perform a DNS resolution of the hostname and print the associated IP address in its output. The command below shows an example.

C:\>ping -i 1 my-traff.net

Pinging my-traff.net [85.17.139.54] with 32 bytes of data:

Reply from 192.168.1.1: TTL expired in transit.

Reply from 192.168.1.1: TTL expired in transit.

Reply from 192.168.1.1: TTL expired in transit.

Reply from 192.168.1.1: TTL expired in transit.

Ping statistics for 85.17.139.54:

Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 0ms, Average = 0ms

You should use ping with caution because it will attempt to contact the remote system, which will reveal your IP address to attackers if they’re watching traffic. A good way to use ping, but avoid sending any traffic to the destination, is to set the packet’s time to live (TTL) value to 1. You will notice that this is what we did by adding the –i 1 option. This ensures that your router will not forward the traffic any further. To set the TTL value to 1 from a Linux system, use –t 1 instead.

Note When you perform a DNS resolution of a hostname, traffic may be sent to the DNS servers associated with that hostname. If you are doing a DNS lookup of a malicious hostname whose DNS servers are controlled by the miscreants, the servers can potentially see your lookup request. Refer to Chapter 1 for tips and considerations to take into account with respect to remaining anonymous while performing investigations.

Web-Based Tools

The list that follows provides a sample of websites that you can use to resolve a domain’s IP address.

· http://www.dnstools.com

· http://www.hcidata.info/host2ip.htm

· http://dns-tools.domaintools.com

· http://domaintoip.com/ip.php

· http://www.ipaddressreport.com

5 http://www.dns.net/dnsrd/rr.html

Researching IP Addresses

Whether malware uses a domain name or not, it will have to use an IP address in some capacity if the malware plans on contacting other hosts on the Internet. As you learned earlier, malware may find an IP address through DNS. However, many malware authors hard-code IP addresses into their programs, so they don’t need to use DNS at all. In either case, you will want to investigate the IP addresses once you figure out which one(s) the malware contacts.

There is some overlap between the tools used to research domains and the tools that are used to research IP addresses. However, the information that is returned is different. In this section, you will learn how to answer the following questions:

· Where is this IP address geographically located?

· What parties are responsible for an IP address?

· How many other IP addresses are in the same network?

· Does this IP address have a bad reputation?

· What DNS entries point to an IP address?

Recipe 5-3: Obtaining IP WHOIS Records

WHOIS information for an IP address will generally give you the following information:

· IP address range it falls under

· Organization name, along with address and phone number

· Technical contact information (phone number and e-mail)

· Other contacts and comments, such as how to report abusive IP addresses

This should already sound familiar, as this is very similar to the type of information that is returned when doing WHOIS queries on a domain name.

Command-line WHOIS

The whois tool, which we introduced earlier in the chapter, is also capable of conducting queries on IP addresses. The process to look up information on IP addresses is identical to how you look up domain names when using whois. The example that follows demonstrates how to conduct such a query and what the results should look like. This recipe continues to use the IP address 85.17.139.54 that we found during our DNS lookups associated with my-traff.net.

$ whois 85.17.139.54

[Querying whois.ripe.net]

[whois.ripe.net]

% This is the RIPE Database query service.

% The objects are in RPSL format.

%

% The RIPE Database is subject to Terms and Conditions.

% See http://www.ripe.net/db/support/db-terms-conditions.pdf

% Note: This output has been filtered.

% To receive output for a database update, use the "-B" flag.

% Information related to '85.17.139.0 - 85.17.139.255'

inetnum: 85.17.139.0 - 85.17.139.255

netname: LEASEWEB

descr: LeaseWeb

descr: P.O. Box 93054

descr: 1090BB AMSTERDAM

descr: Netherlands

descr: www.leaseweb.com

remarks: Please email abuse@leaseweb.com for complaints

remarks: regarding portscans, DoS attacks and spam.

remarks: INFRA-AW

country: NL

admin-c: LSW1-RIPE

tech-c: LSW1-RIPE

status: ASSIGNED PA

mnt-by: OCOM-MNT

source: RIPE # Filtered

person: RIP Mean

address: P.O. Box 93054

address: 1090BB AMSTERDAM

address: Netherlands

phone: +31 20 3162880

fax-no: +31 20 3162890

abuse-mailbox: abuse@leaseweb.com

nic-hdl: LSW1-RIPE

mnt-by: OCOM-MNT

source: RIPE # Filtered

% Information related to '85.17.0.0/16AS16265'

route: 85.17.0.0/16

descr: LEASEWEB

origin: AS16265

remarks: LeaseWeb

mnt-by: OCOM-MNT

source: RIPE # Filtered

The results from the IP WHOIS query have now provided you with the following information:

· IP address is located at a Netherlands-based web-hosting provider called LeaseWeb.

· The IP address falls into LeaseWeb’s 85.17.0.0/16 range of IP addresses.

· There is an e-mail address where you can send abuse complaints.

You will also notice that the query went to whois.ripe.net, which is one of the five regional Internet registries (RIRs) and handles queries for Europe. The following section explains this in more detail.

IP WHOIS via the Web

As with domains, you can look up WHOIS information on IP addresses by using a web browser. However, a few of the websites listed in Recipe 5-1 are incapable of doing IP address lookups. When it comes to IP addresses, a regional Internet registry (RIR) is responsible for maintaining information about them. The Internet Assigned Numbers Authority (IANA) delegates all IP addresses to one of five different RIRs based on its location. This means that you can go directly to the website of any of the RIRs and perform IP address lookups. For example, if you wanted to obtain information on an IP address in Africa, you would need to go to the RIR that covers Africa to perform your lookup. If you need to determine the region or country in which an IP address is located, see Recipe 5-13. Table 5-1 is a list of the various RIRs and the regions they cover. For additional details, see https://www.arin.net/knowledge/rirs.html.

Table 5-1: RIRs and Their Functions

Registry

Geographic Location

Web Address

AfriNIC

Africa, portions of the Indian Ocean

www.afrinic.net/

APNIC

Portions of Asia, portions of Oceania

www.apnic.net/

ARIN

Canada, many Caribbean and North Atlantic islands, and the United States

https://www.arin.net/

LACNIC

Latin America, portions of the Caribbean

www.lacnic.net/en/

RIPE NCC

Europe, the Middle East, Central Asia

www.ripe.net/

Researching with Passive DNS and Other Tools

Passive DNS is an excellent tool for investigating domains and IP addresses. Collecting passive DNS data involves recording authoritative DNS responses that have been sent to a client system. A passive DNS collection system (or “Passive DNS Server” in Figure 5-3) is designed to record this data. It monitors the traffic and records the domain name and IP address for which an answer was returned. The system generally does not record information about the client doing the lookup or queries that did not return an IP address. Figure 5-3 demonstrates how passive DNS works using a charitable (non-malicious) website as an example.

Passive DNS servers can be set up anywhere on a network as long as it can see DNS responses. A typical location is transparently in-line with the border gateway or router. Alternately, you can plug your passive DNS server into a mirror port that can see all traffic on your network. The information that is recorded from passive DNS collection can then be queried to find out what domains exist on an IP address or what IP addresses a given domain has resolved to over time (i.e., forward and reverse queries). As previously mentioned, attackers will frequently change the IP addresses associated with their domains. Therefore, historical records can be very helpful when attempting to investigate malicious activity that happened in the past.

Recording passive DNS information in your environment and being able to query it can be very useful when you want to build logical relationships and understand where your traffic

Figure 5-3: Passive DNS collection system diagram

f0503.eps

is going. Florian Weimer’s website (http://www.enyo.de/fw/software/dnslogger/) can help you learn more about passive DNS and set up your own “DNS replication” service. His website describes passive DNS replication as “a technology which constructs zone replicas without cooperation from zone administrators, based on captured name server responses.”

You can gather information about IP addresses and domains using various other methods besides passive DNS. For example, you could attempt a zone transfer, use an automated script to brute-force subdomains, or query special services offered by Shadowserver and Team Cymru. The recipes in this section cover passive DNS as well as the additional methods.

Recipe 5-4: Querying Passive DNS with BFK

BFK, a German-based security company, maintains one of the few (perhaps the only) publicly accessible passive DNS services. The service was formerly run by RUS-CERT and has since been taken over by BFK. To check if the BFK database contains information on a given IP address or domain name, enter your search criteria into the service’s web site.6 In the following example, we perform a query using the IP address that you used in other examples, 85.17.139.54. Figure 5-4 shows the results.

You can see that the IP address associated with my-traff.net also has several other hostnames that resolve to it. If you read Recipe 5-1, you’ll recognize the domain insorg.net, and, consequently, ns1.insorg.net and ns2.insorg.net. These are the name servers revealed by the WHOIS query you performed on the my-traff.net domain. Additionally, you can see the domains drabland.net and bytecode.biz have also resolved to the IP address and may potentially be malicious as a result.

Figure 5-4: Passive DNS results for 85.17.139.54

f0504.tif

Note Not all domains associated with a particular suspect IP address are necessarily malicious. Some servers host websites for multiple domains using the same IP address. A malicious domain could easily end up being hosted on a perfectly legitimate shared web-hosting server. Passive DNS results for the IP address in question would return dozens of domains that are not malicious. Do not automatically assume all domains hosted on the same server are malicious.

6 http://www.bfk.de/bfk_dnslogger_en.html

Recipe 5-5: Checking DNS Records with Robtex

The robtex website at www.robtex.com describes itself as a Swiss Army Knife internet tool, which is a rather accurate statement. They have a ton of features for researching domains, IP addresses, and networks. One great feature is that robtex saves DNS records associated with IP addresses and makes them available on their website. Thus, robtex provides what is essentially a form of passive DNS. Figure 5-5 shows the robtex search results for 85.17.139.54.

Figure 5-5: The robtex search results

f0505.tif

Notice that the first link is at the URL /ip/<ip address>.html. Instead of using the search form, you can just fill in an IP address where it says <ip address> and bring up a page with all the information that robtex has for that IP address. Figure 5-6 shows what robtex returns when you pull up information for 85.17.139.54.

Figure 5-6: Many domains and hosts are associated with 85.17.139.54

f0506.tif

The search on robtex returns much of the same information that you learned from the BFK passive DNS query in Recipe 5-4. It also provides some information that you would see in an IP WHOIS query. Additionally, the website may have information about the IP address being on various blacklists, which can speak to the reputation of the IP address. This is covered later in Recipe 5-10.

Recipe 5-6: Performing a Reverse IP Search with DomainTools

The DomainTools website7 has a useful feature called Reverse IP. This feature allows you to enter in an IP address and see all of the domains that are hosted on it. The only downside is that it is not completely free. If you search an IP address, DomainTools will only return the first three results it finds for free. If there are more than three results and you want to see them, you must buy a membership or pay a one-time fee. The main benefit to using DomainTools is that it should have a full listing of all domains hosted on a particular IP address. In other words, the results are not limited to IP addresses and domains captured by passive DNS services.

While DomainTools does not show you the full list of domains if there are more than three, it does tell you the total number of results it has for your query. Figure 5-7 shows an example reverse IP lookup on 85.17.139.54.

Figure 5-7: Reverse IP search using DomainTools

f0507.tif

Here you can see that DomainTools gave three results but is hiding a fourth result. From the earlier research, you can already deduce that the fourth domain is my-traff.net. However, if you did not know that already, you could use the Reverse IP feature to figure it out.

The DomainTools website also has other features that are useful for investigating and monitoring domains of interest, many of which also require a membership or one-time fee. These features include:

· Name Server Spy: Tracks transfer of a name server.

· Registrant Alert: You receive an alert when a domain record is created or modified with data of interest (such as a particular phone number or e-mail address).

· Reverse Whois: Finds domains by searching WHOIS data, such as names, addresses, phone numbers, e-mail addresses, etc.

· Domain History: Searches the WHOIS history of millions of domains going back to 1995.

7 http://www.domaintools.com/

Recipe 5-7: Initiating Zone Transfers with dig

A great way to obtain additional information about a domain is via zone transfers. To put it simply, a zone transfer is basically a more demanding DNS query. You are asking the DNS server to provide all the information it has about a particular domain (which includes information on its subdomains). Properly configured DNS servers do not allow unauthorized zone transfers because of the amount of information that they expose. Zone transfers have the potential to yield information that you cannot obtain elsewhere. For example, a domain could have dozens of subdomains that have never been used and will not show up anywhere else, such as in passive DNS results.

To demonstrate how to perform a zone transfer, the authors use the malicious domain name google-marks.com, which they obtained from the Malware Domain List (MDL) website.8 The first thing you must do is identify the DNS servers responsible for google-marks.com. You can obtain this information from the WHOIS record of the domain or through digwith the following command:

$ dig NS google-marks.com

google-marks.com. 900 IN NS ns4.google-marks.com.

google-marks.com. 900 IN NS ns3.google-marks.com.

You can see that the name servers are ns4.google-marks.com and ns3.google-marks.com. You can now check each name server to see if it allows zone transfers by using dig and the axfr option.

$ dig @ns4.google-marks.com axfr google-marks.com

google-marks.com. 86400 IN SOA ns1.google-marks.com.

admin.google-marks.com. 2009061201 3600 900 604800 86400

google-marks.com. 86400 IN NS ns3.google-marks.com.

google-marks.com. 86400 IN NS ns4.google-marks.com.

google-marks.com. 86400 IN MX 10 relay.google-marks.com.

google-marks.com. 86400 IN A 67.212.65.105

ftp.google-marks.com. 86400 IN CNAME google-marks.com.

mail.google-marks.com. 86400 IN CNAME google-marks.com.

ns3.google-marks.com. 86400 IN A 67.212.65.105

ns4.google-marks.com. 86400 IN A 67.212.65.106

relay.google-marks.com. 86400 IN A 67.212.65.105

www.google-marks.com. 86400 IN CNAME google-marks.com.

google-marks.com. 86400 IN SOA ns1.google-marks.com.

admin.google-marks.com. 2009061201 3600 900 604800 86400

The zone transfer succeeded, and as a result, you now have all of the DNS records associated with the domain. You can see there are several different subdomains that you might not have otherwise known about. The results show that relay.google-marks.com has an A record and is hosted on the same IP address as google-marks.com. You can now use this as an additional data point in your research.

8 http://www.malwaredomainlist.com/mdl.php

Recipe 5-8: Brute-forcing Subdomains with dnsmap

If you can’t perform a zone transfer, another way to find out additional hosts in a given domain is to try subdomain brute-forcing. GNUCITIZEN created a tool called dnsmap,9 which was intended for use by penetration testers during the reconnaissance stage of an attack. However, you can use it to try and discover other hosts that attackers may have registered for command and control servers.

The following commands show you how to install the most current version of dnsmap (at the time of this writing).

$ wget http://dnsmap.googlecode.com/files/dnsmap-0.30.tar.gz

$ tar -xvzf dnsmap-0.30.tar.gz

$ cd dnsmap-0.30

$ make

$ sudo make install

The tool comes with a built-in list of about 1,000 commonly used hostnames (see dnsmap.h) and an external list of nearly 18,000 three-letter words (see wordlist_TLAs.txt). The README file also contains some URLs to similar tools and word lists that you can use. To detect if any of the built-in names exist for a target domain, you can use the following command:

$ dnsmap google.com

dnsmap 0.30 - DNS Network Mapper by pagvac (gnucitizen.org)

[+] searching (sub)domains for google.com using built-in wordlist

[+] using maximum random delay of 10 millisecond(s) between requests

ap.google.com

IP address #1: 74.125.115.106

IP address #2: 74.125.115.147

IP address #3: 74.125.115.99

IP address #6: 74.125.115.105

blog.google.com

IP address #1: 74.125.115.191

catalog.google.com

IP address #1: 74.125.115.102

IP address #2: 74.125.115.113

[REMOVED]

If you want to use the list of three-letter words or build your own word list, you can specify the file name like this:

$ dnsmap target-domain.com –f yourwordlist.txt

dnsmap will automatically detect if a domain uses wildcards (for example, if the DNS server responds with the same IP address for any subdomain). If you receive false positives, then you can also exclude IP addresses from the results. Keep in mind that if you brute-force too many subdomains in a short amount of time, your ISP (or the operators of the DNS servers you use) may view your activity as abusive and blacklist you in the future.

9 http://code.google.com/p/dnsmap

Recipe 5-9: Mapping IP Addresses to ASNs via Shadowserver

The Shadowserver Foundation10 and Team Cymru11 both run their own WHOIS services that you can query to find out various things such as IP address to ASN mapping. An autonomous system (AS) is a grouping of IP address blocks that are assigned to an Internet Service Provider (ISP). The ISP must also be assigned an autonomous system number (ASN), which is used to uniquely identify the ISP’s networks for routing purposes. Using an ASN, you can find out what IP address ranges belong to an ISP.

The Shadowserver and Team Cymru services provide the following information about an IP address:

· ASN

· IP address block

· Country the IP is located in

· ISP it belongs to

· Peer networks

· Any other ISPs to which IP address space may have been delegated

Querying ASNs with Shadowserver

The following example shows how to use the Shadowserver WHOIS service at asn.shadowserver.org to find out more about the IP address 67.212.65.105 from Recipe 5-7.

$ whois -h asn.shadowserver.org 'origin 67.212.65.105'

10929 | 67.212.64.0/19 | NETELLIGENT | RU | | QNIX LTD WORLD DEDICATED

The output is in the following format:

ASN | Prefix | AS Name | Country | Domain | ISP

From the preceding output, you can see that the suspect IP address is tied to ASN 10929 and it is contained in the IP address block 67.212.64.0/19 in Russia. The AS Name, NETELLIGENT, represents the ISP that owns the ASN. However, the IP address block has been further delegated to QNIX LTD WORLD DEDICATED. A bit more research on the Web reveals that Netelligent Hosting Services Inc. out of Canada appears to have delegated the 67.212.64.0/19 range to a Russian company named Qnix Ltd, World Dedicated. Note that neither of these two companies are believed to be malicious—we are just using a real-life example of how to determine relationships.

You can now do another query to see what other IP address blocks are covered by ASN 10929.

$ whois -h asn.shadowserver.org 'prefix 10929'

64.15.66.0/24

64.15.64.0/20

64.34.124.0/24

64.86.56.0/22

67.212.83.0/24

67.212.64.0/19

68.71.32.0/20

68.71.32.0/19

205.151.108.0/22

205.236.16.0/24

205.236.58.0/24

205.236.70.0/24

208.75.136.0/23

208.75.136.0/22

208.92.196.0/22

209.44.96.0/19

The preceding output shows you that Netelligent Hosting Services has several different IP address blocks spanning thousands of IP addresses. If you want to find out who their peers are, you can run the following command:

$ whois -h asn.shadowserver.org 'peer 67.212.65.105 verbose'

10929 | 67.212.64.0/19 | NETELLIGENT | RU | | QNIX LTD WORLD DEDICATED

3257 TINET BACKBONE Tinet SpA

3356 LEVEL3 Level 3 Communications

The results show that Tinet and Level 3 Communications are likely peers (upstream providers in this case), as each AS is directly connected to Netelligent. This helps you understand how these networks are connected and gives you potential points of contact should you have an issue reporting abuse to a particular ISP.

Querying ASNs with Netcat

You can query for the ASNs of thousands of IP addresses at once using netcat. Netcat is available for Linux and Windows systems. You can install it on your Ubuntu system by running apt-get install netcat or you can download the Windows version.12 To use this method, create a text file containing the IPs you want to query in the following format:

Note Antivirus vendors may detect netcat as a malicious program and classify it as a threat to be quarantined or removed.

begin origin

a.b.c.d

a.b.b.c

d.e.f.g

d.b.a.d

b.e.e.f

end

If you saved this file as ip.txt, you can now run the following:

$ nc asn.shadowserver.org 43 < ip.txt > asn.txt

This will save all of the output for each of the IP addresses to the file asn.txt. You can visit the Shadowserver IP/BGP Whois Service page or the Team Cymru IP to ASN Mapping page for additional information on the services.

10 http://www.shadowserver.org/wiki/pmwiki.php/Services/IP-BGP

11 http://www.team-cymru.org/Services/ip-to-asn.html

12 http://joncraton.org/files/nc111nt.zip

Recipe 5-10: Checking IP Reputation with RBLs

Different people and organizations maintain several blacklists (or block lists). These lists keep track of whether an IP address, IP address range, or domain is considered malicious or abusive. When the lists keep up to the minute information about IPs and hostnames, they are often referred to as real-time blacklists (RBLs). For example, an IP address that has been detected as sending spam often ends up being listed on the Spamhaus Block List,13 while an IP address for a system that is part of a botnet may end up in the abuse.ch DNS Block List.14 Searching these block lists can give you great information, but at the same time it can be quite time-consuming. Fortunately, there is an online service that will check dozens of these services for you based on an IP address or domain, and will return any backlists that are found.

The Anti-Abuse Project

The Anti-Abuse Project has created a website15 that automatically checks IP addresses and domains against over 50 different block lists. Using the Multi-RBL Check gives you a quick picture as to whether or not an IP address or domain has been reported for involvement in suspicious activity. Should an IP address show up on ten different block lists, you have a pretty good idea it is malicious. At the same time, just because an IP or domain is not listed on any of the block lists does not mean it is safe.

When you search an IP address or domain on the Multi-RBL Check, you will see a listing of all the block lists it checks against. In the following example, you will search the IP address 218.61.202.66. This IP address is a known open proxy located in China. The results appear as shown in Figure 5-8.

Figure 5-8: The IP 218.61.202.66 is listed on several block lists

f0508.tif

You can see that the IP address is listed on 11 block lists. This is a red flag that this domain may be malicious or abusive. You need to visit the block lists that have the IP address listed to see if they provide any more information. Some of the block lists are self-explanatory and give you a general idea of why the IP address is listed right off the bat. You can see that 218.61.202.66 is listed on the SpamCop Blocking List,16 so you know it was recently reported as a source of spam. You can still visit the SpamCop website and search the IP address to obtain additional information. Searching the SpamCop Blocking List returns the information shown in Figure 5-9.

Figure 5-9: Looking up the causes for a blacklisted IP

f0509.tif

SpamCop removes listings after 24 hours of the last report, so you can see that this IP was reported sending spam within the last seven hours (because there are 17 hours remaining). It also tells you that spam has been received and reported by both SpamCop’s spam traps and its users.

13 http://www.spamhaus.org/sbl/index.lasso

14 http://dnsbl.abuse.ch/

15 http://www.anti-abuse.org/multi-rbl-check/

16 http://www.spamcop.net/bl.shtml

Fast Flux Domains

In recent years, criminals have begun using a new technique called fast flux DNS to make their command and control networks more resilient. Instead of hosting their domain name at a single ISP, they host their infrastructure across multiple ISPs. When a domain that is part of a fast flux network is resolved, it often returns several IP addresses. These domains usually have round-robin DNS setup, which continually changes the order that the domains are returned in. If one of the servers goes down, the others automatically pick up the slack and there is little impact to the miscreant’s operation. The IP addresses of servers that have gone offline will eventually be removed and replaced with new ones. The HoneyNet Project has written a paper titled Know Your Enemy: Fast-Flux Service Networks (http://www.honeynet.org/papers/ff/) that provides a great deal more information.

It is necessary to be able to recognize fast flux networks, as you may not want to waste your time attempting to block or take down IP addresses associated with them. The IP addresses associated with fast flux networks are often numerous and short-lived. Blocking or taking down one or more of these IP addresses will not likely have much effect. A block or takedown of the domain would prove to be much more effective. The recipes in this section help you determine if a particular domain name is part of a fast flux network and how to track the IP addresses that are associated with it.

Recipe 5-11: Detecting Fast Flux with Passive DNS and TTLs

Recipe 5-2 detailed how to find a domain’s IP address using the host and dig commands. This recipe uses the same basic steps and explains how to detect potential fast flux networks. The vast majority of fast flux domains will return several IP addresses when you resolve them. This may range from just a few IPs to dozens of them. Others may return only a single IP address when resolved but will frequently change that IP so that a new one is returned for each query. The example that follows shows the DNS resolution for a domain associated with a key logger that we suspect might be part of a fast flux network.

$ host wooobo.cn

wooobo.cn has address 71.238.179.69

wooobo.cn has address 98.255.196.56

wooobo.cn has address 184.56.230.63

wooobo.cn has address 62.42.16.78

wooobo.cn has address 68.61.77.93

As you can see, the domain name wooobo.cn returned five different IP addresses. This by itself does not mean that it is a fast flux domain. However, if you already know or suspect this domain is malicious, it increases the likelihood this domain does not just happen to be hosted on several IP addresses at once. Also note that the IP addresses are not part of the same network. Several hosting providers such as Yahoo! return multiple IP addresses for a given domain that is hosted with them. However, in those cases, IP addresses are often in close proximity to one another and are a part of the same network. The IP addresses from the preceding query do not appear to have any relation to one another.

If you resolve the wooobo.cn domain a few moments later, you will notice it is using the round-robin DNS technique.

$ host wooobo.cn

wooobo.cn has address 68.61.77.93

wooobo.cn has address 62.42.16.78

wooobo.cn has address 184.56.230.63

wooobo.cn has address 98.255.196.56

wooobo.cn has address 71.238.179.69

Notice that the ordering of the IP addresses has changed, but the query still returned the same five addresses. Most applications attempt to connect to the first IP address that is returned and only try the subsequent IP addresses if the connection times out. The round-robin technique helps load-balance the connections and keeps a bad IP address from always being returned first.

At this point, you can be fairly confident that the domain wooobo.cn is part of a fast flux network, but it is still possible it just happens to be hosted at multiple ISPs. You can investigate further by using the host command to perform a reverse lookup (PTR record) on these IP addresses and see where they are hosted. Alternatively, you could conduct WHOIS queries on the IP addresses to see whom they belong to.

$ for i in 68.61.77.93 98.255.196.56 184.56.230.63; do host $i; done

93.77.61.68.in-addr.arpa \

domain name pointer c-68-61-77-93.hsd1.mi.comcast.net.

56.196.255.98.in-addr.arpa \

domain name pointer c-98-255-196-56.hsd1.ca.comcast.net.

63.230.56.184.in-addr.arpa \

domain name pointer cpe-184-56-230-63.neo.res.rr.com.

Based on the output, these hosts are mostly cable modem IP addresses located in different states throughout the US. This makes it highly improbable that these systems are legitimately hosting content and increases the likelihood that we are dealing with a fast flux network.

Because fast flux networks often rotate out and change their IP addresses, you should expect to see different IP addresses at some point when you resolve the domain. To demonstrate this concept, we waited a few hours and then resolved the domain wooobo.cn again. The results are as follows:

$ host wooobo.cn

wooobo.cn has address 85.138.202.232

wooobo.cn has address 93.103.241.36

wooobo.cn has address 190.30.87.30

wooobo.cn has address 190.95.111.179

wooobo.cn has address 41.92.44.42

The domain resolution has returned five completely new IP addresses. You can now confirm that this is a fast flux domain. It returns multiple IP addresses located on different networks that frequently change over time.

Detecting Fast Flux with TTLs

Checking if a hostname has a very low TTL value and is continuously returning new IP addresses is another method you can use to detect fast flux. A TTL value of 0 results in DNS servers not caching the returned IP address, so that all subsequent attempts to contact the hostname result in a new DNS lookup. The attackers then continuously update the IP address to which the domain resolves. The Storm Worm17 and Waledac18 botnets are known for implementing this technique. When these botnets were active, you could find hundreds of botnet IP addresses in an hour by just continuously resolving domains associated with either malware family.

You can use the dig command to find a domain’s TTL.

$ dig my-traff.net

[REMOVED]

my-traff.net. 14400 IN A 85.17.139.54

The bolded portion of the A record response is the TTL value in seconds. This means that name servers should cache the IP address for the domain for 14400 seconds (4 hours). Even if the IP address were to be updated several times in an hour, you would not likely see a change in the IP until four hours had passed since the initial DNS lookup. If you did this query on a Storm Worm or Waledac fast flux domain, you would see the value 0 instead of 14400.

Using Passive DNS for Detecting Fast Flux

It is likely that passive DNS search results would return dozens of IP addresses for a domain that is part of a fast flux network. You can use BFK’s passive DNS service (see Recipe 5-4) to assist in your investigation. Only, this time you will search on the domain wooobo.cn instead of entering an IP address. Figure 5-10 shows the results.

Figure 5-10: BFK passive DNS can help reveal fast flux

f0510.tif

The search results returned over 170 different IP addresses associated with wooobo.cn. You can quickly tell from these results that you are dealing with a fast flux domain that is using dozens of hacked computers to host its activities.

17 http://www.cyber-ta.org/pubs/StormWorm/

18 http://www.honeynet.org/node/348

Recipe 5-12: Tracking Fast Flux Domains

The Australian Honeynet Project created a tool called Tracker19 that you can use to find fast flux domains and track their IP addresses. The Tracker system uses a Postgresql database and a set of Perl scripts that you can run in the background on your Linux system.

To get started with Tracker, follow these steps:

1. Download the most recent version of Tracker, which will contain the database schema and the following set of Perl scripts:

· add-to-test-table.pl: Loads suspect domains from a text file into the database.

· test_submission.pl: Performs an initial check on the domains to see if they are fast flux.

· flux.pl: A daemon process to monitor IPs in a fast flux network.

2. Create a database on your Postgresql server named fast_flux and add a user with full privileges.

$ sudo -u postgres psql

postgres=# CREATE DATABASE fast_flux;

postgres=# CREATE USER flux WITH PASSWORD 'password';

postgres=# GRANT ALL PRIVILEGES ON DATABASE fast_flux to flux;

3. Modify the following line in each of Tracker’s Perl files to contain the appropriate credentials for the database user:

my $username = 'flux';

my $password = 'password';

4. Import the database schema from setupdb.sql into the database that you just created.

$ sudo -u postgres psql fast_flux < setupdb.sql

5. Change the file access permissions to make them executable (without needing to type perl first).

$ chmod +x add-to-test-table.pl

$ chmod +x flux.pl

$ chmod +x test_submission.pl

6. Use add-to-test-table.pl to supply Tracker with a list of suspect domains to monitor. To do this, add the domains to a text file as shown in the following commands:

$ echo test.com > domains.txt

$ echo pillsshopping.com >> domains.txt

$ ./add-to-test-table.pl domains.txt

test.com Inserted

pillsshopping.com Inserted

7. Use test_submission.pl to perform a series of tests on the domains you added to the database. To pass the test, domains must meet the fast flux criteria, which by default consists of domains that return ten or more IP addresses in a five second period. If you want to tweak the criteria (for example to five IP addresses in five seconds), you can modify the $passmark variable in test_submission.pl. This step is important, because Tracker only monitors domains that pass the initial test.

$ ./test_submission.pl

Looking for new work to do

Testing Host test.com

1 Distinct cnt

Removing Host test.com from the input Table

Testing Host pillsshopping.com

5 Distinct cnt

Inserting Host pillsshopping.com as its \

classified as on a fast-flux network

Removing Host pillsshopping.com from the input Table

This example uses two domains, one of which is classified as being fast flux. In the testing period, test.com was found to have a single IP address, while pillsshopping.com was found to have five IP addresses. The latter domain met the criteria and was moved from the input table to the hostname table.

fast_flux=> select * from hostname;

hostname | submit_date | last_seen | live | track

-------------------+-------------+------------+------+-------

pillsshopping.com | 2010-04-26 | 2010-04-26 | t | t

Now you are ready to run flux.pl, which will start tracking domains in the hostname table that have the track column set to true.

$ ./flux.pl

pillsshopping.com

82.211.7.32 pillsshopping.com Inserted

94.136.61.205 pillsshopping.com Inserted

87.230.53.82 pillsshopping.com Inserted

93.89.80.117 pillsshopping.com Inserted

94.23.110.101 pillsshopping.com Inserted

Checking Domains that have been set to inactive

Getting New Work

flux.pl will continue to run and resolve the domain every few seconds to see if any new IP addresses are returned. If a new IP address is detected, it will be added to the node table along with the rest of the IP addresses. The script will also continually check the hostname table and automatically begin to track new additions.

The flux.pl script, once running, will continue to send data to STDOUT until it is closed. You may want to run this file in the background with nohup instead. This keeps the file running even if you log out of the SSH or terminal session.

$ nohup ./flux.pl > /dev/null &

If you want to discontinue tracking a domain, just change the track field to false. This keeps any historical data in the database.

fast_flux=> update hostname \

set track = false \

where hostname = 'pillsshopping.com';

After you run this command, the hostname table should look like this:

fast_flux=> select * from hostname;

hostname | submit_date | last_seen | live | track

-------------------+-------------+------------+------+-------

pillsshopping.com | 2010-04-26 | 2010-04-26 | t | f

19 http://honeynet.org.au/?q=node/10

Geo-Mapping IP Addresses

When you have a lot of suspect IP addresses, possibly from fast flux monitoring, it’s useful to see where they are all located for trending or reporting purposes. Only complete geeks can look at an IP address and tell you off the top of their heads in which country the IP is located. If you’re not one of those geeks, you can use databases to figure out the longitude and latitude. Using those coordinates, you can plot the IPs on a map to see where they exist geographically. The recipes in this section show how to generate static (i.e., PNG, JPEG, BMP) map images and dynamic/interactive maps based on a given set of IP addresses.

Recipe 5-13: Static Maps with Maxmind, matplotlib, and pygeoip

dvd1.eps

You can find supporting material for this recipe on the companion DVD.

This recipe shows how you can use the freely available GeoLite Country or GeoLite City databases from MaxMind20 to determine the approximate geographical location of an IP address. The databases are just files containing data in an organized format, not network-enabled servers like Postgresql and MySQL. To access the data, MaxMind provides APIs in C, Perl, PHP, Python (requires the C library), Ruby, and JavaScript. However, this recipe uses a third-party API called pygeoip21. Pygeoip is written in pure Python and does not depend on any C libraries. Here is a list of the types of information you can find in the MaxMind databases for each IP address:

· Longitude and latitude

· Full country name and two-letter country code

· Region (i.e., state)

· Area code

· City name

· Postal (i.e., zip code)

MaxMind supplies commercial versions of the databases that have slightly more accurate information. For example, they advertise that the free GeoLite City database is 99.5 percent accurate on a country level and 79 percent accurate on a city level. The commercial version is 99.8 percent accurate on a country level and 83 percent accurate on a city level.

Installing MaxMind and Pygeoip

To get started, follow these steps:

1. Download the GeoLite City or GeoLite Country database from MaxMind. The databases are updated at the beginning of each month, so you might set a cron job to automatically download the newest databases when they become available (use –N with wget to download the database only if it has been updated since the last time you fetched it).

$ wget -N -q \

http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz

$ gzip -d GeoLiteCity.dat.gz

$ ls -alh GeoLiteCity.dat

-rw-r--r-- 1 root root 29M 2010-04-02 11:29 GeoLiteCity.dat

2. Install the pygeoip API. The tool’s website provides a few installation techniques, but you might run into issues due to some hard-coded versions in the pygeoip source code. To get around the issues, use the following commands:

$ wget http://pygeoip.googlecode.com/files/pygeoip-0.1.3.zip

$ unzip pygeoip-0.1.3.zip

$ cd pygeoip-0.1.3

$ wget \

http://svn.python.org/projects/sandbox/trunk/setuptools/ez_setup.py

$ wget \

http://pypi.python.org/packages/2.5/s/setuptools/setuptools-0.6c11-py2.5.egg

$ mv setuptools-0.6c11-py2.5.egg setuptools-0.7a1-py2.5.egg

$ python setup.py build

$ sudo python setup.py install

3. If everything worked, you should be able to query the MaxMind database from a Python shell, like this:

$ python

>>> import pygeoip

>>> gip = pygeoip.GeoIP('GeoLiteCity.dat')

>>> rec = gip.record_by_name('yahoo.com')

>>> for key,val in rec.items():

... print "%s: %s" % (key,val)

...

city: Sunnyvale

region_name: CA

area_code: 408

longitude: -122.0074

country_code3: USA

latitude: 37.4249

postal_code: 94089

dma_code: 807

country_code: US

country_name: United States

Generating Static Images with Matplotlib

To use the API in a slightly more automated manner and actually plot the IP addresses on a map, follow these steps:

1. Install the matplotlib22 package and its dependencies. You can install it from the source by downloading the appropriate package or typing the following commands on your Ubuntu machine:

$ sudo apt-get install python-tk \

python-numpy \

python-matplotlib \

python-dev

2. Matplotlib is just the base package. To plot points on a map, you’ll need to also install the basemap module. (Note we broke the URL into separate lines for printing).

$ wget http://sourceforge.net/projects/matplotlib/\

files/matplotlib-toolkits/basemap-0.99.4/\

basemap-0.99.4.tar.gz/download

$ tar -xvzf basemap-0.99.4.tar.gz

$ cd basemap-0.99.4/geos-2.2.3

$ ./configure

$ make

$ sudo make install

$ cd ..

$ python setup.py build

$ sudo python setup.py install

3. Now you’re ready to start producing map images. On the book’s DVD, you’ll find a Python script named mapper.py. You can use this script in three ways:

· Pass it a comma-separated list of IP addresses on the command line.

· Pass it a file name containing a list of IP addresses.

· Import the module from your own Python scripts.

If you plan to use mapper.py on the command line, here is the syntax:

$ python mapper.py

Usage: mapper.py [options]

Options:

-h, --help show this help message and exit

-f FILENAME, --file=FILENAME

filename with CRLF-separated IPs

-a ADDR, --addr=ADDR CSV list of IPs

mapper.py: error: You must supply a list of IPs or file with IPs!

The following example shows you how to plot a few of the IP addresses from the fast flux network described in Recipe 5-11.

$ python mapper.py -a 85.138.202.232,93.103.241.36, \

190.95.111.179,41.92.44.42

Done.

By default, the script outputs a PNG image named map.png using the Miller Cylindrical Projection map (see the basemap23 website for other maps). It should appear like the image in Figure 5-11.

Figure 5-11: A static PNG map populated with various IP addresses

f0511.tif

The following example shows you how you can import the mapper.py module into your own Python programs to generate custom maps.

#!/usr/bin/python

from mapper import Mapper

ip_list = [] # fill this list any way you want

m = Mapper(ip_list)

m.map(title="My New Map", # title for the map

output="newmap.png", # output file name

showcity=False, # do not print city name on the map

type="ortho") # use Orthographic Projection map

20 http://www.maxmind.com

21 http://code.google.com/p/pygeoip/

22 http://matplotlib.sourceforge.net/

23 http://matplotlib.sourceforge.net/basemap/doc/html/users/mapsetup.html

Recipe 5-14: Interactive Maps with Google Charts API

dvd1.eps

You can find supporting material for this recipe on the companion DVD.

If you prefer interactive maps to static images, you can use Google Charts API.24 Some options available to you are:

· Plot your IP addresses on maps that look exactly like the ones on maps.google.com, with the ability to zoom and label locations.

· Plot your IP addresses on interactive, color-coded geomaps and intensity maps.

This recipe shows you how to create a geomap using MaxMind’s database and Google Charts API. On the book’s DVD, you’ll find a script named googlegeoip.py, which takes the same command-line parameters as mapper.py from Recipe 5-13. Instead of outputting a static image, it outputs HTML that you can embed into a web page. The authors took about 500 IP addresses, which are involved in the wooobo.cn fast flux network, and placed them into a text file. Then we issued the following commands (the first is just to show you the output—you’ll want to use the second command that redirects output to an HTML file):

$ python googlegeoip.py -f ip_list.txt

<html><head>

<script type="text/javascript" src="http://www.google.com/jsapi"></script>

<script type="text/javascript">

google.load('visualization', '1', {packages: ['geomap']});

</script>

<script type="text/javascript">

function drawVisualization() {

// Create and populate the data table.

var data = new google.visualization.DataTable();

data.addColumn('string', '', 'Country');

data.addColumn('number', 'Hosts');

data.addRows(58);

data.setValue(0, 0, 'FR');

data.setValue(0, 1, 8);

data.setValue(1, 0, 'BG');

[REMOVED]

$ python googlegeoip.py -f ip_list.txt > map.html

The final step is to view the map.html file in a web browser. Make sure you’re connected to the Internet or the images and dependent JavaScript won’t be available. Figure 5-12 shows the distribution of IP addresses per geographic region for the wooobo.cn fast flux network. You can hover your cursor over any country to see the two-letter country code and exact number of IP addresses that reside in that country.

Figure 5-12: Distribution of IPs per country in the wooobo.cn fast flux network

f0512.tif

24 http://code.google.com/apis/charttools/