VOICE SPAM - APPLICATION ATTACKS - Praise for Hacking Exposed: Unified communications & VoIP Security Secrets & Solutions, Second Edition (2014)

Praise for Hacking Exposed: Unified communications & VoIP Security Secrets & Solutions, Second Edition (2014)



I am to the point where I am going to shut off all my phones. My home phone constantly rings with SPAM and scams. I disconnected it. I am getting the same calls on my smartphone. Worst of all, my office phone is ringing all the time. I can’t turn off my smartphone and office phone. What can I do about this?

—User reaction to voice SPAM

Anyone using email on any sort of device, whether it is a PC, Mac, or smartphone, is familiar with email SPAM. Anyone with an email address is familiar with the constant flood of irritating messages, trying to sell you mortgages, loans, sexual enhancement products, replica watches, gambling opportunities, and so on. Even with blocking traffic from known spammers and using modern SPAM filters, many of us receive hundreds of unwanted messages a day. Even the best SPAM filters let some unwanted messages through—or, worse yet, put a useful message into a junk mail box, which must be searched periodically. Even as SPAM filters have improved, the spammers always seem to find a way to get their messages through.

Voice SPAM or SPAM over Internet Telephony (SPIT) is a similar problem that affects voice and UC. We are going to avoid the use of the term “SPIT,” though, because it implies that voice SPAM can only be received over Internet Telephony, which is not the case. Voice SPAM is generated through the use of VoIP, Internet Telephony, and UC. However, because it involves unwanted calls, it can be received by any target victim, including a residential user using analog or cable service, a smartphone user, and an enterprise user with any mix of legacy or UC systems. Consider getting calls all day for the “products” illustrated in Figure 8-1.


Figure 8-1 Voice SPAM “product” examples


Another term often used for voice SPAM is “robocall.” This is sort of a misnomer, because this term implies any automatically generated call, which could be for many purposes, but is most often associated with voice SPAM. The Federal Trade Commission (FTC), an advocate for consumers, has a lot of very good information on robocalls and voice SPAM.1

Understanding Voice SPAM

Voice SPAM, in this context, refers to bulk, automatically generated, unsolicited calls. Voice SPAM is similar to traditional telemarketing, but occurs at a much higher frequency. Traditional telemarketing is certainly annoying and is often at least partially automated. Telemarketers often employ “auto-dialers,” which dial numbers trying to find a human who will answer the phone. When a human answers and is identified, the call is transferred to another human, who begins the sales pitch. These auto-dialers are pretty good about differentiating a human voice from an answering machine or voicemail system. Some telemarketers use automated messages, but considering the traditional cost of making calls, most will use humans to do the talking. Traditional telemarketing was somewhat expensive because it often did cost more money to make calls. Telemarketers can’t afford to make enormous numbers of calls. This is in contrast to sending email messages, which costs virtually nothing. Making large numbers of calls used to be expensive for the following reasons:

• You needed a PBX, sized to the number of concurrent calls you wanted to make. You needed the PBX itself, some number of T1 access cards, and auto-dialing software (it really wasn’t practical to have humans making the calls). You also needed some number of phones for the humans taking the calls when a person answered. If you wanted to make 100 concurrent calls and had 10 phones available, an estimate for the equipment was $25,000.

• You needed expensive circuit-switched infrastructure to make a lot of concurrent calls. For example, if you wanted to generate 100 concurrent calls, you needed at least five T1s (which had 23 or 24 channels each). The cost of the T1 varied, but averaged around $500 per month.

• Long distance calls averaged around 2 cents a minute. Assuming you were making 100 concurrent long distance calls, the cost per minute was $2.00. Assuming you operated eight hours a day (a very conservative estimate), that would be 480 minutes or about $1,000 (assuming again 100-percent utilization). Actual utilization would be lower, because many calls would not be answered.

• The other cost to consider was that of the humans who made the calls or picked them up when auto-dialing software determined that an actual person had answered the call. In traditional telemarketing, humans were considered essential, given the cost of calls and the desire to have an acceptable “hit” rate.

Keep in mind that a small percentage of the calls made were actually answered by a human, and many went to voicemail. Assuming a 10-percent hit ratio and 10 available telemarketers, only 10 total concurrent telemarketing calls could be handled. This was arguably inefficient, considering the investment in equipment, T1 access, long distance charges, and personnel.

Voice SPAM is really telemarketing on steroids. Voice SPAM occurs with a frequency close to or similar to email SPAM. Telemarketing is annoying, but the rate of calls, at least compared to email SPAM, is very low. Compare the number of telemarketing calls you get on an average day to the number of email SPAM messages you get. Figure 8-2 provides a simple network diagram illustrating voice SPAM.


Figure 8-2 Voice SPAM

With UC, call-generation costs are greatly reduced, which is why voice SPAM resembles email SPAM more than traditional telemarketing. Due to the volume possible, the hit rate percentage can be a lot lower, thus eliminating the need for humans to make the calls. Voice SPAM will include a callback number. The spammer still needs humans to answer the inbound calls from the people who respond to the voice SPAM calls, but these are more likely to result in a sale than a “cold” outbound telemarketing call. Also, voice SPAM will often offer the victim a chance to “opt out” by pressing “1” or another input. This is a trick and the victim should never respond to this, because all it will do is get them put on a list that will result in even more voice SPAM.

As we have discussed in previous chapters, setting up a free PBX and originating calls through SIP is very easy and the cost is much lower (or even free). A commercial PBX could be used, or the attacker could use a freeware system, such as Asterisk, and be up and running for about the cost of a decent server. Because the network access is SIP, expensive circuit-switched T1 access cards are not required. As we have shown in previous chapters, generating calls through UC is very inexpensive—or even free if you can compromise an Internet-based SIP server. Commercial calling or robocalling services can also be used for this function. Of course, eventually, all calls will be free.

To emphasize a key point, although UC is used to generate voice SPAM, the target of the calls can be TDM, UC, or any combination. A typical target is a home phone, which is often analog or UC and provided by the cable company. Many of us have eliminated our home phones because we rely on our smartphones, but also because a high percentage of the calls to home phones are voice SPAM. Smartphones are also a growing target, with enterprise phones a target as well. Depending on the enterprise, we see from 3–5 percent of the inbound calls being some sort of nuisance calls, most of which are voice SPAM. It is completely obvious that this percentage will only continue to rise.

We analyzed the inbound harassing and nuisance call traffic for several hundred enterprises and found that the majority of calls were voice SPAM, broken into various categories, including telemarketing, scams, political advertisements, and so on. Figure 8-3 provides a pie chart that shows the types of calls seen in this analysis.


Figure 8-3 Voice SPAM calls seen in the enterprises

Although some of us rely more on email than voice, for most users, voice is still the primary means of business communication. A phone call is more urgent, interrupting, and much harder to ignore than an email. Many wise email users check their email at intervals, rather than letting it interrupt them whenever they receive a message. When the phone rings, however, most users answer or at least check to see who is calling. Most users don’t turn off their phone or put it in a “do not disturb” mode, as you can easily do with email or instant messaging. Because of this, when the phone rings, if it is voice SPAM, it will immediately cause some amount of disturbance to the user. This is true, even if the user simply takes their attention away from their work at hand and checks the calling number. With the ability to spoof the calling number being so easy, many of these calls will show up with a legitimate-looking number and name and often trick the user into answering the call. With voice SPAM, it is conceivable that the phone will ring as often as the average user receives an email SPAM. This is already occurring for residential phones, and likely by the time you are reading this book it will be happening for smartphones and enterprise phones. Even now, many enterprises are receiving a large amount of voice SPAM. Figure 8-4 shows a graph from one enterprise that was receiving an average of 150,000 voice SPAM calls per month.


Figure 8-4 Sample enterprise voice SPAM volume

Imagine this occurring in cubicle farms, where phones ring constantly. Even if the voice SPAM call is not for you, it is possible that all your surrounding cube mates will be constantly getting calls, thereby disturbing everyone in the office.

One of the biggest issues with voice SPAM is that you can’t analyze the call content before the phone rings. Current email SPAM filters do a passable job of blocking SPAM, but email has no requirement for real-time delivery of a message. The message, along with all its attachments, arrives and can reside on a server before it is delivered to the user. While there, the entire message is available to be reviewed to determine if it is SPAM. This is in contrast to voice SPAM, where the call arrives and you have no idea what its content is. It might be your spouse or yet another Viagra advertisement. Odds are that the calling number will be spoofed, so you won’t know whom the call is from or what it is about until you answer it.

Of course, calls that arrive when the user is not around will also go to voicemail. Listening to voice SPAM left in voicemail is better than listening to the call in real time, but it’s still an issue. Imagine coming in and having as many voicemail messages as you do email messages. At least with email, you can see the headers and bodies quickly in an email client such as Outlook, sort by recipient, eyeball email SPAM, and then delete it. Those users who access their voicemail through a phone will have a very difficult time listening to and deleting voice SPAM. They will have to step through each message, listen to a portion of the message, and delete those that are voice SPAM.

Those calls that are saved to voicemail can be converted to text and analyzed to determine whether they are voice SPAM. Those calls determined to be voice SPAM can then be deleted or moved to a “junk” mailbox, much like SPAM email. Unfortunately, keyword recognition software is far from perfect. Vocabulary systems are available, but they only recognize words in their vocabularies (which are admittedly large) and are susceptible to variances in word pronunciations, accents, and languages. A clever restatement of “Viagra,” although easily understandable to a human, could trick a vocabulary system. Large vocabulary systems are also computationally intensive and require quite a bit of horsepower to analyze calls. Other word-recognition technologies are available, including those based on phonemes. This technology breaks words into elemental phonemes, which represent the various sounds a human can utter. This technology handles accents and languages much better than large vocabulary systems. It is also less computationally intensive. The bad news, though, is neither of these approaches is perfect and their use will result in some number of false positives and negatives.

The FTC Robocall Challenge

Because consumers are receiving so many calls on their residential lines, the Federal Trade Commission (FTC) has taken millions of complaints and is currently looking for solutions. The FTC sponsored a conference with government and industry experts to talk about the issue. You can find the information at www.consumer.ftc.gov/features/feature-0025-robocalls. The conference materials are excellent and a great read. The FTC also published great infographics on how robocalls work, which we have included in Figures 8-5 and 8-6.


Figure 8-5 How a robocall works (part 1)


Figure 8-6 How a robocall works (part 2)

The FTC even sponsored a contest to find the best solution to the robocall issue, with $50,000 as the first prize. The FTC received hundreds of ideas and ended up funding three of them. The winners proposed a variety of blacklists, whitelists, Turing tests, and other countermeasures, covered toward the end of the chapter. There are a number of good ideas here that are likely to find their way into future solutions.

Other Types of UC SPAM

Any large, open communications system is going to have some amount of SPAM. For example, individuals are now receiving unwanted text messages on their smartphones (and older feature phones). Automated calling services such as Call-Em-All can be used for text messages as well as voice messages.

Really, any communications system that offers a way to generate automated messages can be targeted for SPAM. Those of us who use social and professional networking sites such as Facebook, Instagram, Twitter, and LinkedIn also see some amount of SPAM and various unwanted requests.


In Chapter 7, we used the spitter tool to generate a TDoS attack directed against our own enterprise PRI. Now, we will use spitter to generate a voice SPAM attack against the same targets. Generation of voice SPAM is why we originally designed spitter. As we discussed, spitter is run on the same system as the Asterisk installation, and the tool will require some minor modifications to work correctly. Remember, you need to make sure Asterisk is installed and running correctly, spooling is enabled on Asterisk, the trunks are configured correctly for Asterisk, and you have a dial plan that’s appropriate for your targeted network, as we discussed in Chapter 7.

As you know, spitter works by reading an input file with information about the targeted numbers and produces “.call” files based on the input file’s content. The .call files are placed in the/tmp directory and then moved into Asterisk’s outgoing spool folder, /var/spool/asterisk/outgoing/. Asterisk monitors the outgoing directory for .call files and generates outbound calls based on the .call files created. The input file for spitter must contain at least one call record or else nothing will happen, and it’s limited only by the capacity of your storage media. Each .call file generated by spitter has a name in this form:


Each of the .call files will contain attributes that define how the call will be generated. Here are the contents from a .call file used in the upcoming voice SPAM attack:


As we discussed in Chapter 7, the Channel line describes which trunk the call is routed to and what destination number will be dialed. The CallerID line shows what number will be displayed to the target. The MaxRetries value shows how many times Asterisk will retry a call if the number is busy. The RetryTime value is the time to wait between call attempts. The WaitTime value is the number of seconds the system will wait for a call to be answered. The Context value tells Asterisk which item to use in the dial plan. Finally, the Set value allows you to set channel variables, which are the SPIT file to be played when the call is answered in this case.

Remember, the dial plan must be modified such that it has the maximum effect on the targeted enterprise and should be based on how the enterprise answers calls. One of the best ways to determine how to set up your dial plan is to call the enterprise and listen to the IVR prompts and time them. This allows you to build a dial plan that’s the most effective against your target. Here is the “autodialer” dial plan, which we found on the voip-info.org website and modified for the targeted phone system:


You can see that this dial plan allows for the outbound call to connect to the trunk, wait three seconds for the called party to answer, wait five seconds after answering, play the attack.wav file, play the “goodbye” .wav file, and then hang up. We placed the autodialer dial plan in the extensions.conf configuration file.

Once we have our attack environment configured, we can construct the input file that spitter will use to generate the calls. Here is where a little creativity can help you to generate the attack. The input file is the roadmap for the attack that allows you to decide which numbers you are going to call, how often you are going to call them, which trunk you will use, and what .wav file to play with each call.

For demonstration purposes, we decided to call all of the engineers on our floor and play three different .wav files. We could have easily placed 100 calls to everyone with 100 different .wav files, but we do have to work with these people and shouldn’t annoy them too much. Here is a small portion of the input file, which we will call test_ calls_file_engineering, showing two different numbers being called three times:



Although the order of the records isn’t relevant because Asterisk will simultaneously schedule a call for each .call file, it helps to keep them in a semblance of order to make sure you are calling all the numbers of the intended targets as many times as required. This example uses three different .wav files, but it could be one message for each call or hundreds of different messages for each call. The complete input file will have three entries for each targeted DID number, with each having the three different .wav files for voice SPAM.

Spitter has several command-line options. Some examples include the -t (or “test”) mode, which doesn’t require an Asterisk installation, the -l option, which is used to limit how many calls are placed in the outgoing directory, and the -h option, which prints the help file for the tool. Spitter comes with a thorough Readme file, which provides detailed descriptions of all the tool’s options.

Once we have set up our environment and prepared our input file, we can execute the attack. Because the intent of this attack is to create voice SPAM calls as opposed to creating TDoS, we will limit the number of simultaneous calls to three. Here is an example of the command line for spitter using the test call file provided and limiting the calls to three at a time:


The length of time the attack takes will vary depending the attacking platform’s power, the SIP trunks routing the calls, and whether the calls are answered by a person (who will probably hang up) or answered by voicemail and allowed to record the messages in full (which will take longer).

Image Other Tools to Produce Voice SPAM


Another easy way to produce voice SPAM is to use the commercial services, such as Call-Em-All2 and others mentioned in Chapter 7. These services are purpose-built for automatically delivering voice messages. Although we initially mentioned them as possible ways to generate TDoS, they are actually better suited for voice SPAM. These are legitimate services and are probably most often used to deliver “legal” telemarketing calls, advertisements, political ads, etc., but they can also be used for voice SPAM. The disadvantage, though, is that they cost money and are not practical for large-scale voice SPAM campaigns. Plus, we are sure that these services would object to the abuse.

SIPp is a very robust tool that we use all the time for traffic generation and load testing. Although not as flexible as Asterisk, it can also be used to generate SIP-based calls. You can find the source code at sipp.sourceforge.net.3

If you Google “robodialers” or related terms, you will find other possible applications than can be used for voice SPAM.

In the original book, we referenced the TeleYapper tool. It does not look like this tool has been updated recently, but it’s still available. It is integrated with a SQL database where call groups can be defined and audio messages can be stored. It recognizes when a call is not answered and can reschedule the call for later attempts. It has many other nice features. At the time of this writing, you can find information about TeleYapper at the following website: http://nerdvittles.com/?p=701.4

Image Voice SPAM Countermeasures

Voice SPAM is a social issue that enterprises have limited ability to affect. Some solutions are the responsibility of the larger UC and SIP community. If the UC community does not work together to address voice SPAM before it is a big issue, enterprises will be forced to adopt “traditional” mitigation strategies, which are expected to be similar to those adopted for other voice security issues and/or email SPAM. Some of the countermeasures the UC community and enterprises can take are discussed here.

Legal Measures

You can complain to organizations such as the FTC when you receive voice SPAM. Go to their Contacts page for information on how to register a complaint, send information about offending emails, and put your number on the “National Do Not Call Registry.” If you receive a voice SPAM or other harassing call, it will help to call the FTC and at least pass on the number. The FTC receives many complaints about voice SPAM and is working on solutions to the issue. In at least one case, the FTC levied a heavy fine on a debt collector using robocalls and abusive collection practices. See “FTC Fines Debt Collector $3.2 Million for Harassment.”5

Even if some voice spammers ignore it, it is still a good idea to keep your numbers on the “National Do Not Call Registry.” As noted, you can do this through the FTC website. Fines are levied on voice spammers who make calls to users who register their numbers. Of course, this only affects legitimate voice SPAM.

Other Ways to Identify Voice SPAM

The 800notes (www.800notes.com) website collects and tracks complaints against voice SPAM, scams, voice phishing, etc. This site tracks the offending numbers, the number of complaints, and then information entered by the victims. If you receive a voice SPAM call, this is a good site to go to record information about the call. This site also has some good articles and general information about scams. Figure 8-7 shows the 800notes website.


Figure 8-7 800notes website

Authenticated Identity

One of the keys to addressing voice SPAM is the ability to determine the identity of a caller. The caller’s identity is presented in the “From:” SIP header. Unfortunately, as we have shown, it is trivial to spoof this value.

If the true identity of a caller can be determined, certain simple countermeasures, such as employing blacklists and whitelists, can be much more effective. For identities to be ensured, all users within a SIP domain must be authenticated. RFC 3261 requires support for digest authentication. When coupled with the use of TLS between each SIP user agent and SIP proxy, digest authentication can be used to securely authenticate the user agent. Next, when this user agent sends a call to another domain, its identity can be asserted. This approach, although it enhances authentication, only provides hop-by-hop security. The model breaks down if any participating proxy does not support TLS and/or is not trusted.

The p-asserted identity field, defined in RFC 3325, specifies a new field for SIP INVITEs that can be used to assert the identity of the originating caller.6 This is a great concept, but it has not been adopted within the industry. For authenticated identity to work, it must be broadly implemented by enterprises, as well as service providers. It may not be realistic to expect this to happen. The Secure Telephone Identity Revisited (stir) IETF working group has formed to look at a standard way to secure and authenticate the calling number.

Service Providers

Service providers do have some ability to mitigate voice SPAM. However, they receive tons of voice SPAM from other service providers and differentiating it from legitimate traffic is difficult. One can also ask if the service providers, who are in the business of delivering calls, really want to keep this traffic off of their networks. See “Why Aren’t Phone Companies Doing More to Block Robocalls?”7

Enterprise SPAM Filters

Enterprises are likely to address voice SPAM in a manner similar to email SPAM—namely, by deploying voice SPAM mitigation products. Companies such as SecureLogix (www.securelogix.com) and many of the Session Border Controller (SBC) companies offer such products and services. Some of the voice SPAM countermeasures a product might employ are described here:

Blacklists/Whitelists Blacklists are collections of addresses of known attackers. A call from a source on the blacklist is immediately disallowed. Blacklists are not effective with email, but can be of some use for voice SPAM. Well-defined, managed, and vetted blacklists can be used to reject calls from known voice spammers. As an example, SecureLogix maintains a National Harassing Caller blacklist, which is a list of numbers who have a reputation for generating voice SPAM. This list comes from a variety of vetted sources, including parts of the government. The list is broken into groups of numbers, some of which are for known voice spammers, whose calls will automatically be blocked. Others are treated as “grey listed” and are treated less aggressively, such as being redirected to an announcement.

Whitelists are collections of addresses that are known to be good and from whom a user is willing to accept calls. Whitelists require a way for a user to indicate that they want to receive calls from a new source. Once a user elects to receive calls from the source, the address is placed on a whitelist and subsequent communications are allowed. Attackers can’t change their addresses to get around whitelists. However, if they know an address on the whitelist, they can spoof it and make calls.

Approval Systems An approval system works along with whitelists and blacklists. When a new caller attempts to place a call to a user, the user is provided with some sort of prompt to accept the attempt. The user can either accept or reject the request, thereby placing the caller on the blacklist if denied or the whitelist if approved. This approach may help some, but could also just flood a user with approval requests.

Audio Content Filtering As discussed previously, voice SPAM call content can’t be analyzed unless it has been saved to voicemail. Once it’s saved to voicemail, speech-to-text technologies (although not perfect) can be used to convert the audio to text that can be searched for voice SPAM content. Voicemail messages with voice SPAM content can be deleted or moved to a user’s junk mailbox.

Voice CAPTCHAs/Turing Tests CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) or Turing tests are challenges or puzzles that only a human can easily answer. A common example is the text message embedded in an image with background noise—most humans can see the text easily, but it is very difficult for a computer to do so.

Voice CAPTCHAs are similar. When a call comes in, the caller will be greeted with some sort of challenge. This may be as simple as a request to type in several DTMF codes, such as “Please type in the first three letters of the person’s name,” or it could be more complex, such as “Please state the name of the person you want to talk to.” The prompts could be stated in the presence of background noise. These tests are easy for a human to respond to, but difficult for a computer.

If the caller responds correctly to the CAPTCHA, the call will be sent through to the user. If the caller cannot meet the challenge, the call could be dropped, sent to the user’s voicemail, or sent directly to a junk voicemail box. The user could receive some sort of feedback, such as a distinctive sound on the phone, alerting them to possible voice SPAM.

Voice CAPTCHAs can be effective in addressing voice SPAM, but will have the side effect of irritating legitimate callers. This could be a major problem if, for some reason, the caller has to repeat the challenge multiple times. This might occur, for example, on a poor connection from a cell phone.

Voice CAPTCHAs are best used in conjunction with a policy and/or blacklists and whitelists, where they are only used for new or suspect callers.


Voice SPAM refers to bulk, unsolicited, automatically generated calls. As more and more UC is deployed and enterprises use SIP to interconnect one another through the public network, you can expect voice SPAM to become as common as email SPAM. When voice SPAM occurs, it is more difficult to address than email SPAM due to its real-time nature and difficulty in converting speech to text for content analysis. Voice SPAM is easy to generate, and we provided a tool and instructions for doing so. Fortunately, countermeasures are possible, but they will require action and cooperation within the UC industry, as well as deployment of voice SPAM-mitigation products within enterprises.


1. Federal Trade Commission (FTC), www.consumer.ftc.gov/features/feature-0025-robocalls.

2. Call-Em All, www.call-em-all.com.

3. SIPp, http://sipp.sourceforge.net.

4. Ward Mundy, “Its TeleYapper 5.0: The Ultimate RoboDialer for Asterisk,” http://nerdvittles.com/?p=701.

5. Jennifer Liberto, CNN Money, “FTC Fines Debt Collector $3.2 Million for Harassment,” http://money.cnn.com/2013/07/09/pf/ftc-debt-collector-fine/.

6. RFC 3325, P-Asserted Identity, www.rfc-editor.org/rfc/rfc3325.txt.

7. Herb Weisbaum, “Why Aren’t Phone Companies Doing More to Block Robocalls?” www.today.com/money/why-arent-phone-companies-doing-more-block-robocalls-6C10641251.