Threat Modeling: Designing for Security (2014)

Part III. Managing and Addressing Threats

Chapter 8. Defensive Tactics and Technologies

So far you've learned to model your software using diagrams and learned to find threats using STRIDE, attack trees, and attack libraries. The next step in the threat modeling process is to address every threat you've found.

When it works, the fastest and easiest way to address threats is through technology-level implementations of defensive patterns or features. This chapter covers the standard tactics and technologies that you will use to mitigate threats. These are often operating system or program features that you can configure, activate, apply or otherwise rapidly engage to defend against one or more threats. Sometimes, they involve additional code that is widely available and designed to quickly plug in. (For example, tunneling connections over SSH to add security is widely supported, and some unix packages even have options to make that easier.)

Because you likely found your threats via STRIDE, the bulk of this chapter is organized according to STRIDE. The main part of the chapter addresses STRIDE and privacy threats, because most pattern collections already include information about how to address the threats.

Tactics and Technologies for Mitigating Threats

The mitigation tactics and technologies in this chapter are organized by STRIDE because that's most likely how you found them. This section is therefore organized by ways to mitigate each of the STRIDE threats, each of which includes a brief recap of the threat, tactics that can be brought to bear against it, and the techniques for accomplishing that by people with various skills and responsibilities. For example, if you're a developer who wants to add cryptographic authentication to address spoofing, the techniques you use are different from those used by a systems administrator. Each subsection ends with a list of specific technologies.

Authentication: Mitigating Spoofing

Spoofing threats against code come in a number of forms: faking the program on disk, squatting a port (IP, RPC, etc.), splicing a port, spoofing a remote machine, or faking the program in memory (related problems with libraries and dependencies are covered under tampering); but in general, only programs running at the same or a lower level of trust are spoofable, and you should endeavor to trust only code running at a higher level of trust, such as in the OS.

There is also spoofing of people, of course, a big, complex subject covered in Chapter 14, “Accounts and Identity.” Mitigating spoofing threats often requires unusually tight integration between layers of systems. For example, a maintenance engineer from Acme, Inc. might want remote (or even local) access to your database. Is it enough to know that the person is an employee of Acme? Is it enough to know that he or she can initiate a connection from Acme's domain? You might reasonably want to create an account on your database to allow Joe Engineer to log in to it, but how do you bind that to Acme's employee database? When Joe leaves Acme and gets a job at Evil Geniuses for a Better Tomorrow, what causes his access to Acme's database to go away?

Note

Authentication and authorization are related concepts, and sometimes confused. Knowing that someone really is Adam Shostack should not authorize a bank to take money from my account (there are several people of that name in the U.S.). Addressing authorization is covered in the Authorization: Mitigating Elevation of Privilege

From here, let's dig into the specific ways in which you can ensure authentication is done well.

Tactics for Authentication

You can authenticate a remote machine either with or without cryptographic trust mechanisms. Without crypto involves verifying via IP or “classic” DNS entries. All the noncryptographic methods are unreliable. Before they existed, there were attempts to make hostnames reliable, such as the double-reverse DNS lookup. At the time, this was sometimes the best tactic for authentication. Today, you can do better, and there's rarely an excuse for doing worse. (SNMP may be an excuse, and very small devices may be another). As mentioned earlier, authenticating a person is a complex subject, covered in Chapter 14. Authenticating on-system entities is somewhat operating system dependent.

Whatever the underlying technical mechanisms are, at some point cryptographic keys are being managed to ensure that there's a correspondence between technical names and names people use. That validation cannot be delegated entirely to machines. You can choose to delegate it to one of the many companies that assert they validate these things. These companies often do business as “PKI” or “public key infrastructure” companies, and are often referred to as “certification authorities” or “CAs”. You should be careful about relying on that delegation for any transaction valued at more than what the company will accept for liability. (In most cases, certificate authorities limit their liability to nothing). Why you should assign it a higher value is a question their marketing departments hope will not be asked, but the answer roughly boils down to convenience, limited alternatives, and accepted business practice.

Developer Ways to Address Spoofing

Within an operating system, you should aim to use full and canonical path names for libraries, pipes, and so on to help mitigate spoofing. If you are relying on something being protected by the operating system, ensure that the permissions do what you expect. (In particular, unix files in /tmp are generally unreliable, and Windows historically has had similarly shared directories.) For networked systems in a single trust domain, using operating system mechanisms such as Active Directory or LDAP makes sense. If the system spans multiple trust domains, you might use persistence or a PKI. If the domains change only rarely, it may be appropriate to manually cross-validate keys, or to use a contract to specify who owns what risks.

You can also use cryptographic ways to address spoofing, and these are covered in Chapter 16, “Threats to Cryptosystems.” Essentially, you tie a key to a person, and then work to authenticate that the key is correctly associated with the person who's connecting or authenticating.

Operational Ways to Address Spoofing

Once a system is built, a systems administrator has limited options for improving spoofing defenses. To the extent that the system is internal, pressure can be brought to bear on system developers to improve authentication. It may also be possible to use DNSSEC, SSH, or SSL tunneling to add or improve authentication. Some network providers will filter outbound traffic to make spoofing harder. That's helpful, but you cannot rely on it.

Authentication Technologies

Technologies for authenticating computers (or computer accounts) include the following:

§ IPSec

§ DNSSEC

§ SSH host keys

§ Kerberos authentication

§ HTTP Digest or Basic authentication

§ “Windows authentication” (NTLM)

§ PKI systems, such as SSL or TLS with certificates

Technologies for authenticating bits (files, messages, etc.) include the following:

§ Digital signatures

§ Hashes

Methods for authenticating people can involve any of the following:

§ Something you know, such as a password

§ Something you have, such as an access card

§ Something you are, such as a biometric, including photographs

§ Someone you know who can authenticate you

Technologies for maintaining authentication across connections include the following:

§ Cookies

Maintaining authentication across connections is a common issue as you integrate systems. The cookie pattern has flaws, but generally, it has fewer flaws than re-authenticating with passwords.

Integrity: Mitigating Tampering

Tampering threats come in several flavors, including tampering with bits on disk, bits on a network, and bits in memory. Of course, no one is limited to tampering with a single bit at a time.

Tactics for Integrity

There are three main ways to address tampering threats: relying on system defenses such as permissions, use of cryptographic mechanisms, and use of logging technology and audit activities as a deterrent.

Permission mechanisms can protect things that are within their scope of control, such as files on disk, data in a database, or paths within a web server. Examples of such permissions include ACLs on Windows, unix file permissions, or .htaccess files on a web server.

There are two main cryptographic primitives for integrity: hashes and signatures. A hash takes an input of some arbitrary length, and produces a fixed-length digest or hash of the input. Ideally, any change to the input completely transforms the output. If you store a protected hash of a digital object, you can later detect tampering. Actually, anyone with that hash can detect tampering, so, for example, many software projects list a hash of the software on their website. Anyone who gets the bits from any source can rely on them being the bits described on the project website, to a level of security based on the security of the hash and the operation of the web site. A signature is a cryptographic operation with a private key and a hash that does much the same thing. It has the advantage that once someone has obtained the right public key, they can validate a lot of hashes. Hashes can also be used in binary trees of various forms, where large sets of hashes are collected together and signed. This can enable, for example, inserting data into a tree and noting the time in a way that's hard to alter. There are also systems for using hashes and signatures to detect changes to a file system. The first was co-invented by Gene Kim, and later commercialized by Tripwire, Inc. (Kim, 1994).

Logging technology is a weak third in this list. If you log how files change, you may be able to recover from integrity failures.

Implementing Integrity

If you're implementing a permission system, you should ensure that there's a single permissions kernel, also called a reference monitor. That reference monitor should be the one place that checks all permissions for everything. This has two main advantages. First, you have a single monitor, so there are no bugs, synchronization failures, or other issues based on which code path called. Second, you only have to fix bugs in one place.

Creating a good reference monitor is a fairly intricate bit of work. It's hard to get right, and easy to get wrong. For example, it's easy to run checks on references (such as symlinks) that can change when the code finally opens the file. If you need to implement a reference monitor, perform a literature review first.

If you're implementing a cryptographic defense, see Chapter 16. If you're implementing an auditing system, you need to ensure it is sufficiently performant that people will leave it on, that security successes and failures are both logged, and that there's a usable way to access the logs. You also need to ensure that the data is protected from attackers. Ideally, this involves moving it off the generating system to an isolated logging system.

Operational Assurance of Integrity

The most important element of assuring integrity is about process, not technology. Mechanisms for ensuring integrity only work to the extent that integrity failures generate operational exceptions or interruptions that are addressed by a person. All the cryptographic signatures in the world only help if someone investigates the failure, or if the user cannot or does not override the message about a failure. You can devote all your disk access operations to running checksums, but if no one investigates the alarms, they won't do any good. Some systems use “whitelists” of applications so only code on the whitelist runs. That reduces risk, but carries an operational cost.

It may be possible to use SSH or SSL tunneling or IPSec to address network tampering issues. Systems like Tripwire, OSSEC, or L5 can help with system integrity.

Integrity Technologies

Technologies for protecting files include:

§ ACLs or permissions

§ Digital signatures

§ Hashes

§ Windows Mandatory Integrity Control (MIC) feature

§ Unix immutable bits

Technologies for protecting network traffic:

§ SSL

§ SSH

§ IPSec

§ Digital signatures

Non-Repudiation: Mitigating Repudiation

Repudiation is a somewhat different threat because it bridges the business realm, in which there are four elements to addressing it: preventing fraudulent transactions, taking note of contested issues, investigating them, and responding to them. In an age when anyone can instantly be a publisher, assuming that you can ignore the possibility of a customer (or noncustomer) complaint or contested charge is foolish. Ensuring you can accept customer complaints and investigate them is outside the scope of this book, but the output from such a system provides a key validation that you have the right logs.

Note that repudiation is sometimes a feature. As Professor Ian Goldberg pointed out when introducing his Off-the-Record messaging protocol, signed conversations can be embarrassing, incriminating, or otherwise undesirable (Goldberg, 2008). Two features of theOff-the-Record (OTR) messaging system are that it's secure (encrypted and authenticated) and deniable. This duality of feature or threat also comes up in the LINDDUN approach to privacy threat modeling.

Tactics for Non-Repudiation

The technical elements of addressing repudiation are fraud prevention, logs, and cryptography. Fraud prevention is sometimes considered outside the scope of repudiation. It's included here because managing repudiation is easier if you have fewer contested transactions. Fraud prevention can be divided into fraud by internal actors (embezzlement and the like) and external fraud. Internal fraud prevention is a complex matter; for a full treatment see The Corporate Fraud Handbook (Wells, 2011). You should have good account management practices, including ensuring that your tools work well enough that people are not tempted or forced to share passwords as part of getting their jobs done. Be sure you log and audit the data in those logs.

Logs are the traditional technical core of addressing repudiation issues. What is logged depends on the transaction, but generally includes signatures or an IP address and all related information. There are also cryptographic ways to address repudiation, which are currently mostly used between larger businesses.

Tactics for Preventing Fraud by External Parties

External fraud prevention can be seen as a matter of payment fraud prevention, and ensuring that your customers remain in control of their account. In both cases, details about the state of the art changes quickly, so talk to your peers. Even the most tight-lipped companies have been willing to have very frank discussions with peers under NDA.

In essence, stability is good. For example, someone who has been buying two romance novels a month from you for a decade and is still living at the same address is likely the person who just ordered another one. If that person suddenly moves to the other side of the world, and orders technical books in Slovakian with a new credit card with a billing address in the Philippines, you might have a problem. (Then again, they might have finally found true love, and you don't want to upset your loyal customers.)

Tools for Preventing Fraud by External Parties

In their annual report on online fraud, CyberSource includes a survey of popular fraud detection tools and their perceived effectiveness (CyberSource, 2013). Their 2013 survey includes a set of automated tools:

§ Validation services

§ Proprietary data/customer history

§ Multi-merchant data

§ Purchase device tracing

Validation services include tracking verification numbers (aka CVN/CVV), address verification services, postal address verification, Verified by Visa/MasterCard SecureCode, telephone number verification/reverse lookups, public records services, credit checks, and “out-of-wallet/in-wallet” verification services.

Proprietary data and customer history includes customer order history, in house “negative lists” of problematic customers, “positive lists” of VIP or reliable customers, order velocity monitoring, company-specific fraud models (these are usually built with manual, statistical, or machine learning analyses of past fraudulent orders), and customer website behavioral analysis.

Multi-merchant data focuses on shared negative lists or multi-merchant purchase velocity analyzed by the merchant. (This analysis is nominally also performed by the card processors and clearing houses, so the additional value may be transient.)

Finally, purchase device tracking includes device “fingerprinting” and IP address geolocation. The CyberSource report also discusses the importance of tools to help manual review, and how a varied list is both very helpful and time consuming. Because manual review is one of the most expensive components of an anti-fraud approach to repudiation threats, it may be worth investing in tools to gather all the data into one (or at least fewer) places to improve analyst productivity.

Implementing Non-Repudiation

The two key tools for non-repudiation are logging and digital signatures. Digital signatures are probably most useful for business-to-business systems.

Log as much as you can keep for as long as you need to keep it. As the price of storage continues to fall, this advice becomes easier and easier to follow. For example, with a web transaction, you might log IP address, current geolocation of that address, and browser details. You might also consider services that either provide information on fraud or allow you to request decision advice. To the extent that these companies specialize, and may have broader visibility into fraud, this may be a good area of security to outsource. Some of the information you log or transfer may interact with your privacy policies, and it's important to check.

There are also cryptographic digital signatures. Digital signature should be distinguished from electronic signature, which is a term of art under U.S. law referring to a variety of mechanisms with which to produce a signature, some as minimalistic as “press 1 to agree to these terms and conditions.” In contrast, a digital signature is a mathematical transformation that demonstrates irrefutably that someone in possession of a mathematical key took an action to cause a signature to be made. The strength of “irrefutable” here depends on the strength of the math, and the tricky bits are possession of the key and what human intent (if any) may have lain behind the signature.

Operational Assurance of Non-Repudiation

When a customer or partner attempts to repudiate a transaction, someone needs to investigate it. If repudiation attempts are frequent, you may need dedicated people, and those people might require specialized tools.

Non-Repudiation Technologies

Technologies you can use to address repudiation include:

§ Logging

§ Log analysis tools

§ Secured log storage

§ Digital signatures

§ Secure time stamps

§ Trusted third parties

§ Hash trees

§ As mentioned in “tools for preventing fraud” above

Confidentiality: Mitigating Information Disclosure

Information disclosure can happen with information at rest (in storage) or in motion (over a network). The information disclosed can range from the content of communication to the existence of an entity with which someone is communicating.

Tactics for Confidentiality

Much like with integrity, there are two main ways to prevent information disclosure: Within the confines of a system, you can use ACLs, and outside of it you must use cryptography.

If what must be protected is the content of the communication, then traditional cryptography will be sufficient. If you need to hide who is communicating with whom and how often, you'll need a system that protects that data, such as a cryptographic mix or onion network. If you must hide the fact that communication is taking place at all, steganography will be required.

Implementing Confidentiality

If your system can act as a reference monitor and control all access to the data, you can use a permissions system. Otherwise, you'll need to encrypt either the data or its “container.” The data might be a file on disk, a record in a database, or an e-mail message as it transits over the network. The container might be a file system, database, or network channel, such as all e-mail between two systems, or all packets between a web client and a web server.

In each cryptographic case, you have to consider who needs access to the keys for encrypting and the decrypting data. For file encryption, that might be as simple as asking the operating system to securely store the key for the user so that the user can get to it later. Also, note that encrypted data is not integrity controlled. The details can be complex and tricky, but consider a database of salaries, where the cells are encrypted. You don't need to know the CEO's salary to know that replacing your salary with it is likely a good thing (for you); and if there's no integrity control, replacing the encrypted value of your salary with the CEO's salary will do just fine.

An important subset of information disclosure cases related to the storage of passwords or backup authentication mechanisms is considered in depth in Chapter 14.

Operational Assurance of Confidentiality

It may be possible to add ACLs to an already developed system, or to use chroot or similar sandboxes to restrict what it can access. On Windows, the addition of a SID to a program and an inherited deny ACL for that SID may help (or it may break things). It is usually possible to add a disk or file encryption layer to protect information at rest from disclosure. Disk crypto will work “by default” with all the usual caveats about how keys are managed. It works for adversarial custody of the machine, but not if the password is written down or otherwise stored with the machine. With regard to a network, it may be possible to use SSH or SSL tunneling or IPSec to address network tampering issues.

Confidentiality Technologies

Technologies for confidentiality include:

§ Protecting files:

§ ACLs/permissions

§ Encryption

§ Appropriate key management

§ Protecting network data:

§ Encryption

§ Appropriate key management

§ Protecting communication headers or the fact of communication:

§ Mix networks

§ Onion routing

§ Steganography

Note

In the preceding lists, “appropriate key management” is not quite a technology, but is so important that it's included.

Availability: Mitigating Denial of Service

Denial-of-service attacks work by exhausting some resource. Traditionally, those resources are CPU, memory (both RAM and hard drive space can be exhausted), and bandwidth. Denial-of-service attacks can also exhaust human availability. Consider trying to call the reservations line of a very exclusive restaurant—the French Laundry in Napa Valley books all its tables within 5 minutes of the phone being open every day (for a day 30 days in the future). The resource under contention is the phone lines, and in particular the people answering them.

Tactics for Availability

There are two forms of denial of service attacks: brute force and clever. Using the restaurant example, brute force involves bringing 100 people to a restaurant that can seat only 25. Clever attacks bring 20 people, each of whom makes an ever-escalating list of requests and changes, and runs the staff ragged. In the online world, brute force attacks on networks are somewhat common under the name DDoS (Distributed Denial of Service). They can also be carried out against CPU (for example, while(1) fork()) or disk. It's simple to construct a small zip file that will expand to whatever limit might be in place: the maximum size of a file or space on the file system. Recall that a zip file is structured to describe the contents of the real file as simply as possible, such as 65,535 0s. That three-byte description will expand to 64K, for a magnification effect of over 21,000—which is awfully cool if you're an attacker.

Clever denial-of-service attacks involve a small amount of work by an attacker that causes you to do a lot of work. For example, connecting to an SSL v2 server, the client sends a client master key challenge, which is a random key encrypted such that the server does (relatively) expensive public key operations to decrypt it. The client does very little work compared to the server. This can be partially addressed in a variety of ways, most notably the Photuris key management protocol. The core of such protocols is proof that the client has done more work than the server, and the body of approaches is called proof of work. However, in a world of abundant bots and volunteers to run DDoS software for political causes, Ben Laurie and Richard Clayton have shown reasonably conclusively that “Proof-of-Work Proves Not to Work” (in a paper of that name [Laurie]).

A second important strategy for defending against denial-of-service attacks is to ensure your attacker can receive data from you. For example, defenses against SYN flooding attacks now take this form. In a SYN flood attack, a host receives a lot of connection attempts (TCP SYNchronize) and it needs to keep track of each one to set up new connections. By sending a slew of those, operating systems in the 1990s could be run out of memory in the fixed-size buffers allocated to track SYNs, and no new connections could be established. Modern TCP stacks calculate certain parts of their response to a SYN packet using some cryptography. They maintain no state for incoming packets, and use the cryptographic tools to validate that new connections are real (Rescorla, 2003).

Implementing Availability

If you're implementing a system, consider what resources an attacker might consume, and look for ways to limit those resources on a per-user basis. Understand that there are limits to what you can achieve when dealing with systems on the other side of a trust boundary, and some of the response needs to be operational. Ensure that the operators have such mechanisms.

Operational Assurance of Availability

Addressing brute force denial-of-service attacks is simple: Acquire more resources such that they don't run out, or apply limits so that one bad apple can't spoil things for others. For example, multi-user operating systems implement quota systems, and business ISPs may be able to filter traffic coming from certain sources.

Addressing clever attacks is generally in the realm of implementation, not operations.

Availability Technologies

Technologies for protecting files include:

§ ACLs

§ Filters

§ Quotas (rate limiting, thresholding, throttling)

§ High-availability design

§ Extra bandwidth (rate limiting, throttling)

§ Cloud services

Authorization: Mitigating Elevation of Privilege

Elevation of privilege threats are one category of unauthorized use, and the only one addressed in this section. The overall question of designing authorization systems fills other books.

Tactics for Authorization

As discussed in the section “Implementing Integrity,” having a reference monitor that can control access between objects is a precursor to avoiding several forms of a problem, including elevation of privilege. Limiting the attack surface makes the problem more tractable. For example, limiting the number of setuid programs limits the opportunity for a local user to become root. (Technically, programs can be setuid to something other than root, but generally those other accounts are also privileged.) Each program should do a small number of things, and carefully manage their input, including user input, environment, and so on. Each should be sandboxed to the extent that the system supports it. Ensure that you have layers of defense, such that an anonymous Internet user can't elevate to administrator with a single bug. You can do this by having the code that listens on the network run as a limited user. An attacker who exploits a bug will not have complete run of the system. (If they're a normal user, they may well have easy access to many elevation paths, so lock down the account.)

The permission system needs to be comprehensible, both to administrators trying to check things and to people trying to set things. A permission system that's hard to use often results in people incorrectly setting permissions, (technically) enabling actions that policy and intent mean to forbid.

Implementing Authorization

Having limited the attack surface, you'll need to very carefully manage the input you accept at each point on the attack surface. Ensure that you know what you want to accept and how you're going to use that input. Reject anything that doesn't match, rather than trying to make a complete list of bad characters. Also, if you get a non-match, reject it, rather than try to clean it up.

Operational Assurance of Authorization

Operational details, such as “we need to expose this to the Internet” can often lead to those deploying technology wanting to improve their defensive stance. This usually involves adding what can be referred to as defense in depth or layered defense. There are several ways to do this.

First, run as a normal or limited user, not as administrator/root. While technically that's not a mitigation against an elevation-of-privilege threat, but a harbinger of such, it's inline with the “principle of least privilege.” Each program should run as its own limited user. When unix made “nobody” the default account for services, the nobody account ended up with tremendous levels of authorization. Second, apply all the sandboxing you can.

Authorization Technologies

Technologies for improving authorization include:

§ ACLs

§ Group or role membership

§ Role based access control

§ Claims-based access control

§ Windows privileges (runas)

§ Unix sudo

§ Chroot, AppArmor or other unix sandboxes

§ The “MOICE” Windows sandbox pattern

§ Input validation for a defined purpose

Note

MOICE is the “Microsoft Office Isolated Conversion Environment.” The name comes from the problem that led to the pattern being invented, but the approach can now be considered a pattern for sandboxing on Windows. For more on MOICE, see (LeBlanc, 2007).

Note

Many Windows privileges are functionally equivalent to administrator, and may not be as helpful as you desire. See (Margosis, 2006) for more details.

Tactic and Technology Traps

There are two places where it's easy to get pulled into wasting time when working through these technologies and tactics. The first distraction is risk management. The tactics and technologies in this chapter aren't the only ways to address threats, but they are the best place to start. When you can use them, they will be easier to implement and work better than more complex or nuanced risk management approaches. For example, if you can address a threat by changing a network endpoint to a local endpoint, there's no point to engaging in the more time consuming risk management approaches covered in the next chapter. The second distraction is trying to categorize threats. If you found a threat via brainstorming or just the free flow of ideas, don't let the organization of this chapter fool you into thinking you should try to categorize that threat. Instead, focus on finding the best way to address it. (Teams can spend longer in debate around categorization than it would take to implement the fix they identified—changing permissions on a file.)

Addressing Threats with Patterns

In his book, A Pattern Language, architect Christopher Alexander and his colleagues introduced the concept of architectural patterns (Alexander, 1977). A pattern is a way of expressing how experts capture ways of solving recurring problems. Patterns have since been adapted to software. There are well-understood development patterns, such as the three-tier enterprise app.

Security patterns seem like a natural way to group and communicate about tactics and technologies to address security problems into something larger. You can create and distribute patterns in a variety of ways, and this section discusses some of them. However, in practice, these patterns have not been popular. The reasons for this are not clear, and those investing in using patterns to address security problems would likely benefit from studying the factors that have limited their popularity.

Some of those factors might include engineers not knowing when to reach for such a text, or the presentation of security patterns as a distinct subset, apart from other patterns. At least one web patterns book (Van Duyne, 2007) includes a chapter on security patterns. Embedding security patterns where non-specialists are likely to find them seems like a good pattern.

Standard Deployments

In many larger organizations, an operations group will have a standard way to deploy systems, or possibly several standard ways, depending on the data's sensitivity. In these cases, the operations group can document what sorts of threats their standard deployment mitigates, and provide that document as part of their “on-boarding” process. For example, a standard data center at an organization might include defenses against DDoS, or state that “network information disclosure is an accepted risk for risk categories 1–3.”

Addressing CAPEC Threats

CAPEC (MITRE's Common Attack Pattern Enumeration and Classification) is primarily a collection of attack patterns, but most CAPEC threat patterns include defenses. This chapter has primarily organized threats according to STRIDE. If you are using CAPEC, each CAPEC pattern includes advice about how to address it in its “Solutions and Mitigations” section. The CAPEC website is the authoritative source for such data.

Mitigating Privacy Threats

There are essentially three ways to address privacy threats: Avoid collecting information (minimization), use crypto in various clever ways, and control how data is used (compliance and policy). Cryptography is a technology, while minimization and compliance are more tactics you can apply. Each requires effort to integrate into your design or implementation.

Minimization

Perhaps obviously, it is impossible to use information you don't have in a way that impacts someone's privacy. Therefore, minimizing your collection and retention of information reduces risk. Minimizing what you collect is by far more reliable than attempting to use policy controls on the data. Of course, it also eliminates any utility that you can get from that data. As such, minimization is generally a business call regarding risk and reward. Over the past decade, with breach disclosure laws, the balance of factors related to decisions about the collection and retention of information have changed dramatically. Some legal scholars have gone so far as to compare personal data to toxic waste. Holly Towle, an attorney specializing in electronic commerce, offers 10 principles for handling toxic waste, or personally identifying information (PII). Each of these is addressed in depth in her article (Towle, 2009):

§ Do not touch it unless you have to.

§ If you have to touch it, learn how or whether to do so—mistakes can be fatal or at least seriously damaging.

§ Do not use normal methods to transport (transfer) it.

§ Attempt to crack the whip over contractor handling it.

§ Do not store some of it at all.

§ Store what you need but in a manner avoiding spills, and limit access.

§ Be alert for suspicious odors and other red flags.

§ Report spills to the relevant people and agencies.

§ Dispose of it only by special means.

§ Get ready to be sued or incur often unreasonable expenses no matter how much care you take.

Minimization is a conceptually simple way to address privacy. In practice, however, it can become complex and contentious. The value of collecting data is easy to see, and it's hard to know what you'll be unable to do if you don't collect it.

Cryptography

There are a variety of ways to use cryptographic techniques to address privacy concerns. The applicability of each is dependent on the threat model, in the sense of who you're worried about. Each of these techniques is the subject of a great deal of research, so rather than try to provide a full description of each technique and risk leaving out key details, the following sections explain where each is useful as a response to a threat.

Hashing or Encrypting Data

If your privacy concern is someone accidentally viewing data, or running simple database queries, it may help to encrypt the data. If you want a record that can only be accessed once someone has a specific string (such as an e-mail address or SSN), you can use a cryptographic hash of that data. For example, if you store hash(adam.shostack@example.com), then only someone who knows that e-mail address can look it up.

Warning

Simple hashing doesn't protect your data if the attacker is willing to build a “dictionary” and hash each term in the dictionary. For SSNs, that's only a billion hashes to run, which is cheap on modern hardware. Hashing is therefore not the right defense if your data is low entropy, either because the data is short strings, or because it's highly structured. In those cases, you'll likely want to encrypt, and use a unique key and initialization vector per plaintext.

Split-Key Systems

Splitting keys is useful when you're concerned about the threat of someone decrypting the data without authorization. It's possible to encrypt data with multiple keys, such that all or some fraction of the keys is needed to decrypt it. For example, if you store m = e_k1(e_k2(plaintext)), then to decrypt m, you need the party that holds k1 to decrypt the m, then send that to whomever holds k2.

If you're worried about availability threats, there are split-key cryptographic systems for which the keys are mathematically related, and you only need k-of-n keys to get the plaintext out of the system. Such systems encrypt the data with n keys. Those n keys are mathematically related in ways which allow any k of the n keys to decrypt the data. These are useful, for example, for backing up the master key to a system where that master key may be needed a decade later. Such a system is used to backup the root keys for DNSSEC.

Private Information Retrieval

If the threat is a database owner watching a client's queries and learning from them, then a set of techniques called private information retrieval may be useful. Private information retrieval techniques are generally fairly bandwidth intensive, as they often retrieve far more information than is desired to get at the data without revealing anything to the database owner.

Differential Privacy

When the threat is a database client running multiple queries to violate the database owner's privacy policies, differential privacy provides the database owner with a way to first measure how much information has been given out, and then stop answering queries that provide additional information. This does not mean that the database needs to stop answering queries. Many queries will not change, or differentiate, the amount of information that can be inferred, even after the database has reached a specified privacy limit.

Note

Differential privacy offers very strong protection for a very specific definition of privacy.

Mixes and Mix-Like Systems

A mix is a system for preventing traffic analysis and providing untracability to message senders or recipients. That is, an observer should not be able to trace a message back to a person after it has been through a mix. Mixes work by maintaining a pool of messages, and now and then sending messages out. To avoid trusting a single mix, there may be a network of mixes operated by different parties.

There are two major modes in which mixes operate: interactive-time and batch. Interactive-time mixes can be used for scenarios like web browsing, but are less secure against traffic analysis. There are also interactive systems that do not mix traffic but aim to conceal its source and destination. Such interactive systems include Tor.

Blinding

Blinding helps defend against surveillance threats that use cryptographic keys as identifiers. For example, if Alice is worried that a certificate authority might track her vote, then she might want a voting registration system that can do deep checking to ensure that she is authorized to vote, providing her with an anonymous voting chit that can be used to prove her right to vote. Online, this can be done through the use of blinding.

Blinding is a cool math trick that can solve real problems. The math may look a little intimidating, but you can understand it with high school algebra. Think of signing as doing exponentiation modulo p. (Modulo is a remainder—1 mod 12 is 1, 14 mod 12 is 2, where 14 mod 10 is 4. The modulo math is needed for certain security properties.) Therefore if s is the signature, a is Alice's key, and c is the CA's key, then a signature is s = a^c mod p. Normally, the CA calculates the signature, and sends s back to Alice. Now the CA knows s, and can use that knowledge. So how can the CA calculate s without knowing it? Because multiplication is commutative, the CA can calculate something related to s. Blinding works by Alice multiplying her key by some blinding factor (b) before sending the product of that multiplication (ab) to the CA. The CA then calculates s = (ab)c mod p, and sends s to Alice. Alice then divides s by b, and s/b = ac. So Alice now knows s/b, which appears for all the world like a signature on a, but the CA doesn't know that a is associated with Alice. The math shown here is a subset of what's needed to do this securely, with the goal of giving you an idea of how it works. Proper blinding requires that you deal with a plethora of mathematical threats, which are covered in a book such as (Ferguson, 2012).

Compliance and Policy

To the extent that a business decision has been made to gather and store sensitive information about people, the organization needs to put controls around it. Those controls can either be policy or technical, and they can meet either your business need or regulatory needs, or both. These approaches are not as crisp as the tools and tactics you can apply to security problems, and to the extent that you can apply minimization or cryptography to privacy problems, it will be easier and more effective.

Policies

The first class of controls are organizational policies that specify who can do what with the information. From the technologist's perspective, these can be frustratingly vague statements, such as “only authorized people will be allowed access.” However, they are an important first step in setting requirements. From a statement like that you can derive a technical approach, such as “only a security group can access the data.” Then you'll need to ensure that that policy is enforced across a variety of information systems, and then you're at the level of tactics and technologies.

Regulatory Requirements

Personal data is subject to a long and complex list of privacy rules that differ from jurisdiction to jurisdiction. As with everything else covered in this book, this section on mitigating privacy threats is not intended to replace proper legal advice.

There's one other thing to be said about mitigating privacy threats to your organization. In many cases, organizations are required by law to collect and protect a set of information that they must treat as toxic waste, at great expense. It makes a great deal of privacy sense for organizations and their industry groups to argue against requirements to gather such data. That includes rolling back existing mandates and holding firm against new mandates to collect data that you'd prefer not to hold.

Summary

The best way to address threats is to use standard, well-tested features or products that add security against the threats you've identified. These tactics and technologies are available to address each of the STRIDE threats. There are tactics and technologies available to both developers and operations.

Authentication technologies mitigate spoofing threats. You can authenticate computers, bits, or people. Integrity technologies mitigate tampering threats. Generally, you want integrity protection for files and network connections. Non-repudiation technologies mitigate repudiation, which can include fraud and other repudiations. Anti-fraud technologies include validation services, or use of customer history that's either local or shared by others. There are also a variety of cryptographic and operational measures you can take to increase assurance around your logs. Information disclosure threats are addressed by confidentiality technologies. Those can be most easily applied to files or network connections; however, it can also be important to protect container data, such as filenames or the fact of communication. Preventing denial of service involves ensuring that code doesn't have arbitrary limits that prevent it from taking advantage of all the available resources. Preventing elevation of privilege generally works by first ensuring that the code is constrained by mechanisms such as ACLs, and then by more complex sandboxes.

Patterns are collections of tactics and technologies. They seem like a natural approach. For reasons which are unclear, they haven't really taken off, and those who are considering using them would be advised to understand why.

Mitigating privacy threats is best done by minimizing what you collect, and then applying cryptography; however, there are limits to the tactics and technologies available, and sometimes you must fall back to compliance tools or policy.

The issue of standard tactics and technologies not being applicable everywhere is not limited to privacy. In the next chapter, you'll learn about making structured tradeoffs between ways to address threats.