Assessing the Value Proposition - Examining the Value Proposition - Cloud Computing Bible (2011)

Cloud Computing Bible (2011)

Part I: Examining the Value Proposition

Chapter 2: Assessing the Value Proposition

IN THIS CHAPTER

Discovering the attributes that make cloud computing unique

Applying cloud computing when it is the best option

Measuring the costs associated with cloud computing systems

Learning about Service Level Agreements and Licensing

In this chapter, the various attributes of cloud computing that make it a unique service are described. These attributes—scalability, elasticity, low barrier to entry, and a utility type of delivery—completely change how applications are created, priced, and delivered. I describe the factors that have led to this new model of computing. Early adopters of these services are those enterprises that can best make use of these characteristics.

To get a sense for the value of cloud computing, this chapter compares it to on-premises systems. From this perspective, a number of benefits for cloud computing emerge, along with many obstacles. I describe these factors in some detail. Aside from technological reasons, behavior considerations associated with cloud adoption are discussed.

Cloud computing is particularly valuable because it shifts capital expenditures into operating expenditures. This has the benefit of decoupling growth from cash on hand or from requiring access to capital. It also shifts risk away from an organization and onto the cloud provider.

This chapter describes how to begin to measure the costs of cloud computing and some of the tools that you can use to do so. The concept of optimization known as right-sizing is described, and cloud computing has some unique new capabilities in this area.

Service Level Agreements (SLAs) are an important aspect of cloud computing. They are essentially your working contract with any provider. Cloud computing is having impact on software licensing, which although not entirely settled is also described in this chapter.

Measuring the Cloud's Value

Cloud computing presents new opportunities to users and developers because it is based on the paradigm of a shared multitenant utility. The ability to access pooled resources on a pay-as-you-go basis provides a number of system characteristics that completely alter the economics of information technology infrastructures and allows new types of access and business models for user applications.

Any application or process that benefits from economies of scale, commoditization of assets, and conformance to programming standards benefits from the application of cloud computing. Any application or process that requires a completely customized solution, imposes a high degree of specialization, and requires access to proprietary technology is going to expose the limits of cloud computing rather quickly. Applications that work with cloud computing are ones that I refer to as “low touch” applications; they tend to be applications that have low margins and usually low risk. The “high touch” applications that come with high margins require committed resources and pose more of a risk; those applications are best done on-premises.

A cloud is defined as the combination of the infrastructure of a datacenter with the ability to provision hardware and software. A service that concentrates on hardware follows the Infrastructure as a Service (IaaS) model, which is a good description for the Amazon Web Service described in Chapter 9. When you add a software stack, such as an operating system and applications to the service, the model shifts to the Software as a Service (SaaS) model. Microsoft's Windows Azure Platform, discussed in Chapter 10, is best described as currently using SaaS model. When the service requires the client to use a complete hardware/software/application stack, it is using the most refined and restrictive service model, called the Platform as a Service (PaaS) model. The best example of a PaaS offering is probably SalesForce.com. The Google App Engine discussed in Chapter 11 is another PaaS. As the Windows Azure Platform matures adding more access to Microsoft servers, it is developing into a PaaS model rather quickly.

Cross-Ref

Chapter 4, “Understanding Services and Applications by Type,” describes a number of these XaaS service models. Cloud computing is in its wild and wooly frontier days, so it's best to take a few of the lesser known acronyms with a grain of salt.

A cloud is an infrastructure that can be partitioned and provisioned, and resources are pooled and virtualized. If the cloud is available to the public on a pay-as-you-go basis, then the cloud is a public cloud, and the service is described as a utility. If the cloud is captive in an organization's infrastructure (network), it is referred to as a private cloud. When you mix public and private clouds together, you have a hybrid cloud. Any analysis of the potential of cloud computing must account for all these possibilities.

These are the unique characteristics of an ideal cloud computing model:

Scalability: You have access to unlimited computer resources as needed.

This feature obviates the need for planning and provisioning. It also enables batch processing, which greatly speeds up high-processing applications.

Elasticity: You have the ability to right-size resources as required.

This feature allows you to optimize your system and capture all possible transactions.

Low barrier to entry: You can gain access to systems for a small investment.

This feature offers access to global resources to small ventures and provides the ability to experiment with little risk.

Utility: A pay-as-you-go model matches resources to need on an ongoing basis.

This eliminates waste and has the added benefit of shifting risk from the client.

It is the construction of large datacenters running commodity hardware that has enabled cloud computing to gain traction. These datacenters gain access to low-cost electricity, high-network bandwidth pipes, and low-cost commodity hardware and software, which, taken together, represents an economy of scale that allows cloud providers to amortize their investment and retain a profit. It has been estimated that it costs around $100 million to create a datacenter with sufficient scale to be a cloud provider. At this scale, the resources for a large datacenter have been estimated to be between 35 percent and 20 percent lower than the pricing that is offered to medium-sized datacenters.

Note

Members of the UC Berkeley Reliable Adaptive Distribute System Laboratory have published a white paper summarizing the benefits of Cloud Computing called “Above the Clouds: A Berkeley View of Cloud Computing,” which can be found at http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf. This lab was funded by contributions from the major cloud providers and from the NSF, but it's a useful source of analytical data.

The virtualization of pooled resources—processors or compute engines, storage, and network connectivity—optimizes these investments and allows the cloud provider to pass along these economies to customers. Pooling also blurs the differences between a small deployment and a large one because scale becomes tied only to demand.

Companies become cloud computing providers for several reasons:

Profit: The economies of scale can make this a profitable business.

Optimization: The infrastructure already exists and isn't fully utilized.

This was certainly the case for Amazon Web Services.

Strategic: A cloud computing platform extends the company's products and defends their franchise.

This is the case for Microsoft's Windows Azure Platform.

Extension: A branded cloud computing platform can extend customer relationships by offering additional service options.

This is the case with IBM Global Services and the various IBM cloud services.

Presence: Establish a presence in a market before a large competitor can emerge.

Google App Engine allows a developer to scale an application immediately. For Google, its office applications can be rolled out quickly and to large audiences.

Platform: A cloud computing provider can become a hub master at the center of many ISV's (Independent Software Vendor) offerings.

The customer relationship management provider SalesForce.com has a development platform called Force.com that is a PaaS offering.

The development of cloud computing has been likened to the situation that has faced hardware companies that rely on proprietary silicon to produce their products: the AMDs, nVidias, eVGAs, and Apples of the world. Because a semiconductor fabrication facility costs several billion dollars to create, these companies were at a severe disadvantage to companies such as Intel or NEC, which could build their own fabs or fabrication facilities. (A fab is a facility that is a self-contained semiconductor assembly line.) Companies such as TSMC (Taiwan Semiconductor Manufacturing Company) have come along that provide fabrication based on customer designs, spreading their risk and optimizing their operation. Cloud computing is much the same.

Early adopters and new applications

Cloud computing is still in its infancy, but trends in adoption are already evident. In his white paper “Realizing the Value Proposition of Cloud Computing: CIO's Enterprise IT Strategy for the Cloud,” Jitendra Pal Thethi, a Principle Architect for Infosys' Microsoft Technology Group, lists the following business types as the Top 10 adopters of cloud computing:

1. Messaging and team collaboration applications

2. Cross enterprise integration projects

3. Infrastructure consolidation, server, and desktop virtualization efforts

4. Web 2.0 and social strategy companies

5. Web content delivery services

6. Data analytics and computation

7. Mobility applications for the enterprise

8. CRM applications

9. Experimental deployments, test bed labs, and development efforts

10. Backup and archival storage

You can download Thethi's paper from: http://www.infosys.com/cloud-computing/white-papers/Documents/realizing-value-proposition.pdf.

As a group, early adopters are categorized by their need for ubiquity and access to large data sets.

Around 2000-2001, some companies began using the Internet to stage various types of user-facing applications such as office suites, accounting packages, games, and so forth. The first attempts by large ISPs to create utility computing date to that period. By 2005-2006, several Internet sites had become sufficiently large that they had developed extensive infrastructure for their own sites. The excess capacity in these sites began to be offered to partners and eventually to the general public. The infrastructure cloud computing market was established as profitable so that by 2007-2008 many more vendors became cloud providers.

The nature of cloud computing should provide us with new classes of applications, some of which are currently emerging. Because Wide Area Network (WAN) bandwidth provides one of the current bottlenecks for distributed computing, one of the major areas of interest in cloud computing is in establishing content delivery networks (CDN). These solutions are also called edge networks because they cache content geographically.

Due to its scalability, cloud computing provides a means to do high-performance parallel batch processing that wasn't available to many organizations before. If a company must perform a complex data analysis that might take a server a month to do, then with cloud computing you might launch 100 virtual machine instances and complete the analysis in around 8 hours for the same cost. Processor-intensive applications that users currently perform on their desktops such as mathematical simulations in Mathematica and Matlab, graphic rendering in Renderman, and long encoding/decoding tasks are other examples of applications that could benefit from parallel batch processing and be done directly from the desktop. The economics must work out, but this approach is a completely new one for most people and is a game changer.

The relative ubiquity of cloud computing systems also enables emerging classes of interactive mobile applications. The large array of sensors, diagnostic, and mobile devices, all of which both generate data and consume data, require the use of large data sets and on-demand processing that are a good fit for the cloud computing model. Cloud computing also can provide access to multiple data sets that can support layered forms of information, the types of information you get when you view a mashup, such as the layers of information like Panoramio provided in the application Google Earth.

Note

A mashup is an application or Web page that combines data from two or more sources. Ajax (Asynchronous JavaScript and XML) is often used to create mashups.

The laws of cloudonomics

Joe Wienman of AT&T Global Services has concisely stated the advantages that cloud computing offers over a private or captured system. His article appeared on Gigaom.com at: http://gigaom.com/2008/09/07/the-10-laws-of-cloudonomics/. A summary of Wienman's “10 Laws of Cloudonomics” follows and his interpretation:

1. Utility services cost less even though they cost more.

Utilities charge a premium for their services, but customers save money by not paying for services that they aren't using.

2. On-demand trumps forecasting.

The ability to provision and tear down resources (de-provision) captures revenue and lowers costs.

3. The peak of the sum is never greater than the sum of the peaks.

A cloud can deploy less capacity because the peaks of individual tenants in a shared system are averaged over time by the group of tenants.

4. Aggregate demand is smoother than individual.

Multi-tenancy also tends to average the variability intrinsic in individual demand because the “coefficient of random variables” is always less than or equal to that of any of the individual variables. With a more predictable demand and less variation, clouds can run at higher utilization rates than captive systems. This allows cloud systems to operate at higher efficiencies and lower costs.

5. Average unit costs are reduced by distributing fixed costs over more units of output.

Cloud vendors have a size that allows them to purchase resources at significantly reduced prices. (This feature was described in the previous section.)

6. Superiority in numbers is the most important factor in the result of a combat (Clausewitz).

Weinman argues that a large cloud's size has the ability to repel botnets and DDoS attacks better than smaller systems do.

7. Space-time is a continuum (Einstein/Minkowski).

The ability of a task to be accomplished in the cloud using parallel processing allows real-time business to respond quicker to business conditions and accelerates decision making providing a measurable advantage.

8. Dispersion is the inverse square of latency.

Latency, or the delay in getting a response to a request, requires both large-scale and multi-site deployments that are a characteristic of cloud providers. Cutting latency in half requires four times the number of nodes in a system.

9. Don't put all your eggs in one basket.

The reliability of a system with n redundant components and a reliability of r is 1-(1-r)n. Therefore, when a datacenter achieves a reliability of 99 percent, two redundant datacenters have a reliability of 99.99 percent (four nines) and three redundant datacenters can achieve a reliability of 99.9999 percent (six nines). Large cloud providers with geographically dispersed sites worldwide therefore achieve reliability rates that are hard for private systems to achieve.

10. An object at rest tends to stay at rest (Newton).

Private datacenters tend to be located in places where the company or unit was founded or acquired. Cloud providers can site their datacenters in what are called “greenfield sites.” A greenfield site is one that is environmentally friendly: locations that are on a network backbone, have cheap access to power and cooling, where land is inexpensive, and the environmental impact is low. A network backbone is a very high-capacity network connection. On the Internet, an Internet backbone consists of the high-capacity routes and routers that are typically operated by an individual service provider such as a government or commercial entity. You can access a jump page of Internet backbone maps at: http://www.nthelp.com/maps.htm.

Cloud computing obstacles

Cloud computing isn't a panacea; nor is it either practical or economically sensible for many computer applications that you encounter. In practice, cloud computing can deviate from the ideals described in the previous list in many significant ways. The illusion of scalability is bounded by the limitations cloud providers place on their clients. Resource limits are exposed at peak conditions of the utility itself. As we all know, power utilities suffer brownouts and outages when the temperature soars, and cloud computing providers are no different. You see these outages on peak computing days such as Black Monday, which is the Monday after Thanksgiving in the United States when Internet Christmas sales traditionally start.

The illusion of low barrier to entry may be pierced by an inconsistent pricing scheme that makes scaling more expensive than it should be. You can see this limit in the nonlinearity of pricing associated with “extra large” machine instances versus their “standard” size counterparts. Additionally, the low barrier to entry also can be accompanied by a low barrier to provisioning. If you make a provisioning error, it can lead to vast costs.

Cloud computing vendors run very reliable networks. Often, cloud data is load-balanced between virtual systems and replicated between sites. However, even cloud providers experience outages. In the cloud, it is common to have various resources, such as machine instances, fail. Except for tightly managed PaaS cloud providers, the burden of resource management is still in the hands of the user, but the user is often provided with limited or immature management tools to address these issues.

Table 2.1 summarizes the various obstacles and challenges that cloud computing faces. These issues are described in various chapters in this book.

TABLE 2.1

Challenges and Obstacles to Cloud Computing

Subject Area

Captive

Cloud

Challenge

Accounting Management

Chargeback or Licensed

Usage

In private systems, costs associated with operations are fixed due to licenses and must be charged back to accounts based on some formula or usage model. For cloud computing, the pay-as-you-go usage model allows for costs to be applied to individual accounts directly.

Compliance

Policy-based

Proprietary

Compliance to laws and policies varies by geographical area. This requires that the cloud accommodate multiple compliance regimes.

Data Privacy

Bounded

Shared with cloud

To ensure data privacy in the cloud, additional security methods such as private encryption, VLANs, firewalls, and local storage of sensitive data is necessary.

Monitoring

Variable but under control

Limited

For private systems, any monitoring system the organization wishes to deploy can be brought to bear. Cloud computing models often have limited monitoring because it is vendor-defined.

Network Bottlenecks

Low

High

Network bottlenecks occur when large data sets must be transferred. This is the case for staging, replication, and other operations. On-premise operations use LANs that are better able to accommodate transfers than the WAN connections used in cloud computing.

Reputation

Individual

Shared

The reputation for cloud computing services for the quality of those services is shared by tenants. An outage of the cloud provider impacts individuals. Clouds often have higher reliability than private systems.

Security

Restricted

Federated

The different trust mechanisms require that applications be structured differently and that operations be modified to account for these differences.

Service Level Agreements (SLAs)

Customized

Cloud specific

Cloud SLAs are standardized in order to appeal to the majority of its audience. Custom SLAs that allow for multiple data sources are difficult to obtain or enforce.Cloud SLAs do not generally offer industry standard chargeback rates, and negotiations with large cloud providers can be difficult for small users. Business risks that aren't covered by a cloud SLA must be taken into account.

Software Stack

Customized

Commoditized

The cloud enforces standardization and lowers the ability of a system to be customized for need.

Storage

Scalable and high performance

Scalable but low performance

Enterprise class storage is under the control of an on-premise system and can support high speed queries. In cloud computing large data stores are possible but they have low bandwidth connection. High speed local storage in the cloud tends to be expensive.

Vendor Lock-in

Varies by company

Varies by platform

Vendor lock-in is a function of the particular enterprise and application in an on-premises deployment. For cloud providers, vendor lock-in increases going from the IaaS to SaaS to PaaS model. Vendor lock-in for a cloud computing solution in a PaaS model is very high.

Behavioral factors relating to cloud adoption

The issues described in Table 2.1 are real substantive issues that can be measured and quantified. However, a number of intrinsic properties of cloud computing create cognitive biases in people that are obstacles to cloud adoption and are worth mentioning. This goes for users as well as organizations. Duke University economist Dan Ariely, in his book Predictably Irrational: The Hidden Forces that Shape Our Decisions (Harper Collins, 2008), explores how people often make choices that are inconsistent based on expediency or human nature. Joe Weinman has expanded on these ideas and some others to formulate ten more “laws” for cloud computing adoption based on human behavior. You can read the original article at http://gigaom.com/2010/06/06/lazy-hazy-crazy-the-10-laws-of-behavioral-cloudonomics/.

The “10 Laws of Behavioral Cloudonomics” are summarized below:

1. People are risk averse and loss averse.

Ariely argues that losses are more painful than gains are pleasurable. Cloud initiatives may cause the concerns of adoption to be weighed more heavily than the benefits accrued to improving total costs and achieving greater agility.

2. People have a flat-rate bias.

Loss aversion expresses itself by preferences to flat-rate plans where risk is psychologically minimized to usage-based plans where costs are actually less.

Weiman cites the work of Anja Lambrecht and Bernd Skiera, “Paying Too Much and Being Happy About It: Existence, Causes, and Consequences of Tariff-Choice Biases” (http://www.test2.marketing.wiwi.uni-frankfurt.de/fileadmin/Publikationen/Lambrecht_Skiera_Tariff-Choice-Biases-JMR.pdf) for this point.

3. People have the need to control their environment and remain anonymous.

The need for environmental control is a primal directive. Loss of control leads to “learned helplessness” and shorter life spans.

You can read about the research in this area in David Rock's Oxford Leadership Journal article “Managing with the Brain in Mind,” found at http://www.oxfordleadership.com/journal/vol1_issue1/rock.pdf. The point about shorter life spans comes from the work of Judith Rodin and Ellen Langer, “Long-Term Effects of a Control-Relevant Intervention with the Institutionalized Aged” (http://capital2.capital.edu/faculty/jfournie/documents/Rodin_Judith.pdf), which appeared in the Journal of Personality and Social Psychology in 1977.

4. People fear change.

Uncertainty leads to fear, and fear leads to inertia. This is as true for cloud computing as it is for investing in the stock market.

5. People value what they own more than what they are given.

This is called the endowment effect. It is a predilection for existing assets that is out of line with their value to others. The cognitive science behind this principle is referred to as the choice-supportive bias (http://en.wikipedia.org/wiki/Choice-supportive_bias).

6. People favor the status quo and invest accordingly.

There is a bias toward the way things have been and a willingness to invest in the status quo that is out of line with their current value. In cognitive science, the former attribute is referred to as the status quo bias (http://en.wikipedia.org/wiki/Status_quo_bias), while the latter attribute is referred to as an escalation of commitment (http://en.wikipedia.org/wiki/Escalation_of_commitment).

7. People discount future risk and favor instant gratification.

Weinman argues that because cloud computing is an on-demand service, the instant gratification factor should favor cloud computing.

8. People favor things that are free.

When offered an item that is free or another that costs money but offers a greater gain, people opt for the free item. Weinman argues that this factor also favors the cloud computing model because upfront costs are eliminated.

9. People have the need for status.

A large IT organization with substantial assets is a visual display of your status; a cloud deployment is not. This is expressed as a pride of ownership.

10. People are incapacitated by choice.

The Internet enables commerce to shift to a large inventory where profit can be maintained by many sales of a few items each, the so-called long tail. When this model is applied to cloud computing, people tend to be overwhelmed by the choice and delay adoption.

Measuring cloud computing costs

As you see, cloud computing has many advantages and disadvantages, and you can't always measure them. You can measure costs though, and that's a valuable exercise. Usually a commodity is cheaper than a specialized item, but not always. Depending upon your situation, you can pay more for public cloud computing than you would for owning and managing your private cloud, or for owning and using software as well. That's why it's important to analyze the costs and benefits of your own cloud computing scenario carefully and quantitatively. You will want to compare the costs of cloud computing to private systems.

The cost of a cloud computing deployment is roughly estimated to be

CostCLOUD = Σ(UnitCostCLOUD x (Revenue - CostCLOUD))

where the unit cost is usually defined as the cost of a machine instance per hour or another resource.

Depending upon the deployment type, other resources add additional unit costs: storage quantity consumed, number of transactions, incoming or outgoing amounts of data, and so forth. Different cloud providers charge different amounts for these resources, some resources are free for one provider and charged for another, and there are almost always variable charges based on resource sizing. Cloud resource pricing doesn't always scale linearly based on performance.

To compare your cost benefit with a private cloud, you will want to compare the value you determine in the equation above with the same calculation:

CostDATACENTER = Σ(UnitCostDATACENTER x (Revenue - (CostDATACENTER/Utilization))

Notice the additional term for Utilization added as a divisor to the term for CostDATACENTER. This term appears because it is assumed that a private cloud has capacity that can't be captured, and it is further assumed that a private cloud doesn't employ the same level of virtualization or pooling of resources that a cloud computing provider can achieve. Indeed, no system can work at 100 percent utilization because queuing theory states that as the system approaches 100 percent, the latency and response times go to infinity. Typical efficiencies in datacenters are between 60 and 85 percent. It is also further assumed that the datacenter is operating under averaged loads (not at peak capacity) and that the capacity of the datacenter is fixed by the assets it has.

There is another interesting aspect to the calculated costs associated with CostCLOUD vs. CostDATACENTER: The costs associated with resources in the cloud computing model CostCLOUD can be unbundled to a greater extent than the costs associated with CostDATACENTER. The CostDATACENTER consists of the summation of the cost of each of the individual systems with all the associated resources, as follows:

CostDATACENTER = 1nΣ(UnitCostDATACENTER x (Revenue - (CostDATACENTER/Utilization))SYSTEMn,

where the sum includes terms for System 1, System 2, System 3, and so on.

The costs of a system in a datacenter must also include the overhead associated with power, cooling, and the physical plant. Estimates of these additional overheads indicate that over the lifetime of a system, overhead roughly doubles the cost of any system. For a server with a four-year lifetime, you would therefore need to include an overhead roughly equal to 25 percent of the system's acquisition cost.

The overhead associated with IT staff is also a major cost, but it's highly variable from organization to organization. It is not uncommon for the burden cost of a system in a datacenter to be 150 percent of the cost of the system itself.

The costs associated with the cloud model are calculated rather differently. Each resource has its own specific cost and many resources can be provisioned independently of one another. In theory, therefore, the CostCLOUD is better represented by the equation:

CostCLOUD = 1nΣ(UnitCostCLOUD x (Revenue - CostCLOUD))INSTANCEn + 1nΣ(UnitCostCLOUD x (Revenue - CostCLOUD))STORAGE_UNITn +. 1nΣ(UnitCostCLOUD x (Revenue - CostCLOUD))NETWORK_UNITn + …

In practice, cloud providers offer packages of machine instances with a fixed relationship between a machine instances, memory allocation (RAM), and network bandwidth. Storage and transactions are unbundled and variable.

Many cloud computing providers have created their own cost calculators to support their customers. Amazon lets you create a simulated billing based on the machine instances, storage, transactions, and other resources that you provision. An example is the Amazon Simple Monthly Calculator (http://calculator.s3.amazonaws.com/calc5.html) shown in Figure 2.1. You can find similar calculators elsewhere or download a spreadsheet with the calculations built into it from the various sites.

FIGURE 2.1

The Amazon Web Service Simple Monthly Calculator

9780470903568-fg0201.tif

Avoiding Capital Expenditures

A major part of cloud computing's value proposition and its appeal is its ability to convert capital expenses (CapEx) to operating expenses (OpEx) through a usage pricing scheme that is elastic and can be right-sized. The conversion of real assets to virtual ones provides a measure of protection against too much or too little infrastructure. Essentially, moving expenses onto the OpEx side of a budget allows an organization to transfer risk to their cloud computing provider.

Capitalization may be the single largest reason that new businesses fail, and it is surely an impediment to established businesses starting new enterprises. Growth itself can be difficult when revenues don't cover the expansion and obtaining financing is difficult. A company wishing to grow would normally be faced with the following options:

• Buy the new equipment, and deploy it in-house

• Lease the equipment for a set period of time

• Outsource the operation to a managed-services organization

Capital expenditures must create the infrastructure necessary to capture the transactions that the business needs. However, if demand is variable, then it is an open question as to how much infrastructure is needed to support demand.

Cloud computing is also a good option when the cost of infrastructure and management is high. This is often the case with legacy applications and systems where maintaining the system capabilities is a significant cost.

Right-sizing

Consider an accounting firm with a variable demand load, as shown in Figure 2.2. For each of the four quarters of the tax year, clients file their quarterly taxes on the service's Web site. Demand for three of those quarters rises broadly as the quarterly filing deadline arrives. The fourth quarter that represents the year-end tax filing on April 15 shows a much larger and more pronounced spike for the two weeks approaching and just following that quarter's end. Clearly, this accounting business can't ignore the demand spike for its year-end accounting, because this is the single most important portion of the firm's business, but it needs to match demand to resources to maximize its profits.

FIGURE 2.2

Right-sizing demand to infrastructure

9780470903568-fg0202.eps

Buying and leasing infrastructure to accommodate the peak demand (or alternatively load) shown in the figure as DMAX means that nearly half of that infrastructure remains idle for most of the time. Fitting the infrastructure to meet the average demand, DAVG, means that half of the transactions in the Q2 spike are not captured, which is the mission critical portion of this enterprise. More accurately using DAVG means that during maximum demand the service is slowed to a crawl and the system may not be responsive enough to satisfy any of the users.

These limits can be a serious constraint on profit and revenue. Outsourcing the demand may provide a solution to the problem. But outsourcing essentially shifts the burden of capital expenditures onto the service provider. A service contract that doesn't match infrastructure to demand suffers from the same inefficiencies that captive infrastructure does.

The cloud computing model addresses this problem by allowing you to right-size your infrastructure. In Figure 2.2, the demand is satisfied by an infrastructure that is labeled in terms of a CU or “Compute Unit.” The rule for this particular cloud provider is that infrastructure may be modified at the beginning of any month. For the low-demand Q1/Q4 time period, a 1 CU infrastructure is applied. On February 1st, the size is changed to a 4 CU infrastructure, which captures the entire spike of Q2 demand. Finally, on June 1st, a 2 CU size is applied to accommodate the typical demand DAVG that is experienced in the last half of Q2 through the middle of Q4. This curve-fitting exercise captures the demand nearly all the time with little idle capacity left unused.

Note

In reality, the major cloud providers provide machine instances in small slices that can be added within five minutes or less. A standard machine instance (virtual computer) might cost $0.10 or less an hour; a typical storage charge might be $0.10 per GB/month. It is this flexibility that has made cloud computing viable. Past efforts to push cloud computing such as the Intel Computing Services (circa 2000) approach required negotiated contracts and long commitments, which is why this service didn't gain traction.

If this deployment represented a single server, then 1 CU might represent a single dual-core processor, 2 CU might represent a quad-core processor, and 4 CU might represent a dual quad-core processor virtual system. Most cloud providers size their systems small, medium, and large in just this manner.

Right-sizing is possible when the system load is cyclical or in some cases when there are predictable bursts or spikes in the load. You encounter cyclical loads in many public facing commercial ventures with seasonal demands, when the load is affected by time zones, and at times that new products launch. Burst loads are less predictable. You can encounter bursts in systems that are gateways or hubs for traffic. In situations where demand is unpredictable and change can be rapid, right-sizing a cloud computing solution demands automated solutions. Amazon Web Services' Auto Scaling feature (http://aws.amazon.com/autoscaling/) for its EC2 service described in Chapter 9 is an example of such an automated solution. Shared systems with multiple tenants that need to scale are another example where right-sizing can be applied.

Computing the Total Cost of Ownership

The Total Cost of Ownership or TCO is a financial estimate for the costs of the use of a product or service over its lifetime, often broken out on a yearly or quarterly basis. In pitching cloud computing projects, it is common to create spreadsheets that predict the costs of using the cloud computing model versus performing the same functions in-house or on-premises.

To be really useful, a TCO must account for the real costs of items, but frequently they do not. For example, the cost of a system deployed in-house is not only the cost of acquisition and the cost of maintenance. A system consumes resources such as space in a datacenter or portion of your site, power, cooling, and management. All these resources represent an overhead that is often overlooked, either in error or for political reasons. When you account for monitoring and management of systems, you must account for the burdened cost of an IT employee, the cost of the hardware and software that is used for management, and other hidden costs.

The Wikipedia page on Total Cost of Ownership (http://en.wikipedia.org/wiki/Total_cost_of_ownership) contains a list of computer and software industries TCO elements that are a good place to start to build your own worksheet.

Note

A really thorough and meaningful TCO study should probably be done by an accountant or consultant with a financial background in this area to obtain meaningful results.

You can find many examples of TCO wizards and worksheets for cloud systems. Microsoft maintains an economics page (http://www.microsoft.com/windowsazure/economics/#tco_content) for its Windows Azure Platform. One of the features on the page is a link to a TCO calculator that is based on an engine created by Alinean. In this calculator, you describe your business, its location and industry, and the level of activity (logins and connections) in the first step of the wizard. The Azure TCO Calculator shows you its recommendations for deployment, the Azure costs, and a report of its analysis as the last step of the wizard.

Figure 2.3 shows you a report for a hypothetical company with 10 servers on Azure. You can use this report to show graphs and print the report, and the factors it presents in this report may be useful to you.

Any discussion of Total Cost of Ownership provides an operational look at infrastructure deployment. A better metric for enterprises is captured by a Return on Investment or ROI calculation. To accurately measure an ROI, you need to capture the opportunities that a business has been able to avail itself of (or not), something that is accurate only in hindsight. The flexibility and agility of cloud computing allows a company to focus on its core business and create more opportunities.

FIGURE 2.3

The Microsoft Azure Platform ROI wizard provides a quick and dirty analysis of your TCO for a cloud deployment on Windows Azure in an attractive report format.

9780470903568-fg0203.tif

Specifying Service Level Agreements

A Service Level Agreement (SLA) is the contract for performance negotiated between you and a service provider. In the early days of cloud computing, all SLAs were negotiated between a client and the provider. Today with the advent of large utility-like cloud computing providers, most SLAs are standardized until a client becomes a large consumer of services.

Caution

Some SLAs are enforceable as contracts, but many are really agreements that are more along the lines of an Operating Level Agreement (OLA) and may not have the force of law. It's good to have an attorney review these documents before you make a major commitment to a cloud provider.

SLAs usually specify these parameters:

• Availability of the service (uptime)

• Response times or latency

• Reliability of the service components

• Responsibilities of each party

• Warranties

If a vendor fails to meet the stated targets or minimums, it is punished by having to offer the client a credit or pay a penalty. In this regard, an SLA should be like buying insurance, and like buying insurance, getting the insurer to pay up when disaster strikes can be the devil's work.

Microsoft publishes the SLAs associated with the Windows Azure Platform components at http://www.microsoft.com/windowsazure/sla/, which is illustrative of industry practice for cloud providers. Each individual component has its own SLA. The summary versions of these SLAs from Microsoft are reproduced here:

Windows Azure SLA: “Windows Azure has separate SLA's for compute and storage. For compute, we guarantee that when you deploy two or more role instances in different fault and upgrade domains, your Internet facing roles will have external connectivity at least 99.95% of the time. Additionally, we will monitor all of your individual role instances and guarantee that 99.9% of the time we will detect when a role instance's process is not running and initiate corrective action.”

Cross-Ref

The different components of the Windows Azure Platform are discussed in detail in Chapter 10.

SQL Azure SLA: “SQL Azure customers will have connectivity between the database and our Internet gateway. SQL Azure will maintain a “Monthly Availability” of 99.9% during a calendar month. “Monthly Availability Percentage” for a specific customer database is the ratio of the time the database was available to customers to the total time in a month. Time is measured in 5-minute intervals in a 30-day monthly cycle. Availability is always calculated for a full month. An interval is marked as unavailable if the customer's attempts to connect to a database are rejected by the SQL Azure gateway.”

AppFabric SLA: “Uptime percentage commitments and SLA credits for Service Bus and Access Control are similar to those specified above in the Windows Azure SLA. Due to inherent differences between the technologies, underlying SLA definitions and terms differ for the Service Bus and Access Control services. Using the Service Bus, customers will have connectivity between a customer's service endpoint and our Internet gateway; when our service fails to establish a connection from the gateway to a customer's service endpoint, then the service is assumed to be unavailable. Using Access Control, customers will have connectivity between the Access Control endpoints and our Internet gateway. In addition, for both Service Bus and Access Control, the service will process correctly formatted requests for the handling of messages and tokens; when our service fails to process a request properly, then the service is assumed to be unavailable. SLA calculations will be based on an average over a 30-day monthly cycle, with 5-minute time intervals. Failures seen by a customer in the form of service unavailability will be counted for the purpose of availability calculations for that customer.”

You can find Google's App Engine for Business SLA at http://code.google.com/appengine/business/sla.html. The SLA for Amazon Web Service Elastic Computer Cloud (EC2) is published at http://aws.amazon.com/ec2-sla/, and the SLA for Amazon Simple Storage Service (S3) may be found at http://aws.amazon.com/s3-sla/.

Some cloud providers allow for service credits based on their ability to meet their contractual levels of uptime. For example, Amazon applies a service credit of 10 percent off the charge for Amazon S3 if the monthly uptime is equal to or greater than 99 percent but less than 99.9 percent. When the uptime drops below 99 percent, the service credit percentage rises to 25 percent and this credit is applied to usage in the next billing period. Amazon Web Services uses an algorithm that calculates uptime based on the following formula:

Uptime = Error Rate/Requests

as measured for each 5-minute interval during a billing period. The error rate is based on internal server counters such as “InternalError” or “ServiceUnavailable.” There are exclusions that limit Amazon's exposure.

Service Level Agreements are based on the usage model. Most cloud providers price their pay-as-you-go resources at a premium and issue standard SLAs only for that purpose. You can also purchase subscriptions at various levels that guarantee you access to a certain amount of purchased resources. The SLAs attached to a subscription often offer different terms. If your organization requires access to a certain level of resources, then you need a subscription to a service. A usage model may not provide that level of access under peak load conditions.

Defining Licensing Models

When you purchase shrink-wrapped software, you are using that software based on a licensing agreement called a EULA or End User License Agreement. The EULA may specify that the software meets the following criteria:

• It is yours to own.

• It can be installed on a single or multiple machines.

• It allows for one or more connections.

• It has whatever limit the ISV has placed on its software.

In most instances, the purchase price of the software is directly tied to the EULA.

For a long time now, the computer industry has known that the use of distributed applications over the Internet was going to impact the way in which companies license their software, and indeed it has. The problem is that there is no uniform description of how applications accessed over cloud networks will be priced. There are several different licensing models in play at the moment—and no clear winners.

It can certainly be argued that the use of free software is a successful model. The free use of software in the cloud is something that the open-source community can support, it can be supported as a line extension of a commercial product, or it can be paid for out of money obtained from other sources such as advertising. Google's advertising juggernaut has allowed it to create a portfolio of applications that users can access based on their Google accounts. Microsoft's free software in Windows Live is supported by its sales of the Windows operating system and by sales of the Microsoft Office suite.

In practice, cloud-based providers tend to license their applications or services based on user or machine accounts, but they do so in ways that are different than you might expect based on your experience with physical hardware and software. Many applications and services use a subscription or usage model (or both) and tie it to a user account. Lots of experimentation is going on in the publishing industry on how to price Internet offerings, and you can find the same to be true in all kinds of computer applications at the moment. Some services tie their licenses into a machine account when it makes sense. An example is the backup service Carbonite, where the service is for backing up a licensed computer. However, cloud computing applications rarely use machine licenses when the application is meant to be ubiquitous. If you need to access an application, service, or Web site from any location, then a machine license isn't going to be practical.

The impact of cloud computing on bulk software purchases and enterprise licensing schemes is even harder to gauge. Several analysts have remarked that the advent of cloud computing could lead to the end of enterprise licensing and could cause difficulties for software vendors going forward. It isn't clear what the impact on licensing will be in the future, but it is certainly an area to keep your eyes on.

Summary

In this chapter, you learned about the features that make cloud computing unique. It is both a new model and a new platform for computing. The idea of computing as a utility is as old as the computer industry itself, but it is the advent of low-cost datacenters that have really enabled this platform to thrive.

A cloud's unique features are scalability, elasticity, low barrier to entry, and a utility delivery of services. These features completely change the way in which applications and services are used. Cloud computing will enable the development of new types of applications, several of which were described in this chapter. Undoubtedly many new application types will arise that will be a complete surprise to us all. However, cloud computing has some limitations that were discussed as well.

To help you get a handle on cloud computing, this chapter stresses measurements of costs in comparison to private or on-premises systems. Cloud computing shifts capital expenditures into operating expenditures.

Cloud computing changes the nature of the service provider and its relationship to its client. You see this expressed in the Service Level Agreements (SLAs) and software licensing that are part of this new developing industry. There are many changes to come in these areas.

Chapter 3, “Understanding Cloud Architecture,” is an in-depth look at the superstructure and plumbing that makes cloud computing possible. In that chapter, you learn about the standards and protocols used by the cloud computing industry.