Managing the Cloud - Exploring Cloud Infrastructures - Cloud Computing Bible (2011)

Cloud Computing Bible (2011)

Part III: Exploring Cloud Infrastructures

IN THIS PART

Chapter 11

Managing the Cloud

Chapter 12

Understanding Cloud Security

Chapter 11: Managing the Cloud

IN THIS CHAPTER

Learning about network management software

Viewing the essential monitoring features

Using lifecycle management techniques

Discovering emerging network management interoperability standards

Cloud computing deployments must be monitored and managed in order to be optimized for best performance. To the problems associated with analyzing distributed network applications, the cloud adds the complexity of virtual infrastructure. This is one of the most active areas of product development in the entire cloud computing industry, and this chapter introduces you to the different products in this nascent area.

Cloud management software provides capabilities for managing faults, configuration, accounting, performance, and security; this is referred to as FCAPS. Many products address one or more of these areas, and through network frameworks, you can access all five areas. Framework products are being repositioned to work with cloud systems.

Your management responsibilities depend on the particular service model for your cloud deployment. Cloud management includes not only managing resources in the cloud, but managing resources on-premises. The management of resources in the cloud requires new technology, but management of resources on-premises allows vendors to use well-established network management technologies.

The lifecycle of a cloud application includes six defined parts, and each must be managed. In this chapter, the tasks associated with each stage are described.

Efforts are underway to develop cloud management interoperability standards. One effort you learn about in this chapter is the DMTF's (Distributed Management Task Force) Open Cloud Standards Incubator. The goal of these efforts is to develop management tools that work with any cloud type. Another group called the Cloud Commons is developing a technology called the Service Measurement Index (SMI). SMI aims to deploy methods for measuring various aspects of cloud performance in a standard way.

Administrating the Clouds

The explosive growth in cloud computing services has led many vendors to rename their products and reposition them to get in on the gold rush in the clouds. What was once a network management product is now a cloud management product. Nevertheless, this is one area of technology that is very actively funded, comes replete with interesting startups, has been the focus of several recent strategic acquisitions, and has resulted in some interesting product alliances. Let's join the party and see what all the fuss is about.

These fundamental features are offered by traditional network management systems:

• Administration of resources

• Configuring resources

• Enforcing security

• Monitoring operations

• Optimizing performance

• Policy management

• Performing maintenance

• Provisioning of resources

Network management systems are often described in terms of the acronym FCAPS, which stands for these features:

Fault

Configuration

Accounting

Performance

Security

Most network management packages have one or more of these characteristics; no single package provides all five elements of FCAPS.

To get the complete set of all five of these management areas from a single vendor, you would need to adopt a network management framework. These large network management frameworks were industry leaders several years back: BMC PATROL, CA Unicenter, IBM Tivoli, HP OpenView, and Microsoft System Center. Network framework products have been sliced and diced in many different ways over the years, and they are rebranded from time to time. Today, for example, BMC PATROL is now part of BMC ProactiveNet Performance Management (http://www.bmc.com/products/product-listing/ProactiveNet-Performance-Management.html), HP OpenView has been split (https://h10078.www1.hp.com/cda/hpms/display/main/hpms_content.jsp?zn=bto&cp=1-10^36657_4000_100) into a set of HP Manager products.

The impact that cloud computing is having on network frameworks is profound. These five vendors have (or soon will have) products for cloud management. Computer Associates, for example, has completely repositioned its network management portfolio as an IT Management Software as a Service. Find the cloud products for these five large cloud vendors at the following URLs:

• BMC Cloud Computing (http://www.bmc.com/solutions/esm-initiative/cloud-computing.html)

• Computer Associates Cloud Solutions (http://www.ca.com/us/cloud-computing.aspx)

• HP Cloud Computing (http://h20338.www2.hp.com/enterprise/w1/en/technologies/cloud-computing-overview.html)

• IBM Cloud Computing (http://www.ibm.com/ibm/cloud/)

• Microsoft Cloud Services (http://www.microsoft.com/cloud/)

Figure 11.1 shows IBM Tivoli Service Automation Manager, a framework tool for managing cloud infrastructure.

FIGURE 11.1

Tivoli Service Automation Manager lets you create and stage cloud-based servers.

9780470903568-fg1101.tif

Management responsibilities

What separates a network management package from a cloud computing management package is the “cloudly” characteristics that cloud management service must have:

• Billing is on a pay-as-you-go basis.

• The management service is extremely scalable.

• The management service is ubiquitous.

• Communication between the cloud and other systems uses cloud networking standards.

To monitor an entire cloud computing deployment stack, you monitor six different categories:

1. End-user services such as HTTP, TCP, POP3/SMTP, and others

2. Browser performance on the client

3. Application monitoring in the cloud, such as Apache, MySQL, and so on

4. Cloud infrastructure monitoring of services such as Amazon Web Services, GoGrid, Rackspace, and others

5. Machine instance monitoring where the service measures processor utilization, memory usage, disk consumption, queue lengths, and other important parameters

6. Network monitoring and discovery using standard protocols like the Simple Network Management Protocol (SNMP), Configuration Management Database (CMDB), Windows Management Instrumentation (WMI), and the like

It's important to note that there are really two aspects to cloud management:

Managing resources in the cloud

Using the cloud to manage resources on-premises

When you move to a cloud computing architecture from a traditional networked model like client/server or a three-tier architecture, many of the old management tasks for processes going on in the cloud become irrelevant or nearly impossible to manage because the tools to effectively manage resources of various kinds fall outside of your own purview. In the cloud, the particular service model you are using directly affects the type of monitoring you are responsible for.

Consider the case of an Infrastructure as a Service vendor such as Amazon Web Services or Rackspace. You can monitor your usage of resources either through their native monitoring tools like Amazon CloudWatch or Rackspace Control Panel or through the numerous third-party tools that work with these sites' APIs. In IaaS, you can alter aspects of your deployment, such as the number of machine instances you are running or the amount of storage you have, but you have very limited control over many important aspects of the operation. For example, your network bandwidth is locked into the type of instance you deploy. Even if you can provision more bandwidth, you likely have no control over how network traffic flows into and out of the system, whether there is packet prioritization, how routing is done, and other important characteristics.

The situation—as you move first to Platform as a Service (PaaS) like Windows Azure or Google App Engine and then onto Software as a Service (SaaS) for which Salesforce.com is a prime example—becomes even more restrictive. When you deploy an application on Google's PaaS App Engine cloud service, the Administration Console provides you with the following monitoring capabilities:

• Create a new application, and set it up in your domain.

• Invite other people to be part of developing your application.

• View data and error logs.

• Analyze your network traffic.

• Browse the application datastore, and manage its indexes.

• View the application's scheduled tasks.

• Test the application, and swap out versions.

However, you have almost no operational control. Essentially, Google App Engine lets you deploy the application and monitor it, and that's about it. All the management of devices, networks, and other aspects of the platform are managed by Google. You have even less control when you are selling software in the cloud, as you would with Salesforce.com.

Figure 11.2 graphically summarizes the management responsibilities by service model type.

The second aspect of cloud management is the role that cloud-based services can play in managing on-premises resources. From the standpoint of the client, a cloud service provider is no different than any other networked service. The full range of network management capabilities may be brought to bear to solve mobile, desktop, and local server issues, and the same sets of tools can be used for measurement.

Microsoft System Center is an example of how management products are being adapted for the cloud. System Center provides tools for managing Windows servers and desktops. The management services include an Operations Manager, the Windows Service Update Service (WSUS), a Configuration Manager for asset management, a Data Protection Manager, and a Virtual Machine Manager, among other components.

One of these service sets was called the System Center Online Desktop Manager (SCODM). Microsoft has taken SCODM and repositioned it as a cloud-based service for managing updates, monitoring PCs for license compliance and health, enforcing security policies, and using Forefront protect systems from malware, and the company has branded it as Windows Intune (http://www.microsoft.com/windows/windowsintune/default.aspx). From the client's standpoint, it makes little difference whether the service is in the cloud or on a set of servers in a datacenter. The benefit of a cloud management service accrues to the organization responsible for managing the desktops or mobile devices. Figure 11.3 shows an Overview screen from the beta version of Windows Intune. The product is due to be released in the first or second quarter of 2011.

FIGURE 11.2

Management responsibilities by service model type

9780470903568-fg1102.eps

FIGURE 11.3

Intune is Microsoft's cloud-based management service for Windows systems.

9780470903568-fg1103.tif

Lifecycle management

Cloud services have a defined lifecycle, just like any other system deployment. A management program has to touch on each of the six different stages in that lifecycle:

1. The definition of the service as a template for creating instances

Tasks performed in Phase 1 include the creation, updating, and deletion of service templates.

2. Client interactions with the service, usually through an SLA (Service Level Agreement) contract

This phase manages client relationships and creates and manages service contracts.

3. The deployment of an instance to the cloud and the runtime management of instances

Tasks performed in Phase 3 include the creation, updating, and deletion of service offerings.

4. The definition of the attributes of the service while in operation and performance of modifications of its properties

The chief task during this management phase is to perform service optimization and customization.

5. Management of the operation of instances and routine maintenance

During Phase 5, you must monitor resources, track and respond to events, and perform reporting and billing functions.

6. Retirement of the service

End of life tasks include data protection and system migration, archiving, and service contract termination.

Cloud Management Products

Cloud management software and services is a very young industry, and as such, it has a very large number of companies, some with new products and others with older products competing in this area. Table 11.1 shows some of the current players in this market, along with the products they either are offering or are promising in the very near future. When considering products in cloud management, you should be aware that—as in all new areas of technology—there is considerable churn as companies grow, get acquired, or fail along the way. It is entirely possible that if you return to this list a year or two after this book is published, half of these products or services will no longer exist as listed; you should keep this in mind.

TABLE 11.1

Cloud and Web Monitoring Solutions

Product

URL

Description

AbiCloud

http://www.abiquo.com/

Virtual machine conversion and management

Amazon CloudWatch

http://aws.amazon.com/cloudwatch/

AWS dashboard

BMC Cloud Computing Initiative

http://www.bmc.com/solutions/esm-initiative/cloud-computing.html

Cloud planning, lifecycle management, optimization, and guidance

CA Cloud Connected Management Suite

http://www.ca.com/us/cloud-solutions.aspx

CA Cloud Insight, CA Cloud Compose, CA Cloud Optimize, and CA Cloud Orchestrate are described below

Cacti

http://www.cacti.net/

Network performance graphing solution

CloudKick

https://www.cloudkick.com/

Cloud server monitoring

Dell Scalent

http://www.scalent.com/index.php

Virtualization provisioning system that will be rolled into Dell's Advanced Infrastructure Manager (AIM)

Elastra

http://www.elastra.com/

Federated hybrid cloud management software

Ganglia

http://ganglia.info/

Distributed network monitoring software

Gomez

http://www.gomez.com/

Web site monitoring and analytics

HP Cloud Computing

http://h20338.www2.hp.com/enterprise/w1/en/technologies/cloud-computing-overview.html

A variety of management products and services, both released and under development

Hyperic

http://www.hyperic.com/

Performance management for virtualized Java Apps with VMware integration

IBM Service Management and Cloud Computing

http://www-01.ibm.com/software/tivoli/solutions/cloudcomputing/

Various IBM Tivoli managers and monitors

Internetseer

http://www.internetseer.com/home/index.xtp

Web site monitoring service

Intune

http://www.microsoft.com/windows/windowsintune/default.aspx

Cloud-based Windows system management

Keynote

http://www.keynote.com/

Web, mobile, streaming, and customer test and measurement products

ManageEngine OpManager

http://www.manageengine.com/network-performance-management.html

Network and server monitoring, server desk, event and security management

ManageIQ

http://www.manageiq.com/

Enterprise Virtualization Management Suite (EVM) that provides monitoring, provisioning, and cloud integration services

Managed Methods JaxView

http://managedmethods.com/

SOA management tool

Monit

http://mmonit.com/monit/

Unix system monitoring and management

Montis

http://portal.monitis.com/index.php/home

Cloud-based monitoring service

Morph

http://mor.ph/

Infrastructure management, provisioning, deployment, and monitoring tools

Nagios

http://www.nagios.org/

Network monitoring system

NetIQ

http://www.netiq.com/

Network management, monitoring, deployment, and security software

New Relic RPM

http://www.newrelic.com/

Java and Ruby application monitor and troubleshooting

Nimsoft

http://www.ca.com/us/products/detail/CA-Nimsoft-Monitoring-Solution.aspx

Cloud monitoring software

OpenQRM

http://www.openqrm.com/

Data center management platform

Pareto Networks

http://www.paretonetworks.com/

Cloud provisioning and deployment

Pingdom

http://www.pingdom.com/

Web site and server uptime and performing monitoring

RightScale

http://www.rightscale.com/

Automated virtual server scaling

ScienceLogic

http://www.sciencelogic.com/

Datacenter and cloud management solutions and appliances

Scout

http://scoutapp.com/

Hosted server management service

ServiceUptime

http://www.serviceuptime.com/

Web site monitoring service

Site24X7

http://site24x7.com/

Web site monitoring service

Solarwinds

http://www.solarwinds.com/

Network monitoring and management software

Tapinsystems

http://www.tapinsystems.com/home

Provisioning and management service

Univa UD

http://univaud.com/index.php

Application and infrastructure management software for hybrid multi-clouds

VMware Hyperic

http://www.springsource.com/

Performance management for VMware deployed Java applications

Webmetrics

http://www.webmetrics.com/

Web performance management, load testing, and application monitor for cloud services

WebSitePulse

http://www.websitepulse.com/

Server, Web site, and application monitoring service

Whatsup Gold

http://www.whatsupgold.com/

Network monitoring and management software

Zenoss

http://www.zenoss.com/

IT operations monitoring

Zeus

http://www.zeus.com/

Web-based application traffic manager

The core management features offered by most cloud management service products include the following:

• Support of different cloud types

• Creation and provisioning of different types of cloud resources, such as machine instances, storage, or staged applications

• Performance reporting including availability and uptime, response time, resource quota usage, and other characteristics

• The creation of dashboards that can be customized for a particular client's needs

Automated deployment on IaaS systems represents one class of cloud management services. One of the more interesting and successful vendors in this area is Rightscale (http://www.rightscale.com/) whose software allows clients to stage and manage applications on AWS (Amazon Web Service), Eucalyptus, Rackspace, and the Chef Multicloud framework or a combination of these cloud types. Rightscale creates cloud-ready server templates and provides the automation and orchestration necessary to deploy them. Eucalyptus and Rackspace both use Amazon EC2 and S3 services, although Eucalyptus is open source and portable. RightScale server templates and the Rightscript technology are highly configurable and can be run under batch control. The RightScale user interface also provides real-time measurements of individual server instances.

Cloudkick (https://www.cloudkick.com/) is another infrastructure monitoring solution that is well regarded. Its service is noted for being agnostic and working with multiple vendor cloud platforms. The Cloudkick user interface is designed for rapid deployment assessment, and its at-a-glance-monitoring Insight module is particularly easy to use. Figure 11.4 shows the Insight module, and Figure 11.5 shows Cloudkick's real-time server visualization tool, which is one of the more interesting presentation tools we've seen. In Figure 11.5, the circles are servers with their location on the different axes based on observed metrics. Powerful servers are larger circles, and the colors indicate the current state of the server.

Users have commented on Cloudkick's instant launching being difficult, and both Cloudkick and RightScale are known to be easy to use with Linux virtual servers and less so with Windows instances.

FIGURE 11.4

Cloudkick's Insight module (https://www.cloudkick.com/site_media/images/graphs2.png) is powerful and particularly easy to use.

9780470903568-fg1104.tif

All of the service models support monitoring solutions, most often through interaction with the service API. Tapping into a service API allows management software to perform command actions that a user would normally perform. Some of these APIs are themselves scriptable, while in some cases, scripting is supported in the management software.

One key differentiator in monitoring and management software is whether the service needs to install an agent or it performs its service without an agent. The monitoring function normally can be performed through direct interaction with a cloud service or client using processes such as an HTTP GET or a network command like PING. For management functions, an agent is helpful in that it can provide needed hooks to manipulate a cloud resource. Agents also, as a general rule, are useful in helping to solve problems associated with firewall NAT traversal.

ManageIQ (http://www.manageiq.com/) and Service-now.com offer an integrated cloud stack that combines the ManageIQ Enterprise Virtualization Management Suite with Service-Now.com's ITSM SaaS service. The system has offers management, discovery, CMDB synchronization, and automated provisioning services. You can integrate these services into your Web applications using an open API that these companies offer.

FIGURE 11.5

The Cloudkick visualization demo (https://www.cloudkick.com/viz/demo/) provides a real-time graphical illustration of the state of monitored servers.

9780470903568-fg1105.tif

Distributed network applications often benefit from the deployment of a management appliance. Because cloud services tend to distribute applications across multiple sites, physical appliances need to be deployed in different locations—something that only cloud service providers can do. However, there has been a tendency to create virtual appliances, and those can be deployed as server instances wherever an application is deployed. Pareto Networks (http://www.paretonetworks.com/) has a cloud computing service that can monitor and manage distributed network services using a physical or virtual appliance. The system can be used to control and provision network services. Pareto Networks plans to add an API to this service.

Emerging Cloud Management Standards

As it stands now, different cloud service providers use different technologies for creating and managing cloud resources. As the area matures, cloud providers are going to be under considerable pressure from large cloud users like the federal government to conform to standards and make their systems interoperable with one another. No entity is likely to want to make a major investment in a service that is a silo or from which data is difficult to stage or to extract. To this end, a number of large industry players such as VMware, IBM, Microsoft, Citrix, and HP have gotten together to create standards that can be used to promote cloud interoperability. In the section that follows, you learn about the work of the DMTF in this area.

Another effort just getting underway has been started by CA (the company formerly known as Computer Associates) in association with Carnegie Mellon called the Cloud Commons. This effort is aimed at creating an industry community and working group, and promoting a set of monitoring standards that were part of CA's cloud technology portfolio but are now open sourced.

DMTF cloud management standards

The Distributed Management Task Force (DMTF; see http://www.dmtf.org/) is an industry organization that develops industry system management standards for platform interoperability. Its membership is a “who's who” in computing, and since its founding in 1992, the group has been responsible for several industry standards, most notably the Common Information Model (CIM). The DMTF organizes itself into a set of working groups that are tasked with specifying standards for different areas of technology.

A recent standard called the Virtualization Management Initiative (VMAN) was developed to extend CIM to virtual computer system management. VMAN has resulted in the creation of the Open Virtualization Format (OVF), which describes a standard method for creating, packaging, and provisioning virtual appliances. OVF is essentially a container and a file format that is open and both hypervisor- and processor-architecture-agnostic. Since OVF was announced in 2009, vendors such as VirtualBox, AbiCloud, IBM, Red Hat, and VMWare have announced or introduced products that use OVF.

It was, therefore, a natural extension of the work that DMTF does in virtualization to solve management issues in cloud computing. DMTF has created a working group called the Open Cloud Standards Incubator (OCSI) to help develop interoperability standards for managing interactions between and in public, private, and hybrid cloud systems. The group is focused on describing resource management and security protocols, packaging methods, and network management technologies. The Web site of the Cloud Management group (http://dmtf.org/standards/cloud) is shown in Figure 11.6.

DMTF's cloud management efforts are really in their initial stages, but the group has broad industry support. Part of the group's task is to provide industry education, so you can find a number of white papers and technology briefs published on this site. It's an effort that's worth checking back with over time. Although the OCSI's work has not yet been joined by Amazon or Salesforce.com, a set of open standards that extend the use of industry standard protocols—such as the Common Information Model (CIM), the Open Virtualization Format (OVF), and WBEM—to the cloud are going to be hard for vendors to resist.

FIGURE 11.6

DMTF (http://dmtf.org/standards/cloud) has a large and important effort underway for developing cloud interoperability management standards.

9780470903568-fg1106.tif

Cloud Commons and SMI

CA Technologies (http://www.ca.com), the company once known as Computer Associates, has taken some of its technologies in measuring distributed network performance metrics and repositioned its products as the following:

• CA Cloud Insight, a cloud metrics measurement service

• CA Cloud Compose, a deployment service

• CA Cloud Optimize, a cloud optimization service

• CA Cloud Orchestrate, a workflow control and policy based automation service

Taken together, these products form the basis for CA's Cloud Connected Management Suite (http://www.ca.com/us/cloud-solutions.aspx).

CA has lots of experience in this area through its Unicenter management suite and the products that were spawned from it. The company also has invested in cloud vendors such as 3Tera, Oblicore, and Cassatt to create their cloud services. CA acquired Nimsoft in March 2010. Nimsoft has a monitoring and management package called Nimsoft United Monitoring that creates a monitoring portal with customizable dashboards. The system can gather information from up to 100 types of data points and can work with both Google and Rackspace cloud deployments. Among the data points that can be monitored are resource usage and UPS status.

At the heart of CA Cloud Insight is a method for measuring different cloud metrics that creates what CA calls a Service Measurement Index or SMI. The SMI measures things like SLA compliance, cost, and other values and rolls them up into a score. To help allow SMI to gain traction in the industry, CA has donated the core technology to the Software Engineering Institute at Carnegie Mellon as part of what is called the SMI Consortium. This same group is responsible for the Capability Maturity Model Integration (CMMI) process optimization technology and other efforts. The second CA initiative is the funding of an industry online community called the Cloud Commons (http://www.cloudcommons.com/), the home page of which is shown in Figure 11.7.

FIGURE 11.7

The Cloud Commons (http://www.cloudcommons.com/web/guest) is a new online community founded by CA to promote information exchange on cloud services and the SMI standard.

9780470903568-fg1107.tif

Because the Cloud Commons is brand new, it is hard to tell whether this group will have impact in the cloud community, but it is an interesting effort. The hope is that not only will this site establish CA's performance metrics, but that community users will eventually provide detailed information and ratings on particular services.

To demonstrate the potential of cloud-based metrics, the Cloud Commons has built a dashboard called the CloudSensor that monitors the performance of the major cloud-based services inreal time.

This tool measures the performance of the following:

• RackSpace file creation and deletion

• E-mail availability (system uptime) based on Google Gmail, Windows Live Hotmail, and Yahoo! Mail

• Amazon Web Services server creation/destruction times at four AWS sites

• Dashboard Response Times for the consoles of AWS.Amazon, Google App Status, RackSpace Cloud, and Saleforce

• Windows Azure storage benchmarks

• Windows Azure SQL benchmarks

It is meant to demonstrate the value of cloud performance measurements. These metrics are based on real-time data derived from real transactions. Each chart shows the last two hours of activity. Figure 11.8 shows the CloudSensor performance dashboard.

The Service Measurement Index (SMI) is based on a set of measurement technologies forming the SMI Framework that CA donated to the SMI Consortium. It measures cloud-based services in six areas:

• Agility

• Capability

• Cost

• Quality

• Risk

• Security

These form a set of Key Performance Indicators (KPI) that can be used to compare one service to another. Figure 11.9 shows the different characteristics that make up each of the KPIs of the Service Measurement Index.

FIGURE 11.8

The CloudSensor (http://dashboard.atgcloud.info:5001/cloudsensor/cloud-sensor.html) dashboard displays real-time cloud service performance metrics.

9780470903568-fg1108.tif

FIGURE 11.9

SMI defined characteristics (Source: “The Details behind the Service Measurement Index” by Keith Allen, 2010)

9780470903568-fg1109.eps

It's too early to determine whether SMI will gain traction, but the positioning of the technology as an open industry working group makes the project very interesting and worthy of note.

Summary

Cloud management is an important and growing area of technology. In this chapter, you learned about some of the products being offered or developed to address common management problems. Among the management tasks are deployment, monitoring, configuration, optimization, and often security. Nearly all network management software vendors are repositioning their products to work with cloud systems. Some network management is available from within the cloud service providers' platforms. Many of the software systems utilize the service provider's API to manage, monitor, and control resources. The use of virtualization has spawned many new products in this area.

In Chapter 12, you learn about security used in cloud computing. This is an extension of our discussion on network management, because security is another means of controlling access through network policies.