Cloud Management - The Enterprise Cloud: Best Practices for Transforming Legacy IT (2015)

The Enterprise Cloud: Best Practices for Transforming Legacy IT (2015)

Chapter 7. Cloud Management

Key topics in this chapter:

§ Architecture of a cloud management platform

§ Orchestration and automated provisioning

§ Systems management

§ Multitenant self-service control panels

§ Software applications and packaging

§ System extensibility and APIs

§ Build versus buy decision for cloud management platforms

§ Open source cloud platforms and industry standards

§ Cloud management best practices

The cloud management system is one of the most important components to consider when planning, deploying, and operating (or consuming) a cloud service. In a public cloud environment, the customer might only utilize a fraction of the overall cloud management platform — usually just the ordering and self-service portal hosted by the public cloud. In an enterprise private cloud, the management system makes it possible for organizations to provision, track billing and utilization, and manage the entire cloud infrastructure. Lessons learned from the first generation of cloud providers and private cloud deployment has clearly shown an under-appreciation of the importance of the cloud management platform. The cloud management platform is the true core for automation, orchestration, workflow, resource tracking, billing, and operations.

Key Take-Away

Public and private clouds both use an underlying cloud management platform. Private clouds have significantly more features and customization capabilities but you have to evaluate and chose your cloud management platform and software vendor carefully.

In an enterprise private cloud deployment, cloud management tools are the most underestimated or overlooked component. Using just a hypervisor platform for server virtualization is not the same as a full cloud management system that provides multitenant online ordering, approval workflows, customized automated provisioning, resource utilization and financial tracking, self-service application administration, and reporting.

Throughout this book, I have stressed the importance of cloud characteristics. To achieve on-demand ordering, automated provisioning, and pay-as-you-use billing, a cloud management system is an absolute necessity. Anyone can build a server farm, install some virtualization software, and then declare he has a cloud; however, without a system to manage it there is no multitenant customer interface, no automated provisioning, and no metering of resources for billing. And you can forget about saving money on personnel: you’ll have to manually configure networks, servers, virtual machine templates, applications — everything. This would drive your costs too high to be competitive in the industry (public cloud) or control your operational costs in a private cloud.

Understanding the Cloud Management System Architecture

Cloud management systems vary greatly in their features, ease of use, flexibility, and cost. A cloud provider (or private cloud operator) can develop its own cloud management system or purchase an existing system from cloud software management vendors (see the “Cloud Management Platforms: The Build Versus Buy Decision” analysis later in this chapter).

Key Take-Away

A well-designed, modular cloud management system provides a cloud portal, orchestration, workflow, automated provisioning, and integrated billing/resource metering capabilities.

Figure 7-1 shows a vendor-agnostic example of the primary functions of a cloud management system. These functions are presented in three functional layers. Each layer integrates with the layer directly above and below it. For the purposes of illustration, the top layer represents the client-facing web portal on which consumers can place orders, manage, and track their cloud service subscriptions. The middle layer represents the automation, orchestration, workflow, and resource management functions. The bottom layer is the network management layer. This is where systems monitoring, security, and capacity management functions monitor the cloud infrastructure and integrate with existing datacenter operational management tools.

It is very important to note one function that is not included in the cloud management system (and Figure 7-1): the hypervisor. There can be several of these including those hosted at other cloud providers.

High-level functional cloud-service management layers

Figure 7-1. High-level functional cloud service management layers

Figure 7-2 depicts a detailed functional architecture of an ideal cloud management system. There are dozens of ways to show a detailed functional architecture and they will vary depending on cloud management software vendor — none are right or wrong, but pay attention to the individual elements shown in this figure that represent functionality any cloud management system should have.

Similar to Figure 7-1, the function architecture presented in Figure 7-2 does not include the hypervisors or actual cloud service provider(s) — this is just the command and control functions for all the cloud ecosystem.

A detailed cloud-management functional architecture

Figure 7-2. A detailed cloud management functional architecture

In this example, the orchestration levels are both above and below the automation system. This is an attempt to show that orchestration activities occur both pre- and post-initial provisioning. This could also be represented as a circle surrounding the boxes in the middle of the architecture diagram. The orchestration system makes the connections, integration, and data interchange between other layers of the architecture, which allows software from various companies to be integrated when necessary. Workflow and business process logic is normally part of the orchestration layer. There can be multiple instances of the provisioning systems shown in those same middle boxes. As new cloud providers or technologies are added, these additional provisioning systems would integrate with the orchestration system, facilitating modular additional functionality to your cloud without changing the other layers that have been integrated and are in production operations for your business.

The network management layer at the bottom represents the operations, security, asset, configuration, and software licensing functions that the cloud provider uses to manage the entire infrastructure, including all legacy IT systems, private cloud, and any hybrid integration to third-party cloud services.

The National Institute for Standards and Technology (NIST) has also published a high-level diagram showing the functional capabilities for cloud service management. Figure 7-3 demonstrates how several of the elements in the NIST model are very similar to those in the more detailed depiction in Figure 7-2.

NIST model for cloud-service management (Source: NIST, Special Publication 5-500-291 version 2, July 2013)

Figure 7-3. NIST model for cloud service management (Source: NIST, Special Publication 5-500-291 version 2, July 2013)

Orchestrating Automated Actions

An orchestrator refers to a software system programmed with workflow rules and business logic that facilitates automated actions and integrated connectors to external software systems. Many IT organizations create scripts to automate manual tasks; however, these are now considered a legacy technique. Scripts are also difficult to maintain and reuse, and their sequential processing limits their flexibility. An orchestration system goes well beyond scripting with parallel tasking, branching workflows, situational-awareness logic, and the ability to back out from or resume workflows that fail or sense an error. You can integrate sScripts and other automated software installation packaging tools into an orchestration workflow; however, the orchestration should always be the primary logic engine at the core of all cloud provisioning and automation workflows.

USE-CASE SCENARIO: ORCHESTRATION

To understand the importance of the orchestrator, let’s walk through a case scenario.

1. The customer logs on to the cloud service catalog portal and orders one virtual machine (VM) with Linux as the OS, 4 processors, and 16 GB of memory.

2. The customer processes his order through a shopping cart checkout process and submits it.

3. The orchestrator detects this new customer order and sends an email to the designated approver(s) within that customer’s organization.

4. The person who approves orders receives the email and clicks the URL within it to log on to the cloud management portal. A list of pending orders awaiting approval appears, and he approves the order.

5. The orchestrator detects that the order is now approved and begins the automated provisioning process. The orchestrator connects to the VMware server farm and instructs VMware to create one VM based on the Linux VM template.

6. VMware creates the VM as instructed. The orchestrator then changes the configuration of the VM so that it has four processors and 16 GB of memory. The orchestrator then instructs the VMware software to boot the VM for the first time.

7. The orchestrator logs on to the VM, knowing to use Linux commands, to confirm that the system is functioning correctly. The orchestrator completes any additional steps the cloud provider has configured, such as installing additional software updates or patches.

8. The orchestrator connects to the cloud provider’s change control system and enters a record in the database indicating that a new server — VM in this case — has been brought onto the network. The orchestrator populates the IP address and other configuration information into the change control system so that the cloud provider’s support staff now knows that this new server exists.

9. The orchestrator is aware that the VM has been successfully created, so it sends an email to the customer indicating that the VM service is available. This email notification contains the new VM server name, IP address, and logon information.

10. The orchestrator sends out error notices (or uses an API call to an existing service ticketing system) to the cloud provider’s support staff if any step in this process failed to complete or if any of the downstream software systems generated an error. You can configure the orchestration workflow logic to pause if an error or failure is detected (to give cloud support personnel time to resolve the problem rather than generate an error to the customer). Upon fixing the error, the cloud support personnel can continue the workflow, and in some cases the orchestration system can automatically detect the change in status and resume the workflow automatically.

In the preceding example, VMware was the hypervisor technology. There are numerous hypervisor technologies in the industry that would perform similarly; however, a key point to highlight is that the hypervisor software itself is not the same as the cloud management platform. In this example, hypervisors perform only the creation and management of VMs. Note that some hypervisor software platforms can perform some higher-level cloud management functions but are usually not as complete and all-encompassing as a full cloud management platform. It is the cloud management system that performs everything from taking the customer’s order from a service catalog to the approval process; to triggering the hypervisor to provision services; updating the network management systems with the new VM configuration and status; starting utilization and invoice tracking, and finally, sending email to the end user and cloud support staff of success.

Key Take-Away

The earliest cloud systems often relied solely on the hypervisor software’s portal or configuration tools. These hypervisor configuration portals are good for technical personnel to manage basic VM services for a single tenant organization. Multitenant clouds with more advanced PaaS and Software as a Service SaaS applications utilize full cloud management platforms that automate and orchestrate the entire infrastructure and customer portals — the hypervisor software is now just an underlying component of the cloud management system.

Figure 7-4 shows a diagram published by NIST that maps service orchestration to the NIST Cloud Reference Model (see Figure 8-2). Note that the major functionality shown in this diagram closely resembles that of the more detailed orchestration processes detailed in Figure 7-2 and the functional management layers shown in Figure 7-1.

NIST model for cloud-service orchestration (Source: NIST, Special Publication 5-500-291 version 2, July 2013)

Figure 7-4. NIST model for cloud service orchestration (Source: NIST, Special Publication 5-500-291 version 2, July 2013)

Although I legitimately focus on automation throughout this book, there are actually three common methods for provisioning resources in a cloud infrastructure. Clearly, automatic is the preferred method, but there are scenarios in which you would use the other methods, as well.

Provisioning Method

Benefits

Risks

Manual, using cloud provider’s web portal

§ Simple and easy for customer to understand and self-configure

§ Immediate initiation of provisioning activity and visual confirmation

§ Less efficient use of labor

§ Susceptible to configuration mistakes; human error

§ Potential for inconsistent configurations, adherence to standards

§ Cloud provider’s management and billing systems might not be aware of manually configured services or changes to subscriptions

Programmatic, using API calls from scripts

§ Easily integrates into existing software installation tools and scripts

§ Often preferred method for software developers, especially for non-production

§ Low cost to implement

§ Concerns and conflicts using hypervisor APIs versus provider’s cloud management platform APIs

§ Sequential scripts are not as dynamic, upgradeable, and flexible as full automation and orchestration

§ Susceptible to configuration mistakes at a faster scripted pace

§ All provisioning activities must be synchronized with the cloud provider’s management platform and billing.

Automatic, using cloud management platform orchestration

§ Fastest provisioning with least amount of labor/support required

§ Real-time provisioning status awareness, billing, and operational readiness of new cloud services

§ Consistent configurations resulting in better quality, improve security and compliance to standards/procedures

§ Higher effort to implement and configure full orchestration

§ Use caution if automation is configured to allow unlimited elasticity

§ Potentially steep learning curve for legacy IT staff to implement/use new orchestration/automation tools

Table 7-1. Cloud deployment model definitions

Resource Allocation

One key feature of orchestration and provisioning is resource allocation. In this process, the cloud system takes a new order from a customer and determines which servers, storage systems, and subnetworks have available capacity to host this new customer’s request for a compute or software instance. As each server and server farm fills up, the system knows to automatically move to the next pool of servers to provision new services (e.g., virtual machines). If there are no available resources, a warning message is generated to the cloud provider’s support staff. When additional space is available, the orchestrator will again attempt to provision the new service. All of this is done without human intervention, and potentially hundreds of times each day as new customers order services and existing customers begin to utilize more disk space and VMs.

Resource Reclamation

The opposite of automated provisioning is the de-provisioning of services for customers that cancel, or that the support personnel instruct the cloud management system to destroy. The management system knows how to stop VMs, delete user accounts, reclaim disk storage, application licenses, and cease billing for the terminated services. When any customer’s subscription is terminated, all of the server, VM, processor, memory, storage, and other resources are cleared, reclaimed, and made available for the next customer. There are some hypervisors and storage systems that do not automatically reclaim the now-unused disk space because of technical limitations in the manufacturer’s software. This is similar to moving files to your deleted items or trash can on a desktop computer, but the disk space is still being occupied by the deleted data until you clear the trash can; the disk space is not truly reclaimed or made available for the next customer.

Key Take-Away

The reclamation process, both within a hypervisor and especially storage systems, are not always automatic (various limitations of some software manufacturers). This requires the creation of batch processes scheduled to run daily, weekly, or monthly during nonpeak hours to actually clear services or data that is no longer active.

Setting Up Orchestrator Workflows

As the orchestration system is essentially the brains of the cloud management system, it is also the place where any custom business logic and customized workflows are created. The cloud service provider can use the orchestrator to provision simple or multitiered cloud applications, send messages, send out customer invoices, and automatically trigger an alert for any event within the cloud. As future “as a service” products are deployed, the orchestrator is updated with new workflows, scripts, processes, and rules that facilitate automated provisioning, utilization tracking, billing and metering, and operational management. Many cloud management platforms include a service designer tool (through programming and/or a GUI interface) with which the cloud provider or technically skilled customer can create new workflows, single or multiple VM platform applications, network segmentation, and so on.

Often, in a private cloud deployment model, the orchestrator can be used to handle highly customized customer needs. These custom tasks could be to perform multiple-level order approval, approver reminders after a period of time has passed, or customer notification of certain events. You can also program the orchestrator to do non-cloud-specific tasks such as opening a support ticket, gathering statistical data and sending a monthly report, or warning the cloud provider’s support personnel well in advance before they run out of available disk storage.

Key Take-Away

Some orchestration systems include a library of preintegrated service designs and connectors to external software systems, hypervisors, network management tools, and additional cloud providers in a hybrid cloud environment.

Creating Reports and Dashboards

A cloud management system provides numerous reports for finance, service availability, performance, and service-level agreement (SLA) adherence. Reports and a dashboard displaying metrics can show customers a real-time view of the service status and utilization, which is a tremendously valuable feature for those customers that need more than a monthly written report to monitor statistics and performance metrics. Most public cloud providers give only limited reporting and dashboard views into the cloud service. You can customize private cloud systems to a significantly higher degree to meet customer needs and integrate with internal service management or business intelligence systems.

When an organization goes beyond basic IaaS and SaaS offerings, integrating legacy applications into a single cloud management system will provide huge benefits. You can truly have an “executive dashboard” of all your services and applications with metrics, statuses, and financial information in a single web portal.

These reports and dashboards provide cloud customers with visibility into their services, utilization, performance, and costs. The first generation of clouds and providers rarely provided this visibility, which still causes concern and impedes new customer adoption of cloud computing.

Managing Systems and Services

As the cloud management system handles resource allocation and provisioning of services, it is fully aware of all servers, networks, storage systems, and applications that are available and already deployed. The cloud provider or your internal enterprise support staff will use this information to detect system problems, identify systems with low remaining capacity, or take individual servers offline for maintenance. When one system is turned off, the cloud management system has the ability to move any active customer resources (e.g., virtual machines, storage volumes) to other infrastructure devices in the datacenter so that, in a perfect scenario, nobody notices the maintenance outage.

The systems management tools utilized in large datacenters are commonly used for capacity management, security monitoring and alerting, asset and configuration management, and software distribution and automated updating of applications. In a cloud environment, these traditional tools use much of the same technology but the emphasis is on forecasting, real-time updating of statistics, and automated response to events. Clouds work because of automation, so elimination of human-caused delays and eventual inaccuracies is the new paradigm for systems management.

Earlier in Chapter 2, cloud operations management lessons learned and best practices were discussed. There are many aspects of traditional IT or datacenter operations that change in an on-demand, automated, and elastic cloud environment. Capacity management and continuous careful monitoring of the automated provisioning is significantly more important in a cloud environment than ever before. For full details on system-management tools, techniques, and best practices in a cloud ecosystem, refer to Chapter 2.

Providing Self-Service Control Panels

The cloud management system not only handles customer orders, orchestration, and provisioning functions of the cloud, it also provides a (usually web-based) portal for customers to configure their cloud services. The first generation of cloud providers and private cloud management platforms had only given customers extremely basic service administration capabilities. Almost all cloud management portals make it possible for the customer to order new services, upgrade subscriptions, and terminate services. Depending on privileges as defined in roles-based access controls, some IaaS providers give the customer the ability to restart or reboot IaaS VMs; however, this is significantly less capability that a customer would have if it ran its own on-premises virtualization hypervisor. These limitations are further exacerbated when a cloud provider offers PaaS and SaaS that have more sophisticated software applications, configuration settings, and daily user administrative tasks.

Key Take-Away

There are significant gaps in self-service application administration capabilities in first-generation cloud providers and cloud management systems. Many SaaS-focused providers have long understood this problem, and their customer-facing cloud portals offer excellent self-service control panels for each application and cloud service. Public cloud providers and private cloud management systems are slowly understanding and catching up to this level of application self-service administration capability.

Given that the initial priority for most public cloud providers has been to offer IaaS applications, the lack of fully featured self-service application administration control panels has not been a significant problem. The entire industry and customers agree that platforms (PaaS) and applications (SaaS) will be much more of a focus in the coming years; therefore, the lack of self-service application-specific control panels must be solved. In the meantime, early adopters of SaaS cloud services are forced to rely on submitting service tickets to the cloud provider or internal cloud operations team for routine application administrative tasks. This is a waste of time and money for everyone.

For more details on the future of self-service control panels and next-generation priorities and challenges, refer to Chapter 9.

Software Applications and Packaging

Cloud management systems have varied capabilities to provision and administer industry-leading commercial off-the-shelf (COTS) and custom software products. It is this ability to quickly add applications to the cloud management system — and thus make them available to customers — that differentiates a powerful, robust cloud management system from an average one.

Software Applications

There are two primary types of applications cloud management systems support: COTS and customer “homegrown” applications.

COTS are widely sold and distributed products that cloud providers might install into their datacenters, using the cloud management platform to manage them all. When customers order one of these SaaS offerings, the management system has usually been preprogrammed with a “module” that knows how to provision, automatically configure, and integrate using APIs with the COTS application. As each software vendor releases new versions in the future, the cloud management platform vendor must also update its integration module or API to keep up. The cloud customer is not aware of these continuous upgrades unless new user interfaces or features appear in the self-service control panels.

It is when a provider wants to integrate a custom homegrown application — or some legacy customer-owned application — into the cloud that you can use or develop an integration module so that the environments can communicate with each other. Maybe the customer has a finance application that it would like deployed to the cloud; in this scenario, the cloud provider will migrate the application and data, but it will also have to utilize a module so that automated provisioning, billing, reporting, and the self-service control panels are functional. The customer simply sees an additional tab or icon on its cloud control panel called, for example, “finance application.” For more details on legacy application migration, porting, and redesign options, refer back to “Chapter 4”.

By integrating all the custom and COTS applications, the cloud management system gives both the provider and the customer support staff a single management console for all cloud and application configuration and administration. This is very important: without this single unified cloud management console integrating everything, the cloud provider’s support personnel would have dozens of different software tools, one for each application, that they would need to use every time they wanted to create a new user or perform routine administrative tasks.

Extending the System via APIs

You can use cloud management platforms to access and control the cloud by integrating with third-party applications and software systems through an application programming interface (API). APIs are the means by which cloud management platform can access and initialize commands or tasks into other software applications. Just as with the orchestrator, the cloud provider can create its own programs to do mass import, exports, or other tasks using the API features of the cloud management system. The purpose of an API is to allow integration and exchange of automated commands between multiple applications and, in the case of cloud, the cloud management platform.

In a hybrid cloud environment, APIs are also utilized to integrate with external XaaS providers. The cloud management platform will normally have a library of premade modules that use APIs with multiple cloud providers, applications, and future as-yet-unknown cloud providers and XaaS offerings. Cloud brokering, which is covered in Chapter 8, also relies heavily on APIs between cloud providers.

Software Packaging/Configuration Management Tools

Cloud management platforms control the overall processes and workflows within a cloud through their automation and orchestration features. However, there are numerous other software packaging and software configuration management tools that you can use, as well. These tools, such asPuppet and Chef, are popular open source software configuration management tools that are commonly used in modern datacenters. Both of these tools have server-based applications with distributed agents on other target computers throughout the chosen network. Systems administrators specify the desired software state (e.g., software applications and patches) for each target server/host machine and the Puppet/Chef software will update each target system automatically based on configurable rules and parameters. Both tools support multiple programming languages and techniques for even more advanced software installations.

Another type of software packaging tool that has gained significant popularity is the open source product called Docker (you can read more about this product in Chapter 9). Docker is an application containerization technique with which you can package and run software programs in a virtualized memory compartment within any server and OS that runs the Docker [Application] Engine. These “Dockerized” applications are in theory portable and would run on any server or operating system without the need to be recompiled.

There are other tools with which you can package applications into self-installing modules that can also be integrated into the orchestration system of a cloud management platform. These tools vary by software vendor, but they can use a combination of sequential scripts, state/image capturing, or programming tools to intelligently install software onto target servers/hosts in the cloud or datacenter. By using these tools, you can greatly simplify and automate the distribution of software and updates across any number of systems — a level of automation that is critical for a cloud or any modern datacenter environment.

It is important to note that although a cloud management platform has its own automation and orchestration system, with workflow, scripting, and service design capabilities, you can easily integrate third-party software tools such as Puppet, Chef, and Docker. There is no need to re-create software installation scripts from these third-party tools into the orchestration; simply make calls out to the appropriate external tool from the orchestrator in the workflow engine.

Cloud Management Platforms: The Build Versus Buy Decision

Depending on the number and complexity of services, cloud providers might decide that developing their own cloud management system is the right way to go. Because this is custom developed, the cloud provider has complete freedom in what features, functionality, and customer interfaces to include. If the cloud provider is only offering a simple cloud service or to a small customer market, developing a cloud management system might not be cost effective.

Building your Own Cloud Management System

There are two sides of this build-versus-buy decision. Even the simplest cloud management system still needs to have some sort of shopping cart, billing system, reporting system, resource metering system, workflow orchestration, and automated service provisioning. Cloud providers have a significant task ahead of them to create constantly enhance their own cloud management systems to be competitive. Those who do take on the build task themselves often end up spending 2 to 3 times the amount of time and money than they originally planed — and find they are still years behind the competition. The cloud provider will then want to add new applications or services and keep up with future revisions of every software application, resulting in never-ending development costs.

When customers begin to use the cloud management system, the need for additional features, more reporting, and more visibility on status and usage will become apparent. Customers are not shy about sending cloud providers requests for new features and enhancements. In fact, feature and scope creep is so significant that an entire section of this book covers this topic in Chapter 3. Multiply this across dozens or hundreds, or thousands of customers, and cloud providers quickly realize that their homegrown cloud management systems are insufficient. It might not be as modular as desired for easy upgrade; it might not scale up as intended; the customer interface may not be as intuitive as expected; or performance may be dragging down the user experience. Adding new service offerings to the homegrown cloud management platform now takes longer than the deployment of the new servers and applications that customers actually use.

Ultimately, most cloud providers end up redeveloping some or all of their cloud management systems within the first two years of becoming a provider. The amount of money spent developing a homegrown system is only surpassed by the amount of time it takes to maintain and continuously improve it. Lessons learned have proven that a homegrown cloud management system is a long and painful road. Cloud providers should seriously consider using a commercial platform that already has the needed basic functionality and a library of premade modules for integrating with server farms, VM hypervisors, SANs, software-defined networking, and COTS applications. Most important, the software vendor who creates and sells the cloud management platform also maintains its code, provides all future upgrades, new application integrations, and supports its system. You get to run your cloud and service your customers instead of dealing with the distraction of software development and maintenance.

Key Take-Away

The few companies that make money developing their own cloud management platform actually resell their system to other providers — everyone else who develops their own cloud management system wastes millions of dollars trying to build and continuously improve their homegrown cloud management systems.

Buying a Cloud Management System

The only company that makes money and is successful in the cloud management software tool industry is one that creates and resells it; you don’t make money building your own system just for your own cloud. This lesson has been learned again and again, particularly in large datacenters and government organizations that, for various reasons, believed they would “do it better themselves.” History has shown that most of these attempts fail, cost exponentially more than expected, and result in a complete redevelopment or eventual purchase of a commercial cloud management product after years of frustration.

The exception to this “don’t build your own rule” is for extremely large public cloud providers such as Amazon, Microsoft, and Google, all of whom have the investment capability, internal developers, and massive growth plans to justify using their own cloud management platform. This being said, even these large providers end up starting with a very basic, unintuitive customer experience that limits customer adoption and satisfaction initially. These large providers spend millions of dollars continuously improving their cloud platforms — eventually maturing to a level of competitive features and hopefully keeping up with competitors who are continuously improving, as well.

As a private cloud provider, you should focus on your cloud services and your customers rather than spending all your money, time, and focus building and managing your own cloud management tool. Just as cloud providers sell the cloud by telling customers to focus on their customers’ needs and not commodity IT, the cloud providers should heed their own advice and focus on being a cloud provider, not a software development shop.

There are many cloud management software systems available for purchase. Of the major vendors and open source options, there is a considerable variance in features and maturity. Later in this chapter, “Commercial Cloud Management Platforms” presents a summary of available cloud management platforms and software providers.

Purchasing and Upgrading a Cloud Management System

When evaluating and purchasing a cloud management system, some software vendors will sell you a license for their product with an initial up-front price, and then charge a percentage of that cost each year for support, maintenance, and free upgrades.

These upgrades aren’t really free given that you are paying the annual maintenance fees; however, the huge benefit to this approach is that you don’t need to worry about continuously developing updates for your cloud management system. As the software manufacturers release new versions of your hosted COTS applications, the cloud portal vendor will provide you with its latest version. There is usually a two to four-month gap from when a major COTS product is updated and the cloud management vendor releasing its corresponding new version. You aren’t in the business of constantly keeping up your cloud control panels updated with every new COTS release. As a provider, you will need to upgrade your COTS application servers in your server farm, but at least the cloud management system upgrade will be done for you.

Key Take-Away

The first time you go through a major software application upgrade, you will truly realize the benefits of having purchased your cloud management system and support from a third party.

When purchasing your cloud management system, some software vendors have an alternative licensing model by which they will charge a smaller up-front fee in lieu of pay-as-you-go charges for each user, VM, or other unit of measure depending on what services you have enabled. This makes it possible for the provider to start small with just a few customers and pay only the low licensing fees each month. As you grow in users and revenue, so do the fees you pay to the cloud management software vendor; this is essentially a shared-growth model wherein both parties are incentivized to maintain a stable system and increase customer count. I highly recommend this model for small but growing cloud providers to reduce up-front capital expenses.

For private cloud deployments within an organization that has a relatively fixed number of users, you might get a better “bang for the buck” by purchasing the cloud management system outright and only paying ongoing maintenance fees.

Commercial Cloud Management Platforms

With the industry’s newfound understanding that the cloud management platform is the core of any public or private cloud, the software companies providing these platforms are becoming better known. For most public cloud providers, the underlying cloud management platform is more likely a self-developed tool with possibly some COTS tools that providers often do not announce publicly. The private cloud management tools are where the competition is fierce and where the management platform that you choose effects just about every aspect of the customer experience, services offered, billing, operating procedures, flexibility, and cost of the service.

There are several industry-leading private cloud management platforms available. There have been numerous acquisitions in this industry — many by systems integrators — that further illustrates the importance and competitiveness of these cloud management platforms.

In the following subsections, I describe the industry-leading private cloud management platforms and providers (note that many of these companies also provide public or other cloud-related services that are described in Chapter 9); the following list is strictly for cloud management software platforms:

VMware

VMware’s cloud management platform is called vRealize Suite. This suite includes multiple software components for private and hybrid cloud management. The vRealize Suite integrates with VMware’s well-known hypervisor vSphere and the vCloud Suite for internal datacenter and infrastructure management. Given VMware’s popularity and long history in the server virtualization industry, it is a common choice for private clouds as the VM hypervisor.

Microsoft

Microsoft’s private cloud management platform is called Microsoft Azure Pack. This system integrates with on-premises Microsoft System Center and Windows Servers in your datacenter. Microsoft also has its own hypervisor server virtualization technology (Hyper-V) through which it can provide a complete end-to-end cloud software platform, from hypervisor to cloud management system, to operating systems and applications. Microsoft also has a public cloud service also named Azure (described in Chapter 5) which can integrate with this private cloud Azure Pack to provide hybrid capabilities.

Hewlett-Packard

Hewlett-Packard’s suite of cloud management software components include Cloud Service Automation, Operations Orchestration, and Server Automation. These software components are also available as an integrated platform called HP Helion CloudSystem that includes OpenStack, supports multiple hypervisors, and manages any combination of private, public, and hybrid cloud providers. Hewlett-Packard also offers a suite of cloud application development, database automation, security, and datacenter service operations tools.

BMC Software

BMC’s cloud management platform is called Cloud Lifecycle Management. This system was originally based upon an industry-leading IT service-request management system called Remedy. BMC has its own automation, orchestration, and library of integration modules to connect to its own suite of datacenter operations tools and many third-party applications, hypervisors, and cloud providers.

Citrix

Citrix’s cloud management platform is called CloudPlatform. This system is powered by Apache CloudStack, the open source cloud platform Citrix open sourced after purchasing from Cloud.com. CloudPlatform managed private, public and hybrid cloud environment and supports multiple hypervisor. Citrix also has its own hypervisor call Citrix XenServer.

Computer Sciences Corporation (CSC)

CSC’s cloud management platform is called ServiceMesh. This system provides workflow, policy, and governance–focused IT service-management deployed in a private cloud. The system is designed for flexibility in automated business processes of a wide range of IT services including integration to cloud-centric hypervisors, applications, and third-party PaaS and SaaS providers.

Parallels

Parallels’ cloud management platform is called Parallels Automation. This system was originally a SaaS-focused automation and self-service control panel system, but it now manages IaaS aoolications with multiple hypervisor support. Parallels manages private, public, and hybrid clouds and is used by many Internet, telecommunications and cloud service providers. Parallels also has a hypervisor platform called Parallels Cloud Server.

RedHat

RedHat’s cloud management platform is called CloudForms. This system is based on its acquisition of ManageIQ’s Enterprise Virtualization Manager and is focused on management of internal infrastructure servers and virtual machines with a mature self-service administration portal. RedHat has also committed to using OpenStack so that it can continue to mature and evolve its private and hybrid cloud platforms and integration to its industry-leading Linux platform.

RightScale

RightScale’s cloud management platform is called Cloud Portfolio Management (CPM). This system provides a single “pane of glass” to manage and govern cloud services across private or multiple public cloud service providers.

There are many cloud management platform vendors and software systems that are not included in this book. There are simply too many mid-level and up-and-coming platforms to include so I have attempted to cover the most significant and industry-leading vendors. It is safe to say that I’ve more than covered all the “industry leader” cloud management platforms that appear in the top two industry analyst publications.

Open Source Cloud Platforms and Industry Standards

There are several open source cloud management platforms forming and continuously improving in the industry — OpenStack and CloudStack being the two most significant in the industry. Some of these cloud management platforms have broad industry support but might lag behind in terms of the features and functionality provided by commercial cloud management platforms. Organizations that want to evaluate or deploy this class of cloud management system should consider the pros and cons of using open source versus commercially supported software platforms. Open source’s largest benefit is avoiding software vendor lock-in; however, the reality is that some of the best cloud management platforms in the industry are either proprietary or a combination of open source and proprietary — evaluate and choose carefully.

Following are two widely used open source cloud management platforms:

OpenStack

OpenStack is an open source project with the largest community, code contributors, and cloud provider and system integrator involvement. OpenStack’s main goal is to support interoperability between cloud services while enabling enterprises to create Amazon-like cloud services. OpenStack is a combination of modules that you can use to build, host, and operate your own cloud. OpenStack modules are available to provide IaaS VMs, object and block storage, networking, identity, and many other services. There is an OpenStack module for cloud management automation and orchestration called Heat and another module called Horizon that is a customer-facing self-service configuration portal. Being an open source project, all developed source code is submitted to the OpenStack committee, with the combined code being released to the public free of charge. Notable founders and adopters of OpenStack include RackSpace, NASA, Hewlett-Packard, and IBM — each adding to and customizing OpenStack for their customer deployments and integration into other cloud management platforms and providers. Beyond a cloud management platform, OpenStack is also seen as an industry standard for application interfaces, interoperability between cloud providers, and eventually will at least influence software-defined networking and datacenters in the future. Given the quantity and well-known industry companies that have committed to OpenStack, it is expected to dominate as the industry open source for cloud and API integration between clouds.

CloudStack

CloudStack is largely considered the primary competitor to OpenStack. CloudStack was originally developed by Cloud.com, which was purchased by Citrix, and later released to the Apache Incubator program. Citrix is still involved in the open source platform, but the Apache Software Foundation now governs CloudStack. Key features of CloudStack include an easier, streamlined deployment, scalability, and multiple hypervisor support (Citrix Xen, Oracle VM, VMware, and KVM). CloudStack has fewer sponsors and corporations providing code and support compared to OpenStack.

The Organization for the Advancement of Structured Information Standards (OASIS) Topology and Orchestration Specification for Cloud Applications (TOSCA) is an industry-standards organization notable for creating both cloud portability standards and orchestration standards. TOSCA is not a full cloud platform like OpenStack or CloudStack but is instead a standard for cloud management platforms. The TOSCA standard provides an interoperable description of application and infrastructure cloud services and the operations of these services. As of this writing, TOSCA is continuously evolving, with version 2.0 in development. Several cloud management platform software vendors as well as open source cloud platforms are already planning to include TOSCA support in their software systems.

Configuration Automation Tools

There is a class of cloud automation and configuration platforms that focus on the creation of provisioning packages (known as recipes) and automated software configuration. This class of configuration automation tool does not normally have the service catalogs, unified billing, resource utilization tracking, and overall cloud management features those in the products just described. These tools often specialize in scripted execution of VM, OS, and software — using a full cloud management platform to kick-off or initiate these automated configuration tools.

There are too many similar automation tools available as open source or commercial software to cover in this book. Notable open source configuration automation tools in the industry include the following:

Puppet

This tool provides both a command-line and web user interface for managing automated configuration scripts. Puppet uses its own programming language, which is based on Ruby (native Ruby is also supported) and requires programmatic expertise. Puppet uses a modeling approach to configure automated provisioning recipes and a robust push capability to update existing servers.

Chef

This tool is similar to Puppet in that it uses programmed scripts or recipes to automate the deployment and updating of servers, OSs, and software. Chef uses the Git programming language that is sometimes preferred by IT teams that have significant programming experience, which affords extremely detailed and customized scripts.

Salt

Salt is similar to Puppet and Chef in its ability to configure automated scripts or recipes for deploying software. You can create customized scripts or modules by using Python or PyDSL programming languages or you can download premade modules. Salt’s biggest advantage is in scalability and resiliency through the optional use of multiple master, distribution servers, and minion remote agents.

Git

Git is a popular software developer distribution control system with programmable workflows. Git is commonly used to automate the deployment of applications or websites to the cloud via Linux commands. GitHub is a web-based service for Git users and developers. Whereas Git is strictly a command-line tool, GitHub provides a graphical interface for desktop or mobile integration, code sharing, collaboration, task management, and bug tracking.

Docker

Docker is an open source software system and development community that is focused on the automated deployment of applications by using software containers. Using Docker’s application containerization technique, you can package and run software programs in a virtualized memory compartment within any server and OS that runs the Docker [Application] Engine. You can use Docker containers within most private clouds as well as many of the industry’s most popular public cloud environments.

Cloudify

This is an open source cloud orchestration tool for automated deployment and updating of servers, OSs, and software. Each Cloudify recipe describes an application blueprint that includes the detailed instructions for installing and managing applications, including subcomponents and dependencies. Cloudify follows the TOSCA standard for cloud automation.

Cloud Management Best Practices

Based on lessons learned and experience from across the cloud industry, you should consider the following best practices for your organization’s planning.

Cloud Management Platform

All cloud providers and operators of a private cloud utilize a cloud management system that provides a customer ordering portal, subscription management, automation provisioning, billing and resource utilization tracking, and management of the cloud. Here’s some recommendations for you:

§ Use the full capabilities of the cloud management platform’s orchestration system. Do not rely on legacy sequential scripts for automation; instead, use orchestration-based workflows and service designs.

§ Utilize third-party automated software installation packaging tools along with the orchestration system for maximum flexibility, but ensure that the orchestrator is the primary logic engine/workflow tool.

§ When selecting or building a private cloud, evaluating and selecting the cloud management platform is one of the most critical decisions. The features, functionality, and vendor support (including future updates and new features) of the cloud management system will have an impact on every aspect of how a private cloud is managed, the user experience, and the time and cost of releasing initial and new service catalog items.

§ If your organization is planning to use a private cloud and one or more public cloud providers, consider implementing a hybrid cloud management platform rather than just a private management system. For full details on hybrid cloud and management platforms, refer to Chapter 8.

§ The following are the features to look for in a cloud management platform:

§ Usability and customization capabilities of the consumer-facing portal

§ Ability to create, manage, and display multiple categories and types of cloud services through a service catalog

§ Ability to see pricing, optional services and pricing, terms and conditions, and renewal settings for all cloud services through a shopping cart or order checkout process.

§ Customizable order-approval workflow process

§ Consumer-visible status dashboards and reports on utilization, billing, and service status

§ Ability to manage subscribed cloud services (i.e., perform service restart, stop, pause, and add new capacity) via the cloud portal

§ Preintegrated support for multiple hypervisors, external cloud service providers, internal service ticketing system integration, integration with security and network/systems monitoring software.

§ Most cloud management platforms have some form of on-demand reporting and statistics dashboard available for cloud consumers. Cloud operators and consumers will quickly want more and more real-time dashboards, usage statistics, trending/forecasting, and so on. Expect customer requests to outpace development efforts initially, so manage expectations and plan enhancements for future portal releases.

§ There is a trend in the industry by many service-ticketing software vendors to create cloud XaaS items in their service-ticketing system and claiming they are a cloud management platform. Most of the backend architectures of these products are not conducive to the cloud and are just IT service-desk ticketing and workflow systems. These systems can make API calls into hypervisors and other cloud services, but the architecture, functionality, and integrations are nowhere near the level of a true cloud management platform. Be very detailed and careful in evaluating service-desk, IT service delivery, and other platforms that were originally designed and intended as IT service management and support ticketing systems.

Building Versus Buying a Cloud Management Platform

Cloud management platforms are very complex and expensive to design, develop, support, and then continuously upgrade with new features and integration with the latest cloud technologies. Here are some considerations:

§ Even large-scale cloud providers rarely build their own cloud management platforms. When building your own private cloud, don’t build your own cloud management platform, because the cost and effort is extensive and will likely draw your focus away from your core mission.

§ Although you can certainly customize your private cloud management platform to suit your cloud operations and your customers, remember not to go too far with the customizations to the point where your system is now so unique that the original software manufacturer can no longer apply new features and upgrades to your cloud platform.

Hybrid Cloud

A theme and trend mentioned throughout this book is that most private clouds will eventually use one or more third-party public cloud services, so hybrid clouds will soon be the norm. Here are some considerations:

§ When starting with a public cloud provider, you must use its cloud management platform and that platform’s inherent capabilities. If you later deploy a private cloud, you will need to deploy your own private cloud management platform and manage your two clouds individually. If you begin first with hybrid cloud management platform, you can use the hybrid platform’s capability to integrate with and effectively hide the public cloud management systems — using your private/hybrid cloud platform for a single management and customer experience.

§ When evaluating and selecting a private or hybrid cloud management platform, look for preintegrated features and integrations with legacy datacenter hypervisors, networks, storage systems, and support for third-party cloud providers. This means that the external cloud provider’s capability is readily supported when you are ready to use those features.

§ Remember that hybrid clouds can involve any combination of private, public, community, or legacy enterprise IT, regardless of where the data, applications, and server farms are hosted (e.g., on premises, at a provider, or at a leased colocation facility).

§ If you use a public cloud provider’s management platform and self-service portal, the provider might offer some hybrid capabilities to integrate back to internal enterprise IT applications, directory services, and data; however, you might be very limited in flexibility and management. In most scenarios, you are better off starting with a private/hybrid cloud management system that extends outward to one or more public cloud provider(s) — giving you the most flexibility, customization, and choice of providers and XaaS.

Open Source

Just as occurred in the IT industry over the past 30 years, proprietary platforms and services often lead the market until the eventual need for standards and interoperability becomes so paramount that open source platforms become the norm. Cloud services and management platforms are now at this stage of evolution, and now nonproprietary open source cloud services and platforms are more desirable, cheaper, and offer more integration capabilities with other applications, cloud providers, and legacy enterprise IT. Here’s more information:

§ Using an open source cloud management platform has some unique implications:

§ Open source platforms do not always have as many features or mature as quickly as a COTS system, especially during initial project startup years. This is where some software vendors offer their own version or distributions of the open source platform — adding additional capabilities or enhancements above and beyond the open source application.

§ Open source has the benefit of reducing software vendor lock-in

§ Some organizations contribute a significant amount of code and suggested standards, but do not always get their ideas adopted as the part of the new software release. Any enhancements a software vendor wants to add to the open source system capability will need to be retested and modified every time a new software revision is released to ensure compatibility — this can be a costly and never-ending endeavor.

§ Open source often becomes (officially or de facto) the industry standard for APIs and cross-vendor integration

§ Systems integrators and cloud service providers that use open source systems might find it difficult to differentiate their capabilities and features from other competitors who use the same open standard (or even end customers with their own deployment of the open source platform).

§ Customers often use open source to avoid lock-in to any single vendor’s cloud management platform and API standards. Open source platforms are also easier to evaluate for security purposes, given that the source development code is available to the public. Of course, some could also argue that this open source code also helps hackers identify potential weaknesses in security, but this is usually mitigated by a large community of open source testers.