Information Security Management Handbook, Sixth Edition (2012)
DOMAIN 3: INFORMATION SECURITY AND RISK MANAGEMENT
Security Management Planning
Chapter 11. CERT Resilience Management Model: An Overview
Bonnie A. Goins Pilewski and Christopher Pilewski
The CERT® Resilience Management Model (CERT-RMM) is a process model that seeks to improve the management of risk and maintain operational resilience for an organization. It does this by aligning the business continuity management and IT operations and security management disciplines. It also brings the concept of quality and process management into the organization. CERT defines quality as “the extent to which an organization controls its ability to operate in a mission-driven, complex risk environment [CMMI Product Team 2006].”
With the advent of RMM, the model seeks to present the disciplines above in a process approach, which allows the organization to apply process improvement mechanisms, as well as to develop a basis for metrics and measurement. As most security professionals have experienced within their careers, it is difficult at best to craft meaningful metrics for security implementation; as such, any tool that would assist in this capacity is very welcome indeed! It also provides a unified framework for organizing the work in the field that is performed within the organization. As is true with process maturity models, such as the Capability Maturity Model for Integration (CMMI), RMM provides a base for process institutionalization and organizational process maturity.
CERT-RMM v1.0 contains 26 process areas that cover four areas of operational resilience management: enterprise management, engineering, operations, and process management. The practices focus on the activities that an organization performs to actively direct, control, and manageoperational resilience. The model does not prescribe specifically how an organization should secure information. Instead, it focuses on identifying critical information assets, making decisions about the activities and controls required to protect and sustain these assets, implementing strategies to achieve asset control, and maintaining control throughout the life of the assets.
The process areas and theirs tags are represented in the Table 11.1.
The model is managed much the same as the CMMI and includes the following levels for measurements:
Level 0: Incomplete
Level 1: Performed
Level 2: Managed
Level 3: Defined
Levels 4 and 5: Quantitatively Managed and Optimizing
Table 11.1 Process Area Tags
Process Area |
Tag |
Asset Definition and Management |
ADM |
Access Management |
AM |
Communications |
COMM |
Compliance |
COMP |
Controls Management |
CTRL |
Environmental Control |
EC |
Enterprise Focus |
EF |
External Dependencies Management |
EXD |
Financial Resource Management |
FRM |
Human Resource Management |
HRM |
Identity Management |
ID |
Incident Management and Control |
IMC |
Knowledge and Information Management |
KIM |
Measurement and Analysis |
MA |
Monitoring |
MON |
Organizational Process Definition |
OPD |
Organizational Process Focus |
OPF |
Organizational Training and Awareness |
OTA |
People Management |
PM |
Risk Management |
RISK |
Resilience Requirements Development |
RRD |
Resilience Requirements Management |
RRM |
Resilient Technical Solution Engineering |
RTSE |
Service Continuity |
SC |
Technology Management |
TM |
Vulnerability Analysis and Resolution |
VAR |
As stated in the CERT-RMM Report, RMM includes: a process definition, expressed in capability areas across the four RMM framework competencies (enterprise management, engineering, operations management, and process management); focus on the resiliency of four essential operational assets (people, information, technology, and facilities); the inclusion of processes and practices that define a scale of five capability levels for each capability area (incomplete, performed, managed, directed, and continuously improved); and easily aligns with and references common codes of practice such as ISO27000, ITIL, COBIT, and others, such as BS25999 and ISO24762.
RMM also includes quantitative process metrics and measurements that can be used to ensure that operational resiliency processes perform as intended.
Key Components of the RMM
RMM capability areas define the resiliency engineering process. Each capability area has a set of goals. Goals are required elements of the capability area. An example of a goal from the Service Continuity (SC) capability area is “SC-1 Prepare for Service Continuity.” These goals are broken down into specific practices. Specific practices are considered to be the “base practices” of the capability. An example of a specific practice from the SC capability area is “SC-1.1 Plan for Service Continuity,” which is a practice aimed at completing the goal “SC-1 Prepare for Service Continuity.”
These practices are also broken down into subpractices. Subpractices are neither specific nor detailed, but help the user determine how specific practices are implemented and how this helps achieve the goals of the capability area. Each organization will have its own subpractices either organically developed by the organization or acquired from a code of practice.
Subpractices can be linked to common codes of practice. Subpractices are typically generic in nature, while codes of practice can be very specific. For example, a subpractice may suggest “set password standards and guidelines” while a specific code of practice may state that “passwords should be changed in no longer than 90 day intervals.”
Examples of common codes of practice are detailed next, as detailed in the RMM Report.
BS 25999
BS 25999 is the British Standards Institution’s (BSI’s) code of practice and specification for business continuity management. The purpose of the standard is to provide a basis for understanding, developing, and implementing business continuity within an organization and to provide confidence in the organization’s dealings with the customers and other organizations.
There are two BS 25999 documents: the Code of Practice, BS 25999-1:2006 [BSI 2006] and the specification, BS 25999-2: 2007 [BSI 2007].
COBIT
COBIT is the Control Objectives for Information and Related Technology [ITGI 2007]. It was developed by the Information Systems Audit and Control Association (ISACA) and the IT Governance Institute (ITGI) to provide managers, auditors, and IT users with generally accepted information technology control objectives to maximize IT benefits and ensure appropriate IT governance, security, and control.
References are also made to Val IT [ITGI 2006] in this document. Val IT is a reference framework that addresses the governance of IT-enabled business investments.
COSO Enterprise Risk Assessment
In 2004, the Committee of Sponsoring Organizations of the Treadway Commission (COSO) issued an enterprise risk management framework to help organizations enhance their corporate governance and risk management activities [COSO 2004]. The ERM integrated framework provides a broader risk management view that encompasses COSO’s original focus on internal controls.
CMMI
CMMI® 3 is a process improvement maturity model for the development of products and services. It has several constellations or areas of interest that provide application-specific models that share common content.
The CMMI for Development (CMMI-DEV) represents the systems and software development domain [CMMI 2006]. In addition, the CMMI for Services (CMMI-SVC) constellation is represented by a draft CMMI model designed to cover the activities required to manage, establish, and deliver services [SEI 2007].
DRJ/DRII Gap
The DRJ/DRII GAP (Generally Accepted Practices) is put forth jointly by the Disaster Recovery Journal (DRJ) and the Disaster Recovery Institute International [DRJ 2007]. GAP is a set of identified and documented standards and guidelines that aim to create a depository of knowledge by and for the business continuity profession. The practices are aligned with DRII’s 10 areas of professional practice, as detailed in the DRII Professional Practice Guidelines.
FFIEC
The Federal Financial Institutions Examination Council (FFIEC) publishes a series of booklets that comprise the FFIEC Information Technology Examination Handbook. These booklets are published to help bank examiners with evaluation of financial institutions and service provider risk management processes, with the goal being to ensure the availability of critical financial services.
ISO/IEC 20000-2: 2005 (E)
ISO/IEC 20000 is a standard and code of practice for IT service management published by the International Organization for Standardization and the International Electrotechnical Commission (ISO/IEC). It is based on (and supersedes) the earlier British Standard BS 15000. It reflects the best practice guidance for IT service management as provided in the ITIL (Information Technology Infrastructure Library) framework, but also broadly covers other service management standards.
ISO/IEC 24762: 2008
ISO/IEC 24762, “Guidelines for information and communications technology disaster recovery services” [ISO/IEC 2008], is part of the business continuity management standards published by ISO/IEC. It can be applied in-house or to outsourced providers of disaster recovery physical facilities and services.
ISO/IEC 27002: 2005
ISO/IEC 27002, “Code of Practice for Information Security Management” [ISO/IEC 2005b], is also published by ISO/IEC. It is part of a growing “27000 series” that evolved from the original British Standard BS 7799, which was translated to ISO standard ISO 17799.
NFPA 1600
NFPA 1600 is the National Fire Protection Agency Standard on Disaster/Emergency Management and Business Continuity Programs [NFPA 2007]. It is primarily focused on the development, implementation, and operation of disaster, emergency, and business continuity programs, including the development of various types of related plans. The 2007 edition of this standard was used for reference and is an update of the 2004 standard.
Crossmapping
Materials are available that demonstrate the relationship among existing standards, their constituent relationships, and the RMM framework.
Figure 11.1 illustrates the relationship of RMM to these bodies of knowledge.
Figure 11.1 Relationship of CERT-RMM to CMMI process areas and bodies of knowledge.
Process Areas
Table 11.2 represents the process areas of the RMM, by category.
These process areas also have equivalents in other process models, such as CMMI. This allows the user to align resiliency processes with ongoing work in the integration activities of the organization.
Table 11.2 Process Areas by Category
Category Process Area |
Engineering Asset Definition and Management |
Engineering Controls Management |
Engineering Resilience Requirements Development |
Engineering Resilience Requirements Management |
Engineering Resilient Technical Solution Engineering |
Engineering Service Continuity |
Enterprise Management Communications |
Enterprise Management Compliance |
Enterprise Management Enterprise Focus |
Enterprise Management Financial Resource Management |
Enterprise Management Human Resource Management |
Enterprise Management Organizational Training and Awareness |
Enterprise Management Risk Management |
Operations Access Management |
Operations Environmental Control |
Operations External Dependencies Management |
Operations Identity Management |
Operations Incident Management and Control |
Operations Knowledge and Information Management |
Operations People Management |
Operations Technology Management |
Operations Vulnerability Analysis and Resolution |
Process Management Measurement and Analysis |
Process Management Monitoring |
Process Management Organizational Process Definition |
Process Management Organizational Process Focus |
The alignment between CMMI and RMM is represented in Table 11.3.
RMM also contains the generic goals and practices that the organization implements to improve its organizational processes and capability to manage its environment toward resiliency of its operations. These practices exhibit the organization’s commitment and ability to perform resilience management processes, as well as its ability to measure performance and verify implementation. Generic processes are detailed in the RMM for use where noted.
Table 11.3 CMMI to RMM Alignment
CMMI Models Process Areas |
Equivalent CERT-RMM Process Areas |
CAM – Capacity and Availability Management (CMMI-SVC only) |
TM – Technology Management CERT-RMM addresses capacity management from the perspective of technology assets. It does not address the capacity of services. Availability management is a central theme of CERT-RMM, significantly expanded from CMMI-SVC. Service availability is addressed in CERT-RMM by managing the availability requirement for people, information, technology, and facilities. Thus, the process areas that drive availability management include RRD – Resilience Requirements Development (where availability requirements are established) RRM – Resilience Requirements Management (where the life cycle of availability requirements is managed) EC – Environmental Control (where the availability requirements for facilities are implemented and managed) KIM – Knowledge and Information Management (where the availability requirements for information are implemented and managed) |
IRP – Incident Resolution and Prevention (CMMI-SVC only) |
IMC – Incident Management and Control. In CERT-RMM, IMC expands IRP to address a broader incident management system and incident life cycle at the asset level |
MA – Measurement and Analysis |
MA – Measurement and Analysis is carried over intact from CMMI. In CERT-RMM, MA is directly connected to MON |
OPD – Organizational Process Definition |
OPD – Organizational Process Definition is carried over from CMMI |
OPF – Organizational Process Focus |
OPF – Organizational Process Focus is carried over intact from CMMI |
CMMI Models Process Areas |
Equivalent CERT-RMM Process Areas |
OT – Organizational Training |
OTA – Organizational Training and Awareness. OT is expanded to include awareness activities in OTA |
REQM – Requirements Management |
RRM – Resilience Requirements Management. Basic elements of REQM are included in RRM, but the focus is on managing the resilience requirements for assets and services |
RD – Requirements Development |
RRD – Resilience Requirements Development Basic elements of RD are included in RRM |
RSKM – Risk Management |
RISK – Risk Management Basic elements of RSKM are reflected in RRM |
SAM – Supplier Agreement Management |
EXD – External Dependencies Management In CERT-RMM, SAM is expanded to address all external dependencies |
SCON – Service Continuity (CMMI-SVC only) |
SC – Service Continuity In CERT-RMM, SC is positioned as an operational risk management activity that addresses what is required to sustain assets and services balanced |
TS – Technical Solution |
RTSE – Resilient Technical Solution Engineering |
The alignment of generic practice to process area and its subsequent implementation in the organization is given in Table 11.4.
Engineering Process
Given that aspects of operational resilience management are requirements-driven, process areas in the Engineering category represent those that are focused on establishing and implementing resilience for organizational assets and business processes. These processes establish the basic building blocks for resilience and create the foundation to protect and sustain the assets.
Engineering process areas fall into three broad categories:
Requirements Management addresses the development and management of the security (protect) and resilience (sustain) objectives for assets and services.
Asset Management establishes the important people, information, technology, and facilities as assets present in the organization.
Establishing and Managing Resilience addresses the selection, implementation, and management of preventive controls. In addition, it addresses the development and implementation of SC and impact management plans and programs. It also recommends the consideration of resilience for software and systems early in the development life cycle.
Table 11.4 Generic Process Mapped to Related Process Area
Generic Practice |
Related Process Area |
How the Process Area Helps Implement the Generic Practice |
GG2.GP1 Establish process governance |
Enterprise Focus |
Enterprise Focus addresses the governance aspect of managing operational resilience. Mastery of the Enterprise Focus process area can help achieve GG2.GP1 in other process areas |
GG2.GP3 Provide resources |
Human Resource Management Financial Resource Management |
Human Resource Management ensures that resources have the proper skill sets and their performance is consistent over time. Financial Resource Management addresses the provision of other resources to the process, such as financial capital |
GG2.GP4 Train people |
Organizational Training and Awareness |
Organizational Training and Awareness ensures that resources are properly trained |
GG2.GP8 Monitor and control the process |
Monitoring Measurement and Analysis |
Monitoring provides the structure and process for identifying and collecting relevant information for controlling processes. Measurement and Analysis provide general guidance about measuring, analyzing, and recording information that can be used in establishing measures for monitoring actual performance of the process [CMMI 2007] |
GG2.GP10 Review status with higher level managers |
Enterprise Focus |
As part of the governance process, Enterprise Focus requires oversight of the resilience process including identifying corrective actions |
GG3.GP1 Establish a defined process |
Organizational Process Definition |
Organizational Process Definition establishes the organizational process assets necessary to implement the generic practice [CMMI 2007] |
GG3.GP2 Collect improvement information |
Organizational Process Definition Organizational Process Focus |
Organizational Process Definition establishes the organizational process assets. Organizational Process Focus addresses the incorporation of experiences into the organizational process assets [CMMI 2007] |
The Engineering process areas include:
Requirements Management
Resilience Requirements Development (RRD)
Resilience Requirements Management (RRM)
Asset Management
Asset Definition and Management (ADM)
Establishing and Managing Resilience
Controls Management (CTRL)
Resilient Technical Solution Engineering (RTSE)
Service Continuity (SC)
Operations
The Operations process areas represent the core activities for managing the operational resilience of assets and services during this life-cycle phase. These process areas are focused on maintaining an acceptable level of operational resilience as determined by the organization. These process areas represent core security, business continuity, and IT operations/service delivery management activities. Areas of focus include the resilience of people, information, technology, and facilities assets.
Operations process areas fall into three broad categories:
Supplier Management addresses the management of external dependencies and the potential impact on the organization’s operational resilience.
Threat, Vulnerability, and Incident Management addresses the organization’s continuous cycle of identifying and managing threats, vulnerabilities, and incidents to minimize organizational disruption.
Asset Resilience Management addresses the asset-level activities that the organization performs to manage operational resilience of people, information, technology, and facilities to ensure that business processes and services are sustained.
The Operations process areas are:
Supplier Management
External Dependency Management (EXD)
Threat and Incident Management
Access Management (AM)
Identity Management (ID)
Incident Management and Control (IMC)
Vulnerability Analysis and Resolution (VAR)
Model Relationships
To understand how the elements of the process model translate to relationships in the environment, a number of maps have been created. These maps are depicted below.
Figure 11.2 Enterprise-level relationships.
People
Figure 11.2 shows the CERT-RMM process areas that participate in managing the operational resilience of people. They establish people as an important asset in service delivery and ensure that people meet job requirements and standards, have appropriate skills, are appropriately trained, and have access to other assets as needed to do their jobs.
Information
Figure 11.3 shows the CERT-RMM process areas that drive the operational resilience management of information. Information is established as a key element in service delivery. Requirements for protecting and sustaining information are established and used by processes such as risk management, controls management, and SC planning.
Figure 11.3 Relationships that drive the resilience of people.
Technology
Figure 11.4 shows the CERT-RMM process areas that drive the operational resilience management of technology. These relationships address the complexities of software and systems resilience, as well as the resilience of architectures where the technology assets live, their development and acquisition processes, and other processes such as configuration management and capacity planning and management.
Facilities
Figure 11.5 shows the CERT-RMM process areas that drive the operational resilience management of facilities. As with information and technology assets, relationships that drive the resilience of facilities have special considerations, such as protecting facilities from disruption, ensuring that facilities are sustained, managing the environmental conditions of facilities, determining the dependencies of facilities on their geographical region, and planning for the decommissioning of a facility. Because facilities are often owned and managed by an external party, consideration must also be given to how external parties implement and manage the resilience of facilities at the direction of the organization (Figure 11.6).
Figure 11.4 Relationships that drive information resilience.
Understanding Capability Levels
Like the standards that were described earlier in this chapter, CERT-RMM is not a prescriptive model. Process improvement is unique to each organization and, as such, CERT-RMM provides the basic structure to allow organizations to chart their own specific improvement path using the model as the basis.
Improvement paths are defined in RMM by capability levels. Levels characterize improvement from a poorly defined state to a state where processes are characterized and used consistently across the organization.
Figure 11.5 Relationships that drive technology resilience.
To reach a particular level, an organization must satisfy all of the relevant goals of the process area (or a set of process areas), as well as the generic goals that apply to the specific capability level. The structure of the continuous representation for CERT-RMM is provided in Table 11.5.
Connecting Capabilities to Process Maturity
Capability levels describe the degree to which a process has been institutionalized. Likewise, the degree to which a process is institutionalized is defined by the generic goals and practices. Table 11.5 links capability levels to the progression of processes and generic goals.
The progression of capability levels and the degree of process adoption are characterized by the following descriptions.
Capability Level 0: Incomplete
An incomplete process is a process that either is not performed or is partially performed. This leads to one or more of the specific goals of the process area not being satisfied. [CMMI 2007].
Figure 11.6 Relationships that drive facilities.
Capability Level 1: Performed
Capability Level 1 characterizes a performed process. A performed process is a process that satisfies all of the specific goals of the process area. It also supports work needed to perform RMM practices as defined by the specific goals. Although achieving Capability Level 1 results in important improvements, those improvements can be lost over time if they are not adopted. [CMMI 2007].
Table 11.5 Capability Levels Related to Goals and Process Progression
0 N/A Incomplete or no process or partially performed process |
1 GG1 Performed process |
2 GG2 Managed process |
3 GG3 Defined process |
Capability Level 2: Managed
As stated in the RMM Manual, a Capability Level 2 process is characterized as a managed process. A managed process is a performed process that has the basic infrastructure in place to support the process. At this level, the process is planned and executed in accordance with policy.
Corrective actions are taken when the actual results and performance deviate significantly from the plan. A managed process achieves the objectives of the plan and is adopted for consistent performance. [CMMI 2007].
Organizations operating at this capability level should begin to know that they can achieve and sustain their resilience goals, regardless of changes in or when faced with emerging threats. Instead of shifting planning and practices for security and business continuity to address the next threat, the organization defines and refines its processes to address any risk that comes its way.
Capability Level 3: Defined
A Capability Level 3 process is characterized as a defined process. A defined process is a managed process that is tailored from the organization’s set of standard processes. The process also contributes work products, measures, and other process improvement information for use by all organizational units [CMMI 2007].
What does this ultimately mean to the organization? When business units operate with different goals, assumptions, and practices, it is difficult to ensure that the organization’s collective goals and objectives can be reached. This is particularly true with risk management. If the organization’s risk assumptions are not reflected consistently in security, continuity, and IT operations, the organization’s risk management process will not be effective and may actually deter operational resilience. At Capability Level 3, there is more consistency across units and improvements made by each organizational unit can be accessed and used by the organization as a whole. Another significant distinction at Capability Level 3 is that the processes are typically described more rigorously and managed more proactively than at Capability Level 2. [CMMI 2007].
To summarize, an organization that reaches higher capability levels in each process area arguably exhibits a higher degree of organizational maturity with regard to security, continuity, and IT resilience.
Informal Diagnosis
Examples of informal diagnostic methods for CERT-RMM include: meetings or exercises in which the people who are responsible for the practices in a given process area meet, review the guidance, and discuss the extent to which the organization’s practices achieve intent; reviews or analyses performed by a single person or a small group to compare the organization’s practices to the guidance; informal collection and review of evidence that demonstrates appropriate performance. These activities can be useful to guide informal process improvement activities or to provide information for scoping or setting capability level targets for a more formal process improvement project.
Analyzing Gaps
Diagnostic activities typically reveal gaps between the current and the required performance. Before plans are established to close any gaps, it is important to consider the identified gaps in the context of the overall improvement objectives. If it is determined that one or more of the identified gaps are acceptable to the organization, it is recommended that the organization revisit and update the objectives for improvement. This iterative approach is valuable to ensure that the organization spends improvement resources in the most productive manner.
Reference
Caralli, R. A., Allen, J. H., Curtis, P. D., White, D. W., Young, L. R. CERT® Resilience Management Model, Version 1: Improving Operational Resilience Processes. Technical Report: CMU/SEI-2010-TR-012: ESC-TR-2010-012, May 2010.