Cluster Management - HP Vertica Essentials (2014)

HP Vertica Essentials (2014)

Chapter 2. Cluster Management

Vertica provides quite an elastic cluster, which can be scaled both up (adding new nodes) and down (removing nodes) without affecting the processes of the running database. The most important task after alteration in a cluster is the rebalancing of data across new as well as old nodes. This is done to ensure that the data remains K-Safe. Please refer to Chapter 5, Performance Improvement, for more information on K-Safe.

Projections are divided into segments, which are small portions of data. After adding a new node, some segments are given to it, while the other segments are exchanged to ensure proper K-safety. During the process of node removal from a cluster, all of the storage containers, which are residing at the node that is being removed, are moved to other existing nodes in the cluster. This method of partitioning data into movable segments turns a Vertica cluster into an elastic cluster.

Comprehending the elastic cluster scaling factor

Each node in the cluster stores local segments of data. The number of local segments in a node is known as the scaling factor. As discussed earlier, to perform effective rebalancing when nodes are removed or added, local segments from each of the nodes redistribute themselves in the cluster in order to maintain even data distribution across the cluster.

The MAXIMUM_SKEW_PERCENT parameter plays a crucial role when the number of segments cannot be evenly divided by the number of nodes in a new cluster. For example, if the scaling factor is 4 and there are initially 4 nodes, there will be 16 (4 x 4) segments in the whole cluster. Suppose one additional node is added to the cluster; then, it is not possible to evenly distribute 16 segments among 5 nodes. Hence, Vertica will assign more segments to some nodes as compared to others. So, one possible combination can be 4 nodes get 3 segments each and 1 node gets 4 segments. This skew is around 33.33 percent. Vertica will make sure that it remains below the set MAXIMUM_SKEW_PERCENT parameter. If Vertica is not able to redistribute segments because of the MAXIMUM_SKEW_PERCENT limit, the data-rebalancing process will not fail. However, the segmentation space will be evenly distributed among the 5 nodes, and new segments will be created on each node, making the total 20, that is, 4 segments on each node.

Enabling and disabling an elastic cluster

We can query the ELASTIC_CLUSTER system table to determine if an elastic cluster is enabled on the database or not, that is, to determine if segmentation is on or off. Run the following query to check if an elastic cluster is enabled or not:

=> select is_enabled from ELASTIC_CLUSTER;

is_enabled

------------

t

(1 row)

To enable an elastic cluster, we can run the following command:

=> SELECT ENABLE_ELASTIC_CLUSTER();

ENABLE_ELASTIC_CLUSTER

-------------------------

ENABLED

(1 row)

To disable an elastic cluster, we can run the following command:

=> SELECT DISABLE_ELASTIC_CLUSTER();

DISABLE_ELASTIC_CLUSTER

-------------------------

DISABLED

(1 row)

Viewing and setting the scaling factor settings

To view the scaling factor, we can query the ELASTIC_CLUSTER table (4 is the default). Run the following query to find out the scaling factor:

=> SELECT scaling_factor FROM ELASTIC_CLUSTER;

scaling_factor

---------------

4

(1 row)

We can use the SET_SCALING_FACTOR function to change a database's scaling factor. The scaling factor can be any integer between 1 and 32. It should be taken into account that a very high value of the scaling factor may lead to the nodes creating too many small container files, eventually causing too many ROS container errors. These errors are also known as ROS pushback (refer to Chapter 5, Performance Improvement to learn more about ROS). The following command is an example of using SET_SCALING_FACTOR:

=> SELECT SET_SCALING_FACTOR(5);

SET_SCALING_FACTOR

--------------------

SET

(1 row)

=> SELECT scaling_factor FROM ELASTIC_CLUSTER;

scaling_factor

---------------

5

(1 row)

Enabling and disabling local segmentation

Just setting the scaling factor and enabling the elastic cluster will make Vertica create local segments during the rebalancing stage only. It is advisable for production deployments to keep local segments ready beforehand. For this, we can enable and disable local segmentation, which tells Vertica to always segment its old as well as new data.

To enable local segmentation, we can use the ENABLE_LOCAL_SEGMENTS function as follows:

=> SELECT ENABLE_LOCAL_SEGMENTS();

ENABLE_LOCAL_SEGMENTS

-----------------------

ENABLED

(1 row)

To check the status of local segmentation, we can query the ELASTIC_CLUSTER system table in the following fashion:

=> SELECT is_local_segment_enabled FROM elastic_cluster;

is_enabled

------------

t

(1 row)

To disable local segmentation, we can use the DISABLE_LOCAL_SEGMENTATION function as follows:

=> SELECT DISABLE_LOCAL_SEGMENTS();

DISABLE_LOCAL_SEGMENTS

------------------------

DISABLED

(1 row)

To check the status of local segmentation, we can query the elastic_cluster system table in the following fashion:

=> SELECT is_local_segment_enabled FROM elastic_cluster;

is_enabled

------------

f

(1 row)

Understanding the best practices in cluster management

The following are some of the best practices for local segmentation:

· It only makes sense to keep local segments when we are actually planning to scale the cluster up or down

· It is highly recommended that backups of the database(s) are created before the start of the scaling process

· It is not advisable to create segments if our database tables contain a high number of partitions (more than 75)

Monitoring elastic cluster rebalancing

From Vertica 6.0 onwards, system tables can be used to monitor the rebalance status of an elastic cluster. There are two tables that can be employed. They are as follows:

REBALANCE_TABLE_STATUS

REBALANCE_PROJECTION_STATUS

In each table, separated_percent and transferred_percent can be used to determine the overall progress.

Adding nodes in Vertica

Before adding nodes, it is very important that we create a full backup of the database, as adding nodes in Vertica is a sensitive process. To add a new node, we will be using the update_vertica script. However, before adding a node to an existing cluster, the following are certain restrictions and prerequisites that have to be kept in mind:

· Make sure that the database is running.

· Newly added nodes should be reachable by all the existing nodes in the cluster.

· If we have a single node cluster that is deployed, without specifying the IP address, hostname, or hostname specified as the local host, it is not possible to expand the cluster. We must reinstall Vertica and specify an IP address or hostname.

· Generally, it is not needed to shut down the Vertica database for expansion, but a shutdown is necessary if we are expanding it from a single node cluster.

Method

From any node of the cluster, we can run the update_vertica script as follows:

# /opt/vertica/sbin/update_vertica -A <hostname1, hostname2…> -r <rpm_package>

Here, –A is used for providing IP(s) or hostname(s), and –r is used for providing the location of the rpm/debian package.

The update_vertica script will install Vertica, verify the installation, and add a node to the cluster. Once we have added one or more hosts to the cluster, we need to empower them as data storing nodes through either of the following:

· The Management Console interface (Enterprise Edition only)

· The administration tools interface

Using the Management Console to add nodes

We can add or remove nodes from a database by going to the Manage page. Here, just click on the target node and then click on the Add node or Remove node button in the node list. When we add a node, the color of the node icon changes from gray (empty) to green.

Adding nodes using administration tools

The following is the process of adding nodes in a Vertica cluster using administration tools:

1. Navigate to Main Menu | Advanced Menu | Cluster Management | Add Host(s).

2. We should now select the database to which we want to add one or more hosts. A list of unused hosts is displayed.

3. Select the host. You will then be prompted for the password. Provide the password for the database.

4. The user will be prompted about the success or failure of the addition of the node.

5. If it is successful, Vertica starts rebalancing the cluster. During database rebalancing, Vertica will ask the user to provide a path to a temporary directory that the database designer will use.

6. Before starting the rebalancing of the cluster, Vertica will prompt the user to provide a new higher K-value, or we can continue with the existing K-value.

7. As a final step, we should select whether Vertica should immediately start rebalancing or whether Vertica should do it at a later time. If we choose to do the rebalancing later, then a script is created and is kept for later execution. It is always advised that we should select the option to automatically start rebalancing. If we choose to automatically rebalance the database, the script will still be created and will be saved for later use and review.

Removing nodes in Vertica

Removing nodes or scaling down the Vertica cluster is a fairly simple process. The procedure of removing nodes comprises the following broad steps:

1. Back up the database.

2. Remember that it is mandatory to lower the K-safety if the cluster is not able to sustain the current level of K-safety after the cluster is scaled down.

3. Remove the host from the database.

Lowering the K-safety level

A database with a K-safety level 1 requires at least three nodes to operate, and a database with a K-safety level 2 requires at least five nodes to operate. Vertica doesn't support K-safety of level 3 and above. To lower the K-safety level, we will use theMARK_DESIGN_KSAFE function in the Vsql console, as shown in the following example:

km=> SELECT MARK_DESIGN_KSAFE(1);

MARK_DESIGN_KSAFE

----------------------

Marked design 1-safe

(1 row)

Removing nodes using administration tools

The following are the steps to remove nodes using administration tools:

1. Before starting this step, we must make sure that the number of nodes that will remain after removing the nodes will comply with K-safety.

2. As a precautionary step, create a backup of the database.

3. Make sure that the database is running.

4. Navigate to Main Menu | Advanced Menu | Cluster Management | Remove Host(s).

5. Select the database from which we wish to remove the node and click on OK.

6. Select the node that we wish to remove.

7. We will be asked if we are sure about removing the node. If we are, then click on OK; otherwise, click on Cancel.

8. We will be warned that we must redesign our database and create projections that exclude the hosts we are going to drop; click on Yes. The following screenshot shows the warning prompt:

Removing nodes using administration tools

Warning issued during the removal of a node

9. Vertica begins the process of rebalancing the database and removing the node(s). When you are informed that the hosts were successfully removed, click on OK.

Removing nodes using the Management Console

We can remove nodes from a database through the Manage page. For removing a node, we have to select the node we want to act upon and then click on the Remove node button in the node list. We can only remove nodes that are part of the database, that is, nodes that show a state of down or nodes that are not working (represented in red) and are not critical for K-safety. When we remove a node, its color changes from red to clear, and the Management Console updates its state to standby.

Removing hosts from a cluster

Removing a node from a database doesn't result in its removal from the cluster. We can completely remove it from the cluster using the update_vertica script, but it must be ensured that the host must not be used by any database. We do not need to shut down the database for this process.

From one of the hosts in the cluster, we need to run update_vertica with the –R switch, where -R specifies a comma-separated list of hosts to be removed from an existing Vertica cluster. Do not confuse –R with –r, as both have different functionalities. A host can be specified by the hostname or the IP address of the system, as shown in the following example:

# /opt/vertica/sbin/update_vertica -R 192.168.56.103,host04

Replacing nodes

If we have a K-Safe database, we can replace nodes, as a copy of the data will be maintained under the nodes. We do not need to shut down the database to replace the nodes.

Replacing a node using the same name and IP address

Sometimes, you will be required to upgrade one or more nodes in the cluster. If the new node has the same IP as the original node, then use the following method for replacement:

1. From a working node in the cluster, run the following install_vertica script with the -s (used for providing the hostname or IP(s)) and –r (the path of the rpm/deb package) parameters:

2. # /opt/vertica/sbin/install_vertica -s host -r rpm_package

3. The installation script will verify the system configuration and the installation of various important components of Vertica, such as Vertica, Spread, and the administration tool's metadata.

4. Now, create catalog and data directories on the new node. Make sure that the path of these directories is the same as that of the original node.

5. Once ready, restart the newly added host, and you will find that it has been added to the cluster.

The new node automatically joins the database cluster and recovers data by querying the other nodes within the database cluster. Data recovery may take some time as huge chunks of data will be moved across the nodes.

Replacing a failed node using a different name and IP address

There may be a time when you will be required to replace a failed node with a node that has a different IP address and hostname. In such cases, proceed with the following steps:

1. As a preventive step, create a backup of the database.

2. After the backup is created, run update_vertica with the –A, -R, -E, and -r parameters, as shown in the following code, to replace the failed host:

3. /opt/vertica/sbin/update_vertica -A NewHostName -R OldHostName -E -r rpm_package

Where:

· NewHostName is the hostname or IP address of the new node

· OldHostName is the hostname or IP address of the node that is being replaced from the cluster

· The -E parameter makes sure that the failed node is dropped from the cluster

· rpm_package is the name of the rpm package; for example, –r vertica_6.0.x.x86_64.RHEL5. rpm

4. Using administration tools, we can replace the original host with the new host. If we are using more than one database, make sure that the old node is replaced by a new node for all databases.

5. Now, distribute configuration files to the new host. The process is discussed in the next section.

6. Then, we have to run update_vertica again. However, this time, we will run it with the -R parameter, as shown in the following code, to clear all information of the failed node from the administration tool's metadata:

7. # /opt/vertica/sbin/update_vertica -R OldHostName

OldHostName is the hostname or IP address of the system that we removed from the cluster. Do not confuse –R with –r as both have different functions.

8. Now, start the new node and verify if everything is running properly.

Once you have completed the process, the new node automatically recovers the data that was stored in the original node by querying other nodes in the database cluster.

Redistributing configuration files to nodes

The processes of adding and removing nodes automatically redistribute the Vertica configuration files. To distribute configuration files to a host, log on to a host that contains these files using administration tools. This can be done using the following steps:

1. Navigate to Main Menu | Configuration Menu | Distribute Config Files.

2. Click on Database Configuration as shown in the following screenshot:

Redistributing configuration files to nodes

Selecting the redistribution configuration category

3. Then, select the database in which we wish to distribute the files and click on OK.

The vertica.conf file will be distributed to all the other hosts in the database. If it existed earlier on a host, it is overwritten.

4. We need to follow the same process for Secure Sockets Layer (SSL) keys as well as administration tool's metadata. For SSL keys, Vertica may prompt for the location of the server certificate file, server key file, and root certificate file. Provide an appropriate path and click on OK.

Using administration tools to replace nodes with different names and IP addresses

Using administration tools, you can easily replace a node with a node of a different hostname and IP address. Alternatively, you can also use the Management Console to replace a node.

To replace the original host with a new host using administration tools, proceed with the following steps:

1. As a preventive step, create a backup of the database.

2. You first need to make sure that the database is running.

3. Add the replacement hosts to the cluster using the standard procedure of adding a node.

4. Now, shut down the node that is intended to be replaced.

5. Navigate to Main Menu | Advanced Menu.

6. In the Advanced Menu option, we need to select Stop Vertica on Host.

7. Select the host we wish to replace and then click on OK to stop the node.

8. After stopping the node, navigate to Advanced Menu | Cluster Management | Replace Host.

9. Select the database that contains the host we wish to replace and then click on OK.

10. A list of all the hosts with their internal names and IP addresses will be displayed. We will select the host we wish to replace and then click on OK, as shown in the following screenshot:

Using administration tools to replace nodes with different names and IP addresses

Selecting nodes to replace

11. Select the host we want to use as the replacement and then click on OK.

12. When prompted that the host was successfully replaced, click on OK.

13. Restart all the nodes (new nodes may take some time to start as they will be in the recovering state, thus moving data).

14. Verify if the database is running properly.

Changing the IP addresses of a Vertica cluster

Sometimes, during networking-related maintenance chores, the IP(s) of one or more servers running Vertica changes to some other IP. In these cases, IP changes are needed to be done in Vertica as well. However, before making changes in Vertica, we must make sure that changes are made in /etc/hosts of all nodes, where the hostname is mapped to the IP address. After making system-level changes, proceed with the following steps to change the IP address of one or more nodes in a cluster:

1. Back up the following three files:

· /opt/vertica/config/admintools.conf

· /opt/vertica/config/vspread.conf

· /etc/sysconfig/spreadd

It is assumed that Vertica is installed on /opt/vertica. If Vertica is installed on some other location, then we should take backups from that location.

2. We should stop Vertica on all nodes.

3. As a root user, we need to stop Spread manually on each node. Spread is the messaging system for distributed systems such as Vertica. We can stop Spread using the following command:

4. /etc/init.d/spreadd stop

5. On each node, edit /opt/vertica/config/admintools.conf and change the IPs as required. The following is the text from admintools.conf for one of the nodes (replace the old IP with the new one wherever required):

6. [Configuration]

7. install_opts = ['-s', '192.168.56.101,192.168.56.102,192.168.56.103', '-r', 'vertica-ce-6.0.0-1.x86_64.RHEL5.rpm', '-u', 'dba']

8. default_base = /home/dbadmin

9. show_hostnames = False

10.format = 3

11.

12.

13.[Cluster]

14.hosts = 192.168.56.101,192.168.56.102,192.168.56.103

15.spread_hosts =

16.

17.[Nodes]

18.node0001 = 192.168.56.101,/home/dba,/home/dba

19.node0002 = 192.168.56.102,/home/dba,/home/dba

20.node0003 = 192.168.56.103,/home/dba,/home/dba

21.v_km_node0001 = 192.168.56.101,/ilabs/data/vertica/catalog,/ilabs/data/vertica/data

22.v_km_node0002 = 192.168.56.102,/ilabs/data/vertica/catalog,/ilabs/data/vertica/data

23.v_km_node0003 = 192.168.56.103,/ilabs/data/vertica/catalog,/ilabs/data/vertica/data

24.

25.[Database:km]

26.host = 192.168.56.101

27.restartpolicy = ksafe

28.port = 5433

29.path = /ilabs/data/vertica/catalog/km/v_km_node0001_catalog

30.nodes = v_km_node0001,v_km_node0002,v_km_node0003

31. On each node, edit /opt/vertica/config/vspread.conf. We need to replace the old IP with the new one wherever required. We also need to change the N number, where N is followed by the IP address without a period (.). Following is the text from vspread.conffor one of the nodes:

32.Spread_Segment 192.168.56.255:4803 {

33. N192168056101 192.168.56.101 {

34. 192.168.56.101

35. 127.0.0.1

36. }

37. N192168056102 192.168.56.102 {

38. 192.168.56.102

39. 127.0.0.1

40. }

41. N192168056103 192.168.56.103 {

42. 192.168.56.103

43. 127.0.0.1

44. }

45.}

46.EventLogFile = /dev/null

47.EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"

48.DaemonUser = spread

49.DaemonGroup = verticadba

50.DebugFlags = { EXIT }

51. Just like the changes we made in vspread.conf, we also need to make changes in etc/sysconfig/spreadd.

52. After IP-related changes are made in all three configuration files, we should start Spread on each node as a root user:

53. /etc/init.d/spreadd start

54. Then, we should start a single Vertica node and connect to our database and run Vsql.

55. In Vsql, issue the following query to verify that the new IP has been updated:

56.select host_name from host_resources;

57. As a final step, we need to modify the database to use the new IPs.

58. In Vsql, we have to run the following query to display the current node names that are configured:

59.km=> \x

60.Expanded display is on.

61.km=> select node_name, node_address from nodes;

62.-[ RECORD 1 ]+---------------

63.node_name | v_km_node0001

64.node_address | 192.168.56.101

65.-[ RECORD 2 ]+---------------

66.node_name | v_km_node0002

67.node_address | 192.168.56.102

68.-[ RECORD 3 ]+---------------

69.node_name | v_km_node0003

70.node_address | 192.168.56.103

71. For each result, we need to update the IP address by issuing the following command:

72.alter node NODE_NAME is hostname <new IP address>;

In the preceding command, NODE_NAME is the internal name of the node, and <new IP address> is the new IP of this node. Before altering this data, the node of the IP that has been changed needs to be shut down.

73. Bring up all the nodes in the cluster and test if everything is running properly.

Summary

In this chapter, we saw how easily a Vertica cluster can be scaled up or down. It is worth noting that I have over-emphasized on backing up the database, because it is always better to be safe than sorry. In the next chapter, you will learn how to effectively monitor a Vertica cluster.