MySQL High Availability (2014)
Part I. High Availability and Scalability
Chapter 2. MySQL Replicant Library
Joel opened his handy text file full of common commands and tasks and copied them into another editor, changing the values for his current need. It was a series of commands involving a number of tools and utilities. “Ah, this is for the birds!” he thought. “There has got to be a better way.”
Frustrated, he flipped open his handy MySQL High Availability tome and examined the table of contents. “Aha! A chapter on a library of replication procedures. Now, this is what I need!”
Automating administrative procedures is critical to handling large deployments, so you might be asking, “Wouldn’t it be neat if we could automate the procedures in this book?” In many cases, you’ll be happy to hear that you can. This chapter introduces the MySQL Replicant library, a simple library written by the authors for managing replication. We describe the basic principles and classes, and will extend the library with new functionality in the coming chapters.
The code is available at Launchpad, where you can find more information and download the source code and documentation.
The Replicant library is based around the idea of creating a model of the connections between servers on a computer (any computer, such as your laptop), like the model in Figure 2-1. The library is designed so you can manage the connections by changing the model. For example, to reconnect a slave to another master, just reconnect the slave in the model, and the library will send the appropriate commands for doing the job.
Figure 2-1. A replication topology reflected in a model
Besides the simple replication topology shown in Figure 2-1, two other basic topologies include tree topologies and dual masters (used for providing high availability). Topologies will be covered in more depth in Chapter 6.
To make the library useful on a wide variety of platforms and for a wide variety of deployments, it has been constructed with the following in mind:
§ The servers are likely to run on a variety of operating systems, such as Windows, Linux, and flavors of Unix such as Solaris or Mac OS X. Procedures for starting and stopping servers, as well as the names of configuration files, differ depending on the operating system. The library should therefore support different operating systems and it should be possible to extend it with new operating systems that are not in the library.
§ The deployment is likely to consist of servers running different versions of MySQL. For example, while you are upgrading a deployment to use new versions of the server, it will consist of a mixture of old and new versions. The library should be able to handle such a deployment.
§ A deployment consists of servers with many different roles, so it should be possible to specify different roles for the servers. In addition, it should be possible to create new roles that weren’t anticipated at the beginning. Also, servers should be able to change roles.
§ It is necessary to be able to execute SQL queries on each server. This functionality is needed for configuration as well as for extracting information necessary to manage the deployment. This support is also used by other parts of the system to implement their jobs—for example, to implement a slave promotion.
§ It is necessary to be able to execute shell commands on each machine. This is needed to perform some administrative tasks that cannot be done using the SQL interface. This support is also used by, for example, the operating system part of the library to manage the servers.
§ It should be possible to add and remove options from the server’s configuration file.
§ The library should support a deployment with multiple servers on a machine. This requires the ability to recognize different configuration files and database files used by different MySQL servers on a single machine.
§ There should be a set of utilities for performing common tasks such as setting up replication, but it should also be possible to extend the library with new utility functions that were not anticipated at the beginning.
The interface hides these complexities as much as possible and presents a simple interface in Python. Python was chosen by the authors because it is concise, easy to read, available on all operating systems that run MySQL, and increasingly popular for general-purpose scripting. You can see an example of how to define a topology in Example 2-1.
Example 2-1. Using the library to construct a topology
frommysql.replicant.serverimport Server, User
frommysql.replicant.rolesimport Master, Final
# The master that we use
MASTER = Server('master', server_id=1,
# Slaves that we keep available
SLAVES = [
# Create the roles for these servers
master_role = Master(User("repl_user", "xyzzy"))
slave_role = Final(MASTER)
# Imbue the servers with the roles
for slave inSLAVES:
# Convenience variable of all servers
SERVERS = [MASTER] + SLAVES
The first step is to create a server object containing all the information about how to access the server. This server will be used as master, but this statement does nothing specific to configure it as a master. That is done later when imbuing the server with a role.
When configuring the server, you need to include information on how to connect to the server to send SQL commands to it. For this example, we have a dedicated replicant user that is used to access the server. In this case, there is no password, but one could be set up when constructing the User instance.
You will also need access to the machine where the server is running, to do such things as shut it down or access the configuration file. This line grants access to the user who will connect to the machine.
Because servers are started and stopped in different ways on different kinds of operating systems, you must indicate the operating system the server is running on. In this case, Linux is used for all servers.
This is where information about the host the server is running on goes. host and port are used when connecting to a remote machine, and socket is used when connecting on the same machine. If you will connect only remotely, you can omit socket.
This constructs a list of servers that will be slaves. As for the master, it gives basic connection information but has nothing specific about slaves.
To configure servers, the servers are imbued with roles. This statement constructs a Master role containing the replication user that will be used by slaves to connect to the server.
This specifies the Final slave role, which does not run a binary log on the server, so cannot be promoted to a master later.
These statements imbue all the servers with their roles. The effect is to update the configuration of each server so that it can be used in that role. If necessary (e.g., if the configuration file has to be changed), the statements will also restart the servers.
The previous example imbued all the server with their roles, and after you have started all the servers, Example 2-2 shows how you can use the library to redirect all slaves to use a new master.
Example 2-2. Using the library to redirect slaves
for slave inmy_deployment.slaves:
We have deliberately kept this example simple, and therefore have omitted some important steps. As the code stands, it stops replication in its tracks and is likely to lose transactions if executed on an active server. You will see how to change masters properly in Chapter 5.
The following sections show the code that makes such applications possible. To avoid cluttering the code more than necessary, we have removed some error checking and other defensive measures needed to have a stable and safe library. You will find the complete code for the library atLaunchpad.
Basic Classes and Functions
The first things you need in order to use the library are some basic definitions for frequently used concepts. We need exceptions to be able to report errors, and we need some simple objects for representing positions and user information.
The complete list of exceptions can be found in the library. All the exceptions inherit from a common base class Error defined in the library, as is customary. The exceptions that you will see in later chapters include the following:
This exception is thrown when an attempt is made to select a field from a query that did not return any rows.
This exception is raised when ConfigManager does not find the option.
This exception is raised when the slave is not running but was expected to run.
This exception is raised when the server is not a master and the operation is therefore illegal.
This exception is raised when the server is not a slave and the operation is therefore illegal.
There is also a set of classes for representing some common concepts that will be used later in the book:
Position and GTID
These classes represent a binlog position consisting of a filename and a byte offset within the file, or a global transaction identifier (introduced in MySQL 5.6). A representation method prints out a parsable representation of the binlog positions so that they can be put in secondary storage or if you just want to look at them.
To compare and order the positions, the class defines a comparison operator that allows the positions to be ordered.
Note that when global transaction identifiers are not used, positions can be different on different servers, so it is not useful to compare positions from different servers. For that reason, an exception will be thrown if an attempt is made to compare different kinds of positions.
This class represents a user with a name and a password. It is used for many types of accounts: a MySQL user account, a shell user account, and the replication user (which we will introduce later).
Supporting Different Operating Systems
To work with different operating systems, you can use a set of classes that abstract away the differences. The idea is to give each class methods for each of the required tasks that are implemented differently by different operating systems. At this time, all we need are methods to stop and start the server:
This class is the base class for a machine and holds all the information that is common to this kind of machine. It is expected that a machine instance has at least the following members:
The default location of the my.cnf file on this machine
Machine.start_server( server )
Method to start the server
Machine.stop_server( server )
Method to stop the server
This class handles a server running on a Linux machine. It uses the init(8) scripts stored under /etc/init.d to start and stop the server.
This class handles servers running on a Solaris machine and uses the svadm(1M) command to start and stop the server.
The Server class defines all the primitive functions that implement the higher-level functions we want to expose in the interface:
The Server class represents a server in the system; there is one object for each running server in the entire system. Here are the most important parameters (for a full list, consult the project page on Launchpad):
This is the name of the server, and is used to create values for the pid-file, log-bin, and log-bin-index options. If no name parameter is provided, it will be deduced from the pid-file option, the log-bin option, the log-bin-index option, or as a last resort, using the default.
host, port, and socket
The host where the server resides, the port for connecting to the server as a MySQL client, and the socket through which to connect if on the same host.
ssh_user and sql_user
A combination of user and password that can be used for connecting to the machine or the server. These users are used to execute administrative commands, such as starting and stopping the server and reading and writing the configuration file, or for executing SQL commands on the server.
An object that holds operating system–specific primitives. We chose the name “machine” instead of “os” to avoid a name conflict with the Python standard library os module. This parameter lets you use different techniques for starting and stopping the server as well as other tasks and operating system–specific parameters. The parameters will be covered later.
An optional parameter to hold the server’s identifier, as defined in each server’s configuration file. If this option is omitted, the server identifier will be read from the configuration file of the server. If there is no server identifier in the configuration file either, the server is a vagabond and does not participate in replication as master or slave.
An optional parameter to hold a reference to a configuration manager that can be queried for information about the configuration for the server.
Server.connect() and Server.disconnect()
Use the connect and disconnect methods to establish a connection to the server before executing commands in a session and disconnect from the server after finishing the session, respectively.
These methods are useful because in some situations it is critical to keep the connection to the server open even after an SQL command has been executed. Otherwise, for example, when doing a FLUSH TABLES WITH READ LOCK, the lock will automatically be released when the connection is dropped.
Server.ssh(command, args...) and Server.sql(command, args...)
Use these to execute a shell command or an SQL command on the server.
The ssh and sql methods both return an iterable. ssh returns a list of the lines of output from the executed command, whereas sql returns a list of objects of an internal class named Row. The Row class defines the __iter__ and next methods so that you iterate over the returned lines or rows, for example:
for row inserver.sql("SHOW DATABASES"):
To handle statements that return a single row, the class also defines a __getitem__ method, which will fetch a field from the single row or raise an exception if there is no row. This means that when you know your return value has only one row (which is guaranteed for many SQL statements), you can avoid the loop shown in the previous example and write something like:
print server.sql("SHOW MASTER STATUS")["Position"]
Server.fetch_config() and Server.replace_config()
The methods fetch_config and replace_config fetch the configuration file into memory from the remote server to allow the user to add or remove options as well as change the values of some options. For example, to add a value to the log-bin and log-bin-index options, you can use the module as follows:
config = master.fetch_config()
Server.start() and Server.stop()
The methods start and stop forward information to the machine object to do their jobs, which depend on the operating system the server is using. The methods will either start the server or shut down the server, respectively.
Servers work slightly differently depending on their roles. For example, masters require a replication user for slaves to use when connecting, but slaves don’t require that user account unless they act as a master and have other slaves connecting. To capture the configuration of the servers in a flexible manner, classes are introduced for representing different roles.
When you use the imbue method on a server, the appropriate commands are sent to the server to configure it correctly for that role. Note that a server might change roles in the lifetime of a deployment, so the roles given here just serve to configure the initial deployment. However, a server always has a designated role in the deployment and therefore also has an associated role.
When a server changes roles, it might be necessary to remove some of the configuration information from the server, so therefore an unimbue method is also defined for a role and used when switching roles for a server.
In this example, only three roles are defined. Later in the book, you will see more roles defined. For example, you will later see how to create nonfinal slaves that can be used as secondaries or relay servers. The following three roles can be found in the MySQL Replicant library:
This is the base class of all the roles. Each derived class needs to define the methods imbue and (optionally) unimbue to accept a single server to imbue with the role. To aid derived classes with some common tasks, the Role class defines a number of helper functions, including the following:
This method imbues the server with the new role by executing the appropriate code.
This method allows a role to perform cleanup actions before another role is imbued.
If there is no server identifier in the configuration, this method sets it to server.server_id. If the configuration has a server identifier, it will be used to set the value of server.server_id.
This method creates a replication user on the server and grants it the necessary rights to act as a replication slave.
This method enables the binary log on the server by setting the log-bin and log-bin-index options to appropriate values. If the server already has a value for log-bin, this method does nothing.
This method disables the binary log by clearing the log-bin and log-bin-index options in the configuration file.
This is the default role assigned to any server that does not participate in the replication deployment. As such, the server is a “vagabond” and does not have any responsibilities whatsoever.
This role is for a server that acts as a master. The role will set the server identifier, enable the binary log, and create a replication user for the slaves. The name and password of the replication user will be stored in the server so that when slaves are connected, the class can look up the replication username.
This is the role for a (final) slave (i.e., a slave that does not have a binary log of its own). When a server is imbued with this role, it will be given a server identifier, the binary log will be disabled, and a CHANGE MASTER command will be issued to connect the slave to a master.
Note that we stop the server before we write the configuration file back to it, and restart the server after we have written the configuration file. The configuration file is read only when starting the server and closed after the reading is done, but we play it safe and stop the server before modifying the file.
One of the critical design decisions here is not to store any state information about the servers that roles apply to. It might be tempting to keep a list of all the masters by adding them to the role object, but because roles of the servers change over the lifetime of the deployment, the roles are used only to set up the system. Because we allow a role to contain parameters, you can use them to configure several servers with the same information.
slave_role = Final(master=my_deployment.master)
for slave inmy_deployment.slaves:
In this chapter you have seen how to build a library for making administration of your servers easy. You have also seen the beginning of the MySQL Replicant library that we will be developing throughout this book.
Joel finished testing his script. He was pretty confident he had all of the parts in place and that the resulting command would save him a lot of time in the future. He clicked Enter.
A few moments later, his script returned the data he expected. He checked his servers thinking this was too easy, but he found everything he wanted to do had been done. “Cool, that was easy!” he said, and locked his screen before heading to lunch.