Implementing a Highly Available Distributed File Server - Implementing Samba 4 (2014)

Implementing Samba 4 (2014)

Chapter 8. Implementing a Highly Available Distributed File Server

This chapter, with a more advanced focus, deals with high availability for the file server role in Samba 4 deployments. The objective of this chapter is to show how to implement a highly available file server using Samba 4 and software such as Clustered Trivial Database (CTDB) and the GlusterFS distributed filesystem.

The user will be instructed to implement this solution through step-by-step command-line examples, touching some of the many concepts involved and covering the following topics:

· Preparing the Debian GNU/Linux environment

· Configuring GlusterFS

· Integrating CTDB [43], GlusterFS, and the Samba 4 Server

· Executing some tests and validations on the highly available distributed filesystem server

The chapter will follow the objectives of the book in being practical, hence it should not be an exhaustive source for the technologies referenced here (for example, GlusterFS), and will not cover all the options and possible scenarios. That said, it will provide a great foundation and a solid work example to implement a complete solution to provide highly available file services.

Preparing the Debian GNU/Linux environment

Before we start the configuration of the software involved in the solution, we need to have the proper operating system environment so we can install the software required for our highly available file server. If you have installed the Debian GNU/Distribution 7.2 (Wheezy) or above, the CTDB version that is available to be installed on that environment is ready to use for our requirement. In this case, just make sure to execute the following commands before you proceed to the software installation:

root@gluster1:~# apt-get update

Hit http://ftp.br.debian.org wheezy Release.gpg

Hit http://ftp.br.debian.org wheezy-updates Release.gpg

Hit http://ftp.br.debian.org wheezy Release

Hit http://ftp.br.debian.org wheezy-updates Release

Hit http://ftp.br.debian.org wheezy/main Sources

Hit http://ftp.br.debian.org wheezy/main amd64 Packages

… SKIPPED FOR BREVITY …

Reading package lists... Done

root@gluster1:~# apt-get upgrade

Reading package lists... Done

Building dependency tree

Reading state information... Done

0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

root@gluster1:~#

The two preceding commands will update the package list for our distribution and apply any upgrades needed to bring our Debian GNU/Linux installation to the most up-to-date version. For older versions of the Debian GNU/Linux distribution (for example, 6.x Squeeze), the user will need to upgrade the whole distribution with the preceding commands, followed by the apt-get dist-upgrade command. The references have a link for the official Debian documentation explaining the distribution upgrade process in detail [50].

After the operating system environment is configured and ready (for example, Debian Wheezy 7.2+), we can proceed with the installation of the CTDB software. We will need two packages for our configuration: ctdb and libctdb-dev. Just issue the following command in a terminal window, for example, as a root user or using sudo[73]:

root@gluster1:~# apt-get install ctdb libctdb-dev

Reading package lists... Done

Building dependency tree

Reading state information... Done

The following extra packages will be installed:

ctdb libctdb-dev

… SKIPPED FOR BREVITY …

root@gluster1:~#

With the operating system environment configured and ready, we can proceed to the installation of the GlusterFS and XFS filesystem tools. Just log in and use the su or sudo commands to install the packages as follows:

root@gluster1:~# apt-get install glusterfs-server xfsprogs

Reading package lists... Done

Building dependency tree

Reading state information... Done

The following extra packages will be installed:

fuse fuse-utils glusterfs-client glusterfs-common libibverbs1 libreadline5

Suggested packages:

glusterfs-examples xfsdump quota

The following NEW packages will be installed:

fuse fuse-utils glusterfs-client glusterfs-common glusterfs-server

libibverbs1 libreadline5 xfsprogs

0 upgraded, 8 newly installed, 0 to remove and 0 not upgraded.

Need to get 13.6 MB of archives.

… SKIPPED FOR BREVITY …

Setting up xfsprogs (3.1.7+b1) ...

Processing triggers for initramfs-tools ...

update-initramfs: Generating /boot/initrd.img-3.2.0-4-amd64

Setting up fuse-utils (2.9.0-2+deb7u1) ...

Setting up glusterfs-client (3.2.7-3+deb7u1) ...

Setting up glusterfs-server (3.2.7-3+deb7u1) ...

[ ok ] Starting glusterd service: glusterd.

root@gluster1:~#

In the preceding procedure, we can see the command needed to install the GlusterFS server package, XFS tools, and all their dependencies on the Debian GNU/Linux distribution. Just remember that we need the preceding software on all nodes, and in our specific cluster, we will have two servers (for example, gluster1 and gluster2), so we need to execute the previous procedures on both nodes.

The last step in our preparation phase is to install the Samba 4 software. The reader can refer to the previous chapters, where we executed the Samba 4 installation instructions step-by-step. The only attention point is to add the essential option, –with-cluster-support, to the configure command so that Samba 4 will be compiled with the needed cluster features.

Configuring GlusterFS for high availability and scalability

In this topic, our first goal is to explain an important distinction when we discuss cluster filesystems. In IT, we have many words that are used out of context many times and start to mean many different things. Here, we will not try to create any definition or concept though, but just explain why we have decided to use the GlusterFS filesystem for our highly available file server and the important difference between GlusterFS and other cluster filesystems, approaches (for example, Oracle Cluster File System (OCFS) [41]).

If there is a word in IT that has the most different meanings for different people, it is the word "cluster". We have many filesystems that are called cluster filesystems, and if we go to the homepage of the GlusterFS project (http://www.gluster.org/), we will see that the word cluster is used many times. But the most important word for our use case, and the reason we have selected the GlusterFS project for our solution, is "distributed". The majority of the cluster filesystems and many users that think about it may imagine solutions to be more in accordance with the OCFS [41] approach.

OCFS is a type of filesystem where different machines share access to the same storage device (for example, disks). OCFS has a mechanism to control simultaneous access to the underlying storage between the different servers, but it still provides a consistent filesystem view for the clients, independent of which server a client is connected to. This is a legitimate solution, and OCFS is a very robust production filesystem. The issue in this design for the purposes of our use case is the fact that to use it we need to have access to a shared-disk infrastructure [for example, Storage Area Network (SAN)] and rely on the high availability of the disk arrays underneath. Beside this, the scalability of our solution will be dependent on the scalability of the backend disks for performance and space requirements.

Now enters the distributed filesystem approach (where we have GlusterFS [42] as an example), which does not rely on a shared-disk infrastructure but combines storage from different machines/servers together as building blocks to present to the clients a single global namespace. This model is much better in our experience for scalable, highly available, high-performance file server solutions, as there isn't a single point of failure and performance and space requirements scale horizontally as well.

It's important to note that the OCFS used here as an example of a shared-disk cluster filesystem has its requirements and use cases very well defined on the Oracle OCFS page [41]. OCFS has great features and a specific use case for Oracle Real Application Clusters [41]. It is used in many other solutions as it can be used as the base for a Samba 4 highly available file server too. We have just explained the reasons behind choosing to use a distributed filesystem (GlusterFS), so feel free to utilize whatever is more adequate for your environment or you feel more comfortable with (the procedure presented here should help you in a big way even if you decide to use a different filesytem).

Tip

Our file server is for a small/medium network with moderate performance requirements. The scale-out nature of GlusterFS provides us the ability to add more nodes and enhance performance as we need, but every filesystem and solution has its own trade-offs, and you need to evaluate what better fits your use case (for example, small files, big files, random access, and so on).

One last but important point to note is that GlusterFS is not a "full" filesystem per se, but it concatenates filesystems into one big single namespace, so data on those filesystems get distributed, replicated, or both on the GlusterFS nodes [44]. We will use the XFS as the underlying filesystem for the GlusterFS filesystem as it is a very robust filesystem and offers a good general performance (for different use cases).

Now let's begin the real work and start the configuration of the GlusterFS part of our solution. Our highly available distributed file server will be composed of two file servers, three distinct network interfaces, and two Virtual IPs (VIPs).

One network will be used for our CIFS/SMB clients (192.168.1.0), and the VIPs will be on this network too, as follows:

· Node 0 (gluster1/eth0) = 192.168.1.1

· Node 1 (gluster2/eth0) = 192.168.1.2

· VIPs: 192.168.1.20 and 192.168.1.30

The second network will be used for the GlusterFS heartbeat (10.10.10.0/24) as follows:

· Node 0 (gluster1/eth1) = 10.10.10.21

· Node 1 (gluster2/eth1) = 10.10.10.22

The third one will be used for the CTDB heartbeat (10.11.11.0/24) as follows:

· Node 0 (gluster1/eth1) = 10.11.11.21

· Node 1 (gluster1/eth1) = 10.11.11.22

Every cluster solution that I have worked with needed specific interconnection interfaces for exclusive communication between the machines integrating the cluster. Some clusters needed as much as three different exclusive interfaces for this role (for redundancy) or required the use of a quorum network component (for example, a server), a quorum device (for example, a shared disk), or even both. This is because clusters need to handle different failure scenarios, and specifically two-node clusters have even more specific issues (for example, a communication failure between nodes is a total partition/split-brain).

So, we will use one dedicated interface for each heartbeat communication instance [46] (GlusterFS and CTDB) and the data network (for example, client-facing interface). Setting the latter to be a dedicated interface is important for performance and isolation/troubleshooting purposes (for example, NAS Network). Besides that, each server (for example, a cluster node) will have one dedicated disk to be used in the GlusterFS configuration as a data disk (for example, "brick" in GlusterFS terms).

Tip

It's very important to have a dedicated network for the client's data.

The first thing we need to make sure is configured and working properly are our network connections, so that we can start configuring all the interfaces on both nodes. The following commands are an example of the three interfaces for the first node:

root@gluster1:~# ifconfig -a | grep -A2 "eth0\|eth1\|eth2"

eth0 Link encap:Ethernet HWaddr fa:ce:27:ed:bf:39

inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:feed:bf39/64 Scope:Link

--

eth1 Link encap:Ethernet HWaddr fa:ca:27:f9:cc:a4

inet addr:10.10.10.21 Bcast:10.10.10.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:fef9:cca4/64 Scope:Link

--

eth2 Link encap:Ethernet HWaddr ce:fa:27:3e:72:d4

inet addr:10.11.11.21 Bcast:10.11.11.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:fe3e:72d4/64 Scope:Link

root@gluster1:~#

Now let's take a look at the disks that we have available on our nodes (for example, node 1) as follows:

root@gluster2:~# fdisk -l | grep sd

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sda: 8589 MB, 8589934592 bytes

/dev/sda1 * 2048 16254975 8126464 83 Linux

/dev/sda2 16257022 16775167 259073 5 Estendida

/dev/sda5 16257024 16775167 259072 82 Linux swap / Solaris

Disk /dev/sdb: 5368 MB, 5368709120 bytes

root@gluster2:~#

Tip

As we start working on disks, it's very important that we make sure on which disks we are working and validate all procedures in a test environment before moving to production. Be aware that some of the commands we will cover hereafter, when executed on improper devices (for example, disks), may lead to data loss.

The script above shows us that we have a disk (/dev/sda) that contains the partitions for our OS installation and another (/dev/sdb) available and without a partition table (first line in the output above). That is our brand new disk, where our Samba 4 client's data will reside, and thus we can create an XFS filesystem on it (make sure all nodes you are preparing for the cluster have a disk available for our XFS filesystem).

Now we can execute the following script on both servers (for example, gluster1 and gluster2) to create one partition on the /dev/sdb disk and list it afterwards to make sure it was created properly:

root@gluster1:~# echo -e "n\np\n1\n\n\nw" | fdisk /dev/sdb > /dev/null 2>&1 && fdisk -l /dev/sdb

Disk /dev/sdb: 5368 MB, 5368709120 bytes

181 heads, 40 sectors/track, 1448 cylinders, total 10485760 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0xad90d89b

Device Boot Start End Blocks Id System

/dev/sdb1 2048 10485759 5241856 83 Linux

root@gluster1:~#

So far so good, let's create the XFS filesystem (execute on both nodes):

root@gluster1:~# mkfs.xfs /dev/sdb1 && echo OK

meta-data=/dev/sdb1 isize=256 agcount=4, agsize=327616 blks

= sectsz=512 attr=2, projid32bit=0

data = bsize=4096 blocks=1310464, imaxpct=25

= sunit=0 swidth=0 blks

naming =version 2 bsize=4096 ascii-ci=0

log =internal log bsize=4096 blocks=2560, version=2

= sectsz=512 sunit=0 blks, lazy-count=1

realtime =none extsz=4096 blocks=0, rtextents=0

OK

root@gluster1:~#

The next step is to create the mount point for our brand new XFS filesystem, persist it on our operating system configuration (for example, /etc/fstab), and mount it (the procedure to be executed on all cluster nodes) as follows:

root@gluster1:~# mkdir -p /var/storage/disk1

root@gluster1:~# cp -pRf /etc/fstab /etc/fstab-`date '+%s%m%d%Y'` &&

echo "/dev/sdb1 /var/storage/disk1 xfs defaults 0 0" >> /etc/fstab && echo OK

OK

root@gluster1:~# mount -a && df -h | grep sdb1

/dev/sdb1 5.0G 33M 5.0G 1% /var/storage/disk1

The preceding scripts should create a backup of our /etc/fstab file before editing it. The last script does mount the XFS filesystem on the mount point we have created (reading it from the /etc/fstab directory) and uses the df utility to show us the mounted partition information. For the bootstrap of our configuration, let's add the hostname/IP pair of each server on the /etc/hosts file for simplicity (execute the same procedure on both nodes) as follows:

root@gluster1:~# cp -pRf /etc/hosts /etc/hosts-`date '+%s%m%d%Y'` &&

echo "10.10.10.21 gluster1-gc" >> /etc/hosts && \

echo "10.10.10.22 gluster2-gc" >> /etc/hosts && \

echo "10.11.11.21 gluster1-cc" >> /etc/hosts && \

echo "10.11.11.22 gluster2-cc" >> /etc/hosts && echo OK

OK

root@gluster1:~#

After we have executed these basic procedures successfully (for example, received OK, as in the preceding output), we can start the configuration of the GlusterFS distributed filesystem. Make sure you have executed all the previous procedures on both nodes until this point before moving on to the next steps.

Again, we need to execute the following command on both nodes, but just slightly differently. Here, we have one example of running it on the node 2 (gluster2) as follows:

root@gluster2:~# gluster peer probe gluster1-gc

Probe successful

root@gluster2:~#

Tip

The systems should not have previous Samba software versions (for example, Version 3) or old configurations and binaries that can conflict with Samba 4. Remember to remove all configurations and software previously installed (for example, the apt-get purge). Test everything in a lab environment and create backups of everything you remove or reconfigure.

The important point in the previous commands is that we need to probe node 0 (gluster1) from node 1 (gluster2) and vice versa. Note that we are using the Gluster interconnect interface (gluster1-gc) because that is the heartbeat interface we have created for the GlusterFS inter-node communication.

Before we do create our volume, let's just clarify two important and distinct configuration options for Gluster: replicate and distribute. We will use the first option, as it replicates data between the nodes of the cluster that gives us redundancy and availability at the Gluster filesystem level (very important as we are using just one disk at each server without RAID [47]). The second mode (distribute) will place the files across the nodes (for example, volumes) of the gluster [45], which is a good thing for performance and space, but we would need to add a layer of redundancy on our disks or any node failures would result in unavailability for our clients. We can even combine the two modes, but that is out of the scope of this book.

Execute the following commands on just one node. The following is an example of the execution on the gluster2 node:

root@gluster2:~# gluster volume create smb01 replica 2 \

gluster1-gc:/var/storage/disk1 gluster2-gc:/var/storage/disk1

Creation of volume smb01 has been successful. Please start the volume to access data.

root@gluster2:~# gluster volume start smb01

Starting volume smb01 has been successful

root@gluster2:~#

If the two preceding commands were executed successfully, we should have our brand new volume ready to go! As we need to see it to believe it, the following command will show the state of the glusterfs volume (executed from any node):

root@gluster2:~# gluster volume info

Volume Name: smb01

Type: Replicate

Status: Started

Number of Bricks: 2

Transport-type: tcp

Bricks:

Brick1: gluster1-gc:/var/storage/disk1

Brick2: gluster2-gc:/var/storage/disk1

root@gluster2:~#

The preceding command is very informative, as we can see the volume name, the volume type (in our case, replicate), the number and nodes/path of the individual bricks, and most importantly, the status (in our execution example, Started).

Integrating CTDB, GlusterFS, and the Samba 4 Server

CTDB is the software that implements the clusterization of the Trivial Database (TDB) [49] used by Samba [43]. As CTDB provides functionalities that are similar to those provided by TDB (for example, the same type of functions), Samba or any other project/software that already uses the Trivial Database can migrate to a clustered version (CTDB) with minimal effort.

CTDB needs a special lock file that is accessible from any node, so we will create it on our glusterfs volume (for example, smb01) in a specific directory called ctdb. The following procedure needs to be run in just one node:

root@gluster1:~# mkdir –p /var/lib/samba/glusterfs && echo OK

OK

root@gluster1:~# cp -pRf /etc/fstab /etc/fstab-`date '+%s%m%d%Y'` &&

echo "localhost:/smb01 /var/lib/samba/glusterfs/ glusterfs defaults,_netdev 0 0" >> /etc/fstab && mount -a && echo OK

OK

root@gluster1:~# mkdir /var/lib/samba/glusterfs/ctdb && echo OK

OK

root@gluster1:~#

Now we need to create some files inside that directory, and the first one is the ctdb file. Let's copy the default ctdb file from the Debian GNU/Linux distribution to use it as a template. For that, just issue the following command in one node (the node you have executed the previous procedure in, as it has the glusterfs filesystem already mounted):

root@gluster1:~# cp -pRf /etc/default/ctdb /etc/default/ctdb-`date '+%s%m%d%Y'` && echo OK

OK

root@gluster1:~# rm /etc/default/ctdb && echo OK

OK

root@gluster1:~#

In the other nodes of the cluster, just execute the following command:

root@gluster2~# rm /etc/default/ctdb && echo OK

OK

root@gluster2:~#

The /var/lib/samba/glusterfs/ctdb/ctdb file is the main configuration file for the CTDB software (for example, on the Debian GNU/Linux distribution), and we will edit this file in just one node (the one we have the glusterfs filesystem mounted in, that is, gluster1) and leave it with the following content:

root@gluster1:~# cat /etc/default/ctdb

CTDB_RECOVERY_LOCK="/var/lib/samba/glusterfs/ctdb/lock"

CTDB_PUBLIC_INTERFACE=eth0

CTDB_PUBLIC_ADDRESSES="/var/lib/samba/glusterfs/ctdb/public_addresses"

CTDB_NODES="/var/lib/samba/glusterfs/ctdb/nodes"

CTDB_MANAGES_SAMBA=yes

CTDB_INIT_STYLE=debian

CTDB_SERVICE_SMB=smb4

CTDB_DBDIR=/var/lib/ctdb

CTDB_DBDIR_PERSISTENT=/var/lib/ctdb/persistent

The first configuration is for the CTDB lock file, and the second line is the configuration for the file containing the public IP addresses (VIPs). The content for the file pointed in the configuration of CTDB_NODES should be the list of the nodes' IP addresses (for example, the nodes' interconnect interfaces). The last lines are configurations for CTDB to be able to manage the initialization of the Samba 4 daemon and CTDB databases' locations.

The following is the content of the nodes file; execute the following script on just one node to create the file (in the same node we have executed the previous procedures, as we have the glusterfs filesystem already mounted on it):

root@gluster1:~# echo -e "10.11.11.21\n10.11.11.22" >

/var/lib/samba/glusterfs/ctdb/nodes

root@gluster1:~# cat !$

cat /var/lib/samba/glusterfs/ctdb/nodes

10.11.11.21

10.11.11.22

root@gluster1:~#

Tip

In the official CTDB documentation at samba.org, you will find the description of each tunable option [52].

We need virtual IPs (VIPs) for the cluster to use in case of node failures to be able to transfer the services (for example, VIPs) to other nodes. These VIPs need to be on the public interface because they are the point of connection for the Samba 4 client.

Here is the script to be executed just on one node (the same as we executed in the previous steps) to add the content of the public_addresses file and show it:

root@gluster1:~# echo -e "192.168.1.20/24\n192.168.1.30/24" > /var/lib/samba/glusterfs/ctdb/public_addresses

root@gluster1:~# cat !$

cat /var/lib/samba/glusterfs/ctdb/public_addresses

192.168.1.20/24

192.168.1.30/24

root@gluster1:~#

The last configuration step is to edit our smb.conf file and add the clustering directive and specify our share volume, as highlighted in the following excerpt:

root@gluster1:~# cat /usr/local/samba/etc/smb.conf

[global]

clustering = yes

workgroup = POA

netbios name = hafs

security = ads

[share]

path = /var/lib/samba/glusterfs/data

comment = Highly Available Share

read only = No

valid users = @"POA\Domain Admins"

browseable = Yes

After that, we move the smb.conf configuration file for our highly available share (you just need to do the editing in one node—the same way we have executed the previous editing) as shown in the following commands:

root@gluster1:~# mkdir /var/lib/samba/glusterfs/data

root@gluster1:~# mv /usr/local/samba/etc/smb.conf \

/var/lib/samba/glusterfs/ctdb/ && echo OK

OK

root@gluster1:~#

Tip

You may consider adding the reset on zero vc = yes option to your smb.conf file. You may have a look into more details about this and other file locking options and considerations [55]. Locking is an important issue in file servers, and so the reader needs to understand all the implications of the different options.

Now we need to create symbolic links for the ctdb and smb.conf files in every node of the cluster (for example, the gluster1 and gluster2 nodes). So, execute the following procedure on all nodes (inclusive of gluster1):

root@gluster1:~# ln -s /var/lib/samba/glusterfs/ctdb/ctdb /etc/default/ctdb && echo OK

OK

root@gluster1:~# ln -s /var/lib/samba/glusterfs/ctdb/smb.conf /usr/local/samba/etc/smb.conf && echo OK

OK

root@gluster1:~#

Some utilities will search for files in the default location using default names (for example, /etc/ctdb/nodes), so let's create these links as well (on both nodes) as shown in the following commands:

root@gluster1:~# ln -s /var/lib/samba/glusterfs/ctdb/nodes /etc/ctdb/nodes && echo OK

OK

root@gluster1:~# ln -s /var/lib/samba/glusterfs/ctdb/public_addresses /etc/ctdb/public_addresses && echo OK

OK

root@gluster1:~#

Now we are going to start the services, so let's do it! As we executed the majority of the configuration procedure on the gluster1 node, we should already have the glusterfs volume mounted on that node. Now, we just need to mount it on the second node as shown in the following commands:

root@gluster2:~# mkdir –p /var/lib/samba/glusterfs && echo OK

OK

root@gluster2:~# cp -pRf /etc/fstab /etc/fstab-`date '+%s%m%d%Y'` &&

echo "localhost:/smb01 /var/lib/samba/glusterfs/ glusterfs defaults,_netdev 0 0" >> /etc/fstab && mount -a && echo OK

OK

root@gluster2:~#

As we have installed the Samba 4 software by compiling it from the source code directly, we are not using a binary package from the distribution and so we do not have a Samba 4 initialization script. We have made some minor adjustments on a publicly available Samba 4 initialization script referenced here [48] that was based on the script from the Debian GNU distribution for the Samba 3 package, and it is provided as an example initialization script available in the book's repository. So we just need to download it and put it in the /etc/init.d/ directory and name it smb4 (on both nodes), as follows:

root@gluster1:~# cd /etc/init.d/ && echo OK

OK

root@gluster1:~# wget --quiet https://raw.github.com/packt/bookrepository/master/smb4-fs && mv smb4-fs smb4 && echo OK

OK

root@gluster1:~# chmod 755 smb4 && cd && echo OK

OK

root@gluster1:~#

Execute the following commands to adjust the CTDB script path and create a link for the default Samba configuration file (in all nodes):

root@gluster1:~# cp -pRf /etc/ctdb/functions /etc/ctdb/functions-

`date '+%s%m%d%Y'` && sed -e 's/PATH=\/bin:.*$/PATH=\/usr\/local\/samba\/bin:\/usr\/local\/samba\/sbin:\/bin:\/usr\/bin:\/usr\/sbin:\/sbin:\$PATH/g'

/etc/ctdb/functions > /etc/ctdb/functions-new && mv

/etc/ctdb/functions-new /etc/ctdb/functions && echo OK

OK

root@gluster1:~# mkdir /etc/samba && ln -s /var/lib/samba/glusterfs/ctdb/smb.conf /etc/samba/ && echo OK

OK

root@gluster1:~#

Ready to start the CTDB software and bring our Samba 4 highly available distributed file server online? Good! Just execute the following commands on both nodes:

root@gluster1:~# service ctdb start

[ok]

root@gluster1:~#

The CTDB cluster will not be up and running (healthy) right away, as we still need to add our HA file servers into the domain. The winbindd daemon will not start unless we do that, and CTDB will identify that error and prevent the system from being fully functional.We can take a look at the ctdb logs, as follows, to see the typical error at this configuration stage:

root@gluster1:~# tail /var/log/ctdb/log.ctdb

2013/12/12 18:35:35.732508 [18195]: Release freeze handler for prio 3

2013/12/12 18:35:36.031226 [recoverd:18454]: Resetting ban count to 0 for all nodes

2013/12/12 18:35:49.465193 [18195]: 50.samba: ERROR: winbind - wbinfo -p returned error

2013/12/12 18:35:52.046421 [recoverd:18454]: Trigger takeoverrun

2013/12/12 18:35:54.644519 [18195]: 50.samba: ERROR: winbind - wbinfo -p returned error

2013/12/12 18:35:59.863546 [18195]: 50.samba: ERROR: winbind - wbinfo -p returned error

...

root@gluster1:~#

In order to fix this error, we need to join the nodes into our domain, but for that we need to execute the join procedure in just one node. The CTDB cluster will handle the sync of the secrets.tdb file to have consistent credentials between our nodes (a join is made for our netbios cluster name, and the same is done for all clusters' members). For our AD DS server, no matter how many gluster, CTDB, and Samba file servers we have in the cluster, they will be like one of these:

root@gluster1:~# export PATH=/usr/local/samba/sbin:\

/usr/local/samba/bin:$PATH

root@gluster1:~# net ads join -U administrator

Enter administrator's password:

Using short domain name -- POA

Joined 'HAFS' to dns domain 'poa.msdcbrz.eall.com.br'

Not doing automatic DNS update in a clustered setup.

root@gluster1:~#

Now we restart the CTDB services (on both nodes) as follows:

root@gluster1:~# service ctdb stop && service ctdb start

[....] Stopping Clustered TDB: ctdb

[ ok ] Starting Clustered TDB : ctdb.

root@gluster1:~#

And after a few minutes, we should be able to confirm that the status of the services will be OK, as follows:

root@gluster1:~# ctdb status

Number of nodes:2

pnn:0 10.11.11.21 OK (THIS NODE)

pnn:1 10.11.11.22 OK

Generation:225613703

Size:2

hash:0 lmaster:0

hash:1 lmaster:1

Recovery mode:NORMAL (0)

Recovery master:0

root@gluster1:~#

The systems need some time to start and verify the services until we get to the OK statuses, so just wait a few minutes (for example, watch ctdb status). If you face any problems or the system does not come online, the logfile /var/log/ctdb/log.ctdb has information about what the problem could be. When debugging the problems with CTDB, we can change the default debug level from ERR in the file /etc/default/ctdb to a number such as 3 or 5; for example, I have tested with 5 and it is really verbose, instead ofCTDB_DEBUGLEVEL=5.

The last point to confirm before going on to more specific tests is to check if the VIPs are online on our nodes, as follows:

root@gluster1:~# ctdb ip -v

Public IPs on node 0

192.168.1.20 node[0] active[eth0] available[eth0] configured[eth0]

192.168.1.30 node[1] active[] available[eth0] configured[eth0] root@gluster1:~#

root@gluster2:~# ctdb ip -v

Public IPs on node 0

192.168.1.20 node[0] active[] available[eth0] configured[eth0]

192.168.1.30 node[1] active[eth0] available[eth0] configured[eth0] root@gluster2:~#

The preceding command is very handy, as we can see information about the IP allocation on the cluster; another important piece of data is the active column. From the preceding output, we can see that we have each of our servers with one IP active, so the cluster automatically balances the distribution of IPs evenly on our cluster, which is good for performance and availability. Let's just show the network configuration of each server, so we can see that each node has the IP, as shown in the preceding output:

root@gluster1:~# ip add show eth0

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether fa:ce:27:ed:bf:39 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.2/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.20/24 brd 192.168.1.255 scope global secondary eth0

inet6 fe80::a00:27ff:feac:f0c9/64 scope link

valid_lft forever preferred_lft forever

root@gluster1:~#

Now let's look at the same command execution on node 1, as follows:

root@gluster2:~# ip addr show eth0

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether fa:ce:27:f3:8d:50 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.3/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.30/24 brd 192.168.1.255 scope global secondary eth0

inet6 fe80::a00:27ff:fef3:8d50/64 scope link

valid_lft forever preferred_lft forever

root@gluster2:~#

Everything is up and running and we can go ahead and start our tests on the file server and validate our configuration!

Executing tests and validations on the highly available file server

The first test we need to execute on our highly available file server is to try to access the share directly on the server localhost interface. We can perform this test using the smbclient tool as follows:

root@gluster1:~# smbclient -L localhost -U 'zisala%w1ndow$$'

Domain=[POA] OS=[Unix] Server=[Samba 4.0.9]

Sharename Type Comment

--------- ---- -------

share Disk Highly Available Share

root@gluster1:~#

Tip

It's assumed that even if you have specified the interfaces to bind the Samba 4 services, the loopback interface hangs your tests if you have removed this interface from your configuration.

Second, let's try to access using the VIPs as follows:

root@gluster1:~# smbclient -L 192.168.1.20 -U 'zisala%w1ndow$$'

Domain=[POA] OS=[Unix] Server=[Samba 4.0.9]

Sharename Type Comment

--------- ---- -------

share Disk Highly Available Share

root@gluster1:~#

And…

root@gluster1:~# smbclient -L 192.168.1.30 -U 'zisala%w1ndow$$'

Domain=[POA] OS=[Unix] Server=[Samba 4.0.9]

Sharename Type Comment

--------- ---- -------

share Disk Highly Available Share

root@gluster1:~#

Perfect! It's time to test the access and the high availability of our file server directly from a Microsoft Windows client machine. Let's first create a DNS entry for our VIPs so we can use it instead of the IP addresses to let our cluster handle any failures for us, as it fails over the VIPs transparently between the nodes. Follow these steps:

1. Navigate to Administrative Tools | DNS, connect to the addc AD DS, and expand the DNS server to select our domain as shown in the following screenshot:

Executing tests and validations on the highly available file server

2. In the top menu, select View and check the option Advanced as shown in the following screenshot:

Executing tests and validations on the highly available file server

3. The preceding selection is needed to be able to edit the TTL value for our glusterfs records [51]. We need to edit the TTL to provide a faster recovery time for our file server cluster and some kind of load balancing. Just for our test, we will create AA records and set the TTL to 1. Just right-click on the poa.msdcbrz.eall.com.br zone and select New Host (A or AAAA)... as shown in the following screenshot:

Executing tests and validations on the highly available file server

4. In the New Host screen next, fill the first box with the name glusterfs, fill the IP address box with 192.168.1.20, and at the bottom of the screen set the TTL to 1. The following screenshot has an example for this procedure:

Executing tests and validations on the highly available file server

5. After we have clicked on the Add Host button in the previous screen, we repeat the process, just changing the IP from 192.168.1.20 to 192.168.1.30, as shown in the following screenshot. So, at the end, we should have two new glusterfs records, each pointing to different IPs and both with TTL 1. Just click on Done.

Executing tests and validations on the highly available file server

6. As stated in [51], we need to configure the Microsoft Windows client cache time to actually have a consistent configuration and our short TTL effective on our test environment. So, let's open the registry and find the keyHKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Dnscache\Parameters, as shown in the following screenshot:

Executing tests and validations on the highly available file server

7. We need to add the entry DWORD (32-bit) Value with value name MaxCacheEntryTtlLimit [51] as shown in the following screenshot:

Executing tests and validations on the highly available file server

8. The value is in seconds, so let's double-click on it and set it to 1 as shown in the following screenshot:

Executing tests and validations on the highly available file server

9. In the end, our DNS cache parameters should read like those in the following screenshot:

Executing tests and validations on the highly available file server

10. Log in to a Microsoft Windows Server 2008 R2 as a user that has permission to access the share we have created (for example, in Domain Admin group), click on Start, enter \\glusterfs on the Search programs and files input box, as shown in the followingscreenshot, and press Enter:

Executing tests and validations on the highly available file server

11. We should be presented with an explorer window showing our share. On double-clicking on it, we should see that our share is empty, as shown in the following screenshot:

Executing tests and validations on the highly available file server

12. Now we can copy some data to our share, and while the copying process is running, we will stop the file server services on our node where we have the preceding Microsoft Windows connection. To identify our connection, let's execute the ipconfig command in a command prompt in the Microsoft Windows Server 2008 R2 prompt, as shown in the following screenshot:

Executing tests and validations on the highly available file server

13. From the preceding output, we can see that our client has the IP 192.168.1.12. Now we can execute the following command in all nodes of our GlusterFS cluster and identify the node that the client is connected to. We can do this using more generic tools from the GNU/Linux world (for example, netstat and grep) together with the onnode utility as a first approach, as follows:

14.root@gluster1:~# onnode all "netstat -anl | grep -w 192.168.1.12"

15.>> NODE: 10.11.11.21 <<

16.tcp 0 0 192.168.1.20:445 192.168.1.12:57694 ESTABLISHED

17.>> NODE: 10.11.11.22 <<

18.root@gluster1:~#

The preceding utility (onnode) receives as arguments the nodes where we want to execute commands (for example, all) and the script we want to execute (for example, in the preceding example, list the host connections and search for one specific IP).

19. An easier option is to use the smbstatus tool to present us with the answer in a well formatted and very clean manner (we will use the –p option to show us the processes and –n for the numeric data, as seen in the netstat command earlier):

20.root@gluster1:~# smbstatus -np

21.

22.Samba version 4.0.13

23.PID Username Group Machine

24.-------------------------------------------------------------------

25.0:12624 1500 1513 192.168.1.12 (ipv4:192.168.1.12:57694)

26.root@gluster1:~#

We were lucky because the Microsoft client is connected on the same host where SSH is connected (node 0 = gluster1), so we can just stop the services on this node and CTDB will migrate the IP to another node, and that should be transparent for clients.

Let's copy something to our share. We have the matrix's logs on our desktop, so we can copy it to the glusterfs share and so other architects can inspect it and look for failures in the system. It's important in tests such as these to have the MD5 of the file so that we are able, at the end of the test, to make sure the file was not corrupt and the cluster really did a consistent job on the IP transfer. So, glusterfs and everything else worked as expected for the stability of our file server (we will use xcopy [53] to copy the file; this utility has a /v option for verification of the file after the copy, and you may use it if you need to). The MD5 for our file is eeaa7d4c17d5c6c86aa7aac57c3594db. Follow these steps:

1. First let's see the status of the CTDB services as follows:

2. root@gluster1:~# ctdb status

3. Number of nodes:2

4. pnn:0 10.11.11.21 OK (THIS NODE)

5. pnn:1 10.11.11.22 OK

6. Generation:985025383

7. Size:2

8. hash:0 lmaster:0

9. hash:1 lmaster:1

10.Recovery mode:NORMAL (0)

11.Recovery master:0

12.root@gluster1:~#

Tip

For the services startup (for example, operating system boot), one important step is to make sure the mount of the system partitions/disks (for example, the glusterfs volumes) will happen after a gluster is online. Another point to note is to organize the dependencies of each service accordingly with the GNU/distribution initialization framework (for example, ctdb needs to be the last service to come online, as it will start Samba and all the underlying infrastructure needs to be up and running before that).

13. Just open a command prompt window and issue the command demonstrated in the following screenshot:

Executing tests and validations on the highly available file server

14. While our copying process is running, we will use the smbstatus command one more time to make sure we will disable only the node where our Microsoft Windows client is connected, as shown in the following command:

15.root@gluster1:~# smbstatus -np

16.

17.Samba version 4.0.13

18.PID Username Group Machine

19.-------------------------------------------------------------------

20.0:12624 1500 1513 192.168.1.12 (ipv4:192.168.1.12:57694)

21.root@gluster1:~#

22. The Microsoft Windows client is connected on the gluster1 host (node 0), so we will disable this node, which will make the CTDB software migrate the VIP to gluster2 (node 1), as follows:

23.root@gluster1:~# ctdb disable -n 0 && echo OK

24.OK

25.root@gluster1:~#

26. For this example, we have mapped the \\glusterfs\share share to Z: on our Microsoft Windows client just to facilitate the tests. Z is the first letter offered.

27. Now, we can look at the CTDB's status to confirm that the node 0 is disabled, as follows:

28.root@gluster2:~# ctdb status

29.Number of nodes:2

30.pnn:0 10.11.11.21 DISABLED

31.pnn:1 10.11.11.22 OK (THIS NODE)

32.Generation:2123322465

33.Size:2

34.hash:0 lmaster:0

35.hash:1 lmaster:1

36.Recovery mode:NORMAL (0)

37.Recovery master:1

38.root@gluster2:~#

39. The preceding output is exactly what we were expecting, as it shows the node 0 (pnn: 0) disabled, and so we can assume that both the VIPs are on node 1 (gluster2). Let's take a look at the following commands:

40.root@gluster2:~# ip addr show eth0

41.3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

42. link/ether fa:ce:27:f3:8d:50 brd ff:ff:ff:ff:ff:ff

43. inet 192.168.1.3/24 brd 192.168.1.255 scope global eth0

44. inet 192.168.1.30/24 brd 192.168.1.255 scope global secondary eth0

45. inet 192.168.1.20/24 brd 192.168.1.255 scope global secondary eth0

46. inet6 fe80::a00:27ff:fef3:8d50/64 scope link

47. valid_lft forever preferred_lft forever

48.root@gluster2:~#

49. Now, looking at smbstatus one more time, we should see our Microsoft Windows client connected to gluster2 (node 1) as follows:

50.root@gluster2:~# smbstatus -np

51.

52.Samba version 4.0.13

53.PID Username Group Machine

54.-------------------------------------------------------------------

55.1:26044 1500 1513 192.168.1.12 (ipv4:192.168.1.12:50116)

56.root@gluster2:~#

57. We need to wait until the copying process finishes; the following screenshot shows the end of our example copy:

Executing tests and validations on the highly available file server

58. Based on our previously calculated MD5 hash, we can validate if our failover routine had some impact on the consistency of our file/copy. So, we can execute the md5sum utility directly on one of our nodes to validate it, as follows:

59.root@gluster2:~# md5sum /var/lib/samba/glusterfs/data/matrix.log

60.eeaa7d4c17d5c6c86aa7aac57c3594db /var/lib/samba/glusterfs/data/matrix.log

61.root@gluster2:~#

Tip

Browse to samba.org to watch some cool screencasts. There's even one on a procedure similar to the one presented here. There are some on CTDB and failover scenarios too [54].

62. Exactly the same! We have a robust solution, which can scale as well, based on the distributed nature of glusterfs. This kind of solution leverages administration flexibility and control much better, as well as availability and resilience for the services and all that leads to happier customers. To bring our cluster to the normal operation with all nodes in a healthy state, let's reenable our node 0 (the gluster1 host) as follows:

63.root@gluster2:~# ctdb enable -n 0 && echo OK

64.OK

65.root@gluster2:~# ctdb status

66.Number of nodes:2

67.pnn:0 10.11.11.21 OK

68.pnn:1 10.11.11.22 OK (THIS NODE)

69.Generation:2123322465

70.Size:2

71.hash:0 lmaster:0

72.hash:1 lmaster:1

73.Recovery mode:NORMAL (0)

74.Recovery master:1

75.root@gluster2:~#

Summary

In this chapter, we learned how to create a highly available distributed file server using the Debian GNU/Linux distribution, Samba 4, and other very important components, such as CTDB, XFS, and GlusterFS. We have learned step-by-step procedures from the preparation of the operating system environment, installation of the needed software and dependencies (for example, libraries), to a custom compilation of the Samba 4 application itself.

We have presented the relationship between the different components and how to integrate and implement a proper solution that is robust, highly available, and scalable—all very important characteristics for the highly demanding IT environment. Finally, we executed tests and validations on the high availability and distributed nature of our file server. We even simulated an issue on a node to actually see the behavior of our environment and testify to the resilience of our network file services.

In the next chapter, we will take a look at the Samba's Python binding scripting interface and how we can become familiar with the language and internals of the Samba 4 code. We will also take our first steps into the journey of participation and collaboration in the Samba community.