OpenNode High Availability Cluster solutions are based on Red Hat Cluster Suite (RHCS) software. In this howto we describe one of the simplest ways to setup basic HA Cluster with shared block device (SAN or iSCSI) - resulting with simple 2-node active/passive failover setup which is ment to handle physical server failure.
For more featured cluster configurations - like VM level HA (for both OpenVZ and KVM), DRBD replicated storage, GFS2 shared storage and Load-Balanced Clusters - please contact us for commercial support: firstname.lastname@example.org.
More detailed overview about RHCS can be obtained here: https://access.redhat.com/knowledge/docs/en-US/RedHatEnterpriseLinux/6/html/HighAvailabilityAdd-OnOverview/index.html
Complete RHCS management documentation can be found here: https://access.redhat.com/knowledge/docs/en-US/RedHatEnterpriseLinux/6/html/ClusterAdministration/index.html
- 2 or more physical cluster nodes required
- network equipment should allow UDP multicasting in local LAN for cluster heartbeat
- shared block device between cluster nodes for simple cluster storage required (SAN or iSCSI)
- separate and dedicated network link between nodes for cluster heartbeat is recommended
- is case of 2-node cluster using also quorum disk is highly recommended
Simple 2-node failover cluster design for Highly Available OpenVZ service
This basic failover cluster is ment to handle server hardware failures - eg in case one of the cluster nodes fails all VM-s are automatically restarted on another cluster node. For simplicity we use shared block device between cluster nodes - but not shared failsystem. Filesystem residing on shared block device is active ONLY on single cluster node at the time - eg it gets mounted only on node where currently all VMs are running. HA service being created and monitored by Cluster Manager is OpenVZ (vz) service.
Installing RHCS software packages
# Execute these tasks on ALL cluster nodes yum groupinstall "High Availability" "Resilient Storage" -y # Enable and start ricci service chkconfig ricci on && service ricci start # Generate ricci password - needed later for cluster node addition passwd ricci # Enable cluster services chkconfig cman on && chkconfig rgmanager on && chkconfig modclusterd on
Dedicated network connection for cluster heartbeat (multicast)
# If your server has more than one network interface it is # strongly suggested that you setup a separate directly cabled # link between server nodes for cluster heartbeat # Configure dedicated network interfaces on both nodes # with some unused private LAN IP addresses nano -w /etc/sysconfig/network-scripts/ifcfg-ethX --- MODIFY --- NM_CONTROLLED="no" ONBOOT="yes" IPADDR=192.168.50.x NETMASK=255.255.255.0 --- MODIFY --- # Add additional hostnames (host-p attached) to both nodes /etc/hosts # in order to route cluster heartbeat into dedicated link nano -w /etc/hosts --- ADD --- 192.168.50.1 node1-p.example.com node1-p 192.168.50.2 node2-p.example.com node2-p --- ADD ---
Creating 2-node HA Cluster
# Execute the following commands on one of cluster nodes ccs -h localhost --createcluster VZHA ccs -h localhost --addnode node1-p.example.com ccs -h localhost --addnode node2-p.example.com ccs -h localhost --setfencedaemon post_join_delay=30 #For 2-node cluster ONLY ccs -h localhost --setcman expected_votes="1" two_node="1" #Validate cluster conf ccs_config_validate ccs -h localhost --getconf #NB! Don't forget to sync and activate cluster.conf!!! ccs -h localhost --sync --activate #Start cman services on ALL cluster nodes service cman start service rgmanager start service modclusterd start #Verify cluster status cman_tool status clustat
Attaching shared iSCSI block device to ALL cluster nodes (skip if using SAN block device)
# You need to perform these command on ALL cluster nodes # Install iSCSCI utils yum install iscsi-initiator-utils -y # Enable and start iscsid service service iscsid start && chkconfig iscsid on #Test discovery iscsiadm --mode discovery --type sendtargets --portal portal_ip_hostname # Log into the iSCSI target - to make it persistent across reboots iscsiadm -m node -T iqn.xxx.com.example.iscsi:diskname -p portal_ip_hostname -l # Verify that disk is attached cat /proc/partitions
Enabling Clustered LVM (CLVM)
# You need to execute these commands on ALL cluster nodes! # Changing LVM locking on all nodes to clustered type nano -w /etc/lvm/lvm.conf --- MODIFY --- locking_type = 3 --- MODIFY --- # Enabling and starting clvmd service service clvmd start && chkconfig clvmd on
Creating clustered LVM data volume for VM storage
# Applies to both shared iSCSI or SAN block device # Creating LVM Physical Volume (PV) and clustered Volume Group (VG) # on top of shared vlock device # Execute on one of the nodes pvcreate /dev/sdb vgcreate -c y sanvg1 /dev/sdb # Creating clustered LVM data volume for /vz partition lvcreate -L 50G -n storage sanvg1 # Creating filesystem mkfs.ext4 /dev/sanvg1/storage # Verify that sanvg1 is clustered service clvmd status
Creating clustered directory structure and migrating OpenVZ config files
### ABOUT ARCHITECTURE ### # /etc/init.d/vz will be cluster service which depends on /storage/local/vz filesystem subresource # VE config files will be put into /storage/local/vz filesystem subresource - which gets mounted before vz service starts # VEs are started/stopped by vz service # Execute on ALL cluster nodes chkconfig vz off && service vz stop # Execute only on master node mount /dev/sanvg1/storage /mnt rsync -av /storage/* /mnt/ mkdir -p /mnt/local/vz rsync -av /vz/* /mnt/local/vz/ rmdir /mnt/local/vz/lost+found umount /mnt # Execute on ALL cluster nodes # Move vz folders as original locations will be symlinked mv /etc/vz /etc/vz.orig mv /etc/sysconfig/vz-scripts /etc/sysconfig/vz-scripts.orig mv /var/vzquota /var/vzquota.orig # Unmount old /vz and /storage partitions umount /vz umount /storage # Remove old /storage /vz mounts from /etc/fstab sed -i '/\/storage/d' /etc/fstab sed -i '/\/vz/d' /etc/fstab # Fake symlink targets on all nodes for vz-lib updates - otherwise they will destroy symlinks mkdir -p /storage/local/vz/etc/sysconfig mkdir -p /storage/local/vz/etc/vz mkdir -p /storage/local/vz/var/vzquota # Copy original vz folders into "fake" locations rsync -av /etc/vz.orig/ /storage/local/vz/etc/vz/ rsync -av /etc/sysconfig/vz-scripts.orig/ /storage/local/vz/etc/sysconfig/vz-scripts/ rsync -av /var/vzquota.orig/ /storage/local/vz/var/vzquota/ # Relocate /vz to /storage/local/vz rmdir /vz cd / && ln -s /storage/local/vz # Execute only on master node # Mount clustered /storage volume mount /dev/sanvg1/storage /storage # Create OpenVZ dirs mkdir -p /storage/local/vz/etc/sysconfig mkdir -p /storage/local/vz/etc/vz mkdir -p /storage/local/vz/var/vzquota # Copy original vz dirs contents rsync -av /etc/vz.orig/ /storage/local/vz/etc/vz/ rsync -av /etc/sysconfig/vz-scripts.orig/ /storage/local/vz/etc/sysconfig/vz-scripts/ rsync -av /var/vzquota.orig/ /storage/local/vz/var/vzquota/ # Unmount clustered /storage volume umount /storage # Execute on ALL cluster nodes # Re-link vz dirs ln -s /storage/local/vz/etc/vz /etc/vz ln -s /storage/local/vz/etc/sysconfig/vz-scripts /etc/sysconfig/vz-scripts ln -s /storage/local/vz/var/vzquota /var/vzquota
Setting up failover domain
ccs -h localhost --addfailoverdomain VZHA ordered nofailback ccs -h localhost --addfailoverdomainnode VZHA node1-p.example.com 1 ccs -h localhost --addfailoverdomainnode VZHA node2-p.example.com 2 ccs -h localhost --lsfailoverdomain ccs -h localhost --sync --activate
Setting up cluster service and resources
# Add cluster service named vz ccs -h localhost --addservice vz domain=VZHA recovery=relocate autostart=0 # Add shared block device subresource ccs -h localhost --addsubservice vz fs name=storage device=/dev/sanvg1/storage mountpoint=/storage fstype=ext4 options=noatime,nodiratime # Add vz init script subresource ccs -h localhost --addsubservice vz fs:script file=/etc/init.d/vz name=vzctl # Populate and activate cluster config ccs -h localhost --sync --activate # Display cluster status clustat # Enable clustered vz service clusvcadm -e vz # Do relocation test for vz service - see it migrating between nodes clusvcadm -r vz
Setup IPMI devices on cluster nodes
# While there might be other methods for cluster fencing - # most common is to use servers IPMI interfaces for power control. # Here we install OpenIPMI kernel driver and ipmitool - # together with ipmi configuration # Install OpenIPMI and load kernel module yum install OpenIPMI OpenIPMI-tools -y service ipmi start # We don't recommend leaving IPMI service running # as ipmi kernel driver has been unstable in longer run #chkconfig ipmi on # How to configure BMC IPMI LAN device from OS ipmitool -I open lan set 1 ipsrc static ipmitool -I open lan set 1 ipaddr 192.168.1.10 ipmitool -I open lan set 1 netmask 255.255.255.0 ipmitool -I open lan set 1 access on ipmitool -I open lan set 1 defgw ipaddr 192.168.1.1 # Setup IPMI user ipmitool -I open user set name 2 admin ipmitool -I open user set password 2 passwd ipmitool -I open user enable 2 # Read IPMI configuration ipmitool -I open lan print 1 ipmitool -I open user list 1 # Testing IPMI power control state ipmitool -I lan -H ipmi_ip -U ipmi_user chassis power status
NB! Setup cluster fencing!
# Most important is to set up cluster fencing correctly - # otherwise HA failover wont work # List available fencing options ccs -h localhost --lsfenceopts # We are going to setup servers power control through # standard IPMI compliant baseboard management controller - # that modern servers all tend to have nowdays # NB! Replace ipmi_ip, ipmi_user and ipmi_passwd with yours ccs -h localhost --addfencedev node1fence agent=fence_ipmilan ipaddr=ipmi_ip login=ipmi_user passwd=ipmi_passwd auth=password action=off ccs -h localhost --addfencedev node2fence agent=fence_ipmilan ipaddr=ipmi_ip login=ipmi_user passwd=ipmi_passwd auth=password action=off ccs -h localhost --addmethod IPMI node1-p.example.com ccs -h localhost --addmethod IPMI node2-p.example.com ccs -h localhost --addfenceinst node1fence node1-p.example.com IPMI ccs -h localhost --addfenceinst node2fence node2-p.example.com IPMI ccs -h localhost --sync --activate
Setup Quorum Disk for avoiding split-brain situations with 2-node cluster
# While strictly not needed - it is highly recommended to setup # quorum disk - especially with 2-node clusters # Create qdisk on a shared block device (iSCSI or SAN) - # 64MB disk size is sufficient. # NB! Clustered LVM LVs wont work - has to plain shared block device! # Exec once on single cluster node mkqdisk -c /dev/mapper/mpathX -l VZHAQ # Check on both nodes that you see the quorum device mkqdisk -L -d ccs -h localhost --setquorumd interval=2 label=VZHAQ tko=5 votes=1 # Token interval should be longer than (tko+1)*interval(qdisk) ccs -h localhost --settotem token=33000 ccs -h localhost --sync --activate # Startup gdiskd ccs -h localhost --stopall ccs -h localhost --startall ccs -h localhost --setcman expected_votes="3" two_node="0" ccs -h localhost --sync --activate
Adding simple KVM HA into mix!
Please give us feedback about your OpenNode installation and register it by dropping us a note to the following email address: email@example.com
In return we provide you with instructions and code how to add simple KVM HA service into mix!