IT Stuff

Cluster 21. Making installed VM (Virtual Machine) a cluster resource.

Related to libvirtd

In order to start VM as HA resource, libvirtd must be up and running (on both nodes):
creare resource make it clone and start after shredfs because libvirtd uses pools (virsh pool-list virsh pool-info) and files pool is /shared/files
pcs resource create libvirtd systemd:libvirtd
pcs resource clone libvirtd clone-max=2 clone-node-max=1 interleave=true
pcs constraint order start sharedfs-clone then start libvirtd-clone
pcs constraint colocation add libvirtd-clone with sharedfs-clone
some options can be found here: Cluster 17

Firewall setup to support KVM Live Migration

Setup firewall ports for KVM live-migration (on both nodes):
On node1:
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="10.10.53.2/32" port protocol="tcp" port="49152-49216" accept'
firewall-cmd --reload
firewall-cmd --list-all
On node2:
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="10.10.53.1/32" port protocol="tcp" port="49152-49216" accept'
firewall-cmd --reload
firewall-cmd --list-all

49152-49216 is a pool of tcp ports used randomly by virsh to perfor live migration.

Related to VM itself

In order for the cluster to manage a server, it must know where to find the "definition" file that describes the virtual machine and its hardware. When the server was created with virt-install, it saved this definition file in /etc/libvirt/qemu/
Normal libvirtd tools are not cluster-aware, so we don't want them to see our server except when it is running. We will get this done via "undefine" our VM.

First we'll share definition:
virsh list --all # list running and power-off VMs
virsh dumpxml vm02-www # view VM definition xml dump
mkdir /shared/definitions
vursh shutdown vm02-www
virsh dumpxml vm02-www > /shared/definitions/vm02-www.xml # save dump to the shared directory, this file will be used to start, stop, recover and migrate the VM
verify that xml is saved properly # because next step will destroy the VM

Stop and destroy VM:
virsh destroy vm02-www
virsh undefine vm02-www
virsh list --all # be sure that needed VM is undefined

Setup VM cluster resource (this command is executed on VM primary node - vm02-www primary node is agrp-c01n01):
pcs resource create vm02-www ocf:heartbeat:VirtualDomain hypervisor="qemu:///system" config="/shared/definitions/vm02-www.xml" migration_transport=ssh meta allow-migrate=true op monitor interval="30" timeout="30s" op start interval="0" timeout="240s" op stop interval="0" timeout="120s"

Options described(for all options see pcs resource describe VirtualDomain or man ocf_heartbeat_VirtualDomain):

hypervisor="qemu:///system" - you can find this uri by executing virsh --quiet uri
migration_transport=ssh - use ssh while migrating VM
meta allow-migrate=true - Resources have two types of options: meta-attributes and instance attributes. Meta-attributes apply to any type of resource, while instance attributes are specific to each resource agent. Visit clusterlabs.org/meta

pcs constraint order start libvirtd-clone then vm02-www

pcs constraint colocation add vm02-www with libvirtd-clone

Adding below constraint is needed because without it after node returning (after fail or manual cluster stop-starting) pacemaker will try to migrate VM to the primary node without waiting for DRBD promotion:

pcs constraint colocation add vm02-www with master ms_drbd_r0

Scores are calculated per resource and node. Any node with a negative score for a resource can’t run
that resource. The cluster places a resource on the node with the highest score for it. Positive values indicate a preference for running the affected resource(s) on this node — the higher the value, the stronger the preference. Negative values indicate the resource(s) should avoid this node (a value of - INFINITY changes "should" to "must"):
pcs constraint location add lc_vm02_n01 vm02-www agrp-c01n01 1
pcs constraint location add lc_vm02_n02 vm02-www agrp-c01n02 0

Above location constraints are needed to automatically live-migrate to the node which is primary for that VM. For vm02-www primary node is agrp-c01n01)

To view current score for the resource:

crm_simulate -sL | grep " vm[0-9]"

vm02-www (ocf::heartbeat:VirtualDomain): Started agrp-c01n02

native_color: vm02-www allocation score on agrp-c01n01: -INFINITY

native_color: vm02-www allocation score on agrp-c01n02: 0

-INFINITY for agrp-c01n01 is really not because of constraint but because of agrp-c01n01 is offline

SELinux related issues

SELinux is preventing /usr/bin/virsh from read access on the file vm01-nagios.xml

semodule -DB # enables complete logging SELinux messages to the audit.log
open one more ssh to the node
tail -f -n0 /var/log/audit/audit.log

pcs resources cleanup
The message appeared:
type=AVC msg=audit(1524726228.964:514): avc: denied { read } for pid=8711 comm="virsh" name="vm01-nagios.xml" dev="dm-5" ino=3477882 scontext=system_u:system_r:virsh_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file

Its's complaining about access to the vm01-nagios.xml file which is on device dm-5 and having inode 3477882. Let's find which device is it:
ls -lah /dev/mapper | grep dm-5 # It's agrp--c01n01_vg0-shared
Let's find what is inode 3477882:
find /shared -inum 3477882 # It's /shared/definitions/vm01-nagios.xml
Let's view SELinux context for /shared (we can also view context for only that file but we know that we can have many definitions in the /shared):
ls -laZ /shared # . (meaning current directory - /shared) context is system_u:object_r:unlabeled_t:s0 this context is not permissive enough, so we'll change it (only on one node but verify on the other):
semanage fcontext -a -t virt_etc_t '/shared(/.*)?'
restorecon -r /shared
ls -laZ /shared
semodule -B # disable audit.log for contexts with dontaudit option enabled

This tutorials were used to understand and setup clustering:
AN!Cluster
unixarena
clusterlabs.org
rarforge.com