Thursday, March 29, 2018

Cluster 18. GFS2 (Global File System 2).

GFS2 overview

With DRBD providing the clusters raw storage space, and Clustered LVM providing the logical partitions, we can now look at the clustered file system. This is the role of the GFS2.

It works much like standard filesystem, with user-land tools like mkfs.gfs2, fsck.gfs2 and so on. The major difference is that it and clvmd use the cluster's DLM. Once formatted, the GFS2-formatted partition can be mounted and used by any node in the cluster's closed process group (CPG). All nodes can then safely read from and write to the data on the partition simultaneously.

The Red Hat Global File System (GFS) is Red Hat’s implementation of a concurrent-access shared storage file system. As any such filesystem, GFS allows multiple nodes to access the same storage device, in read/write fashion, simultaneously without risking data corruption. It does so by using a Distributed Lock Manager (DLM) which manages concurrent access from cluster members.

By default, the value of no-quorum-policy is set to stop, indicating that once quorum is lost, all the resources on the remaining partition will immediately be stopped. Typically this default is the safest and most optimal option, but unlike most resources, GFS2 requires quorum to function. When quorum is lost both the applications using the GFS2 mounts and the GFS2 mount itself cannot be correctly stopped. Any attempts to stop these resources without quorum will fail which will ultimately result in the entire cluster being fenced every time quorum is lost.
To address this situation, you can set the no-quorum-policy=freeze when GFS2 is in use. This means that when quorum is lost, the remaining partition will do nothing until quorum is regained:
pcs property set no-quorum-policy=freeze

  1. no-quorum-policy=freeze:
    1. If quorum is lost, the cluster partition freezes. Resource management is continued: running resources are not stopped (but possibly restarted in response to monitor events), but no further resources are started within the affected partition. This setting is recommended for clusters where certain resources depend on communication with other nodes (for example, OCFS2 mounts). In this case, the default setting no-quorum-policy=stop is not useful, as it would lead to the following scenario: Stopping those resources would not be possible while the peer nodes are unreachable. Instead, an attempt to stop them would eventually time out and cause a stop failure, triggering escalated recovery and fencing.
  2. no-quorum-policy=stop (default):
    1. If quorum is lost, all resources in the affected cluster partition are stopped in an orderly fashion.

GFS2 setup


From both nodes:
yum install gfs2-utils -y
From one node:
Format the /dev/agrp-c01n01_vg0/shared:
mkfs.gfs2 -j 2 -p lock_dlm -t agrp-c01:shared /dev/agrp-c01n01_vg0/shared # "say" yes to all questions (cluster must be started at the moment of the LV formatting)

The following switches are used with our mkfs.gfs2 call:
  • -j 2 # This tells GFS2 to create two journals. This must match the number of nodes that will try to mount this partition at any one time.
  • -p lock_dlm # This tells GFS2 to use DLM for its clustered locking.
  • -t agrp-c01:shared # This is the lock space name, which must be in the format <cluster_name>:<file-system_name>. The cluster_name must match the one in pcs config | grep Name. The <file-system_name> has to be unique in the cluster, which is easy for us because we'll only have the one gfs2 file system.
From both nodes:
tunegfs2 -l /dev/agrp-c01n01_vg0/shared # both nodes must see this LV in older versions of gfs,  equivalent to this command is: gfs2_tool sb /dev/an-a05n01_vg0/shared all

From one node:
Configure mount point for GFS2 resource (view pcs resource describe Filesystem output to understand all available options):
pcs resource create sharedfs ocf:heartbeat:Filesystem device="/dev/agrp-c01n01_vg0/shared" directory="/shared" fstype="gfs2" 
pcs resource clone sharedfs clone-max=2 clone-node-max=1 interleave=true ordered=true

From both nodes:
df -h /shared # both nodes must see /shared directory as mounted and 259M must shown as used space. 259M is journals consumed space (number of journals * journal size) on disk

From one node:
pcs constraint order start clvmd-clone then sharedfs-clone
pcs constraint colocation add sharedfs-clone with clvmd-clone

Test /shared from any of the nodes (we'll use agrp-c01n02):
cd /shared
touch test{1..10}
ssh agrp-c01n01 ls -lh /shared

This tutorials were used to understand and setup clustering: 
AN!Cluster
unixarena
clusterlabs.org
redhat.com


No comments:

Post a Comment