Thursday, June 9, 2011

Clustered LVM on DRBD resource in Fedora Linux

As Florian Haas has pointed out in my previous post's comment, our shared storage configuration requires special precautions to avoid corruption of data when two hosts connected via DRBD try to manage LVM volumes simultaneously. Generally, these precautions concern locking LVM metadata operations while running DRBD in 'dual-primary' mode.

Let's examine it in detail. The LVM locking mechanism is configured in the [global] section of /etc/lvm/lvm.conf. The 'locking_type' parameter is the most important here. It defines which locking LVM is used while changing metadata. It can be equal to:

  • '0': disables locking completely - it's dangerous to use;
  • '1': default, local file-based locking. It knows nothing about the cluster and possible conflicting metadata changes;
  • '2': uses an external shared library and is defined by the 'locking_library' parameter;
  • '3': uses built-in LVM clustered locking;
  • '4': read-only locking which forbids any changes of metadata.

The simplest way is to use local locking on one of the drbd peers and to disable metadata operations on another one. This has a serious drawback though: we won't have our Volume Groups and Logical Volumes activated automatically upon creation on the other, 'passive' peer. The thing is that it's not good for the production environment and cannot be automated easily.

But there is another, more sophisticated way. We can use the Linux-HA (Heartbeat) coupled with the LVM Resource Agent. It automates activation of the newly created LVM resources on the shared storage, but still provides no locking mechanism suitable for a 'dual-primary' DRBD operation.

It should be noted that full support of clustered locking for the LVM can be achieved by the lvm2-cluster Fedora RPM package stored in the repository. It contains the clvmd service which runs on all hosts in the cluster and controls LVM locking on shared storage. In this case, we have only 2 drbd-peers in the cluster.

clvmd requires a cluster engine in order to function properly. It's provided by the cman service, installed as a dependency of the lvm2-cluster (other dependencies may vary from installation to installation):

The only thing we need the cluster for is the use of clvmd; the configuration of cluster itself is pretty basic. Since we don't need advanced features like automated fencing yet, we specify manual handling. As we have only 2 nodes in the cluster, we can tell cman about it. Configuration for cman resides in the /etc/cluster/cluster.conf file:

clusternode name should be a fully qualified domain name and should be resolved by DNS or be present in /etc/hosts. Number of votes is used to determine quorum of the cluster. In this case, we have two nodes, one vote per node, and expect one vote to make the cluster run (to have a quorum), as configured by cman expected attribute.

The second thing we need to configure is the cluster engine (corosync). Its configuration goes to /etc/corosync/corosync.conf:

The bindinetaddr parameter must contain a network address. We configure corosync to work on eth1 interfaces, connecting our nodes back-to-back on 1Gbps network. Also, we should configure iptables to accept multicast traffic on both hosts.

It's noteworthy that these configurations should be identical on both cluster nodes.

After the cluster has been prepared, we can change the LVM locking type in /etc/lvm/lvm.conf on both drbd-connected nodes:

Start cman and clvmd services on drbd-peers and get our cluster ready for the action:

Now, as we already have a Volume Group on the shared storage, we can easily make it cluster-aware:

Now we see the 'c' flag in VG Attributes:

As a result, Logical Volumes created in the vg_shared volume group will be active on both nodes, and clustered locking is enabled for operations with volumes in this group. LVM commands can be issued on both hosts and clvmd takes care of possible concurrent metadata changes.