Cluster 15. Pacemaker resources & resource constraints.
Pacemaker resources
A resource is a service made highly available by a cluster. The simplest type of resource, a primitive
resource, is described in this section. More complex forms, such as groups and clones, are described
in later sections.
Every primitive resource has a resource agent. A resource agent is an external program that abstracts the service it provides and present a consistent view to the cluster. Typically, resource agents come in the form of shell scripts. However, they can be written using any technology (such as C, Python or Perl) that the author is comfortable with.
Pacemaker supports several classes of agents (Remember to make sure the computer is not configured to start any services at boot time — that should be controlled by the cluster):
- LSB - LSB resource agents are those found in /etc/init.d (SysV initialization style). Many distributions claim LSB compliance but ship with broken init scripts. Common problematic violations of the LSB standard include:
- Not implementing the status operation at all
- Not observing the correct exit status codes for start/stop/status actions
- Starting a started resource returns an error
- Stopping a stopped resource returns an error
- Systemd - Some newer distributions have replaced the old "SysV" style of initialization daemons and scripts with an alternative called Systemd . Pacemaker is able to manage these services if they are present. Instead of init scripts, systemd has unit files. Generally, the services (unit files) are provided by the OS distribution, but there are online guides for converting from init scripts.
- Upstart - Some newer distributions have replaced the old "SysV" style of initialization daemons (and scripts) with an alternative called Upstart. Instead of init scripts, upstart has jobs. Generally, the services (jobs) are provided by the OS distribution.
- Service - Since there are various types of system services (systemd, upstart, and lsb), Pacemaker supports a special service alias which intelligently figures out which one applies to a given cluster node. This is particularly useful when the cluster contains a mix of systemd, upstart, and lsb. In order, Pacemaker will try to find the named service as:
- an LSB init script
- a Systemd unit file
- an Upstart job
- OCF - the OCF standard is basically an extension of the LSB conventions for init scriptsto support parameters, make them self-describing, and make them extensible. OCF specs have strict definitions of the exit codes that actions must return. The cluster follows these specifications exactly, and giving the wrong exit code will cause the cluster to behave in ways you will likely find puzzling and annoying. In particular, the cluster needs to distinguish a completely stopped resource from one which is in some erroneous and indeterminate state. Parameters are passed to the resource agent as environment variables, with the special prefix OCF_RESKEY_. So, a parameter which the user thinks of as ip will be passed to the resource agent as OCF_RESKEY_ip. The number and purpose of the parameters is left to the resource agent; however, the resource agent should use the meta-data command to advertise any that it supports. The OCF class is the most preferred as it is an industry standard, highly flexible (allowing parameters to be passed to agents in a non-positional manner) and self-describing. pcs resource providers command show which OCF providers presence on the cluster:
- heartbeat
- linbit
- openstack
- pacemaker
- Fencing - this class is used exclusively for fencing-related resources (differently from all other resources which are managed by pcs resource create command, this class is managed by pcs stonith command)
- Nagios Plugins - Nagios Plugins allow us to monitor services on remote hosts. Pacemaker is able to do remote monitoring with the plugins if they are present. A common use case is to configure them as resources belonging to a resource container (usually a virtual machine), and the container will be restarted if any of them has failed. Another use is to configure them as ordinary resources to be used for monitoring hosts or services via the network.
To find which Pacemaker resource classes/standards are supported:
pcs resource standards # List available resource agent standards supported by this installation
lsb
ocf
service
systemd
To view all resources agents:
pcs resource list # this will list all available resource agents with their standard/class names and with provider name (if one is available).
To view usage help for any agent:
pcs resource describe [<standard>:[<provider>:]]<type> [--full]
you can specify only the name of the agent (standard and provider are optional until agent's name in unique). Without "--full" advanced options are not shown.
- LSB - LSB resource agents are those found in /etc/init.d (SysV initialization style). Many distributions claim LSB compliance but ship with broken init scripts. Common problematic violations of the LSB standard include:
- Not implementing the status operation at all
- Not observing the correct exit status codes for start/stop/status actions
- Starting a started resource returns an error
- Stopping a stopped resource returns an error
- Systemd - Some newer distributions have replaced the old "SysV" style of initialization daemons and scripts with an alternative called Systemd . Pacemaker is able to manage these services if they are present. Instead of init scripts, systemd has unit files. Generally, the services (unit files) are provided by the OS distribution, but there are online guides for converting from init scripts.
- Upstart - Some newer distributions have replaced the old "SysV" style of initialization daemons (and scripts) with an alternative called Upstart. Instead of init scripts, upstart has jobs. Generally, the services (jobs) are provided by the OS distribution.
- Service - Since there are various types of system services (systemd, upstart, and lsb), Pacemaker supports a special service alias which intelligently figures out which one applies to a given cluster node. This is particularly useful when the cluster contains a mix of systemd, upstart, and lsb. In order, Pacemaker will try to find the named service as:
- an LSB init script
- a Systemd unit file
- an Upstart job
- OCF - the OCF standard is basically an extension of the LSB conventions for init scriptsto support parameters, make them self-describing, and make them extensible. OCF specs have strict definitions of the exit codes that actions must return. The cluster follows these specifications exactly, and giving the wrong exit code will cause the cluster to behave in ways you will likely find puzzling and annoying. In particular, the cluster needs to distinguish a completely stopped resource from one which is in some erroneous and indeterminate state. Parameters are passed to the resource agent as environment variables, with the special prefix OCF_RESKEY_. So, a parameter which the user thinks of as ip will be passed to the resource agent as OCF_RESKEY_ip. The number and purpose of the parameters is left to the resource agent; however, the resource agent should use the meta-data command to advertise any that it supports. The OCF class is the most preferred as it is an industry standard, highly flexible (allowing parameters to be passed to agents in a non-positional manner) and self-describing. pcs resource providers command show which OCF providers presence on the cluster:
- heartbeat
- linbit
- openstack
- pacemaker
- Fencing - this class is used exclusively for fencing-related resources (differently from all other resources which are managed by pcs resource create command, this class is managed by pcs stonith command)
- Nagios Plugins - Nagios Plugins allow us to monitor services on remote hosts. Pacemaker is able to do remote monitoring with the plugins if they are present. A common use case is to configure them as resources belonging to a resource container (usually a virtual machine), and the container will be restarted if any of them has failed. Another use is to configure them as ordinary resources to be used for monitoring hosts or services via the network.
To find which Pacemaker resource classes/standards are supported:
pcs resource standards # List available resource agent standards supported by this installation
lsb
ocf
service
systemd
To view all resources agents:
pcs resource list # this will list all available resource agents with their standard/class names and with provider name (if one is available).
To view all resources agents:
pcs resource list # this will list all available resource agents with their standard/class names and with provider name (if one is available).
pcs resource describe [<standard>:[<provider>:]]<type> [--full]
you can specify only the name of the agent (standard and provider are optional until agent's name in unique). Without "--full" advanced options are not shown.
Pacemaker resource constraints
Pacemaker has below resource constraints:
- Location constraints - tell the cluster which nodes a resource can run on.
- Ordering Constraints - tell the cluster the order in which resources should start or stop.
- Colocation Constraints - tell the cluster that the location of one resource depends on the location of another one.
- Ticket Constraints - tell the cluster how to coordinate multi-site (geographically-distributed/dispersed clusters). Apart from local clusters, Pacemaker also supports multi-site clusters. That means you can have multiple, geographically dispersed sites, each with a local cluster. Fail-over between these clusters can be coordinated manually by the administrator, or automatically by a higher-level entity called a Cluster Ticket Registry (CTR). A ticket grants the right to run certain resources on a specific cluster site.
What these all mean:
- If we want to start Apache on node1 we use location constraint,
- but if we can start Apache on any node but must start it only on the node where MySQL is currently started, then we'll use colocation constraint.
- if we want Apache to be started only after MySQL is started, then we'll use ordering constraint
- if we have several (not only two) resources that must have applied some rules on them, then we use resources set
Pacemaker has below resource constraints:
- Location constraints - tell the cluster which nodes a resource can run on.
- Ordering Constraints - tell the cluster the order in which resources should start or stop.
- Colocation Constraints - tell the cluster that the location of one resource depends on the location of another one.
- Ticket Constraints - tell the cluster how to coordinate multi-site (geographically-distributed/dispersed clusters). Apart from local clusters, Pacemaker also supports multi-site clusters. That means you can have multiple, geographically dispersed sites, each with a local cluster. Fail-over between these clusters can be coordinated manually by the administrator, or automatically by a higher-level entity called a Cluster Ticket Registry (CTR). A ticket grants the right to run certain resources on a specific cluster site.
What these all mean:
- If we want to start Apache on node1 we use location constraint,
- but if we can start Apache on any node but must start it only on the node where MySQL is currently started, then we'll use colocation constraint.
- if we want Apache to be started only after MySQL is started, then we'll use ordering constraint
- if we have several (not only two) resources that must have applied some rules on them, then we use resources set
Pacemaker non-primitive resources: groups, clones, multi-state, bundles
- groups - One of the most common elements of a cluster is a set of resources that need to be located together, start sequentially, and stop in the reverse order. To simplify this configuration, we support the concept of groups. Resources are started in the order they appear in a group and resources are stopped in the reverse order to which they appear in the group. So group is syntactic-sugar packaging primitive resource, colocation constraints and order constraints under one name (configured by pcs resource group)
- clones - Resources That Get Active on Multiple Hosts (A clone is basically a shortcut: instead of defining n identical, yet differently named resources, a single cloned resource suffices). Three types of cloned resources exist:
- Anonymous - Anonymous clones are the simplest. These behave completely identically everywhere they are running. Because of this, there can be only one copy of an anonymous clone active per machine.
- Globally unique - Globally unique clones are distinct entities. A copy of the clone running on one machine is not equivalent to another instance on another node, nor would any two copies on the same node be equivalent.
- Stateful clones - multi-state
- (configured by pcs resource clone)
- multi-state - Resources That Have Multiple Modes - are a specialization of clone resources. Multi-state resources allow the instances to be in one of two operating modes (called roles). The roles are called master and slave, but can mean whatever you wish them to mean. The only limitation is that when an instance is started, it must come up in the slave role (configured by pcs resource master).
- bundles - Pacemaker supports a special syntax for launching a container with any infrastructure it requires: the bundle (configured by pcs resource bundle).
- groups - One of the most common elements of a cluster is a set of resources that need to be located together, start sequentially, and stop in the reverse order. To simplify this configuration, we support the concept of groups. Resources are started in the order they appear in a group and resources are stopped in the reverse order to which they appear in the group. So group is syntactic-sugar packaging primitive resource, colocation constraints and order constraints under one name (configured by pcs resource group)
- clones - Resources That Get Active on Multiple Hosts (A clone is basically a shortcut: instead of defining n identical, yet differently named resources, a single cloned resource suffices). Three types of cloned resources exist:
- Anonymous - Anonymous clones are the simplest. These behave completely identically everywhere they are running. Because of this, there can be only one copy of an anonymous clone active per machine.
- Globally unique - Globally unique clones are distinct entities. A copy of the clone running on one machine is not equivalent to another instance on another node, nor would any two copies on the same node be equivalent.
- Stateful clones - multi-state
- (configured by pcs resource clone)
- multi-state - Resources That Have Multiple Modes - are a specialization of clone resources. Multi-state resources allow the instances to be in one of two operating modes (called roles). The roles are called master and slave, but can mean whatever you wish them to mean. The only limitation is that when an instance is started, it must come up in the slave role (configured by pcs resource master).
- bundles - Pacemaker supports a special syntax for launching a container with any infrastructure it requires: the bundle (configured by pcs resource bundle).
No comments:
Post a Comment