HP ServiceGuard

HP MC SERVICEGUARD
HP serviceguard is high availability clustering software from HP. It allows us to configure HA with multiple packages among clustered nodes. The document describes configuration of HP cluster in Linux machines.
CONFIGURATION OF THE CLUSTER
The system administrator sets up cluster configuration parameters and does an initial cluster startup; thereafter, the cluster regulates itself without manual intervention in normal operation. Configuration parameters for the cluster include the cluster name and nodes,
networking parameters for the cluster heartbeat, cluster lock information, and timing parameters. Cluster parameters are entered by editing the
cluster ASCII configuration file The parameters you enter are used to build a binary configuration file which is propagated to all nodes in the cluster. This binary cluster configuration file must be the same on all the nodes in the cluster.
SSH CONFIGURATION
HP recommends that after installing Linux, you enable ssh, and use it for remote login (instead of rlogin, remsh, or telnet), and the related command scp for remote copy. ssh and scp encrypt passwords before transmitting them, whereas rlogin, rcp, etc., do not.
By default the root login will be disabled in ssh configuration file, and we have to allow root user to access ssh by editing configuration file as follows.
lxdbsqm0002c01:~ # vi /etc/ssh/sshd_config
PermitRootLogin yes
Then create dsa/rsa key pair and export the public key to remote server for secure authentication, which will allow user to access remote system without giving login password.
Setting up the Quorum Server
If you are using a quorum server for tie-breaking, the quorum server software has to be running during cluster configuration on a system other than the nodes on which your cluster will be running.
It is recommended that the node on which the QS is running be in the same subnet as the clusters for which it is providing services. This will help prevent any network delays which could affect quorum server operation. If you use a different subnet, you may experience network delays which may cause quorum server timeouts. To prevent these timeouts, you can use the QS_TIMEOUT_EXTENSION parameter in the cluster ASCII file to increase the quorum server timeout interval.
The quorum server executable file, qs, is installed in the /opt/qs/bin for SuSE. When the installation is complete, you need to create an authorization file on the server where the QS will be running to allow specific host systems to obtain quorum services. The required pathname for this file is /opt/qs/conf/qs_authfile for SuSE. Add to the file the names of all cluster nodes that will access cluster services from this quorum server. Enter one node per line, or enter “+” (followed by a CR) to allow access by all nodes. Use one line per node, as in the following example:
lxdbsqm0001c01:~ # cat /opt/qs/conf/qs_authfile
lxdbsqm0002c01
lxdbsqm0002c02
The quorum server reads the authorization file at startup. Whenever you
modify the file qs_authfile, run the following command to force a
re-read of the file
# /opt/qs/bin/qs -update
Running the Quorum Server
The quorum server must be running during the following cluster
operations:
• when the cmquerycl command is issued.
• when the cmapplyconf command is issued.
• when there is a cluster re-formation.
By default, quorum server run-time messages go to stdout and stderr.
It is suggested that you capture these messages by redirecting stdout
and stderr to the file /var/log/qs/qs.log.
You must have root permission to execute the quorum server.
Creating a Package for the Quorum Server
You can run the Quorum Server as a package in another cluster. In fact,
a QS package running on one cluster can provide quorum services for
any number of other clusters. You can add the Quorum Server to an
existing cluster by creating a package with QS as the monitored service.
Use the following procedure:
1. Install the Quorum Server software on all nodes, as described above.
2. In the configuration directory ($SGCONF), create a subdirectory for
the QS package, then change into it:
# mkdir qs-pkg
# cd qs-pkg
3. Create a package ASCII file by using the cmmakepkg command:
# cmmakepkg -P qs-pkg.config
4. Edit the file using the parameters in the following table.
Parameter Value
PACKAGE_NAME qs-pkg
PACKAGE_TYPE FAILOVER
FAILOVER_POLICY CONFIGURED_NODE
FAILBACK_POLICY MANUAL
NODE_NAME *
AUTO_RUN YES
LOCAL_LAN_FAILOVER_ALLOWED YES
NODE_FAIL_FAST_ENABLED NO
RUN_SCRIPT $SGCONF/qs-pkg/qs-pkg.ctl
RUN_SCRIPT_TIMEOUT NO_TIMEOUT
HALT_SCRIPT $SGCONF/qs-pkg/qs-pkg.ctl
HALT_SCRIPT_TIMEOUT NO_TIMEOUT
SERVICE_NAME qs
SERVICE_FAIL_FAST_ENABLED NO
SERVICE_HALT_TIMEOUT 10
SUBNET Specify your subnet here.
Create a control script in the same directory:
# cmmakepkg -s qs-pkg.ctl
6. Edit the file using the parameters in the following table.
Parameter Value
IP IP address to be used when accessing the Quorum Server
SUBNET Specify your subnet here
SERVICE_NAME “qs”
SERVICE_CMD
SuSE: “/opt/qs/bin/qs >> /var/log/qs/qs.log 2>&1”
SERVICE_RESTART “-R”
Run the cluster and start the Quorum Server package.
SERVICEGUARD DAEMONS
The following daemon processes are associated with Serviceguard for Linux:
• cmclconfd—Configuration Daemon
• cmcld—Cluster Daemon
• cmlogd—Cluster System Log Daemon
• cmlockdiskd—Cluster Lock LUN Daemon
• cmomd—Cluster Object Manager Daemon
• cmsrvassistd—Service Assistant Daemon
• cmresmond—Resource Monitor Daemon
• qs—Quorum Server Daemon
Each of these daemons logs to the Linux system logging files. The quorum server daemon logs to the user specified log file, such as, /usr/local/qs/log/qs.log file on Red Hat or /var/log/qs/sq.log on SuSE and cmomd logs to /usr/local/cmom/log/cmomd.log on Red Hat or /var/log/cmom/log/cmomd.log on SuSE.
PORT REQUIREMENTS
Serviceguard uses the ports listed below. Before installing, check /etc/services and be sure no other program has reserved these ports.
• discard 9/udp Discard
• hacl-qs 1238/tcp HA Quorum Server
• hacl-hb 5300/tcp High Availability (HA) Cluster heartbeat
• hacl-hb 5300/udp High Availability (HA) Cluster heartbeat
• hacl-gs 5301/tcp HA Cluster General Services
• hacl-cfg 5302/tcp HA Cluster TCP configuration
• hacl-cfg 5302/udp HA Cluster UDP configuration
• hacl-probe 5303/tcp HA Cluster TCP probe
• hacl-probe 5303/udp HA Cluster UDP probe
• hacl-local 5304/tcp HA Cluster commands
• hacl-test 5305/tcp HA Cluster test
The ports reserved for authentication are also used by Serviceguard:
• auth 113/tcp authentication
• auth 113/udp authentication
In addition, Serviceguard also uses dynamic ports (typically in the range
49152-65535) for some cluster services. If you have adjusted the dynamic
port range using kernel tunable parameters alter your rules accordingly.
NTP CONFIGURATION
Before configuring your cluster, ensure that all cluster nodes possess the appropriate security files, kernel configuration and NTP (network time protocol) configuration.
Edit the /etc/ntp.conf file and add the NTP server details there.
Server <servername>
Use the ntpq command to see the servers with which you are synchronized. It provided you with a list of configured time servers and the delay, offset and jitter that your server is experiencing with them. For correct synchronization, the delay and offset values should be non-zero and the jitter value should be under 100.
[root@lxdbsqm0002c01 tmp]# ntpq -p
CONFIGURING THE CLUSTER
Use the cmquerycl command to specify a set of nodes to be included in the cluster and to generate a template for the cluster configuration file.
If you will be using a lock LUN, be sure to specify the -L lock_lun_device option with the cmquerycl command. If the name of the device is the same on all nodes, enter the option before the node names, as in the following example:
cmquerycl -L /dev/dsk/c0t0d0s0 -v -C /etc/cmcluster/cluster.conf -n nodea -n nodeb
Edit the filled-in cluster characteristics as needed to define the desired cluster. It is strongly recommended that you edit the file to send heartbeat over all possible networks.
VERIFYING THE CLUSTER CONFIGURATION
If you have edited an ASCII cluster configuration file, use the following command to verify the content of the file:
# cmcheckconf -v -C $SGCONF/sg_cluster.config
This command checks the following:
• Network addresses and connections.
• All lock LUN device names on all nodes refer to the same physical disk area.
• One and only one lock LUN device is specified per node.
• A quorum server or lock LUN is configured, but not both.
• Uniqueness of names.
• Existence and permission of scripts specified in the command line.
• If all nodes specified are in the same heartbeat subnet.
• If you specify the wrong configuration filename.
• If all nodes can be accessed.
• No more than one CLUSTER_NAME, HEARTBEAT_INTERVAL, and AUTO_START_TIMEOUT are specified.
• The value for package run and halt script timeouts is less than 4294 seconds.
• The value for HEARTBEAT_INTERVAL is at least one second.
• The value for NODE_TIMEOUT is at least twice the value of HEARTBEAT_INTERVAL.
• The value for AUTO_START_TIMEOUT variables is >=0.
• Heartbeat network minimum requirement. The cluster must have one heartbeat LAN that is configured as a link aggregate of at least two interfaces.
• At least one NODE_NAME is specified.
• Each node is connected to each heartbeat network.
• All heartbeat networks are of the same type of LAN. The network interface device files specified are valid LAN device files.
• Other configuration parameters for the cluster and packages are valid.
If the cluster is online or the cmcheckconf command also verifies that all the conditions for the specific change in configuration have been met.
DISTRIBUTING THE BINARY CONFIGURATION FILE
After specifying all cluster parameters, you use the cmapplyconf command to apply the configuration. This action distributes the binary configuration file to all the nodes in the cluster. We recommend doing this separately before you configure packages. In this way, you can verify the quorum server, heartbeat networks, and other cluster-level operations by using the cmviewcl command on the running cluster. Before distributing the configuration, ensure that your security files permit copying among the cluster nodes.
The following command distributes the binary configuration file:
# cmapplyconf -v -C /etc/cmcluster/cluster.conf
CHECKING CLUSTER OPERATION WITH SERVICEGUARD COMMANDS
Serviceguard also provides several commands for control of the cluster:
• cmviewcl checks status of the cluster and many of its components. A non-root user with the role of Monitor can run this command from a cluster node or see status information in Serviceguard Manager.
• cmrunnode is used to start a node. A non-root user with the role of Full Admin, can run this command from a cluster node or through Serviceguard Manager.
cmhaltnode is used to manually stop a running node. A non-root with the role of Full Admin can run this command from a cluster node or through Serviceguard Manager.
• cmruncl is used to manually start a stopped cluster. A non-root user with Full Admin access can run this command from a cluster node, or through Serviceguard Manager.
• cmhaltcl is used to manually stop a cluster. A non-root user with Full Admin access, can run this command from a cluster node or through Serviceguard Manager.
Creating the Package Configuration File
Use the following procedure to create packages by editing and processing a package configuration file.
1. First, create a subdirectory for each package you are configuring in the $SGCONF directory:
# mkdir $SGCONF/samba
You can use any directory names you wish.
2. Next, generate a package configuration template for the package:
# cmmakepkg -p $SGCONF/samba/samba.config
You can use any file names you wish for the ASCII templates.
3. Edit these template files to specify package name, prioritized list of nodes, the location of the control script, and failover parameters for each package. Include the data recorded on the Package Configuration Worksheet.
Configuring in Stages
It is recommended to configure packages on the cluster in stages, as follows:
1. Configure volume groups and mount points only.
2. Apply the configuration.
3. Distribute the control script to all nodes.
4. Run the package and ensure that it can be moved from node to node.
5. Halt the package.
6. Configure package IP addresses and application services in the control script.
7. Distribute the control script to all nodes.
8. Run the package and ensure that applications run as expected and that the package fails over correctly when services are disrupted.
Writing the Package Control Script
The package control script contains all the information necessary to run all the services in the package, monitor them during operation, react to a failure, and halt the package when necessary. Each package must have a separate control script, which must be executable. The control script is placed in the package directory and is given the same name as specified in the RUN_SCRIPT and HALT_SCRIPT parameters in the package ASCII configuration file. For security reasons, the control script must reside in a directory with the string cmcluster in the path.
The package control script template contains both the run instructions and the halt instructions for the package. You can use a single script for both run and halt operations, or, if you wish, you can create separate scripts. After completing the script, you must propagate it to all the nodes.
Use the following procedure to create a control script for the sample package samba.
First, generate a control script template:
# cmmakepkg -s $SGCONF/samba/samba.sh
You may customize the script,
Verifying the Package Configuration
If you have edited an ASCII package configuration file, use the following command to verify the content of the file:
# cmcheckconf -v -P $SGCONF/samba/samba.config
APPLYING AND DISTRIBUTING THE CONFIGURATION
Use the cmapplyconf command to apply and distribute a binary cluster configuration file containing the package configuration among the nodes of the cluster. Example:
# cmapplyconf -v -C $SGCONF/sg_cluster.config -P $SGCONF/samba/samba.config
The cmapplyconf command creates a binary cluster configuration database file and distributes it to all nodes in the cluster. This action ensures that the contents of the file are consistent across all nodes.
The cmrunpkg will start the package in specified node.
Now the cluster with package is ready. Try to ping to cluster IP, which will move to econdary node when one node is going down.
 
< Prev   Next >