
Scalla: Extended Features Supplement
Migrating to the
Cluster Management Service
(olbd to cmsd)
Andrew Hanushevsky
Stanford Linear
8-January-2008
Scalla: Structured Cluster Architecture for Low Latency Access
©2003-2008 by the Board of Trustees of the Leland Stanford,
Jr., University
All Rights Reserved
Produced under contract DE-AC02-76-SFO0515 with the Department
of Energy
This code is available under a BSD-style
license allowing minimally restricted use.
This
document describes how to migrate to the Cluster Management Service (cms) for installation currently running the Open Load
Balancing system (olb). The cms, which includes cmsd
(the daemon) and an integrated cmsd
client in the Open File System (ofs) layer of xrootd is a functional replacement of
the olbd and its ofs counterpart, Open Distributed Clustering (odc) component.
The
cmsd is the next generation version
of the olbd and provides enhanced
capability along with much lower latency and increased throughput. Some of the
features provided by the cmsd not
present in the olbd are:
·
Complete
support for opaque information allowing for consistent file handling across a
cluster,
·
Improved
fault detection algorithms to avoid false error notification and speed true
error recovery,
·
Superior
request specificity so that requests to locate and prepare files are more
timely,
·
Effective
use of low latency objects to further reduce overhead and increase throughput,
·
Enhanced
tracing so that xrootd and cmsd events can be easily correlated,
and
·
A
solid extensible platform to effectively incorporate new features.
Current
versions of xrootd are compatible
with cmsd and olbd, but not both in the
same cluster. Hence, you must either run olbd
on every node or cmsd on every node.
Mixed configurations are not
supported.
In
order to ease migration, the cmsd
recognizes all non-deprecated olbd
configuration file directives. Generally, you need not change the configuration
file to run either system as long as the configuration file does not contain one of the following
directives: olb.apath, odc.apath, olb.path, or olb.port.
The
“apath” directive must be replaced
by the all.adminpath directive. The olb.path directive must be replaced
using a combination of the oss.defaults
and all.export directives. The olb.port directive no longer applies as
that information is contained in the all.manager
directive. Current versions of the olbd
accept these replacements hence the configuration file will still be compatible
after deprecated directives are removed.
1) Review the configuration file and replace deprecated options:
a. replace olb.apath with all.adminpath
b. replace olb.path with oss.defaults and all.export
c. remove the olb.port directive
2) Review the configuration file for the StartCMS script (i.e., StartCMS.cf). It is more than likely that the same options you used for the StartOLB.cf file can be used for the StartCMS.cf file.
3) Restart the olbd using the changed configuration file to make sure that it still comes up in the same way as before. This will give you a back-out strategy.
4) Start a test cmsd with the configuration file to make sure you have made any mistakes. This can be done on any machine. Since it is likely that roles are tied to machine names, you can easily test the configuration file for each possible role using command line options to over-ride the configuration file:
a. Manager role test: cmsd –m –c config_file
b. Supervisor role test: cmsd –m –s –c config_file
c. Server role test: cmsd –s –c config_file
5) Install the latest version of xrootd and cmsd on all of your machines. Only versions of xrootd that include cmsd are compatible with cmsd.
6) You may now start killing the old servers and starting the new servers. It is important that you switch over using the order described below.
a. Restart your redirector nodes. Kill both the xrootd and olbd then restart the new xrootd and cmsd (order is not important).
b. Restart your supervisor nodes, if any, in the same manner as above.
c. Restart your server nodes in the same manner as above.
Should you run into problems, you may revert back to using the olbd. The new xrootd will recognize the olbd and revert to using the old protocol. However, once an xrootd has reverted is cannot recognize a cmsd. This means that you will need to restart the xrootd should you wish to switch back to running cmsd.