Scalla: Extended Features Supplement

Migrating to the

Cluster Management Service

(olbd to cmsd)

 

 

Andrew Hanushevsky

Stanford Linear Accelerator Center

8-January-2008


 

 

 

 

 

 

 

 

 

 

 

 

 

 

                                                                           

 

 

 

 

 

 

 

 

 

Scalla: Structured Cluster Architecture for Low Latency Access

©2003-2008 by the Board of Trustees of the Leland Stanford, Jr., University

All Rights Reserved

Produced under contract DE-AC02-76-SFO0515 with the Department of Energy

This code is available under a BSD-style license allowing minimally restricted use.

 



1         Introduction

 

This document describes how to migrate to the Cluster Management Service (cms) for installation currently running the Open Load Balancing system (olb). The cms, which includes cmsd (the daemon) and an integrated cmsd client in the Open File System (ofs) layer of xrootd is a functional replacement of the olbd and its ofs counterpart, Open Distributed Clustering (odc) component.

 

The cmsd is the next generation version of the olbd and provides enhanced capability along with much lower latency and increased throughput. Some of the features provided by the cmsd not present in the olbd are:

·        Complete support for opaque information allowing for consistent file handling across a cluster,

·        Improved fault detection algorithms to avoid false error notification and speed true error recovery,

·        Superior request specificity so that requests to locate and prepare files are more timely,

·        Effective use of low latency objects to further reduce overhead and increase throughput,

·        Enhanced tracing so that xrootd and cmsd events can be easily correlated, and

·        A solid extensible platform to effectively incorporate new features.

 

Current versions of xrootd are compatible with cmsd and olbd, but not both in the same cluster. Hence, you must either run olbd on every node or cmsd on every node. Mixed configurations are not supported.

 

In order to ease migration, the cmsd recognizes all non-deprecated olbd configuration file directives. Generally, you need not change the configuration file to run either system as long as the configuration file does not contain one of the following directives: olb.apath, odc.apath, olb.path, or olb.port.

 

The “apath” directive must be replaced by the all.adminpath directive. The olb.path directive must be replaced using a combination of the oss.defaults and all.export directives. The olb.port directive no longer applies as that information is contained in the all.manager directive. Current versions of the olbd accept these replacements hence the configuration file will still be compatible after deprecated directives are removed.

 


1.1       Recommended Migration Sequence

 

1)      Review the configuration file and replace deprecated options:

a.      replace olb.apath with all.adminpath

b.      replace olb.path with oss.defaults and all.export

c.       remove the olb.port directive

2)      Review the configuration file for the StartCMS script (i.e., StartCMS.cf). It is more than likely that the same options you used for the StartOLB.cf file can be used for the StartCMS.cf file.

3)      Restart the olbd using the changed configuration file to make sure that it still comes up in the same way as before. This will give you a back-out strategy.

4)      Start a test cmsd with the configuration file to make sure you have made any mistakes. This can be done on any machine. Since it is likely that roles are tied to machine names, you can easily test the configuration file for each possible role using command line options to over-ride the configuration file:

a.      Manager role test:            cmsd –m –c config_file

b.      Supervisor role test:         cmsd –m –s –c config_file

c.       Server role test:                cmsd –s –c config_file

5)      Install the latest version of xrootd and cmsd on all of your machines. Only versions of xrootd that include cmsd are compatible with cmsd.

6)      You may now start killing the old servers and starting the new servers. It is important that you switch over using the order described below.

a.      Restart your redirector nodes. Kill both the xrootd and olbd then restart the new xrootd and cmsd (order is not important).

b.      Restart your supervisor nodes, if any, in the same manner as above.

c.       Restart your server nodes in the same manner as above.

 

1.2       Recommended Back-Out Strategy

 

Should you run into problems, you may revert back to using the olbd. The new xrootd will recognize the olbd and revert to using the old protocol. However, once an xrootd has reverted is cannot recognize a cmsd. This means that you will need to restart the xrootd should you wish to switch back to running cmsd.