Example xrootd setup and configuration - SLAC Xrootd Output Buffer



Index


Simple Explanation of Skimming

The xrootd output buffer is employed in the skim production at SLAC. Skim production classifies events according to their physics properties and groups events with the same properties together. This allows users to efficiently access events of interest. For BaBar events are grouped into collections and each collection consists of one or more root files. The root files store the objects of the events in the collections. In this context transferring a collections means transferring the corresponding root files.

The following figure shows a simplified view of the skim productions.

*

Steps involved in the productions are:
  1. Read events from the read-only analysis xrootd cluster.
  2. Events are analyzed, classified and stored into local root files on the client machines.
  3. The root files are transfered to the output buffer using the xrdcp application.
  4. A new client is reading collections of the same skim type from the output buffer and merges them into a single collection. The files are stored on a local disk.
  5. The merged collection is transfered to the output buffer using xrdcp.
  6. After checking collections and ensuring that all collections exist they are renamed from being in the /prod name space to the /store name space, e.g.: /prod/PRskims/R18/18.1.2c/DsToMuNu/80/DsToMuNu_8080.01.root to /store/PRskims/R18/18.1.2c/DsToMuNu/80/DsToMuNu_8080.01.root. The rename is initiated by the client.
  7. Files that are in the /store name space on the output buffer are automatically migrated into HPSS. From HPSS these collections are transfered to the analysis Xrootd cluster upon a client request.


Xrootd Output Buffer

As shown in the previous section a xrootd system is need that allows reading and writing of data. The next figure shows the setup of the Xrootd Output Buffer. The output buffer exports two name spaces. /prod is used for most of the operations and is used only for production. All files that only exists only during the production are stored in the /prod name space. The final collections are moved into the /store name space which is the name space for all permanent BaBar collections which are available to the users.

*

(Note: all of this is an Internet Free Zone (IFZ) and hence not visible to the outside world.)

The Redirector(s)

The redirector is a critical element as all clients first connect to a redirector. Therefore two redirectors are used to provide some redundancy with DNS-style round-robin load balancing. All client jobs are configured to open files via "bbr-rdr-p":
   ~> nslookup  bbr-rdr-p
      Name:    bbr-rdr-p.slac.stanford.edu
      Addresses:  134.79.85.25, 134.79.85.26
the two addresses correspond to bbr-rdr05 and bbr-rdr06. The redirector machines are:

Data Server

Currently eleven data servers are used. Each data server has 4 files systems except 2 servers which have 8 file systems (example for one server):
Filesystem            kbytes    used       avail capacity  Mounted on
/dev/dsk/c2t1d0s6    845881102 552257435 285164856    66%    /kanga
/dev/dsk/c2t1d1s6    845881102 470388145 367034146    57%    /kanga/cache1
/dev/dsk/c3t1d1s6    610449718 519489608 84855613     86%    /kanga/cache3
/dev/dsk/c3t1d0s6    610449718 514313627 90031594     86%    /kanga/cache2

Xrootd Configuration

The configurations files for the redirectors and data servers are:
Redirector config
Data server config

The main configuration of the xrootd are:

/prod and /store exported and writable

The /prod name space has to be writable as files are written to it. /store also has to be writable in order to rename the final files from /prod to /store

Migratable cache system

A cache system is used on the data servers. The cache is marked as migratable. Files transfered into xrootd are put into the cache system and a lock file is created. The time stamp of the file is more recent then the one of the lock file which means that a files is eligible for migration, e.g.:

    ~> ls -lL /kanga/prod/store/PRskims/R18/18.0.4/Kll/65/Kll_6524.01.root*
       -rw-rw-r--   1 bbdatsrv bfactory 357081984 Jun  7 12:07 /kanga/prod/store/PRskims/R18/18.0.4/Kll/65/Kll_6524.01.root
       -rw-------   1 bbdatsrv bfactory       0 Jun  7 12:05 /kanga/prod/store/PRskims/R18/18.0.4/Kll/65/Kll_6524.01.root.lock

No mass storage system

The open storage system (oss) section of xrootd was configured not to use a mass storage systems. This means that xrootd is not able to stage files from HPSS to disk (there is no need for this) and it also doesn't check if a file is already in HPSS.

Checksum enabled

The checksum of a file can be obtained via xrootd. The xrootd.chksum directive is used to configure the checksum. The scripts that calculates the checksum must exists before starting the xrootd server otherwise it wouldn't startup.

Comments


wilko
Last modified: Fri Feb 17 07:32:04 PST 2006