The xrootd Protocol

Version 2.7.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Andrew Hanushevsky

Stanford Linear Accelerator Center

25-Jan-2007


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

©2004-2007 by the Board of Trustees of the Leland Stanford, Jr., University

All Rights Reserved

Produced under contract DE-AC02-76-SFO0515 with the Department of Energy

This code is available under a BSD-style license allowing minimally restricted use.

 


1         Contents

1     Contents. 3

2     Request/Response Protocol 5

2.1      Format of Client-Server Initial Handshake. 5

2.2      Data Serialization. 6

2.3      Client Request Format 9

2.3.1       Valid Client Requests. 10

2.3.2       Valid Client Paths. 11

2.3.3       Client Recovery From Server Failures. 11

2.3.4       Client Recovery From File Location Failures. 12

2.4      Server Response Format 13

2.4.1       Valid Server Response Status Codes. 14

2.4.2       Server kXR_attn Response Format 15

2.4.2.1      Server kXR_attn Response for kXR_asyncab Client Action. 17

2.4.2.2      Server kXR_attn Response for kXR_asyncdi Client Action. 18

2.4.2.3      Server kXR_attn Response for kXR_asyncgo Client Action. 19

2.4.2.4      Server kXR_attn Response for kXR_asyncms Client Action. 20

2.4.2.5      Server kXR_attn Response for kXR_asyncrd Client Action. 21

2.4.2.6      Server kXR_attn Response for kXR_asynresp Client Action. 23

2.4.2.7      Server kXR_attn Response for kXR_asyncwt Client Action. 24

2.4.3       Server kXR_authmore Response Format 25

2.4.4       Server kXR_error Response Format 26

2.4.4.1      Server kXR_error Sub-Codes. 27

2.4.5       Server kXR_ok Response Format 28

2.4.6       Server kXR_oksofar Response Format 29

2.4.7       Server kXR_redirect Response Format 30

2.4.8       Server kXR_wait Response Format 32

2.4.9       Server kXR_waitresp Response Format 33

3     Detailed Protocol Specifications. 35

3.1      kXR_admin Request 35

3.2      kXR_auth Request 36

3.3      kXR_bind Request 37

3.4      kXR_chmod Request 38

3.5      kXR_close Request 39

3.6      kXR_dirlist Request 41

3.7      kXR_endsess Request 43

3.8      kXR_getfile Request 45

3.8.1       Multi-Stream File Retrieval 46

3.9      kXR_login Request 47

3.10     kXR_mkdir Request 51

3.11     kXR_mv Request 53

3.12     kXR_open Request 55

3.12.1      Passing Opaque Information. 58

3.13     kXR_ping Request 59

3.14     kXR_prepare Request 61

3.15     kXR_protocol Request 63

3.16     kXR_putfile Request 64

3.16.1      Multi-Stream File Storage. 65

3.17     kXR_query Request 67

3.17.1      KXR_query Checksum Cancellation Request 69

3.17.2      KXR_query Checksum Request 71

3.17.3      KXR_query Configuration Request 73

3.17.4      KXR_query Statistics Request 75

3.18     kXR_read Request 85

3.19     kXR_readv Request 89

3.20     kXR_rm Request 91

3.21     kXR_rmdir Request 92

3.22     kXR_set Request 93

3.22.1      Valid kXR_Set Values. 94

3.23     kXR_stat Request 95

3.24     kXR_statx Request 97

3.25     kXR_sync Request 99

3.26     kXR_unbind Request 100

3.27     kXR_write Request 101

4     Local Socket Administrative Protocol 103

4.1      Initiating an Administrative Session. 103

4.2      General Request Format 103

4.2.1       Request Target Format 104

4.2.1.1      Connection name format 104

4.3      General Response Format 105

4.3.1       Error Response Format 105

4.4      Abort request for kXR_asyncab Client Action. 106

4.5      Close request 107

4.6      cj request 108

4.7      Cont request for kXR_asyncgo Client Action. 109

4.8      Disc request for kXR_asyncdi Client Action. 110

4.9      Login request (mandatory) 111

4.10     Lsc request 112

4.11     Lsd request 113

4.12     Lsj request 115

4.13     Msg request for kXR_asyncms Client Action. 116

4.14     Pause request for kXR_asyncwt Client Action. 117

4.15     Redirect request for kXR_asyncrd Client Action. 118

5     Document Change History. 119

 


2         Request/Response Protocol

2.1       Format of Client-Server Initial Handshake

 

When a client first connects to the XRootd server, it must perform a special handshake. This handshake will determine whether the client is communicating with an XRootd server or a rootd server.

 

The handshake consists of the client sending 20 bytes, as follows:

 

 

            kXR_int32    0

     kXR_int32    0

     kXR_int32    0

     kXR_int32    4 (network byte order)

            kXR_int32  2012 (network byte order)

 

 

The first twelve bytes are zero. The next eight bytes correspond to a standard rootd server protocol request (i.e., kROOTD_PROTOCOL). Both, rootd and XRootd, servers will respond, as follows:

 

 

          rootd Response                                              XRootd Response

                                  streamid: kXR_char  smid[2]

                                     status:  kXR_unt16    0

      msglen:  kXR_int32     8          msglen:  kXR_int32 rlen

  msgtype:  kXR_int32     2012          msgval1: kXR_int32 pval

      msgval:  kXR_int32  pval          msgval2: kXR_int32 flag

 

 

Where:

 

smid     is the initial streamid. The smid for the initial response is always two null characters (i.e., ‘\0’);

 

rlen      is the binary response length (e.g., 8 for the indicated response).

 

pval     is the binary protocol version number.

 

flag      is additional bit-encoded information about the server; as follows:

            kXR_DataServer - This is a data server.

            KXR_LBalServer - This is a load-balancing server.

 

Notes

1)      All binary fields are transmitted in network byte order using an explicit length. The kXR_char and kXR_unt16 data types are treated as unsigned values. All reserved fields must be initialized to binary zero.

2)      The first four bytes of the reply determine whether a client is communicating with rootd (has a value of 8) or XRootd (has a value of 0).

3)      All twenty bytes must be received by the server at one time. All known TCP implementations will guarantee that the first message is sent intact if all twenty bytes are sent in a single system call. Using multiple system calls for the first message may cause unpredictable results.

2.2       Data Serialization

 

All data sent and received is serialized (i.e., marshaled) in three ways:

1.      Bytes are sent unaligned without any padding,

2.      Data type characteristics are predefined (see table below), and

3.      All integer quantities are sent in network byte order (i.e, big endian).

 

XRootd Type

Sign

Bit Length

Bit Alignment

Typical Host Type

kXR_char8

unsigned

 8

 8

unsigned char

kXR_unt16

unsigned

16

16

unsigned short

kXR_int32

  signed

32

32

long[1]

kXR_int64

  signed

64

64

long long

Table 1: XRootd Protocol Data Types

Network byte order is defined by the Unix htons() and htonl() macros for host to network short and host to network long, respectively. The reverse is defined by the ntohs() and ntohl() macros. Many systems do not define the long long versions of these macros. XRootd protocol requires that the POSIX version of long long serialization be used, as defined in the following figures. The OS-dependent isLittleEndian() function returns true if the underlying hardware using little endian integer representation.


 

 

unsigned long long htonll(unsigned long long x)

       {unsigned long long ret_val;

        if (isLittleEndian())

          {*( (unsigned long *)(&ret_val) + 1) =

                     htonl(*(  (unsigned long *)(&x)));

           *(((unsigned long *)(&ret_val))) =

                     htonl(*( ((unsigned long *)(&x))+1) );

           } else {

           *( (unsigned long *)(&ret_val)) =

                     htonl(*(  (unsigned long *)(&x)));

           *(((unsigned long *)(&ret_val)) + 1) =

                     htonl(*( ((unsigned long *)(&x))+1) );

           }

       return ret_val;

      };

 

Figure 1: POSIX Host to Network Byte Order Serialization

 
 

unsigned long long ntohll(unsigned long long x)

       {unsigned long long ret_val;

        if (isLittleEndian())

           {*( (unsigned long *)(&ret_val) + 1) =

                            ntohl(*( (unsigned long *)(&x)));

            *(((unsigned long *)(&ret_val))) =

                            ntohl(*(((unsigned long *)(&x))+1));

           } else {

            *( (unsigned long *)(&ret_val)) =

                            ntohl(*( (unsigned long*)(&x)));

            *(((unsigned long*)(&ret_val)) + 1) =

                            ntohl(*(((unsigned long*)(&x))+1));

           }

        return ret_val;

       };

 

Figure 2: POSIX Network to Host Byte Order Serialization

 


More compact and efficient, though OS restricted (i.e., Solaris and Linux), versions of 64-bit network byte ordering routines are given in the following figure.

 

 
#if defined(__sparc) || __BYTE_ORDER==__BIG_ENDIAN
#ifndef htonll
#define htonll(x) x
#endif
#ifndef ntohll
#define ntohll(x) x
#endif
#else
#ifndef htonll
#define htonll(x) __bswap_64(x)
#endif
#ifndef ntohll
#define ntohll(x) __bswap_64(x)
#endif

 

Figure 3: Network and Host Byte Ordering Macros


2.3       Client Request Format

 

Requests sent to the server are a mixture of ASCII and binary. All requests, other than the initial handshake request, have the same format, as follows:

 

 

     kXR_char  streamid[2]

     kXR_unt16 requestid

     kXR_char  parms[16]

     kXR_int32 dlen

     kXR_char data[dlen]

 

 

Where:

 

streamid

            is the binary identifier that is associated with this request stream. This identifier will be echoed along with any response to the request.

 

requestid

            is the binary identifier of the operation to be performed by the server.

 

parms  are parameters specific to the requestid.

 

dlen     is the binary length of the data portion of the message. If no data is present, then the value is zero.

 

data     are data specific to the requestid. Not all requests have associated data. If the request does have data, the length of this field is recorded in the dlen field.

 

Notes

1)      All binary fields are transmitted in network byte order using an explicit length. The kXR_char and kXR_unt16 data types are treated as unsigned values. All reserved fields must be initialized to binary zero.

2)      All XRootd client requests consist of a standard 24-byte fixed length message. The 24-byte header may then be optionally followed by request specific data.

3)      Stream id’s are arbitrary and are assigned by the client. Typically these id’s correspond to logical connections multiplexed over a physical connection established to a particular server.


4)      The client may send any number of requests to the same server. The order in which requests are performed is undefined. Therefore, each request should have a different streamid so that returned results may be paired up with associated requests.

5)      Requests sent by a client over a single physical connection may be processed in an arbitrary order. Therefore the client is responsible for serializing requests, as needed.

 

2.3.1        Valid Client Requests

 

The following table lists all possible requests and their arguments. Grayed rows represent requests that are not currently supported.

 

Requestid

Login?

Auth?

Redirect?

Arguments

kXR_admin

yes

yes

no

args

kXR_auth

y

n

n

authtype, authinfo

KXR_bind

n

n

n

sessid

kXR_chmod

y

y

yes

mode, path

kXR_close

y

y

n

fdnum

KXR_dirlist

y

y

y

path

KXR_endsess

y

y

n

sessid

kXR_getfile

y

y

y*

path

kXR_login

n

n

n

userid, token

kXR_ls

y

y

y

options path

kXR_mkdir

y

y

y

mode, path

kXR_mv

y

y

y

old_name, new_name

kXR_open

y

y

y*

mode, flags, path

kXR_ping

y

n

n

 

kXR_prepare

y

y

n

paths

kXR_protocol

n

n

n

 

kXR_putfile

y

y

y*

mode, flags, path

kXR_query

y

y

n

args

kXR_read

y

y

y

fdnum, length, offset

kXR_rm

y

y

y

path

kXR_rmdir

y

y

y

path

kXR_set

y

y

n

info

kXR_stat

y

y

y

path

kXR_statx

y

y

n

pathlist

kXR_write

y

y

y

fdnum, length, offset, data

Table 2: Valid Client Requests

*


2.3.2        Valid Client Paths

 

The XRootd server accepts only absolute paths where a path may be specified. Relative paths must be resolved by the client interface prior to sending them to XRootd. This means that the interface must handle a virtual “current working directory” to resolve relative paths should they arise.

 

Path names are restricted to the following set of characters:

 

In general, paths may not contain shell meta-characters or imbedded spaces.

 

2.3.3        Client Recovery From Server Failures

 

A server failure should be recognized when the server unexpectedly closes it’s TCP/IP connection. Should this happen, the client may recover all operations by treating the termination of the connection as a redirection request (see page 30) to the initial XRootd server for all streams associated with the closed TCP/IP connections.

 

Because many clients are likely to be affected by a server failure, it is important that clients pace their reconnection to the initial XRootd server. One effective way to do this is to use the last three bits of the client’s IP address as the number of seconds to wait before attempting a reconnection. It is up to the client to determine either the number of times or the time window in which reconnections should be attempted before failure is declared. Typical values are 16 attempts or 3 minutes, whichever is longer.

 

Note that it may not be possible to recover in this way for files that were opened in update mode. Clients who do not provide proper transactional support generally cannot recover via redirection for any read/write resources.

 


2.3.4        Client Recovery From File Location Failures

 

A file location failure should be recognized when a server returns kXR_error status code with a kXR_NotFound error code  and is either a load balancing server or the target of a redirect. The recovery steps are:

  1. The client should contact the last load balancer used.
  2. For kXR_Open requests, the request should be re-issued with the kXR_refresh option set. For other requests, no recovery is currently possible.
  3. If the same result is encountered again, the client should consider the file missing and not attempt any further recovery actions.

 


2.4       Server Response Format

 

All responses, including the initial handshake response, have the same format, as follows:

 

 

     kXR_char  streamid[2]<