23.07.2014 Views

Lustre 1.6 Operations Manual

Lustre 1.6 Operations Manual

Lustre 1.6 Operations Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Information on the Socket LND (socklnd) protocol<br />

<strong>Lustre</strong> layers the socket LND (socklnd) protocol above TCP/IP. The first message<br />

sent on the TCP/IP bytestream is HELLO, which is used to negotiate connection<br />

attributes. The protocol version is determined by looking at the first 4+4 bytes of the<br />

hello message, which contain a magic number and the protocol version<br />

In KSOCK_PROTO_V1, the hello message is an lnet_hdr_t of type<br />

LNET_MSG_HELLO, with the dest_nid (Destination Server/Machine) replaced by<br />

net_magicversion_t. This is followed by 'payload_length' bytes of IP addresses (each<br />

4 bytes) which list the interfaces that the sending socklnd owns. The whole message<br />

is sent in little-endian (LE) byte order. There is no socklnd level V1 protocol after the<br />

initial HELLO meaning everything that follows is unencapsulated LNET messages.<br />

In KSOCK_PROTO_V2, the hello message is a ksock_hello_msg_t. The whole<br />

message is sent in byte order of sender and the bytesex of 'kshm_magic' is used on<br />

arrival to determine if the receiver needs to flip. From then on, every message is a<br />

ksock_msg_t also sent in the byte order by sender. This either encapsulates an LNET<br />

message (ksm_type == KSOCK_MSG_LNET) or is a NOOP. Every message includes<br />

zero-copy request and ACK cookies in every message so that a zero-copy sender can<br />

determine when the source buffer can be released without resorting to a kernel<br />

patch. The NOOP is provided for delivering a zero-copy ACK when there is no<br />

LNET message to back it on.<br />

Note that socklnd may connect to its peers via a "bundle" of sockets - one for<br />

bidirectional "ping-pong" data and the other two for unidirectional bulk data.<br />

However the message protocol on every socket is as described earlier.<br />

D-28 <strong>Lustre</strong> <strong>1.6</strong> <strong>Operations</strong> <strong>Manual</strong> • September 2008

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!