From: minima Date: Thu, 19 Aug 2004 16:32:27 +0000 (+0000) Subject: add the techdoc section, starting with the protocol document for the new X-Git-Tag: R_1_51B~14 X-Git-Url: http://www.dxcluster.org/gitweb/gitweb.cgi?p=spider.git;a=commitdiff_plain;h=1a3106f748d8123a2b88572227f18147019c61c5 add the techdoc section, starting with the protocol document for the new protocol. --- diff --git a/techdoc/protocol.pod b/techdoc/protocol.pod new file mode 100644 index 00000000..de523c9d --- /dev/null +++ b/techdoc/protocol.pod @@ -0,0 +1,277 @@ +=head1 NAME + +DXSpiderWeb Orthogonal Communications Protocol + +=head1 SYNOPSIS + + ,,,,,|,... + +=head1 ABSTRACT + +For many years DX Clusters have used a protocol which was designed +for a non-looped tree of nodes. This has probably never, reliably, +been achieved in practice; certainly not recently. This document +describes a complete replacement for that protocol. It allows a +fully looped network, is inherently extensible and should be simple +to implement (especially in perl). + +All implementations of this protocol shall B use this protocol +for inter-node communications. + +=head1 DESCRIPTION + +This protocol is encoded in UTF8 with HTTP style escaping. It is +designed to be an extensible basis for any type of one to many +"instant" line-based communications tasks. + +This protocol is designed to be flood routed in a meshed network in +as efficient a manner as possible. + +The protocol consists of a L and a L. +The two sections are separated with the '|' character. + +Most of this document is concerned with the L, however +some L which all implementation should issue and +must accept are described. + +=head2 Routing Section + +The application that implements this protocol is essentially a line +oriented message router. One line equals one message. Each line is +effectively a datagram. + +It is assumed that nodes are connected to +each other using a "reliable" streaming protocol such as TCP/IP or +AX25. Having said that: in context, elements of the protocol could be +multi or broadcast, either "as is" or wrapped in some other framing +protocol. + +Because this is an unreliable, best effort, "please route my packets +through your node" protocol, there is no guarantee that a message +will get to the other side of a mesh of nodes. There may be a +discontinuity either caused by outage or deliberate filtering. + +However, as it is envisaged that most messages will be flood routed or, +in the case of directed messages (those that have a EtonodeE or +EtouserE) down all interfaces showing a route for that +direction, it is unlikely that messages will be lost in practice. + +=head3 Field Description + +Only the first three fields in the L are compulsory +and indicate that this is a broadcast to be sent to all nodes coming +from the L. If the message needs to be identified as coming +from a user on a node, then the L field is added. + +Adding a L and/or L field will restrict the destinations +or recipients that receive this message. + +The L field is incremented on receipt of a message on a node. + +Fields are separated by the comma ',' character with the last field +required followed by the vertical bar '|' character. + +If trailing fields are missed out then superfluous commas can also +be left out. If intervening fields are missing then no space needs +to be left for the separating comma. + +The characters allowed in the routing section are restricted. Any +invalid characters in any field will cause the whole message to be +silently dropped. + +More detailed descriptions of the fields follow: + +=over + +=item Origin + +This is a compulsory field. It is the name of the originating node. +The field can contain up to 12 characters in the set [-A-Z0-9_] in +any order. Higher layers may restrict this further. + +The field must not be changed by any other node. + +=item TimeSeq + +This is a compulsory field. It is a 10 hexadecimal digit string which +consists of a day no (1-31), seconds within that day (0-86399) [6 +hex digits] that are concatenated with a sequence number (0-65535) +[4 hex digits] making the total of 10. + +The date portion is constructed as: + + my $date = ((gmtime)[3] << 18) | (time % 86400); + +The sequence number is simply an unsigned short (or 16 bit) number +starting at 0. + +Each message originated at this node will increment the sequence +number. + +=item Hop + +This is a compulsory field. It is the number of hops from the +originating node. It is incremented immediately on receipt and +before determining its value. + +So the originating node sends a message with a L of 0, the +neighbouring nodes must increment this field before passing +it on to higher layers for onward processing. + +Implementations may have an upper limit to this field and may +silently drop incoming messages with a L count greater than the +limit. + +=item FrmUser + +This field is optional. It is the identifier of the originating +user. If it is missing then the message is +assumed to come from the originating node itself. + +It can consist of up to 12 characters in the set [-A-Z0-9_] +in any order. Higher layers may restrict this further. + +=item To + +This field is optional. It is a string of up to 12 characters +in the set [-A-Z0-9_] in any order. + +This field is used either to indicate particular node destination +or to differentiate this broadcast in some way by making this +message as a member of a L. Any message can be sent +down any L. The names of Ls and their usage +is entirely up to the implementor. + +It is assumed that node names can be differentiated from user +names and L names. + +If the field is set to a particular node destination, it will +be routed (rather than broadcast) to that node. However, any +intervening nodes are free to duplicate the message and send +it down more than one, likely looking, interface - depending on any +network policies that may pertain. + +=item ToUser + +This field is optional. It is a string of up to 12 characters +in the set [-A-Z0-9_] in any order. Higher layers may restrict +this further. + +Conventionally this field is used to indicate the user to whom +this message is directed. In an ideal world the L field +will be set, by the originating node, to the identifier of the node +on which this user resides. + +If the L field is not set then this message will be +broadcast. However, should a node become apparent (on route) +then nodes are free to fill in the L field and proceed +with a more directed approach. + +If it becomes apparent (on route) that there may be more than +one possible L destination for a L then a node +may duplicate the message (keeping the same L) and +route it onwards. Because of the L inherent in +the system, it is indeterminate as to which destination will +receive the message. It is possible for all or just some +destinations to receive the message. The tuple (L, +L) will determine uniqueness. + +This field can, in the case where L +is set to the name of a node, be set to a L. If this +is the case then this will cause this message to be sent to +a L on the L node only. + +=back + +=head3 Channel + +Channels are a concept very similar to that on IRC. It is a +way of segregating data flows in a network. In principle, subject +to local policy or application requirements, any data (or +L) can be sent down any channel. + +It is up to the implementation whether to use this feature or not. + +=head3 Routing + +It is assumed that nodes will be connected in a looped network with +more than one route available (in many cases) to another node. + +In anycase, most traffic is not directed, but broadcast to all users +on all nodes. + +Each message is uniquely identified by the (L,L) +tuple. The basic system will learn which interfaces can see what nodes +by looking at the tuple and merging that with the L count. +Each interface remembers the latest L with the lowest L +for each L that arrives on that interface. It also remembers +the number of messages for that L that has been received on +that interface. + +Any message for onward broadcast is duplicated and sent out on all +interfaces that it did not come in on. + +Any message that is directed to a particular node will be sent out on +the "best" interface based on routing information gathered so far. If there +is more than one possible route then, depending on network or local +policy, the message may be duplicated and sent on other interfaces +as well. + +=head3 DeDuplication + +On receipt of a message, its unique tuple (L,L) is +checked against a hash table. If it exists: the message is silently +dropped. If it does not exist in the hash table then the tuple is +added. + +The hash table is periodically cleaned, removing tuples that +have expired. The length of time a tuple remains in the hash table +is implementation dependant but could easily be several days, if +required. + +This mechanism only ensures that a message broadcast around the network +travels the least distance and through the fewest nodes possible. It +is up to higher layers to make sure that data carried is not, itself, +duplicated! + +=head2 Command Section + +The Command Section of the message contains the actual data being +passed. It is called the Command Section because all commands +are identified with a L which is implemented by +the software using this protocol. + +=head3 Command Tag + +The Command Tag consists of string of uppercase letters and digits, starting +with a leading, uppercase, letter. Tags should be as short as is meaningful. + +Valid tags would be: + + DX + PC23 + ANN + +Invalid tags include: + + 1AAA + dx + Ann + +There are a number of standard commands which must be accepted by +all implementations. + +=head1 AUTHOR + +Dirk Koopman, G1TLH, Edjk@tobit.co.ukE + +=head1 COPYRIGHT AND LICENSE + +Copyright 2004 by Dirk Koopman, G1TLH + +This library is free software; you can redistribute it and/or modify +it under the same terms as Perl itself. + +=cut + +