Membership Layer

The purpose of the membership layer is to provide client programs a list of computers on the local network that are running PFS daemons. Internally, it uses a leader-client architecture where a computer will become the leader and that leader maintains the canonnical list of participating computers.

When a PFS daemon starts up, it begins to send off "I'm alive" over UDP. When a leader sees one of these packets, it attempts to connect to the client. When this connection reaches the client (referred to hereafter as a "tether"), the leader sends an update down the tether to inform the client of all computers that the leader is aware of, and also sends an update to each client informing them of the new participant.

The tether connection is a TCP connection. If an update would cause the connection to block, then it is assumed that the client is not responding and the tether is promptly dropped and the other machines are informed of the dropped client.

It is possible for two clients on the network to believe that they are the leader.... (pradip elaborate)

Client programs utilize the daemon through a library that uses a socket as an RPC mechanism to communicate with the daemon. The protocol uses the same update system that the leader to client protocol uses.

Protocol details and APIs

Updates

Adding client(s)
4 bytes 4 bytes arbitrary
0xFFFFFFFF numclients list of clients' IP addresses in network byte order
Removing client(s)
4 bytes 4 bytes arbitrary
0x00000000 numclients list of clients' IP addresses in network byte order

Leader to leader negotiation

...to be filled out...

APIs

All clients using the membership layer make use of MembershipClient objects.
class MembershipClient
Instantiate this class to start using the membership layer.
method GetPeers
error_t GetPeers(unsigned long **list, size_t *listsz)
Returns a list of peers at any given time.
method AddWatch
error_t AddWatch(PeerWatchProc proc)
Adds a watch - this watch gets called with any updates, as well as getting an initial update to provide the current list
method RemoveWatch
error_t RemoveWatch(PeerWatchProc proc)
Removes a watch from the list of watches to be called.
typedef PeerWatchProc
typedef void (*PeerWatchProc)(bool adding, const unsigned long *addrs, size_t numaddrs)
This is the prototype of all watch functions. When called, it is given a flag that denotes whether the list is being added to or removed from, a list of addresses, and the length of that list.

Watch functions should never block for any significant amount of time; only one thread is used internally for all watches for each MembershipClient object.