15-440, Fall 2012, Class 07, Sept. 18, 2012
Randal E. Bryant

All code available in:
    /afs/cs.cmu.edu/academic/class/15440-f12/code/class07
Remote Procedure Calls (Tannenbaum 4.1-4.2)

General vocabulary

Middleware: Protocol / software that lives just below Application
providing higher-level services.

Examples:
	HTTP (although now often used as transport protocol)
	LSP from project 1

Persistent vs. Transient communication

Persistent: Protocol holds onto all information until operation
completed.
	E.g., TCP, LSP

Transient: Protocol discards information if fails
	E.g., UDP

Synchronous vs. Asynchronous

Synchronous: Sender blocks until operation completes

Asynchronous: Sender returns from operation immediately
	      E.g., Everything we've seen so far	   


Remote Procedure call

One way to provide client/server model to model.

Idea: On client, appear to make procedure call, but operation actually
performed on server.

Natural way to express interaction:

client-->server: Do this for me (call)

Server does some work

server-->client: Here's my response (return)

How this work:

1. Client application calls function
2. Function is really a "stub" that packages function name & arguments
as message ("marshaling")
3. Send message to server
4. Server unpacks message, determines what function is being requested
and executes it.
5. Server marshals results back into message and sends it back to
client
6. Client stub unmarshals results and returns back to caller

Two versions:

Synchronous: Client must wait for all steps to complete.

Asynchronous: Stub returns after step 3.  Some other mechanism
provided to pick up result later.

History: Developed by Bruce Nelson & Andrew Birrell, ca. 1980.  At the
time, Nelson was PhD student at CMU, although much of the work done at
Xerox PARC.  Nelson went on to being very successful, including as CTO
at CISCO.  Died in 1999.  Recognized via Bruce Nelson chair (held by
Manuel Blum).


Let's think about an application of RPC.

How about a distributed password cracker.  Similar to that of Project
1, but designed to use RPC

Three classes of agents:

1. Request client.  Submits cracking request to server.  Waits until
   server responds.

2. Worker.  Initially a client.  Sends join request to server.  Now it
   should reverse role & become a server.  Then it can receive
   requests from main server to attempt cracking over limited range.

3. Server.  Orchestrates whole thing.  Maintains collection of
   workers.  When receive request from client, split into smaller jobs
   over limited ranges.  Farm these out to workers.  When finds
   password, or exhausts complete range, respond to request client.

So, here's the RPC version:

Request-->Server-->Request: Classic synchronous RPC

Worker-->Server.  Synch RPC, but no return value.

"I'm a worker and I'm listening for you on host XXX, port YYY."

Server-->Worker.  Synch RPC?  No that would be a bad idea.  Better be Asynch.

Otherwise, it would have to block while worker does its work, which
misses the whole point of having many workers.


Why is RPC different from a regular procedure call?

1. Running on different machines:
   A. Don't have shared memory space.
      Must ship argument data from client to server, and results back to client
      If need any data structures, must send them over, too.
   B. Possibly, two machines differ (e.g., byte ordering), and client & server
      may be written in different languages, or running different OS's

2. Need to communicate with each other
   A. May need to locate server.  Simple: Have IP address.  More elaborate: locate
      server based on service being requested.
   B. Need some kind of network connection between them.
   C. Need conventions for how to encode ("marshal") data, send,
      and decode (unmarshal) it at other end.

3. Things don't always  work right.
   A. Packets get dropped, duplicated, or mangled.
   B. Client or server may die, before or during call

4. Might be worried about security
   A. Need way to have client & server authenticate to each other
   B. Need way to keep communications secret


Lets look at some details:

Marshaling:

Need convention for how to send objects.

Example = JSON.  Very general form.  Converts struct to named fields.
Applied recursively

Example applying it to our sequential buffer:

Data structures declared as:

// Linked list element
type BufEle struct {
	Val interface{}
	Next *BufEle
}

type Buf struct {
	Head *BufEle         // Oldest element
	tail *BufEle         // Most recently inserted element
	cnt int              // Number of elements in list
}

(Note that only upper case names get marshaled.)

Add method to bufi:

func (bp *Buf) String() string {
	b, e := json.MarshalIndent(*bp, "", "  ")
	if e != nil {
		return e.Error()
	}
	return string(b)
}

Here's examples when inserting strings into the buffer:

Empty buffer
{
  "Head": null
}

After inserting "pig", "cat", "dog":
{
  "Head": {
    "Val": "pig",
    "Next": {
      "Val": "cat",
      "Next": {
        "Val": "dog",
        "Next": null
      }
    }
  }
}

Main point: There are standard ways to convert objects into byte
sequences.  These are "deep" encodings, meaning that they go all the
way into a structure.  Beware of trying to do this with circular
data structures!

Other encoding methods:
gob: Used by Go RPC.
XML:


RPC Example.

Using Go RPC package.

In general see two styles of RPC implementation:

* Shallow integration.  Must use lots of library calls to set things
up:
	- How to format data
	- Registering which functions are available and how they are
	  invoked.

* Deep integration.
        - Data formatting done based on type declarations
	- (Almost) all public methods of object are registered.
 
Go is the latter.


Server side, write each operation as a function

func (s *servertype) Operate (args *argtype, reply *argtype) Error

Function must decode arguments, perform operation, encode reply.

Returns nil if no error.

Then must register servertype.  All exported (uppercase names)
operations available.

Client side:

Synchronous call:

Invoke Call, with operation name (as string), and pointers for
arguments and reply. 

When Call returns, get result from reply.

Asynchronous call:

Invoke Go, with operation name and pointers for arguments and reply,
and channel for responding.

Function returns immediately.

If want to get result, then receive from channel.

RPC Example: An RPC version of an asynchronous buffer

// For passing arbitrary values 
type Val struct {
	X interface{}		# Embed in struct.  Gob wants it this way.
}

// Server implementation
type SrvBuf struct {
	abuf *dserver.Buf	# Use one of our asynchronous buffers
				# since needs concurrent access
}

func NewSrvBuf() *SrvBuf {
	return &SrvBuf{dserver.NewBuf()}
}

## Example methods for server

## Note signature.  Pass in arguments + reply location
func (srv *SrvBuf) Insert(arg *Val, reply *Val) Error {
	srv.abuf.Insert(arg.X)	    # Insert object of type interface{}
	*reply = nullVal()	    # Wrapper around nil
	return nil
}

func (srv *SrvBuf) Front(arg *Val, reply *Val) Error {
	*reply = Val{srv.abuf.Front()}
	return nil   # This means it's OK
}

...

Here's the main incantation

func Serve(port int) {
	srv := NewSrvBuf()
	# Register takes object and makes it's exported methods available
	rpc.Register(srv)
	# Use HTTP as communication protocol
	rpc.HandleHTTP()
	addr := fmt.Sprintf(":%d", port)
	l, e := net.Listen("tcp", addr)
	Checkfatal(e)
	# Set up HTTP server
	http.Serve(l, nil)
}

Client side

# Really don't need more than provided by RPC package
type SClient struct {
	client *rpc.Client
}

# Wrapper to access Call function
func (cli *SClient) Call(serviceMethod string, args interface{},
	reply interface{}) os.Error {
	return cli.client.Call(serviceMethod, args, reply)
}

# Setup up TCP client
func NewSClient(host string, port int) *SClient {
	hostport := fmt.Sprintf("%s:%d", host, port)
	client, e := rpc.DialHTTP("tcp", hostport)
	Checkfatal(e)
	return &SClient{client}
}

# Making RPC calls

func (cli *SClient) Insert(val interface{}) {
	v := Val{val}
	var rv Val
	e := cli.Call("SrvBuf.Insert", &v, &rv)
	if Checkreport(1, e) {
		fmt.Printf("Insert failure\n")
	}
}

func (cli *SClient) Remove() interface{} {
	av := nullVal()
	var rv Val
	e := cli.Call("SrvBuf.Remove", &av, &rv)
	if Checkreport(1, e) {
		fmt.Printf("Remove failure\n")
		return nullVal()
	}
	return rv.X
}

What about bigger data structures?

Suppose we want to return entire buffer contents.

Add to bufi:

// Return slice containing entire buffer contents
func (bp *Buf) Contents() []interface{} {
	result := make([]interface{}, bp.cnt)
	e := bp.Head
	for i := 0; i < bp.cnt; i++ {
		result[i] = e.Val
		e = e.Next
	}
	return result
}

(Also added field cnt to bufi.Buf, to keep count of number of elements)

Added to dserver:
func (bp *Buf) Contents() []interface{}

Add to RPC code:

1. Let's name this data type: 

type Islice []interface{}
var islice Islice

 
2. Let's let the server & client know about this type:

func NewSrvBuf() *SrvBuf {
	gob.Register(islice)
	return &SrvBuf{dserver.NewBuf()}
}

func NewSClient(host string, port int) *SClient {
	gob.Register(islice)
	hostport := fmt.Sprintf("%s:%d", host, port)
	client, e := rpc.DialHTTP("tcp", hostport)
	Checkfatal(e)
	return &SClient{client}
}

3. Let's implement the server function:

func (srv *SrvBuf) Contents(arg *Val, reply *Val) error {
	c := Islice(srv.abuf.Contents())
	*reply = Val{c}
	Vlogf(2, "Generated contents: %v\n", c)
	return nil
}

func (cli *SClient) Contents() Islice {
	av := nullVal()
	var rv Val
	e := cli.Call("SrvBuf.Contents", &av, &rv)
	if Checkreport(1, e) {
		fmt.Printf("Contents failure: %s\n", e.Error())
	}
	return rv.X.(Islice)
}

Other issues:

Dealing with failures:

* Network dropped/duplicated/mangled packets
* Client or server dies before or during operation

Typically, want operation required by RPC call to take place EXACTLY
once.

This is hard to guarantee.

Variants:

"At most once": Client sends request to server.  Hopefully gets response.

Fails if:
1. Request message doesn't get to server
2. Server fails
3. Response message doesn't get to client.

Note that with #1 & #2, call not executed.  With #3 call executed,
e.g., could cause state change by server. ("Withdraw $100 from my bank account")

"At least once" Client executes loop: { send request; wait for response }
until either get response or give up.

Same failure modes.  But overcomes cases where these failures are not persistent.

Danger: Server gets multiple requests and doesn't realize they are duplicates.
   (Think of the account withdrawal example)

Solution: Want to make operations "idempotent:" Doing same operation
multiple times has same effect as doing it once.

Example mechanism: Use sequence numbers.

Requires maintaining per-client state at server.  (Imagine having 1M clients.)


See example in Project 1 protocol.

LSP is like RPC, in that it serves as middleware between client and
server applications.  Deals with failed messages, clients, and
servers.  But provides message passing model between clients &
servers, rather than RPC.

* Each data message includes sequence number.
  - Can detect duplicate messages (either from network or from resending)

* Each data message acknowledged.
  - Sender knows that it's been received.

* Sender cannot new send message until previous one acknowledged
  - Prevents lost message in middle of data stream

* Periodic resend of most recent data + acknowlegement.
  - Compensate for dropped messages
  - Indication that machine at other end of connection still alive.
    Same effect as "heartbeat" messages 

* Detect failure at other end if no messages for K epochs
  - Independent of application-level activity.


Where these mechanisms show up in RPC.

* Typically use TCP or HTTP.  Provides reliable transport level that
  eliminates most network problems.

* Typically have sequence numbers to avoid acting on duplicate requests
  (May need to persist across multiple TCP sessions.)

* Don't (by default) do a very good job detecting failed clients or servers.


Other RPC systems:

ONC RPC (a.k.a. Sun RPC).  Fairly basic.  Includes encoding standard
XDR + language for describing data formats.

Java RMI (remote method invocation).  Very elaborate.  Tries to make it look like
can perform arbitrary methods on remote objects.  

Thrift.  Developed at Facebook.  Now part of Apache Open Source.
Supports multiple data encodings & transport mechanisms.  Works across
multiple languages.

Avro.  Also Apache standard.  Created as part of Hadoop project.  Uses
JSON.  Not as elaborate as Thrift.