Remote Procedure Calls (RPC)

Why RPC

Distributed client-server applications (file transfer, image processing, etc.) share common communication patterns: creating sockets, allocating buffers, copying data, managing protocol headers. These steps are repeatedly reimplemented across applications.

RPC captures these common steps into a system-level solution, providing a procedure call interface for inter-process communication. The client calls what looks like a local procedure; the RPC runtime handles all network communication transparently.

Benefits:

  • Higher-level interface — handles connection setup, requests, responses, acknowledgements, and error handling automatically

  • Hides cross-machine complexity — different machine types, network failures, and machine failures are transparent to the developer

  • Synchronous call semantics — the calling thread blocks until the procedure completes and returns results, identical to local procedure calls

RPC Requirements

  • Client/server model — server provides a complex service; clients issue requests without needing the same capabilities

  • Procedure call semantics — designed for procedural languages (C, Pascal, Fortran); calling thread blocks until result is returned

  • Type checking — wrong argument types produce errors; type information also helps the RPC runtime interpret byte streams (are they integers? arrays? images?)

  • Cross-machine data conversion — machines may differ in endianness, floating-point representation, or negative number encoding. The RPC system handles all necessary conversions, typically by agreeing on a single network representation

  • Transport protocol independence — RPC works over TCP, UDP, or shared memory IPC; also incorporates higher-level mechanisms like access control, authentication, and fault tolerance (e.g., retry on failure, contact server replicas)

Structure of RPC

Example: a client calls k = add(i, j) where only the server has the add implementation.

  1. Client calls add(i, j) — execution jumps to the client stub (not the real implementation)

  2. The client stub marshals the procedure descriptor (add) and arguments (i, j) into a contiguous buffer

  3. The RPC runtime sends the buffer to the server (using the server’s IP/port obtained during binding)

  4. The server stub receives and parses the buffer — identifies the procedure, extracts and allocates arguments as local variables

  5. The server stub calls the actual implementation of add; the result is computed

  6. The result takes the reverse path: server stub → buffer → network → client RPC runtime → client stub extracts result → returns to client

  7. The client process blocks for the entire duration, just like a local procedure call

Steps in RPC

  1. Registration — server announces its procedures, argument types, location (IP, port) so it can be discovered

  2. Binding — client discovers and connects to the appropriate server

  3. Client call — calls the RPC procedure; execution enters the client stub; client code blocks

  4. Marshalling — client stub serializes arguments (which may be at non-contiguous memory locations) into a contiguous buffer

  5. Send — RPC runtime transmits the buffer using the agreed-upon protocol (TCP, UDP, shared memory)

  6. Receive — server RPC runtime receives data, routes to the correct server stub, optionally performs access control

  7. Unmarshalling — server stub deserializes the byte stream, extracts arguments, creates local data structures

  8. Server execution — actual procedure is called with the extracted arguments; result (or error) is computed

  9. Return — result follows the reverse path back to the client

Interface Definition Language (IDL)

Clients and servers may be independent processes written in different languages. An IDL standardizes how servers describe their exported interfaces (procedure names, argument types, result types, version numbers).

The IDL specification is used to:

  • Automate stub generation (marshalling/unmarshalling routines)

  • Enable service discovery — clients find compatible servers

  • Support incremental upgrades — version numbers let clients identify compatible server implementations without requiring simultaneous upgrades

Two approaches:

  • Language-agnostic — e.g., XDR (External Data Representation) used by SunRPC. Independent of any programming language.

  • Language-specific — e.g., Java RMI uses Java itself as the IDL. Convenient for Java developers but irrelevant for non-Java clients.

The IDL is used only for interface specification, not for implementing the service.

Marshalling and Unmarshalling

Marshalling serializes procedure arguments from arbitrary memory locations into a contiguous buffer for transmission. The buffer includes a procedure identifier followed by encoded arguments.

Example — add(i, j): the buffer contains [add_id | i | j].

Example — array_add(i, j[]) where j is an array: the buffer contains [array_add_id | i | len(j) | j[0] | j[1] | ...]. Arrays can be encoded as length-prefixed or null-terminated (as with strings).

Unmarshalling is the reverse: parse the byte stream, extract the correct number of bytes per argument type, and allocate/initialize local data structures on the receiving side.

Marshalling/unmarshalling routines are auto-generated by an IDL compiler from the interface specification. They also handle encoding (e.g., endianness conversion). The developer simply links the generated code with client/server executables.

Binding and Registry

Binding is how a client discovers which server to connect to and obtains connection details (IP, port, protocol).

A registry (analogous to Yellow Pages) stores available services. Clients look up by service name and find matches based on version, protocol, and proximity.

Registry models:

  • Distributed — a global online registry (e.g., rpcregistry.com) that any server can register with

  • Per-machine — a local registry daemon on each server machine; clients must already know the machine address, and the registry provides the port number and protocol details

Handling Pointers

Pointers as RPC arguments are problematic since they reference addresses in the caller’s address space, meaningless to the remote server.

Two solutions:

  • Disallow pointers in RPC procedure arguments entirely

  • Serialize the pointed-to data — the marshalling code dereferences the pointer and copies the data structure into the buffer. On the server side, the data is reconstructed, and a local pointer to it is used as the argument

Partial Failures

When an RPC call hangs, the cause is ambiguous: server overloaded, request lost, response lost, server crashed, network element down. Even with timeout/retry mechanisms, the RPC runtime cannot definitively determine the cause.

RPC systems introduce a catch-all error/exception for RPC failures that may indicate partial failure (the client doesn’t know what succeeded vs. what failed). The only conclusion from a timeout is that any of the possible failure causes may apply.

SunRPC

Originally developed by Sun (now Oracle) for NFS on UNIX; now widely available on other platforms.

Overview

Design choices:

  • Per-machine registry — server machine address assumed known; client contacts the local registry to find service details

  • Language-agnostic IDL — uses XDR for both interface specification and data encoding

  • Pointers allowed — pointed-to data structures are serialized

  • Error handling — internal retry on timeout (configurable count); returns meaningful errors where possible (server unavailable, version mismatch, protocol mismatch, generic timeout)

Client-server interaction uses procedure call semantics. Clients and servers may be on different machines or the same machine (RPC as higher-level IPC). SunRPC documentation is maintained by Oracle; Linux man pages available via man rpc.

TI-RPC (Transport Independent RPC) extends SunRPC to allow the transport protocol to be specified at runtime rather than compile time.

XDR Interface Definition

A .x file specifies:

  • Data types for arguments and results (e.g., square_in, square_out — structs containing an int)

  • Service name (e.g., SQUARE_PROG) — used by clients for binding

  • Procedures with ID numbers (used internally by the RPC runtime, not by programmers)

  • Version numbers — support multiple versions simultaneously for incremental upgrades

  • Service ID — numeric identifier; user-defined values must be in the allowed range (predefined values exist for services like NFS)

Compiling XDR

Compile with rpcgen -C square.x to generate:

  • square.h — header with language-specific type definitions and function prototypes

  • square_clnt.c — client stub with wrapper function (e.g., squareproc_1)

  • square_svc.c — server stub with main() (registration, housekeeping) and per-service request parsing; includes prototype for the actual procedure (e.g., square_proc_1_svc) that the developer must implement

  • square_xdr.c — common marshalling/unmarshalling routines

The developer writes:

  • Server: implements the service procedure (e.g., square_proc_1_svc)

  • Client: calls the wrapper function (e.g., y = squareproc_1(x)) — looks like a regular procedure call

Thread safety: rpcgen -C generates code with statically allocated result structures (not thread-safe). Use rpcgen -C -M for multithreading-safe code (dynamically allocated results, different function signatures). Note: -M does not create a multithreaded server — on Linux, multithreaded servers must be built manually using the thread-safe routines.

Registry

The registry daemon is called portmapper, runs on every machine (requires root/sudo):

/sbin/portmap

Servers register with portmapper on startup (auto-generated in main()). Clients contact portmapper to get port numbers and protocol support.

Query registered services:

rpcinfo -p

Returns program ID, service name, version, protocol (TCP/UDP), and port number. The portmapper itself registers on port 111 for both TCP and UDP.

Binding

The client initiates binding with:

CLIENT *clnt_handle = clnt_create(hostname, SQUARE_PROG, SQUARE_VERS, "tcp");
  • hostname — server machine

  • SQUARE_PROG, SQUARE_VERS — auto-generated #define values from .h file

  • "tcp" — transport protocol

The returned CLIENT handle is included in every subsequent RPC call and tracks per-client state (RPC status, errors, authentication).

XDR Data Types and Encoding

Data Types

Default types: char, byte, int, float (like C). Additional types:

  • const → compiles to #define

  • hyper → 64-bit integer

  • quadruple → 128-bit float

  • opaque → uninterpreted binary data (e.g., images as opaque arrays)

Fixed-length arrays: int data[12] — exact size known; RPC runtime allocates accordingly.

Variable-length arrays: int data<12> — angle brackets denote maximum length. Compiles to a struct with int len (actual size) and int *val (pointer to data). Sender sets len and val; receiver reads len first, allocates memory, then reads elements.

Strings: string name<128> compiles to char * (null-terminated in memory). For transmission, encoded as length + data (like other variable-length types), without the null terminator.

Memory example: a variable-length array of max 5 ints on a 32-bit machine requires 28 bytes in memory: 4 (len) + 4 (val pointer) + 20 (5 × 4-byte ints).

Encoding Rules

XDR defines both syntax (IDL) and binary encoding (wire format):

  • All data types encoded in multiples of 4 bytes (padding added as needed) for alignment

  • Big endian is the transmission standard — both endpoints convert to/from big endian regardless of native format

  • Two’s complement for integers; IEEE format for floating point

Encoding example: string "hello" — 6 bytes in memory (5 chars + null). On the wire: 12 bytes — 4 bytes for length (5), 5 bytes for characters, 3 bytes padding.

Encoding example: variable-length array of 5 ints — on the wire: 24 bytes — 4 bytes for length (5), then 5 × 4-byte integers. No additional padding needed.

What goes on the wire: RPC header (procedure ID, version, request ID for dedup) + encoded arguments/results + transport header (protocol, destination address).

XDR Routines

The XDR compiler generates:

  • Marshalling/unmarshalling routines for all data types (in square_xdr.c)

  • Cleanup routines (e.g., xdr_free) to deallocate argument/result memory after servicing an RPC request

  • A user-defined ``_freeresult`` procedure (e.g., square_prog_1_freeresult) specifying which data structures to free; called automatically by the RPC runtime after returning results

Java RMI

Java Remote Method Invocations — Sun’s RPC equivalent for the Java ecosystem. Since Java is object-oriented, inter-process communication uses remote method invocations rather than procedure calls.

Architecture mirrors RPC: client stubs and server skeletons. Since all processes run in the JVM, the IDL is Java itself (language-specific).

The runtime separates into:

  • Transport layer — TCP, UDP, or shared memory

  • Remote reference layer — captures reference semantics: unicast (single server), broadcast (multiple servers, return on first/all responses), or custom semantics

Developers specify desired reference semantics; the system handles the rest.