Internals of gRPC architecture

sudan
17 min readApr 16, 2023

--

In this article, we will dive deep into the internals of gRPC architecture. We will discuss how gRPC and Protocol Buffers work together internally to provide a framework for Remote Procedure Calls (RPC).

gRPC

Before understanding the internals of gRPC, let’s discuss the OSI model, TCP 3-way handshake, and HTTP/2 protocol which serves as the basic building block for subsequent discussions.

OSI Model

OSI model

OSI Model stands for Open Source Interconnection. It is a conceptual model that describes how two hosts in the network communicate with each other using standard protocols.

The model includes 7 layers with each layer catering to its own core functionality.

  • Application Layer (L7) Layer: This layer receives user input in the form of data and converts it into a binary format. HTTP is the most popular protocol in the Application Layer.
  • Presentation Layer(L6) Layer: This layer is responsible for protocol conversions, data formatting, translation, and compression of data.
  • Session Layer(L5) Layer: This layer is responsible for Domain Name Resolution(DNS), connection establishment, and termination.
  • Transport Layer(L4) Layer: This layer receives data from the L5 layer and breaks them into smaller chunks called Segments . Each Segment is enriched with a source and destination port. This layer is called End-to-End delivery and delivers Segments from a source process running on a particular port to a destination process running on a particular port. TCP and UDP are the prominent protocols in this layer.
  • Network Layer(L3) Layer: This layer receives Segments from L4 layer and create Packets . Each Packet is enriched with a source and destination IP address. This layer is called Host-to-Host delivery and delivers Packets from the source host to the destination host.
  • Data Link(L2) Layer: This layer receives Packets from L3 layer and creates Frames . Each Frame is enriched with a source and destination MAC address. This layer is called Hop-to-Hop delivery and delivers Frames from one device to another within the same network.
  • Physical(L1) Layer: This layer receives Frames in the form of bits and converts them into analog or digital signals propagating them through channels such as coaxial cable, fiber optic cable, DSL cable, etc…

On the sender side, the data moves from the L7 layer to the L1 layer, and the resultant signals are propagated through one of the channels to the destination. On the receiver side, the exact reverse happens.

Transmission Control Protocol (TCP)

TCP is a transport layer protocol that provides an abstraction of a reliable network over an unreliable channel. A TCP connection starts with a 3-way handshake before any data exchange between the client and the server can happen.

TCP 3-way handshake
  • The client picks a random sequence number x and sends a SYNpacket with additional metadata.
  • The server receives the SYN packet and increments x by 1, determines its own sequence number y, and sends a SYN ACK packet back to the client with additional metadata.
  • The client receives the SYN ACK packet, increments both x and y, and completes the handshake by sending the ACK packet to the server.

Since a 3-way handshake is a very expensive operation, the client and server establish the connection once and reuse the same connection for subsequent data transfer requests.

This is the reason why it is recommended to implement connection pooling for remote network calls.

HTTP/2

HTTP/1.1 introduced persistent connections where-in a TCP connection can be used across multiple HTTP requests.

HTTP/1.1 also introduced request pipelining where-in the clients can fire multiple requests to the server before receiving the response within the same TCP connection.

But it did not allow multiple responses to be multiplexed on the same connection forcing each response to be returned sequentially. This phenomenon is called HoL (Head-of-Line) blocking.

HTTP/2 is a major advancement in the HTTP protocol after HTTP/1.1 providing multiple features such as Binary Framing Layer, Multiplexing, Header Compression, Server Push, etc…

Binary framing layer

HTTP/2 retained the semantics of HTTP/1.1 on HTTP methods, status codes, and headers but introduced a new Binary Framing Layer that dictates how the message is encapsulated and transmitted between the client and the server.

The following are the major components of this layer:

  • Stream: Stream is a virtual channel within a TCP connection that allows the bi-directional flow of bytes between the client and the server. Each stream has a unique identifier to identify the self which is used to sequence them on the destination.
  • Frame: A Frame is the smallest communication unit in HTTP/2 which includes a type, a stream identifier, and an optional payload. There are different types of frames which are covered in the next section.
  • Message: A message maps to a logical Request or Response. Each message is broken down into multiple frames.
TCP Connection in an HTTP/2 Protocol

Following are the different types of supported frames:

  • DATA: Frame that carries the actual data.
  • HEADERS: Frame that carries headers containing metadata.
  • PRIORITY: Frame that carries priority information of the resource.
  • SETTINGS: Frame that configures how two endpoints should communicate.
  • RST_STREAM: Frame that signals abnormal termination of the stream.
  • PING: Frame that is used to measure Round Trip Time (RTT) and acts as a health check.
  • GOAWAY: Frame that informs the peer to stop creating streams for the connection.
  • WINDOW_UPDATE: Frame that implements flow control per stream and connection.
  • PUSH_PROMISE: Frame that notifies the peer in advance of the stream the sender intends to initiate.

Multiplexing

HTTP/2 enabled full request and response multiplexing in a single TCP connection through its Binary Framing Layer.

Within a single TCP connection, multiple logical pipelines — Streams are created one for each HTTP request and response. Streams facilitates the bi-directional flow of messages between the client and the server.

Once the Stream is created, every HTTP request — Message is broken down into multiple Frames and transported between the client and the server within that Stream thus enabling multiple HTTP requests and responses to be multiplexed within a single TCP connection.

Header Compression

HTTP/1.1 transfers headers between the client and the server in plain-text format. Additionally, it transfers a lot of commonly used headers in every request and response resulting in wasted bandwidth.

HTTP/2 addressed this problem by using a header compression algorithm called HPACK.

HPACK uses a dictionary data structure at both the client and the server to avoid transmission of redundant headers and compress the headers sent over the wire.

HPACK replaces entire key-value pairs with lightweight encoded indexes from the dictionary data structure and transmits the encoded index instead of complete key-value pairs, thus saving the network bandwidth. For a detailed explanation of HPACK, refer to this Link.

Server Push

HTTP/2 allows the server to push content to the client before the client requests them.

The server sends a PUSH_PROMISE frame which signals the server’s intent to push resources to the client, which contains only the HTTP headers. The client can decide to decline the push streams completely. Once the client receives the PUSH_PROMISE frames, the server pushes the resources to the client in DATA frames. Thus, HTTP/2 enabled a bi-directional flow of data between the client and the server enabling the server to push data to the client.

gRPC Introduction

RPC (Remote Procedure Call) is a communication protocol where-in a process can request information from another process residing in a different host without understanding the intrinsic network details.

The invocation of remote methods on the server resembles a local function call.

gRPC is an open-source, high-performance, inter-process communication RPC framework built on top of HTTP/2 by Google. It uses Protocol Buffers as the message interchange format which is a platform-neutral, extensible framework for serializing and deserializing structured data.

In the following section, we will discuss the important terminologies and components of gRPC.

Interface Definition Language

Building a gRPC application starts with defining a Service Interface.

The Service Interface Definition dictates the remote methods exposed by the server along with the Request and Response contracts for each method.

The language that is specified in the Service Interface Definition is called Interface Definition Language(IDL). gRPC uses Protocol Buffers as the IDL.

service HealthcheckService {
rpc Ping (PingRequest) returns (PingResponse) {}
}

message PingRequest {
string name = 1;
}

message PingResponse {
string message = 1;
}

Above is an example of an IDL which defines a Service HealthCheckService which is basically a collection of remote methods.

The service defines a single method Ping which takes PingRequest as Request and returns PingResponse as Response.

PingRequest and PingResponse are called Messages which are the data structures exchanged between the client and the server containing a series of name-value pairs called fields.

Each field is associated with a unique number that is used to identify them in the binary format. In the above example,

  • PingRequest includes a field name of type string with a unique number 1 .
  • PingResponse includes a field message of type string with a unique number 1 .

Code Generator

Once the IDL is defined, the next step is to use a language-specific Protocol Buffer compiler to generate appropriate data structures for Messages in a specific programming language.

It provides a mechanism for serializing, deserializing, populating, and retrieving those data structures in that specific programming language.

Ex: If we use Protocol Buffer compiler for Java, PingRequest and PingResponse are converted into appropriate Java classes with suitable getters and setters and appropriate mechanisms for serializing and deserializing them in Java.

Since Service Definition Interface is an extension of Protocol Buffer, a special gRPC plugin generates client and server-side code.

The server-side code is called the server skeleton which provides abstract functions for the remote methods defined in IDL which can be overridden with our custom business implementation.

The client-side code is called the client stub which provides abstractions for the same remote methods defined in IDL simplifying the invocation of remote methods implemented in the server.

Communication Patterns

gRPC provides four communication patterns:

  • Unary RPC where-in the client sends a single Request to the remote server and waits for the Response from the server.
  • Server Side Streaming RPC where-in the client sends a single Message to the server and the server returns back a sequence of Messages.
  • Client Side Streaming RPC where-in the client sends a sequence of Messages and the server returns back a single Response.
  • Bi directional Streaming RPC where-in both the client and server can send a sequence of Messages in either direction but the client has to initiate the Request.

Interceptors

gRPC lets developers intercept the execution of remote methods by injecting common logic such as logging, authentication, etc… using Interceptors.

These interceptors can be applied on either the client side or the server side. Additionally, these interceptors can be applied before the invocation or after the invocation.

Metadata is auxiliary information about a particular gRPC call structured in the form of a list of key-value pairs which are heavily used inside the interceptors.

gRPC Internals

The previous section discussed the basics of gRPC and commonly used terminologies. This section covers the various stages involved in remote communication between a client and a server.

Stages of a gRPC communication

Client Stub Invocation

The gRPC plugin bundled with the language-specific Protocol Buffer compiler takes an IDL and generates language-specific client abstractions for the remote methods defined in the IDL. These abstractions are called client stubs and simplify remote method invocations on the server.

The client application invokes the remote method through the client stub resembling a local function call.

Message Encoding

Once the client hands over the control to the client stub, the client stub is responsible for converting the Message into a byte array. This conversion is called encoding/marshaling.

Each remote method in the service definition includes two data structures: A Request Message exchanged between the client and the server and a Response Message exchanged between the server and the client.

Each Message contains a series of name-value pairs called fields with each field having a unique number to identify them in the binary format.

Composition of Message

Message Composition

A Message is composed of a series of tag and value pairs for each field terminating with a 0 indicating the end of the message.

Each tag is composed of two fields: Field Index and Wire Type.

  • The Field Index is the unique identifier assigned to each field in the message in IDL.
  • The Wire Type is a unique integer value that identifies the data type and is transmitted over the wire. It has a 1:1 correlation with the field data type defined in the IDL

The below diagram explains the correlation between the field data type and wire data type:

Field Data Type and Wire Type mapping

Once the Field Index and Wire Type are derived, the tag value is derived using the following formulae:

TagValue = (FieldIndex << 3) | WireType

The above formulae essentially indicate that the Wire Type occupies the last 3 bits of the tag and the Field Index occupy the first X bits except the last 3 bits.

Encoding algorithms

Once the tag value is determined for every field in the Message, the next step is to encode the value for each field in the Message.

Protocol Buffers support different encoding mechanisms for different field data types. The field types are categorized into different groups and each group uses a different encoding mechanism.

Integer data types

Varints

Varints also called Variable length integers is an encoding algorithm that uses variable bytes to encode integers. The number of bytes allocated for each value is not fixed and it depends on the integer value itself.

In a varint encoding scheme, if the Most Significant Bit (MSB) is set, it indicates there are more bytes to come and it acts as an indicator to encounter more bytes during parsing. The next 7 bits encapsulate the value which is stored as 2’s complement of the data.

The data types which use Varints are int32 , int64 , uint32 , uint64 , bool and enum

ZigZag encoding

Signed integers sint32 and sint64 which are used to represent positive and negative integers using the ZigZag encoding mechanism.

In a ZigZag encoding scheme, signed integers are converted to unsigned integers in a zigzag way through negative and positive integers.

ZigZag encoding

The negative original values are mapped to odd positive values and positive original values are mapped to even positive values as shown in the above diagram.

Post this mapping, the encoding gives a positive number irrespective of the original value and this value is subject to varint encoding scheme.

sint32 and sint64 are recommended for negative numbers because varint encoding for a negative number needs more bytes to represent than an equivalent positive number.

Therefore, by using ZigZag encoding, it converts negative values into positive values thereby reducing the number of bytes.

Non-varint encoding

Non-varint encoding mechanism allocates a fixed number of bytes for the actual integer value.

There are two data types that fall into non-varint encoding:

  • 64-bit data type such as fixed64 , sfixed64 and double.
  • 32-bit data type such as fixed32 , sfixed64 and float.

UTF-8 encoding

String data types are encoded using UTF-8 character encoding. The encoded value of the string includes two components:

  • Length of the string which is a varint encoded length.
  • UTF-8 encoded string value.

Sub-message

Sub-messages are messages having nested messages in themselves. They also fall into wire type 2 similar to a string which consists of two components: the length of the sub-message followed by the encoding of its constituent messages and their fields.

Repeated fields

Repeated fields also fall into wire type 2 that includes the length of the list followed by the encoding of its individual records in the list following a similar pattern as the sub-message.

Similarly, Maps are encoded like a repeated field where the key and value are converted into a message with two fields each.

Packaging Message

Once the field tags are derived and values for each field are encoded, the tag and value for each field are concatenated together to form a byte stream for the entire Message ending with a tag value 0.

Once the encoded message is ready, the client stub computes the size of the Message and adds it before the Message. This technique is called the Length prefix framing technique.

Packed Message
  • The first byte indicates whether the message is compressed or not. A value of 1 indicates that the message is compressed and the compression algorithm is part of the Message-Encoding header.
  • The next 4 bytes determines the size of the message.
  • The last part includes the encoded message.

HTTP/2 Request Builder

Once the encoded Message is ready, the next step is to construct an HTTP/2 Request.

The client stub creates a HTTP POST request with an encoded Message. All requests in gRPC are HTTP POST requests with Content-Type application/grpc .

A Request message consists of three major components: Request Headers, Length-prefixed Messages, and End of Stream.

Request Message
  • Client stub initiates the remote call to the server once the client sends Request Headers.
  • Once the headers are sent, then the length-prefixed messages are sent.
  • Once all the messages are sent, finally, the End of Stream flag is sent to notify the end of the request.
Request Headers

The above diagram depicts the header composition of a gRPC Request

  • By default, all gRPC methods are sent as HTTP POST requests to the server.
  • The scheme can be http or https depending on whether TLS is enabled.
  • The path header is a concatenation of ServiceName and MethodName separated by / . In this case, it is a concatenation of HealthCheckService and Ping .
  • Authority indicates the hostname of the remote server.
  • grpc-timeout defines the request timeout sent by the client to the server which the server uses for setting the execution timeout on the server side. If this timeout is not set, it is assumed to be an infinite timeout.
  • Content-Type for gRPC requests is application/grpc and grpc-encoding scheme is gzip .

Header names beginning with :are reserved headers and HTTP/2 mandates them to appear before other headers.

There are two broad categories of headers supported in HTTP/2:

  • Call definition headers: They are the headers supported by HTTP/2 and they begin with : .
  • Custom metadata: Arbitrary set of key-value pairs defined by the L7 layer and they should appear after the call definition headers.

This is how the request headers are constructed for a gRPC request.

Data Transport

Once the request headers are constructed, the client initiates the remote communication by sending the headers to the server. The request headers are sent in HEADERS frame of HTTP/2.

Once the request headers are sent, the client stub sends the length-prefixed encoded message as DATA frames. If the encoded message does not fit into one DATA frame, it is broken down into multiple length-prefixed DATA frames.

When all the DATA frames are sent over the wire to the server, the client sends an END_STREAM flag on the lastDATA frame indicating the end of the message.

The client initiates a TCP 3-way handshake to establish the connection before exchanging any frames. These frames eventually move from L4 to the L1 layer of the OSI model before being transmitted over the wire.

Server Stub Invocation

The gRPC plugin bundled with the language-specific Protocol Buffer compiler takes an IDL and generates a language-specific server skeleton for the remote methods defined in the IDL.

The server skeleton includes abstract method implementations which are overridden by the custom business logic.

Once the request message is received on the server, the server parses the path header to determine the Service and Method that should be invoked.

In this case, the Service is HealthCheckService and the remote Method is Ping. The server hands over the request to the appropriate Service and remote Method.

Message Decoding

Once the Service and remote Method are identified by the server, the server tries to convert the request in the byte array format into a Message for execution. This process is called decoding/unmarshalling.

In our example, the byte array has to be converted into a PingRequest for executing the Ping method.

The initial step in decoding is examining the compressed flag in the message. If the flag is set to 1, Message-Encoding header is used to determine the algorithm to decompress the compressed message. Otherwise, it is treated as an uncompressed message.

Next, the length of the message is used to parse the actual content of the message. The actual message is a concatenated sequence of tags and values for each field in the message defined in the IDL.

The server skeleton includes a parser that reads the sequence of tags and values for each field and constructs the actual Message.

Following is the sequence of actions that are performed to construct the actual Message:

Tag Parsing

The first step in parsing the message is to parse the tag to determine the wire type and unique field identifier.

  • Since each tag is a varint type, the parser checks whether the MSB bit is set to 0 or 1.
  • If the bit is set to 0 , it parses the last 3 bits to determine the wire type.
  • If the bit is set to 1 , it parses the subsequent bytes till it finds a byte whose MSB is set to 0 which is the last byte of the tag. Once it finds the last byte of the tag, it parses the last 3 bits to determine the wire type.
  • Once the wire type is determined, the rest of the bits in the tag are used to determine the unique field identifier using the formulae described above.

Value parsing

Once the wire type is determined, parses know the data type of the field. The wire type specifies the decoding algorithm to use for unmarshaling the value and how many bytes to read past the tag for the actual value.

The unique field identifier is used to map the field identifier in the Request to the field defined in the IDL.

The parser repeats this process of parsing the tag and its corresponding value until it reaches the end of the message.

Once it reaches the end of the message, the value is converted into an appropriate data structure in that programming language. In our example, PingRequest will be constructed from the byte array.

Remote Method Invocation

Once the request message is decoded to an appropriate data structure in the programming language, the remote method performs business logic and generates a response message.

The response message is encoded using the same technique described above for the request message converting the response message into a byte array.

Response Message

The response message includes three components: Response Headers, Length Prefixed Message, and Trailers.

Response Headers

Response Headers are sent as HEADERS frame which includes the HTTP status code, Content Type and grpc encoding .

Once the server sends Response Headers, length prefixed messages are sent as HTTP/2 DATA frames. If the length prefixed message does not fit into one DATA frame, it spans multiple DATA frames.

Trailers

Finally, trailers are sent as HEADERS frames to indicate the end of the response. The end of the response stream is indicated by setting the END_STREAM flag in trailer headers. The trailer includes grpc-status and grpc-message . The list of grpc-status can be found in Link.

This completes a simple request-response cycle in gRPC communication.

Appendix

--

--