Deep dive into the evolution of HTTP protocol

17 min readSep 29, 2022

In this article, we will discuss the evolution of HTTP protocol, starting with HTTP/0.9 all the way up to HTTP/3, discussing their specific features and implications.

Before discussing the evolution, we need to understand the OSI model of networking.

OSI Model

OSI model(Open System Interconnection) is a framework defining the communication protocol for the host. The model includes seven layers, with each layer having its core responsibility.

Application (L7) Layer

The application layer receives user input, i.e., data and converts it into binary format. All the client applications that need to communicate remotely interact with the application layer. HTTP is the most popular protocol in the application layer.

Presentation(L6) Layer

The presentation layer plays a significant role in protocol conversions, data formatting, translation, and compression.

Session(L5) Layer

The session layer is responsible for connection establishment, name resolution, and connection termination.

Transport(L4) Layer

The transport layer receives data from the L5 layer and breaks them into smaller chunks adding source and destination ports to each chunk to create a segment.

The transport layer is also called as End to End delivery layer. It is responsible for delivering segments from a source process on a particular port to the destination process on a different port. TCP and UDP are the prominent protocols in the transport layer.

Network(L3) Layer

The network layer receives segments from the L4 layer and adds the source IP address and destination IP address to create a packet.

The network layer is called the Host to Host delivery layer and is responsible for delivering packets from the source host to the destination host.

Data Link(L2) Layer

The data link layer receives packets from the L3 layer and adds a source MAC address and destination MAC address to create a frame.

The data link layer is called as Hop to Hop delivery layer and is responsible for the delivery of frames from one device to another within the network.

Physical (L1) Layer

The physical layer receives frames from the L2 layer in the form of bits, converts them into analog/digital signals, and propagates them through channels such as coaxial cable, fiber optic cable, DSL cable, etc.…

On the sender side, the data moves from the L7 layer to the L1 layer, and the resultant signals are propagated through one of the channels to the destination. On the receiver side, the exact reverse happens.

For an in-depth understanding of how data flows through the internet, refer here.

TCP

Transmission Control Protocol (TCP) is a transport layer protocol that provides an abstraction of a reliable network over an unreliable channel.

A TCP connection begins with a 3-way handshake before any data exchange between the client and the server.

The client picks a random sequence number x and sends a SYN packet with additional metadata.
The server receives the SYN packet and increments x by 1, determines its own sequence number y, and sends a SYN ACK packet back to the client with additional metadata.
The client receives the SYN ACK packet, increments both x and y, and completes the handshake by sending the ACK packet to the server.

HTTP/0.9

HTTP stands for HyperText Transfer Protocol and is an application layer protocol that powers the World Wide Web. It was invented by Tim Berners-Lee at CERN.

HTTP/0.9 was the first version of the HTTP protocol that adopted a request-response model where-in the client initiates a connection through a TCP handshake and sends data for which the server responds.

Following are some of the important features of HTTP/0.9:

Every request from the client involved creating a new connection with a 3-way TCP handshake with the server. The interface was primarily based on telnet.
The client request was a single line consisting of ASCII characters delimited by a carriage return, and Only GET requests were supported with the format GET http://{host}:{port}.
The response consisted of only HTML documents. The connection was terminated immediately after each response.
The request and response headers were not supported. This implies it was impossible to differentiate between a success and a failure response.

HTTP/1.0

HTTP/1.0 is an advancement to HTTP/0.9. HTTP/1.0 also operated on the request-response model like HTTP/0.9.

Following are some of the important features of HTTP/1.0:

Every request from the client involved creating a new connection with a 3-way TCP handshake with the server. The interface evolved from telnet based to browser-based.
Inclusion of headers in both request and response with support for HTTP versioning and status codes; Support for POST , GET and HEAD HTTP methods.
Support for rich Content-Type in response headers allowing non-HTML responses to be returned. The connection was still terminated after every response, like HTTP/0.9.

HTTP/0.9 and HTTP/1.0 required a 3-way TCP handshake for every request since the connection was terminated after the response. This introduced a significant latency on every request.

HTTP/1.1

HTTP/1.1 is the most important advancement in the HTTP protocol that introduced several optimizations and features.

HTTP/1.1 introduced support for POST , GET , PUT , DELETE , TRACE , OPTIONS methods.

Persistent connections

One of the core improvements in HTTP/1.1 was the introduction of persistent connections or keep-alive connections.

Earlier versions of HTTP required a 3-way handshake for every HTTP request. The total execution time of each HTTP request is the sum of the execution time of the TCP handshake and the actual request processing.

With the introduction of keep-alive connections, a TCP handshake is performed only once, and the same connection is reused across multiple HTTP requests. The total execution time saved for N requests is (N-1) * RTT (Round Trip Time).

Chunked transfer encoding

Chunked transfer encoding is another important feature of HTTP/1.1. In chunked transfer encoding, the request and response are broken down into non-overlapping smaller chunks, and each chunk can be sent and received independently by the client and server.

Each chunk includes a size header that indicates the size, and the transmission stops when the last chunk is zero-length. The chunked transfer encoding is useful when the size of the request or response is unknown beforehand. Ex: file uploads and downloads.

Without this feature, the client or server should buffer and wait for all the content before transmission which increases the memory of the client/server and the latency of transmission.

Request pipelining

With persistent connections, an application can execute multiple HTTP requests through a single connection but sequentially. The first request is dispatched to the server, and the subsequent request is dispatched only after the server returns the response.

Request pipelining is a feature introduced in HTTP/1.1 that allows clients to fire multiple requests to the server before getting their corresponding response. Since the server is idle during request and response dispatching, it can process additional requests. Additionally, the server can use multiple threads to process requests in parallel.

But HTTP/1.1 does not allow multiple response objects to be multiplexed on the same connection. This forces each response to be returned sequentially before the next response can be transferred.

The latter responses must be buffered until the former is returned to the client. A single slow response blocks all responses behind it. This phenomenon is called Head-of-Line blocking (HOL).

The support for request pipelining was later removed from HTTP/1.1 due to a lack of support for multiplexing.

Range-based resource request

HTTP/1.1 introduced a range-based request for resources where-in the client asks the server to send only a portion of an HTTP message back. It is very useful for clients like media players that require random access.

If the HTTP response includes Accept-Header and its value is anything other than none , then it supports range-based access to resources.

Single part range: Specifies which part of the resource is required from the server by sending Range header in the format bytes:0-50 .
Multi-part range: Allows multiple ranges of resources to be accessed in a single request sending Range header in the format bytes:0-50,60-100.

Caching

HTTP 1.1 introduced caching support through multiple caching directives.

Cache-Control is an HTTP header that specifies browsing caching policies on client requests and server responses. This header is broken down into multiple directives.

Max-Age: Directive that specifies the time for a cached copy to expire post which we should refresh the data.
No-Cache: The cache must revalidate the request from the original server before caching the response.
No-Store: Browsers are not allowed to cache the response.
Public: Indicates that the response can be cached by any cache.
Private: Should not be cached by a shared cache since it is intended for user-specific data.

Expires is another cache header that specifies a fixed date/time for the expiration of the cached resource.

Etag is yet another cache header, but on the response side, that indicates an identifier assigned by the server to a specific version of the resource. If the resource ever changes, a new Etag is assigned. If the version remains unchanged, the browser uses the cached version locally.

SPDY

SPDY(pronounced speedy) was an experimental protocol developed at Google in 2009 whose primary goal was to reduce the load time of web pages and address the limitations of HTTP/1.1.

SPDY introduced a new binary framing layer to enable request and response multiplexing and prioritization. SPDY acted as a catalyst and paved the way for HTTP 2.0 standards.

HTTP/2

HTTP/2 is a significant release in HTTP protocol after HTTP/1.1, inspired by the SPDY protocol.

Binary framing layer

HTTP/2 did not modify the semantics of HTTP/1.1. All the contracts and semantics on HTTP methods, status codes, headers, etc… remained the same as HTTP/1.1. However, HTTP/2 introduced a new binary layer that dictates how the message is encapsulated and transported between the client and the server.

The introduction of a new binary framing layer is backward incompatible, which implies both clients and servers should be upgraded to HTTP/2 to leverage its features.

Following are the components of the binary framing layer:

Stream

A stream is a virtual channel within a TCP connection that allows a bi-directional flow of bytes between the client and the server. Each stream has a unique identifier that identifies itself to sequence them on the other end.

Frame

The frame is the smallest communication unit in HTTP/2. Each frame includes a type, a stream identifier, and an optional payload. Each frame type carries different information and has different functionality.

Following are the different types of frames:

DATA: Frame that carries actual data.
HEADERS: Frame that carries headers containing metadata.
PRIORITY: Frame that carries priority information of the resource.
SETTINGS: Frame that configures how two endpoints should communicate.
RST_STREAM: Frame that signals abnormal termination of the stream.
PING: Frame that is used to measure RTT and acts as a health check.
GOAWAY: Frame that informs the peer to stop creating streams for the current connection.
WINDOW_UPDATE: Frame that implements flow control per stream and connection.
PUSH_PROMISE: Frame that is used to notify the peer in advance of the stream the sender intends to initiate.

Message

A message maps to a logical HTTP request and response. Each message is composed of one or more frames.

Multiplexing

HTTP/1.1 addressed this limitation of HTTP/1.0 by introducing a persistent connection. But the requests are still sequenced within the same connection leading to Head-of-Line blocking.

HTTP/2 enabled full request and response multiplexing in a single TCP connection.

For every HTTP request, a logical pipeline stream is created within the same single TCP connection. Each stream allows a bi-directional flow of messages between the client and the server.
Once the stream is created, the request and response are broken down into smaller chunks called frames and transported between the client and the server within the stream.

Thus streams enable multiple requests and responses to be interleaved within a single TCP connection improving throughput and reducing TCP handshakes.

Header compression

HTTP/1.1 transfers headers between the client and server in plain text format. It also transfers a lot of commonly used headers in every request leading to an overhead of a few kilobytes on every request.

HTTP/2 addresses this problem by using a header compression algorithm, HPACK.

HPACK uses a dictionary data structure at both the client and the server to avoid transmission of redundant headers and compress the headers sent over the wire. Each entry in the dictionary has an index, a header key, and a header value. Each dictionary includes two components:

Static Table: A pre-defined dictionary set of 61 static headers along with predefined values is stored.
Dynamic Table: An additional list of custom key-value header fields can be stored that was encountered previously in the same TCP connection in the dynamic dictionary.

The index in the table is computed using either a Huffman encoding or ASCII encoding for every header key-value pair.

When HPACK needs to compress a key-value pair, it looks up in the static and dynamic tables.

If the key and value match the entry in the dictionary, the key-value pair is replaced with the index in the frame.
If only the key matches the entry in the dictionary, the key is replaced with the index, and the value remains plain text in the frame.
Otherwise, the key-value pair is passed as-is in the frame in plain text format.

Over time, clients and servers learn about incoming headers and update dynamic tables accordingly. Subsequent requests and responses will leverage the encoded index if the same header key-value fields are passed.

This way, by replacing key-value with lightweight encoded indexes, HPACK reduces the overhead of header fields in the HTTP request and response.

Prioritization

HTTP/2 introduced the concept of stream priorities, allowing clients to hint at the relative importance of a particular stream. The server can use these priorities to prioritize processing the corresponding frames from that stream. These priority values act as a hint to the server, and the server can decide to ignore the priority and process the frames in the order it thinks is most appropriate.

There are two mechanisms of implementing prioritization:

Dependency: One stream can depend on another stream so that the former can send frames only if the latter does not need to use the connection.
Weight: Each stream can have a relative weight used to resolve the priorities of two streams having the same parent stream.

Flow control

Multiplexing multiple requests and response streams over a single TCP connection causes bandwidth to be shared and introduces contention.

Flow control is a technique to regulate the data transmission between the sender and receiver to prevent a faster sender from overwhelming a slower receiver.

HTTP/2 introduced flow control at the stream level and the connection level. When the HTTP/2 connection is established, the client and the server exchange the SETTINGS frames to set flow control at both the stream and connection levels to regulate the traffic.

Server push

Server push is the most interesting feature of HTTP/2, which allows the server to push content to the client before the client requests them.

The server cannot initiate push requests arbitrarily. Instead, the server follows a request-response cycle and sends push resources only in response to a request.

The server sends a PUSH_PROMISE frame which signals the server's intent to push resources to the client, which contains only the HTTP headers. The client can decide to decline push streams completely. Once the client receives the PUSH_PROMISE frames, the server pushes the resources to the client in DATA frames.

Since both the client and server can create streams, client-initiated streams have even-numbered stream IDs, and server-initiated streams have odd-numbered stream IDs.

Problems with TCP

TCP is a transport layer protocol that provides an abstraction of the reliable network over an unreliable channel. In addition, TCP offers the following guarantees to the layers above it.

Retransmission of lost packets: If the packets are lost in the network, TCP ensures retransmission of lost packets.
Ordered delivery of packets: TCP protocol ensures that the packets are sequenced on the destination based on the sequence number before passing it to the application layer.
Acknowledgment of packets: TCP protocol ensures that it receives an acknowledgment of every packet sent, ensuring data transmission reliability. If packets are not acknowledged within a stipulated time frame, it initiates retransmission of packets.
Flow Control: Flow control dictates the number of packets a sender can send to a receiver without receiving acknowledgments. TCP uses flow control to prevent a fast sender from overwhelming a slow receiver through sender and receiver buffers.
Congestion Control: TCP uses a congestion control mechanism to regulate the flow of data packets in the network through the Slow Start — Exponential Increase and Congestion avoidance algorithm.

Head of Line Blocking at Transport Layer

HTTP/2 addresses the HOL problem at the application layer by multiplexing requests and responses.

But at the transport layer, TCP combines the requests and responses from multiple streams and assigns incremental sequence numbers, as shown in the above diagram.

A packet drop or delay for any of the requests or responses will delay all other packets behind it even though they belong to different requests and responses.

Since multiplexing of requests and responses is not natively supported in the transport layer, the HOL problem still exists in the transport layer, which can lead to degraded performance in certain scenarios.

Connection handshake

For a given client and server, every HTTP connection requires a 3-way TCP handshake, considered 1-RTT(Round Trip Time). Similarly, every HTTPS connection requires a 3-way TCP handshake, and the TLS handshake is regarded as 2-RTT(Round Trip Time).

Even if the client and server endpoints are the same, every new connection requires 1-RTT or 2-RTT, depending on whether it is an HTTP or HTTPS connection.

TCP Fast Open (TFO)

TCP Fast Open is an extension of TCP protocol that aims to reduce the number of RTTs during connection establishment.

Once the initial 3-way handshake between the client and server is complete, the server issues a cryptographic cookie (TFO cookie) to identify the client.

When the client later reconnects with a new connection with the same server, it sends the initial SYN packet along with the cookie and request data to set up the connection. If it is successful, the server can start sending data to the client, skipping the 3-way handshake.

TCP Fast Open could not be deployed widely because of middleboxes. Middleboxes are the intermediate components that reside between the client and server routing the traffic such as firewall, proxies, load balancers, NAT gateway, etc...

These middleboxes intercepted the requests and started dropping the SYN packet with data, considering it a malformed TCP format. The fix was to upgrade all middleboxes in the connection path, which was not a feasible solution.

This process is called Ossification, a progressive reduction in the flexibility of network protocol design caused by the presence of middleboxes in the network, which cannot easily be removed or upgraded to allow protocol changes.

SCTP

Stream Control Transmission Protocol (SCTP) is a transport layer protocol that provided multiplexing support and offers message-oriented features similar to UDP and resiliency guarantees identical to TCP.

SCTP was not widely popular because of the same middlebox as the NAT servers need to be SCTP aware.

HTTP/3

gQUIC is a transport layer protocol initially designed by Jim Roskind at Google, and deployed in 2012. It was eventually adopted by IETF and paved the way for QUIC protocol and standards of HTTP/3.

QUIC (pronounced Quick) is a transport layer protocol that addresses the shortcomings of HTTP/2 and TCP.

QUIC is built into user space instead of kernel space and sits on top of UDP protocol. QUIC enables full-fledged request and response multiplexing at the transport layer and enables encryption by default.

The above diagram illustrates the layering differences between HTTP/2 and HTTP/3 protocols.

The following section describes some of the core constructs of QUIC and how they address some of the limitations of HTTP/2 and TCP protocol.

Encryption by default

QUIC is an encrypted by default transport layer protocol. It provides security by default and encompasses functionalities like authentication, encryption, and decryption, generally handled by a higher layer protocol like TLS. QUIC leverages TLS 1.3 for providing security features and replaces the TLS record layer with its framing format.

By providing encryption by default end to end, QUIC avoids middleboxes sniffing the message format and avoids potential Ossification. This also enables QUIC protocol to evolve without worrying about the potential upgrades and replacement of middleboxes.

Multiplexing at the transport layer

QUIC enabled full request and response multiplexing at the transport layer addressing the limitation of the HTTP/2 Head-of-Line blocking problem. The semantics of message, stream, and frames of HTTP/2 remains the same in HTTP/3.

A delay in the delivery of frames or frame loss in one stream will only affect that stream and not affect other streams in the QUIC connection. This enables complete utilization of available bandwidth without throttling other streams in the same connection.

Connection establishment

HTTP connection requires a 3-way TCP handshake, considered as 1-RTT(Round Trip Time). Similarly, every HTTPS connection requires a 3-way TCP handshake and the TLS handshake commonly regarded as 2-RTT(Round Trip Time).

QUIC protocol combines the 3-way handshake with cryptographic negotiation into a 1-RTT. Additionally, on the first handshake, the client receives cryptographic keys from the server, which can be used to create a new connection with the same server without the handshake, thus providing 0-RTT.

In a nutshell, QUIC enables 1-RTT for fresh connections with the server and 0-RTT for new connections with the same server.

Connection migration

In HTTP/2, when a mobile client switches between cellular network and Wi-Fi, all the existing TCP connections times out, and a new TCP connection is re-established. This is because the endpoints are identified by IP address and port, which change on the network switch.

One of the important features of QUIC is to enable connection migration by allowing endpoints to re-establish the connection with the server without lengthy handshakes on the network switches.

QUIC achieves connection migration with the help of a connection ID that is exchanged during the initial handshake and stored on the client. The connection ID uniquely identifies the connection to the server without the IP address/network involved.

On a network switch, the client sends a packet with the connection ID to the server and re-establishes the connection seamlessly without doing handshakes.

QUIC and UDP

UDP (User Datagram Protocol) is a connectionless and unreliable protocol. Unlike TCP, UDP does not offer reliability guarantees, ordering guarantees, flow control, and congestion control.

QUIC leverages UDP because every network host kernel and middleboxes understand the UDP protocol transmission.

QUIC sits on top of UDP and provides all the guarantees of TCP in the user space. Furthermore, by deploying QUIC on user space instead of the host's kernel, the protocol and the algorithms used in the protocol can be rapidly prototyped and evolved as the change needs to be deployed only on the client and server and not on the middleboxes.

QPACK

QPACK is a header compression algorithm of HTTP/3, an extension of HPACK from HTTP/2.

Unlike HPACK, where TCP ensures ordered delivery of headers, UDP cannot guarantee ordered delivery of header fields. QPACK is designed to operate on QUIC and handle ordered delivery of header fields.

Flexible congestion control

QUIC uses Cubic as the congestion control algorithm same as TCP. But it provides flexible enhancements to congestion control over TCP.

Since QUIC is implemented in user space, different congestion control algorithms can be plugged in without needing a kernel upgrade.
QUIC doesn't use the sequence number for acknowledgments and retransmissions. Instead, it uses a packet number incremented every time, even if it is retransmitted. This way, endpoints can easily identify whether the acknowledgment came from the original or retransmitted packets.
The QUIC acknowledgment packet also carries the delay from the recipient. This way, the incremental packet number can be used to calculate the RTT accurately.

This completes the feature description and implications of various HTTP protocol versions. For a detailed understanding of protocols in depth, refer appendix.

Appendix

HTTP/1.1 protocol: https://www.ietf.org/rfc/rfc2616.txt
HTTP/2 protocol: https://datatracker.ietf.org/doc/html/rfc7540
QUIC protocol: https://datatracker.ietf.org/doc/html/rfc9000