HTTP/2: Multiplexing and Streams

intermediate15 min read

The Protocol That Made Your Bundler Less Important

HTTP/2 was published in 2015 (RFC 7540), and it changed everything about how browsers load resources. Where HTTP/1.1 forced sequential request-response cycles on each connection, HTTP/2 lets you fire off dozens of requests simultaneously over a single connection — and get responses back in any order.

The result? Domain sharding became an anti-pattern. Giant bundles became unnecessary. The "fewer requests is better" mantra of HTTP/1.1 flipped to "many small requests are fine."

But HTTP/2 didn't solve everything. It has its own head-of-line blocking problem, just at a different layer. Understanding where HTTP/2 wins and where it still struggles is critical for making informed performance decisions.

The Mental Model

Mental Model

HTTP/1.1 is a single-lane road: one car at a time. HTTP/2 is a multi-lane highway built on the same road surface (TCP). Many cars (requests) travel simultaneously, each in their own lane (stream). But there's a catch: the road surface is still a single TCP connection. If a pothole (packet loss) appears in the road, ALL lanes stop until it's repaired. HTTP/2 fixed the traffic jam at the car level, but the road itself can still block everything.

Binary Framing: The Foundation

HTTP/1.1 messages are plain text. HTTP/2 messages are binary frames. This isn't just a format change — it's what enables everything else.

A single HTTP/2 connection carries multiple streams. Each stream is an independent sequence of frames. Frames from different streams can be interleaved on the wire:

HTTP/1.1 (text, sequential):
  GET /style.css HTTP/1.1\r\n...
  [wait for full response]
  GET /app.js HTTP/1.1\r\n...
  [wait for full response]

HTTP/2 (binary, multiplexed):
  [HEADERS frame, stream 1: GET /style.css]
  [HEADERS frame, stream 3: GET /app.js]
  [HEADERS frame, stream 5: GET /image.png]
  [DATA frame, stream 1: first chunk of style.css]
  [DATA frame, stream 3: first chunk of app.js]
  [DATA frame, stream 1: second chunk of style.css]
  [DATA frame, stream 5: first chunk of image.png]
  ...frames arrive interleaved

Each frame has a stream ID, so the browser knows which response each piece belongs to. Streams are reassembled independently.

HTTP/2 Frame TypesPhase 1 / 3

Phase 1 / 3HEADERS Frame

Carries HTTP headers (request or response). Compressed with HPACK. Opens a new stream.

metadata

1/3

Quiz

What fundamental change makes HTTP/2 multiplexing possible?

ABCD

Multiplexing: Many Requests, One Connection

The marquee feature of HTTP/2 is multiplexing: many requests and responses in flight simultaneously over a single TCP connection.

How it solves HTTP/1.1's problems:

No more head-of-line blocking at the HTTP layer. If stream 1 (a large image) is slow, streams 3 and 5 (CSS and JS) can still receive data. The server sends whichever data is ready.

No more 6-connection limit bottleneck. With HTTP/1.1, you needed 6 connections to get 6 parallel requests. With HTTP/2, a single connection supports hundreds of concurrent streams. Chrome's 6-connection limit is per-origin for HTTP/1.1 only — HTTP/2 uses 1 connection with unlimited streams.

No more domain sharding. In fact, domain sharding hurts HTTP/2. Each additional domain means a separate TCP connection that can't share multiplexing. HTTP/2 works best with everything on a single origin.

Real-World Comparison

HTTP/1.1 loading 60 resources from one origin:
  - 6 parallel connections
  - 10 rounds of 6 requests each
  - Each round waits for the slowest response
  - Total: many sequential rounds

HTTP/2 loading 60 resources from one origin:
  - 1 connection
  - All 60 requests sent immediately
  - Responses arrive as data becomes ready
  - Total: limited mainly by bandwidth, not protocol overhead

Feature	HTTP/1.1	HTTP/2
Connections per origin	6 (browser limit)	1 (multiplexed)
Request format	Plain text	Binary frames
Header compression	None	HPACK
Parallel requests	6 per origin	100+ streams per connection
Server push	Not possible	Supported (rarely used)
Head-of-line blocking	HTTP layer	TCP layer only
Domain sharding	Helpful workaround	Harmful anti-pattern

HPACK: Header Compression

Remember HTTP/1.1's redundant headers problem — 500-800 bytes of uncompressed headers with every single request? HTTP/2 fixes this with HPACK, a header compression algorithm designed specifically for HTTP.

HPACK uses two techniques:

1. Static table — 61 common header fields pre-indexed. Instead of sending content-type: text/html, the encoder sends the index number (a single byte).

2. Dynamic table — headers from previous requests are remembered. The first request sends authorization: Bearer eyJ... (full value). Subsequent requests just send "same as before" (an index reference). For requests to the same origin, this means cookies, auth tokens, and other repeated headers are sent once, then referenced by index.

The savings are dramatic. Google measured 85-88% compression on typical header sets.

Quiz

How does HPACK header compression help with repeated requests to the same origin?

ABCD

Stream Prioritization

Not all resources are equally important. Your CSS file matters more than a below-the-fold image. HTTP/2 includes a stream prioritization system that lets the browser hint at which streams matter most.

In the original spec, prioritization used a dependency tree — streams could declare parent streams and weighted priorities. In practice, this was complex and browsers implemented it inconsistently.

The newer approach (RFC 9218, Extensible Prioritization) simplifies this to two parameters:

Urgency (0-7): how important is this resource? (0 = most urgent)
Incremental (boolean): can this be delivered progressively?

CSS:     urgency=0, incremental=false  (critical, need all of it)
JS:      urgency=1, incremental=false  (important, need all of it)
Image:   urgency=4, incremental=true   (can render progressively)
Font:    urgency=2, incremental=false  (important for first paint)

Priority hints in HTML

You can influence stream priority from your HTML using the fetchpriority attribute. This tells the browser how important a resource is, and the browser translates that into HTTP/2 stream priority.

<!-- High priority: above-the-fold hero image -->
<img src="hero.jpg" fetchpriority="high" alt="Hero">

<!-- Low priority: below-the-fold carousel images -->
<img src="slide-3.jpg" fetchpriority="low" alt="Slide" loading="lazy">

Server Push (and Why It Mostly Failed)

HTTP/2 introduced server push: the server can send resources the client hasn't asked for yet. The idea was that when a browser requests index.html, the server already knows it'll need style.css and app.js, so it pushes them proactively.

Browser: GET /index.html
Server:  Here's index.html
         ...also, here's style.css (you'll need it)
         ...also, here's app.js (you'll need it too)

In theory, this eliminates the round trip where the browser parses HTML, discovers CSS/JS references, and then requests them.

In practice, server push was problematic:

Cache duplication — the server doesn't know if the client already has the resource cached. Pushing cached resources wastes bandwidth.
Priority conflicts — pushed resources compete with resources the browser is actively requesting, sometimes delaying critical resources.
Complexity — getting push timing and resource selection right was hard. Most implementations pushed too much or at the wrong time.

Chrome removed server push support in Chrome 106 (2022). The industry consensus is that 103 Early Hints is a better solution — the server sends hints about which resources to preload, and the browser decides whether to fetch them.

Common Trap

Server push is effectively dead. Chrome removed support in 2022, and most CDNs have deprecated it. If you're reading older resources that recommend server push, know that 103 Early Hints is the modern replacement. Early Hints let the server hint at resources to preload without the cache duplication and priority problems of server push.

HTTP/2's Remaining Problem: TCP Head-of-Line Blocking

Here's the thing most articles miss. HTTP/2 solved head-of-line blocking at the HTTP layer — streams are independent, so a slow response doesn't block other streams. But HTTP/2 runs on TCP, and TCP has its own head-of-line blocking.

TCP guarantees in-order delivery. If a TCP packet is lost, the kernel buffers all subsequent packets until the lost one is retransmitted and received. This blocks ALL streams, not just the one the lost packet belonged to.

HTTP/2 streams on TCP:
  Stream 1 data  ┐
  Stream 3 data  ├─ All on one TCP connection
  Stream 5 data  ┘

TCP packet 47 (carries Stream 3 data) is lost:
  → TCP buffers packets 48, 49, 50... (all streams)
  → Waits for packet 47 retransmission
  → Streams 1 and 5 are blocked even though their data arrived fine

On reliable networks (wired, low-latency), this rarely matters. On lossy networks (mobile, WiFi), TCP packet loss can stall all HTTP/2 streams simultaneously. This is worse than HTTP/1.1 in some cases, where 6 independent TCP connections meant a packet loss on one connection only affected that connection's requests.

This is exactly the problem HTTP/3 and QUIC were designed to solve.

Quiz

HTTP/2 runs on a single TCP connection. When a TCP packet is lost, what happens to all HTTP/2 streams on that connection?

ABCD

Common Mistakes

What developers do	What they should do
Using domain sharding with HTTP/2 Domain sharding creates multiple TCP connections that can't share multiplexing. HTTP/2 works best with a single connection per origin. Each additional origin means separate handshakes, separate slow start, and no stream prioritization across origins.	Consolidate to a single origin for maximum HTTP/2 multiplexing benefit
Building massive bundles 'to reduce requests' with HTTP/2 With HTTP/2, the cost of an additional request is roughly one HEADERS frame (a few bytes after HPACK compression). Many small files improve caching granularity — changing one module only invalidates that module's cache, not a giant bundle.	Split into many granular files. HTTP/2 multiplexing makes per-request overhead negligible.
Implementing server push for critical resources Server push is effectively deprecated. Chrome removed support in 2022. Push couldn't handle cache state (wasting bandwidth on already-cached resources) and caused priority issues. Early Hints (103 status code) and preload hints let the browser decide whether to fetch.	Use 103 Early Hints or preload link headers instead

Key Takeaways

Key Rules

1HTTP/2 multiplexes all requests over a single TCP connection using binary frames tagged with stream IDs. This eliminates HTTP-level head-of-line blocking.
2HPACK header compression sends repeated headers by index reference, reducing header overhead by 85-88%. Cookies and auth tokens benefit most.
3HTTP/1.1 workarounds (domain sharding, bundling, spriting) are anti-patterns in HTTP/2. Undo them when migrating.
4Server push is dead (Chrome removed it in 2022). Use 103 Early Hints for server-driven preloading.
5HTTP/2 still suffers from TCP-level head-of-line blocking. A single lost packet blocks ALL streams because TCP guarantees in-order delivery. HTTP/3 (QUIC) solves this.