HTTP/2: Multiplexing and Streams
The Protocol That Made Your Bundler Less Important
HTTP/2 was published in 2015 (RFC 7540), and it changed everything about how browsers load resources. Where HTTP/1.1 forced sequential request-response cycles on each connection, HTTP/2 lets you fire off dozens of requests simultaneously over a single connection — and get responses back in any order.
The result? Domain sharding became an anti-pattern. Giant bundles became unnecessary. The "fewer requests is better" mantra of HTTP/1.1 flipped to "many small requests are fine."
But HTTP/2 didn't solve everything. It has its own head-of-line blocking problem, just at a different layer. Understanding where HTTP/2 wins and where it still struggles is critical for making informed performance decisions.
The Mental Model
HTTP/1.1 is a single-lane road: one car at a time. HTTP/2 is a multi-lane highway built on the same road surface (TCP). Many cars (requests) travel simultaneously, each in their own lane (stream). But there's a catch: the road surface is still a single TCP connection. If a pothole (packet loss) appears in the road, ALL lanes stop until it's repaired. HTTP/2 fixed the traffic jam at the car level, but the road itself can still block everything.
Binary Framing: The Foundation
HTTP/1.1 messages are plain text. HTTP/2 messages are binary frames. This isn't just a format change — it's what enables everything else.
A single HTTP/2 connection carries multiple streams. Each stream is an independent sequence of frames. Frames from different streams can be interleaved on the wire:
HTTP/1.1 (text, sequential):
GET /style.css HTTP/1.1\r\n...
[wait for full response]
GET /app.js HTTP/1.1\r\n...
[wait for full response]
HTTP/2 (binary, multiplexed):
[HEADERS frame, stream 1: GET /style.css]
[HEADERS frame, stream 3: GET /app.js]
[HEADERS frame, stream 5: GET /image.png]
[DATA frame, stream 1: first chunk of style.css]
[DATA frame, stream 3: first chunk of app.js]
[DATA frame, stream 1: second chunk of style.css]
[DATA frame, stream 5: first chunk of image.png]
...frames arrive interleaved
Each frame has a stream ID, so the browser knows which response each piece belongs to. Streams are reassembled independently.
Multiplexing: Many Requests, One Connection
The marquee feature of HTTP/2 is multiplexing: many requests and responses in flight simultaneously over a single TCP connection.
How it solves HTTP/1.1's problems:
No more head-of-line blocking at the HTTP layer. If stream 1 (a large image) is slow, streams 3 and 5 (CSS and JS) can still receive data. The server sends whichever data is ready.
No more 6-connection limit bottleneck. With HTTP/1.1, you needed 6 connections to get 6 parallel requests. With HTTP/2, a single connection supports hundreds of concurrent streams. Chrome's 6-connection limit is per-origin for HTTP/1.1 only — HTTP/2 uses 1 connection with unlimited streams.
No more domain sharding. In fact, domain sharding hurts HTTP/2. Each additional domain means a separate TCP connection that can't share multiplexing. HTTP/2 works best with everything on a single origin.
Real-World Comparison
HTTP/1.1 loading 60 resources from one origin:
- 6 parallel connections
- 10 rounds of 6 requests each
- Each round waits for the slowest response
- Total: many sequential rounds
HTTP/2 loading 60 resources from one origin:
- 1 connection
- All 60 requests sent immediately
- Responses arrive as data becomes ready
- Total: limited mainly by bandwidth, not protocol overhead
| Feature | HTTP/1.1 | HTTP/2 |
|---|---|---|
| Connections per origin | 6 (browser limit) | 1 (multiplexed) |
| Request format | Plain text | Binary frames |
| Header compression | None | HPACK |
| Parallel requests | 6 per origin | 100+ streams per connection |
| Server push | Not possible | Supported (rarely used) |
| Head-of-line blocking | HTTP layer | TCP layer only |
| Domain sharding | Helpful workaround | Harmful anti-pattern |
HPACK: Header Compression
Remember HTTP/1.1's redundant headers problem — 500-800 bytes of uncompressed headers with every single request? HTTP/2 fixes this with HPACK, a header compression algorithm designed specifically for HTTP.
HPACK uses two techniques:
1. Static table — 61 common header fields pre-indexed. Instead of sending content-type: text/html, the encoder sends the index number (a single byte).
2. Dynamic table — headers from previous requests are remembered. The first request sends authorization: Bearer eyJ... (full value). Subsequent requests just send "same as before" (an index reference). For requests to the same origin, this means cookies, auth tokens, and other repeated headers are sent once, then referenced by index.
The savings are dramatic. Google measured 85-88% compression on typical header sets.
Stream Prioritization
Not all resources are equally important. Your CSS file matters more than a below-the-fold image. HTTP/2 includes a stream prioritization system that lets the browser hint at which streams matter most.
In the original spec, prioritization used a dependency tree — streams could declare parent streams and weighted priorities. In practice, this was complex and browsers implemented it inconsistently.
The newer approach (RFC 9218, Extensible Prioritization) simplifies this to two parameters:
- Urgency (0-7): how important is this resource? (0 = most urgent)
- Incremental (boolean): can this be delivered progressively?
CSS: urgency=0, incremental=false (critical, need all of it)
JS: urgency=1, incremental=false (important, need all of it)
Image: urgency=4, incremental=true (can render progressively)
Font: urgency=2, incremental=false (important for first paint)
You can influence stream priority from your HTML using the fetchpriority attribute. This tells the browser how important a resource is, and the browser translates that into HTTP/2 stream priority.
<!-- High priority: above-the-fold hero image -->
<img src="hero.jpg" fetchpriority="high" alt="Hero">
<!-- Low priority: below-the-fold carousel images -->
<img src="slide-3.jpg" fetchpriority="low" alt="Slide" loading="lazy">
Server Push (and Why It Mostly Failed)
HTTP/2 introduced server push: the server can send resources the client hasn't asked for yet. The idea was that when a browser requests index.html, the server already knows it'll need style.css and app.js, so it pushes them proactively.
Browser: GET /index.html
Server: Here's index.html
...also, here's style.css (you'll need it)
...also, here's app.js (you'll need it too)
In theory, this eliminates the round trip where the browser parses HTML, discovers CSS/JS references, and then requests them.
In practice, server push was problematic:
- Cache duplication — the server doesn't know if the client already has the resource cached. Pushing cached resources wastes bandwidth.
- Priority conflicts — pushed resources compete with resources the browser is actively requesting, sometimes delaying critical resources.
- Complexity — getting push timing and resource selection right was hard. Most implementations pushed too much or at the wrong time.
Chrome removed server push support in Chrome 106 (2022). The industry consensus is that 103 Early Hints is a better solution — the server sends hints about which resources to preload, and the browser decides whether to fetch them.
Server push is effectively dead. Chrome removed support in 2022, and most CDNs have deprecated it. If you're reading older resources that recommend server push, know that 103 Early Hints is the modern replacement. Early Hints let the server hint at resources to preload without the cache duplication and priority problems of server push.
HTTP/2's Remaining Problem: TCP Head-of-Line Blocking
Here's the thing most articles miss. HTTP/2 solved head-of-line blocking at the HTTP layer — streams are independent, so a slow response doesn't block other streams. But HTTP/2 runs on TCP, and TCP has its own head-of-line blocking.
TCP guarantees in-order delivery. If a TCP packet is lost, the kernel buffers all subsequent packets until the lost one is retransmitted and received. This blocks ALL streams, not just the one the lost packet belonged to.
HTTP/2 streams on TCP:
Stream 1 data ┐
Stream 3 data ├─ All on one TCP connection
Stream 5 data ┘
TCP packet 47 (carries Stream 3 data) is lost:
→ TCP buffers packets 48, 49, 50... (all streams)
→ Waits for packet 47 retransmission
→ Streams 1 and 5 are blocked even though their data arrived fine
On reliable networks (wired, low-latency), this rarely matters. On lossy networks (mobile, WiFi), TCP packet loss can stall all HTTP/2 streams simultaneously. This is worse than HTTP/1.1 in some cases, where 6 independent TCP connections meant a packet loss on one connection only affected that connection's requests.
This is exactly the problem HTTP/3 and QUIC were designed to solve.
Common Mistakes
| What developers do | What they should do |
|---|---|
| Using domain sharding with HTTP/2 Domain sharding creates multiple TCP connections that can't share multiplexing. HTTP/2 works best with a single connection per origin. Each additional origin means separate handshakes, separate slow start, and no stream prioritization across origins. | Consolidate to a single origin for maximum HTTP/2 multiplexing benefit |
| Building massive bundles 'to reduce requests' with HTTP/2 With HTTP/2, the cost of an additional request is roughly one HEADERS frame (a few bytes after HPACK compression). Many small files improve caching granularity — changing one module only invalidates that module's cache, not a giant bundle. | Split into many granular files. HTTP/2 multiplexing makes per-request overhead negligible. |
| Implementing server push for critical resources Server push is effectively deprecated. Chrome removed support in 2022. Push couldn't handle cache state (wasting bandwidth on already-cached resources) and caused priority issues. Early Hints (103 status code) and preload hints let the browser decide whether to fetch. | Use 103 Early Hints or preload link headers instead |
Key Takeaways
- 1HTTP/2 multiplexes all requests over a single TCP connection using binary frames tagged with stream IDs. This eliminates HTTP-level head-of-line blocking.
- 2HPACK header compression sends repeated headers by index reference, reducing header overhead by 85-88%. Cookies and auth tokens benefit most.
- 3HTTP/1.1 workarounds (domain sharding, bundling, spriting) are anti-patterns in HTTP/2. Undo them when migrating.
- 4Server push is dead (Chrome removed it in 2022). Use 103 Early Hints for server-driven preloading.
- 5HTTP/2 still suffers from TCP-level head-of-line blocking. A single lost packet blocks ALL streams because TCP guarantees in-order delivery. HTTP/3 (QUIC) solves this.