HTTP/2

Binary framing, multiplexing, and the head-of-line blocking nobody fixed

HTTP/1.1 sends ASCII messages over TCP, one request at a time per connection. To load 80 sub-resources for a page, browsers opened 6 connections per origin and pipelined requests — until pipelining broke on broken proxies and everyone gave up. HTTP/2 fixed this by reframing HTTP as a binary protocol of multiplexed streams over one TCP connection. 80 requests fly in parallel over a single socket, with HPACK compressing the otherwise-redundant headers.

It works beautifully — until a single TCP packet is lost. Then every multiplexed stream stalls waiting for the kernel to retransmit, even streams whose data already arrived. This is "head-of-line blocking at the transport layer," the pathology HTTP/3 over QUIC was designed to fix by moving streams below the loss-recovery boundary.

Stream State Machine

Every stream transitions through a defined lifecycle. Knowing this prevents subtle bugs: you can't send HEADERS on a closed stream, and a RST_STREAM mid-flight has different semantics than END_STREAM.

HTTP/2 stream state machine idle open half-closed (local) half-closed (remote) closed client sends HEADERS END_STREAM remote sends END_STREAM END_STREAM RST_STREAM END_STREAM stream ID reused → idle

Key invariants: only the idle state accepts new HEADERS. Once either side sends END_STREAM the stream is half-closed. RST_STREAM immediately closes the stream regardless of direction, flushing any unconsumed DATA. Stream IDs are monotonically increasing; odd IDs are client-initiated, even are server-initiated.

HTTP/2 Framing

Every HTTP/2 message is split into binary frames sharing one TCP connection. The 9-byte header tells the receiver which stream the frame belongs to and how to parse the payload.

HTTP/2 frame layout (9-byte header + payload) Length (24 bit) payload size, max 16 MB Type 8 bit Flags 8 bit Stream ID (31 bit) odd = client, even = server Payload (length bytes, type-specific encoding) Frame types DATA, HEADERS, PRIORITY, RST_STREAM, SETTINGS, PUSH_PROMISE, PING, GOAWAY, WINDOW_UPDATE, CONTINUATION (10 standardized types)

Key Numbers

9 bytes
fixed frame header size
2^31-1
max stream ID (~2 billion streams per connection)
16 MB
max payload per frame (24-bit length, capped by SETTINGS)
64 KB-1
default initial flow-control window
100
default SETTINGS_MAX_CONCURRENT_STREAMS
4 KB
default HPACK dynamic table size
RFC 7540
HTTP/2 spec; RFC 9113 is the bis (revised) version

Flow Control

HTTP/2 has two independent flow-control windows: per-stream and per-connection. Both start at 64 KB. Every WINDOW_UPDATE increments them.

{`# Initial state
stream 1 window:        64 KB
stream 3 window:        64 KB
connection window:      64 KB

# Server sends 64 KB, exhausts its stream window
DATA stream=1 length=65536               # connection window now: 0
                                          # stream window also 0

# Client sends WINDOW_UPDATE to refill both windows
WINDOW_UPDATE stream=1 bytes=65536      # stream window: 64 KB again
WINDOW_UPDATE stream=0 bytes=65536      # connection window: 64 KB again

# Server sends remaining bytes
DATA stream=1 length=139264 flags=END_STREAM  # done`}

A WINDOW_UPDATE on stream 0 refills the connection window only; each stream's individual window also needs explicit updates. Tune with:

  • SETTINGS_INITIAL_WINDOW_SIZE (default 64 KB) — new streams only
  • SETTINGS_MAX_WINDOW_SIZE (max 231-1) — connection window ceiling

At 500 ms RTT, the default 64 KB window limits throughput to ~1 Mbps (64 KB / 0.5s = 128 KB/s). Bump to 16 MB and you get ~256 Mbps — dramatic for satellite links or bulk transfers.

Connection Management

HTTP/2 connections follow a precise handshake: TLS ALPN negotiation, connection preface, and initial SETTINGS exchange. The protocol is strict about the order — deviating from it is grounds for immediate connection closure.

{`# 1. TLS handshake with ALPN
# Client offers in ClientHello:
ALPN: ["h2", "http/1.1"]
# Server picks in ServerHello:
ALPN: h2

# 2. HTTP/2 connection preface (client sends immediately after TLS)
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n   # 24-byte magic string
# Followed by SETTINGS frame (empty, stream=0)

# 3. Server responds with its own empty SETTINGS
# 4. Both sides acked with SETTINGS + ACK
# 5. Normal request/response begins

# Why the magic string?
# "PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n" is a valid HTTP/1.1 request
# that no HTTP/1.1 server would process normally (it returns 400).
# HTTP/2-aware servers parse it and switch modes.

# Alternative: HTTP/1.1 Upgrade header
GET / HTTP/1.1\r\nHost: example.com\r\nUpgrade: h2c\r\n\r\n
# Server responds with 101 Switching Protocols, then HTTP/2 preface.
# This costs an extra RTT but works over port 80 where h2c is allowed.`}

PING — Keepalive and RTT Measurement

{`PING
  length=8
  type=PING
  flags=ACK
  stream=0         # stream 0 only — connection-level
  data=0x0102030405060708   # opaque 8-byte payload, echoed back

# Server or client can send PING with no ACK flag (unanswered)
# if the connection has been idle for too long. Useful for:

# 1. Measuring RTT: send PING, measure time until ACK arrives
# 2. Keepalive through NAT/firewall timeouts (typically >60s)
# 3. Detecting dead connections faster than TCP keepalive
#    (which can take minutes on some platforms)

# The connection is considered dead if:
# - 3 consecutive PINGs get no response
# - Any frame fails to be acknowledged within the idle timeout

# HTTP/2 requires endpoints to send a frame at least every
# SETTINGS_MAX_IDLE_TIME (default: unlimited in spec but often 60s)
# to keep the connection alive.`}

GOAWAY — Graceful Shutdown

{`# Server initiates graceful shutdown
GOAWAY
  length=8
  stream=0
  error=NO_ERROR
  last-stream-id=14
# "I've processed everything up to stream 14. Don't open new
# streams > 14. I'll finish streams 1-14."

# GOAWAY can carry an error code:
# INTERNAL_ERROR, FLOW_CONTROL_ERROR, SETTINGS_TIMEOUT,
# STREAM_CLOSED, FRAME_SIZE_ERROR, REFUSED_STREAM,
# CANCEL, COMPRESSION_ERROR, CONNECT_ERROR,
# ENHANCE_YOUR_CALM (too many requests), INADEQUATE_SECURITY,
# HTTP_1.1_REQUIRED (server only speaks HTTP/1.1)

# A client receiving GOAWAY should:
# - Retry unprocessed streams on a new connection
# - If last-stream-id covers all known streams: just reconnect

# GOAWAY is NOT a hard error: the connection stays open until
# all in-flight frames are processed. It's a graceful drain.`}

The Frame Types

Ten frame types do all the work. HEADERS and DATA carry the request and response; everything else is control plane.

FramePurpose
HEADERSHTTP request/response headers, HPACK-compressed
DATARequest/response body bytes
CONTINUATIONContinuation of a HEADERS or PUSH_PROMISE frame too big for one frame
SETTINGSConnection-level config: max streams, initial window size, header table size
WINDOW_UPDATEFlow control credit; tells sender it may send N more bytes on this stream
RST_STREAMCancel a stream; opposite of TCP RST but per-stream
GOAWAYConnection shutdown; "don't open new streams, here's the last stream I'll process"
PINGHeartbeat / RTT measurement; payload echoed by peer
PRIORITYStream priority hint (largely ignored, deprecated in RFC 9113)
PUSH_PROMISEServer push (deprecated; Chrome removed support in 2022)

HPACK Header Compression

HTTP requests carry a lot of repeated header text: User-Agent, Accept, Cookie. HPACK compresses headers with a static table (61 most common headers), a dynamic table (a per-connection LRU cache), and Huffman encoding.

{`# Static table - first 61 entries are predefined, both sides know them
1 :authority
2 :method GET
3 :method POST
4 :path /
5 :path /index.html
6 :scheme http
7 :scheme https
8 :status 200
...
61 www-authenticate

# Dynamic table - filled at runtime as headers are sent
# Once "user-agent: Mozilla/5.0..." has been sent on this connection,
# the next request just sends the index (e.g., 62) referring to it.

# Encoding example - first request
:method      GET                    -> static idx 2 (1 byte)
:path        /api/users             -> literal-with-incremental indexing (adds to dynamic)
:scheme      https                  -> static idx 7 (1 byte)
:authority   api.example.com        -> literal-with-incremental indexing
user-agent   Mozilla/5.0 ...        -> literal-with-incremental indexing
cookie       session=abc...         -> literal-with-incremental indexing

# Second request - same path, same UA, same cookie - just send dynamic table indexes
# 5x6 byte requests instead of 5x80 byte`}

The dynamic table is per-connection and per-direction (client→server has its own table, server→client another). Maintaining it requires both endpoints to agree on every insertion order; HPACK's design carefully avoids race conditions.

HPACK is deliberately limited to prevent compression side-channel attacks. The CRIME and BREACH attacks (2012-2013) used gzip compression ratio differences to guess secret header values byte-by-byte. HPACK prevents this by mandating that indexed entries cannot be partially matched: once a header is in the dynamic table, it's referenced atomically by index, never by substring. And dynamic table entries cannot reference sensitive values like cookies or Authorization headers in a way an attacker could probe incrementally.

QPACK, HTTP/3's header compression, extends HPACK with unidirectional streams so the dynamic table can be built in parallel without head-of-line blocking on the table reference itself.

Multiplexing One TCP

Streams interleave on a single TCP connection. Stream IDs are odd for client-initiated and even for server-initiated. A request and its response use the same stream ID.

{`# Wire-level interleaving (each frame tagged with its stream ID)
HEADERS    stream=1  GET /a
HEADERS    stream=3  GET /b
HEADERS    stream=5  GET /c
DATA       stream=3  
DATA       stream=1  
DATA       stream=5  
DATA       stream=3             END_STREAM
DATA       stream=1             END_STREAM
DATA       stream=5             END_STREAM

# All three responses come back interleaved, no head-of-line blocking
# at the application level (but TCP is still serial - see below).`}

Stream Priority (Deprecated)

RFC 7540 included a stream priority tree: parent stream, weight (1-256), exclusive flag. The server was supposed to allocate bandwidth proportionally. In practice, no implementation got it right and HTTP/3 dropped the system entirely.

{`# RFC 7540 PRIORITY frame - largely ignored in production
PRIORITY stream=5
  exclusive=false
  parent=3
  weight=128

# RFC 9113 (HTTP/2 bis) deprecates this in favor of the
# extensible priority scheme (RFC 9218):
priority: u=2, i

# u = urgency (0-7, lower = more urgent), i = incremental delivery
# Carried as a regular HTTP header, simpler, supported by Chrome/Firefox`}

Server Push (Deprecated)

The server could volunteer responses for resources the client hadn't yet asked for via PUSH_PROMISE. Aimed to reduce round trips for predictable sub-resources. In practice, browsers cached imperfectly, push consumed bandwidth on already-cached resources, and Chrome removed support in M106 (2022).

{`# Server sends PUSH_PROMISE before the client asked for the resource
PUSH_PROMISE stream=1  (promises response on stream 2)
  :method    GET
  :path      /fonts/example.woff2
  :scheme    https
  :authority api.example.com

# Server then sends the promised response
HEADERS    stream=2  200 OK
DATA       stream=2  

# Client can cancel with RST_STREAM if it already has the resource cached
RST_STREAM stream=2  error=ENHANCE_YOUR_CALM
# error code tells server why (optional hint, not binding)`}

PUSH_PROMISE reserves a stream ID and tells the client what response is coming, so the client can deduplicate. But the semantics are subtle: the server can send the PUSH_PROMISE on one stream and the response on another, meaning the client doesn't know which pushed resource belongs to which request until frames arrive.

The replacement is 103 Early Hints: the server sends an interim response with Link: rel=preload headers before the real response, letting the client start fetching sub-resources during server processing time. Early Hints is client-driven: the client decides whether to act on the hints, avoiding the push bandwidth-wasting problem entirely.

Connection Coalescing

HTTP/2 allows a client to open a single TCP connection to one IP and send requests for multiple origins — provided those origins share the same host, port, and TLS certificate (or a certificate valid for all of them via Subject Alternative Names). This is connection coalescing, and it avoids the per-origin connection overhead that HTTP/1.1 suffered from.

{`# Coalescing requires all three conditions:
# 1. Same IP address + port (TCP 4-tuple)
# 2. TLS certificate covers both origins (SAN match)
# 3. No authorization headers or cookies scoped to a single origin

# Example: api.example.com and www.example.com share one IP
# DNS resolves both to 93.184.216.34:443
# TLS cert has SAN for both — same connection, two origins

# Connection reuse in nghttp2:
# nghttp2 automatically reuses connections if the above conditions hold.
# h2load uses the --origin flag to test coalescing:
h2load --origin=https://api.example.com https://www.example.com

# Coalescing breaks when:
# - One origin returns a 421 (Misdirected Request)
# - Different client certificates per origin
# - Auth cookies scoped to a specific domain
# 421 means "I can't handle this on this connection" — client
# falls back to opening a new connection for that origin`}

HTTP/3's connection migration works differently: a QUIC connection is identified by Connection IDs, not by the 4-tuple. A client can migrate the connection to a different 5-tuple (different source port, same destination IP/port) without reconnecting. This is fundamentally incompatible with HTTP/2's coalescing model, which is why HTTP/3 doesn't coalesce connections the same way.

Dependency Trees and Prioritization

RFC 7540 included stream dependencies as a way to tell the server which request matters most. The system was a weighted tree: each stream optionally depends on another stream, with a weight (1-256). RFC 9113 deprecated this in favor of the RFC 9218 extensible priority scheme, but understanding the original helps explain what went wrong.

HTTP/2 stream dependency tree (RFC 7540) stream 1 stream 3 (weight 100) stream 5 (weight 50) stream 7 (w=32) stream 9 (w=32) stream 3 gets 100/(100+50) = 67% of stream 1's bandwidth streams 7 and 9 share stream 3's 67% equally: each 33%
{`# RFC 7540 PRIORITY frame (deprecated)
PRIORITY stream=3 exclusive=true parent=1 weight=100
# stream 3 depends exclusively on stream 1 (no other children)
# gets 100/(100+50) = 67% of available bandwidth

PRIORITY stream=5 exclusive=false parent=1 weight=50
# stream 5 depends on stream 1, shares with stream 3
# gets 50/(100+50) = 33%

PRIORITY stream=7 exclusive=false parent=3 weight=32
PRIORITY stream=9 exclusive=false parent=3 weight=32
# streams 7 and 9 share stream 3's share equally

# The exclusive flag: if true, all existing children of parent
# become children of the new stream. A stream "takes over" its siblings.
PRIORITY stream=11 exclusive=true parent=1 weight=256
# streams 3 and 5 now depend on stream 11 (reparented)
# stream 11 gets all of stream 1's capacity`}

The problem was that servers disagreed on the algorithm. Some did strict weighted fair queuing, some ignored weights during head-of-line blocking, some treated exclusive incorrectly. The result was unpredictable and often worse than no prioritization at all. Browsers shipped workarounds: Chrome sent PRIORITY frames only for high-priority resources (CSS, JS) and left everything else at default weight. This made prioritization partially work but never fully right.

{`# RFC 9218 — Extensible Priority Scheme (current)
# Carried as an HTTP header, not a frame:
:path   /api/users
priority: u=3, i
# u = urgency (0-7, lower = more urgent)
# i = incremental (boolean)

# Urgency levels:
# 0-2: non-incremental responses (HTML, CSS, JS — block rendering)
# 3:   normal responses (images, fonts)
# 4-7: incremental (streaming, progressive images)

# Example priority tree:
# urgency=0, non-incremental: blocking JS, critical CSS
# urgency=1, non-incremental: async JS, preloaded assets
# urgency=3: images, fonts, lazy-loaded content
# urgency=7, incremental: server-sent events, streaming

# Chrome/Firefox send this header instead of PRIORITY frames.
# nginx and other servers now understand it (RFC 9218 extensible).`}

CONTINUATION Frames

HEADERS frames have a maximum size set by SETTINGS_MAX_FRAME_SIZE (default 16 KB). When the HPACK-encoded headers exceed that, the rest goes in CONTINUATION frames. CONTINUATION must immediately follow HEADERS on the same stream with no other frames interleaved — an exception to the otherwise-free interleaving.

{`HEADERS      stream=1  flags=                       
CONTINUATION stream=1  flags=                       
CONTINUATION stream=1  flags=END_HEADERS            
DATA         stream=1  flags=END_STREAM             

# CONTINUATION frames cannot be interleaved with other frames
# on the same stream. This is the only serial guarantee in HTTP/2.
# Breaking it (sending DATA between CONTINUATIONs) is a PROTOCOL ERROR
# and triggers GOAWAY.`}

CONTINUATION Flood (CVE-2024-27316)

In early 2024, a vulnerability was discovered: an attacker sends a HEADERS frame followed by an unbounded number of CONTINUATION frames without ever setting END_HEADERS. The server accumulates header bytes in a per-stream buffer, growing memory until OOM. Affected implementations included nghttp2, h2o, Go's net/http (http2), and others.

The fix: a server-side cap on total header bytes per stream (e.g., 256 KB), enforced regardless of whether END_HEADERS was reached. A stream exceeding this limit triggers a PROTOCOL_ERROR and GOAWAY. This is why a frontend proxy or edge layer should always set SETTINGS_MAX_HEADER_LIST_SIZE to prevent unconstrained header accumulation from reaching your application.

Connection Management

HTTP/2 connections follow a precise handshake: TLS ALPN negotiation, connection preface, and initial SETTINGS exchange. The protocol is strict about the order — deviating from it is grounds for immediate connection closure.

{`# 1. TLS handshake with ALPN
# Client offers in ClientHello:
ALPN: ["h2", "http/1.1"]
# Server picks in ServerHello:
ALPN: h2

# 2. HTTP/2 connection preface (client sends immediately after TLS)
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n   # 24-byte magic string
# Followed by SETTINGS frame (empty, stream=0)

# 3. Server responds with its own empty SETTINGS
# 4. Both sides acked with SETTINGS + ACK
# 5. Normal request/response begins

# Why the magic string?
# "PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n" is a valid HTTP/1.1 request
# that no HTTP/1.1 server would process normally (it returns 400).
# HTTP/2-aware servers parse it and switch modes.

# Alternative: HTTP/1.1 Upgrade header
GET / HTTP/1.1\r\nHost: example.com\r\nUpgrade: h2c\r\n\r\n
# Server responds with 101 Switching Protocols, then HTTP/2 preface.
# This costs an extra RTT but works over port 80 where h2c is allowed.`}

PING — Keepalive and RTT Measurement

{`PING
  length=8
  type=PING
  flags=ACK
  stream=0         # stream 0 only — connection-level
  data=0x0102030405060708   # opaque 8-byte payload, echoed back

# Server or client can send PING with no ACK flag (unanswered)
# if the connection has been idle for too long. Useful for:

# 1. Measuring RTT: send PING, measure time until ACK arrives
# 2. Keepalive through NAT/firewall timeouts (typically >60s)
# 3. Detecting dead connections faster than TCP keepalive
#    (which can take minutes on some platforms)

# The connection is considered dead if:
# - 3 consecutive PINGs get no response
# - Any frame fails to be acknowledged within the idle timeout

# HTTP/2 requires endpoints to send a frame at least every
# SETTINGS_MAX_IDLE_TIME (default: unlimited in spec but often 60s)
# to keep the connection alive.`}

GOAWAY — Graceful Shutdown

{`# Server initiates graceful shutdown
GOAWAY
  length=8
  stream=0
  error=NO_ERROR
  last-stream-id=14
# "I've processed everything up to stream 14. Don't open new
# streams > 14. I'll finish streams 1-14."

# GOAWAY can carry an error code:
# INTERNAL_ERROR, FLOW_CONTROL_ERROR, SETTINGS_TIMEOUT,
# STREAM_CLOSED, FRAME_SIZE_ERROR, REFUSED_STREAM,
# CANCEL, COMPRESSION_ERROR, CONNECT_ERROR,
# ENHANCE_YOUR_CALM (too many requests), INADEQUATE_SECURITY,
# HTTP_1.1_REQUIRED (server only speaks HTTP/1.1)

# A client receiving GOAWAY should:
# - Retry unprocessed streams on a new connection
# - If last-stream-id covers all known streams: just reconnect

# GOAWAY is NOT a hard error: the connection stays open until
# all in-flight frames are processed. It's a graceful drain.`}

The fundamental flaw HTTP/2 couldn't fix: it's still TCP underneath. If a packet on stream 5 is lost, TCP's retransmit logic stalls the receive buffer for all streams sharing that connection — even streams whose bytes already arrived.

TCP HOL blocking on a multiplexed HTTP/2 connection Stream 1: P1.1 P1.2 P1.3 Stream 5: P5.1 LOST P5.2 P5.3 Stream 9: P9.1 P9.2 P9.3 time -> P5.1 lost ⇒ TCP holds ALL packets after it — including streams 1, 9 — until retransmit Streams 1 and 9 are completely independent but stall anyway

Tools and Commands

Practical commands for inspecting, benchmarking, and debugging HTTP/2 in the wild.

{`# Check if a site supports HTTP/2 (ALPN negotiation)
openssl s_client -connect example.com:443 -alpn h2 -brief 2>&1 | grep ALPN
# Output: ALPN:        protocol = h2

# Verify HTTP/2 via h2c prior knowledge (gRPC-style)
curl -v --http2-prior-knowledge https://api.example.com/health
# Shows: * Using HTTP/2 prior knowledge
#         * h2 header appears (prior-knowledge mode)

# Fetch over HTTP/2 with verbose output
curl -vv --http2 https://example.com/api/endpoint 2>&1 | grep -E 'h2|HTTP/|\[\*\]'
# Look for: * h2
#           * Using HTTP/2

# Wireshark filter to capture HTTP/2 frames
tcp port 443 or tcp port 80
# Then filter individual HTTP/2 frame types:
http2.type == 0x0    # DATA frames
http2.type == 0x1    # HEADERS frames
http2.type == 0x4    # SETTINGS frames
http2.type == 0x6    # PING frames
http2.type == 0x8    # GOAWAY frames
http2.type == 0x9    # WINDOW_UPDATE frames

# nghttp2 tools — benchmarking with h2load
brew install nghttp2  # macOS
h2load -n 1000 -c 100 -m 10 https://example.com/
# -n 1000  : 1000 total requests
# -c 100   : 100 concurrent clients
# -m 10    : 10 concurrent streams per connection
# Output shows RPS, latency percentiles, transfer speeds

# nghttp2 — make a single HTTP/2 request and show frames
nghttp -v https://example.com/
# Shows frame-by-frame: SETTINGS, HEADERS, DATA, WINDOW_UPDATE, etc.

# k6 run script for HTTP/2 load testing (JavaScript)
import http from 'k6/http';
import { check } from 'k6';

export const options = {
  scenarios: {
    example: {
      executor: 'constant-vus',
      vus: 50,
      duration: '30s',
    },
  },
};

export default function () {
  const res = http.get('https://example.com/api/endpoint', {
    responses: 10,
  });
  check(res, { 'status is 200': (r) => r.status === 200 });
}`}

When debugging, remember that curl -v will show HTTP/2 negotiation but not the frames themselves. nghttp -v gives frame-level detail. For production traffic, Wireshark with HTTP/2 dissector (or mitmproxy) gives the full picture. nghttp2's h2load is the gold standard for benchmarking HTTP/2 vs HTTP/1.1 vs HTTP/3 performance on the same endpoint.

The Road to HTTP/3 over QUIC

QUIC moves stream multiplexing below the loss-recovery boundary. Each stream has its own loss recovery; a lost packet stalls only the stream it was on, not the others. QUIC also bundles TLS 1.3 inline (1-RTT or 0-RTT handshake) and runs over UDP.

AspectHTTP/1.1HTTP/2HTTP/3
Wire formatASCII textBinary framesQUIC frames
TransportTCPTCPUDP (QUIC)
MultiplexingNone (per-conn)App-level streamsTransport-level streams
Header compressionNoneHPACKQPACK (HPACK-like, async)
HOL blockingPer-connTCP layerPer-stream only
Connection setup1-RTT TCP + 1-2 RTT TLSsame1-RTT or 0-RTT total
Connection migrationNoNoYes (Connection ID)

Tradeoffs

HTTP/2 vs 1.1

Wins on header overhead, parallel requests, server-initiated traffic. Loses on debugging (binary), middlebox compatibility (some old proxies break), and the HOL blocking nobody fully fixed.

One connection or many

Browsers default to one HTTP/2 connection per origin. For high-throughput backends (gRPC), pooling multiple connections amortizes HOL blocking risk — at the cost of HPACK efficiency lost across pool members.

Server push was a mistake

It guesses at what the client wants. The client knows what it has cached. The server knows neither. Early Hints + client-driven preload is strictly better.

HTTP/2 in the data center

gRPC is HTTP/2 framing and benefits from multiplexed streams. Latency-sensitive RPC over HTTP/2 still sees TCP HOL hurt; some shops are migrating to HTTP/3 internally.

FAQ

Why is HTTP/2 binary?

Parsing speed and unambiguity. Binary frames have a fixed header layout that's trivial to parse and impossible to misinterpret. Text protocols always have edge cases (CRLF in header values, whitespace handling) that binary avoids.

Can I use HTTP/2 without TLS?

The spec allows h2c (cleartext HTTP/2), but no major browser supports it. In practice HTTP/2 is HTTPS-only on the public internet, with h2c reserved for internal RPC traffic and gRPC.

What's ALPN?

Application-Layer Protocol Negotiation, a TLS extension. During the handshake the client offers ["h2", "http/1.1"] and the server picks. Without ALPN the client wouldn't know if the server speaks HTTP/2.

Why is the dynamic HPACK table only 4 KB?

Memory cost on the server. Each connection has two dynamic tables (one per direction). With 100K concurrent connections and 4 KB per direction, that's 800 MB. Larger tables compress better but bloat memory at scale.

Is HTTP/2 obsolete now that HTTP/3 exists?

No. HTTP/2 deployment is universal; HTTP/3 deployment is partial (CDN edges, some browsers, fewer origin servers). For the next decade both will coexist. ALPN picks the best mutually-supported one.

What's the deal with the H2 prior knowledge upgrade?

If client and server both know they speak HTTP/2 (e.g., gRPC over h2c), they can skip the HTTP/1.1 Upgrade dance and start with the HTTP/2 connection preface immediately. Saves an RTT but only works when both sides are pre-configured.

Why did HTTP/2 prioritization get replaced?

RFC 7540's priority tree required servers to do a topological sort of stream dependencies and weights. In practice, servers couldn't agree on the algorithm (nginx ignored it entirely, some servers did weighted fair queuing incorrectly). The result: prioritization either didn't work or caused more head-of-line blocking than it solved. RFC 9113 replaced it with the RFC 9218 extensible scheme (urgency 0-7, incremental flag) which Chrome and Firefox implement correctly.

What is CONTINUATION flood?

CVE-2024-27316: an attacker sends a HEADERS frame followed by endless CONTINUATION frames without setting END_HEADERS. The server keeps accumulating header bytes into a per-stream buffer, eventually exhausting memory and crashing. The fix is a server-side cap on total header bytes per stream (regardless of END_HEADERS). Affected: nghttp2, h2o, Go's net/http (http2), and others. Always run behind a frontend that limits header size.