JIT Compilation: Ignition to TurboFan

advanced18 min read

Why Your Code Gets Faster the Longer It Runs

Run a function once and it takes 50ms. Run it 10,000 times and each call takes 0.02ms. Wait, what? Same function, same inputs, 2500x faster. This isn't caching — it's V8 literally rewriting your JavaScript into increasingly optimized machine code while your program runs.

This is Just-In-Time (JIT) compilation: the engine watches your code's runtime behavior, makes bets about types and control flow, and generates specialized machine code based on those bets. The more it watches, the better it gets. And understanding this pipeline is how you stop being surprised by performance cliffs.

V8's compilation pipeline has four tiers, each trading compilation speed for execution speed:

V8 Tiered Compilation Pipeline5 steps

Press Play or click any node to begin.

Tier 0: Parsing and the AST

Before any of the fun stuff happens, V8 has to parse your source code into an Abstract Syntax Tree (AST). But even here, V8 is already making strategic decisions. It uses two parsing strategies:

Eager parsing: the full function body is parsed immediately. Used for top-level code and functions that V8 expects will run soon.
Lazy parsing (pre-parsing): only function signatures are parsed. The body is parsed later if the function is actually called. This saves startup time for large codebases where many functions are never invoked.

function hotPath() {    // Eagerly parsed if called at module level
  return computeStuff();
}

function rarePath() {   // Lazy parsed — body skipped until first call
  return handleEdgeCase();
}

The Double-Parse Penalty

Lazy-parsed functions pay a cost: when finally called, V8 must re-parse the function body from scratch. If a function is always called during startup, lazy parsing wastes time. V8's heuristics for choosing eager vs lazy parsing are good but not perfect — IIFEs and functions called at the top level are eagerly parsed.

Quiz

Why does V8 lazy-parse some functions instead of parsing everything upfront?

ABCD

Tier 1: Ignition — The Interpreter

This is where every function starts its life. Ignition compiles the AST into bytecode — a compact, platform-independent instruction set. Bytecode is not machine code; it's interpreted by Ignition's bytecode handler loop.

function add(a, b) {
  return a + b;
}

V8 compiles this into bytecode roughly like:

Ldar a1         // Load argument 'b' into accumulator
Add a0, [0]     // Add argument 'a' to accumulator, feedback slot [0]
Return          // Return accumulator

Key properties of Ignition:

Fast compilation: bytecode generation is linear-time — no optimization passes
Low memory: bytecode is 25-50% the size of equivalent machine code
Feedback collection: every operation records type information in feedback vectors

Most tutorials skip this part, but it's the key to the whole thing. The [0] in the Add instruction is a feedback vector slot. Every time add() runs, V8 records what types a and b were. After enough calls, the feedback vector might say: "this Add operation always received two SMIs (Small Integers)."

This feedback is the foundation of every optimization that follows.

Quiz

What is the primary purpose of feedback vectors in Ignition bytecode?

ABCD

Tier 2: Sparkplug — The Baseline Compiler

Sparkplug is a non-optimizing baseline compiler introduced in V8 9.1. It compiles Ignition bytecode directly to machine code without any optimization passes.

Sounds pointless, right? Why bother with unoptimized machine code? Because even unoptimized native code eliminates interpreter overhead: no bytecode dispatch loop, no indirect jumps through handler tables. Sparkplug code runs ~2x faster than Ignition for typical functions. That's a free 2x just by removing the interpreter middleman.

Key properties:

Extremely fast compilation: Sparkplug walks the bytecode linearly. No IR, no register allocation, no optimization. Compilation takes microseconds per function
No new feedback: Sparkplug doesn't collect additional type info — it relies on feedback vectors already populated by Ignition
Stack frame compatible: Sparkplug frames are identical to Ignition frames, so the two tiers can share stack frames and deoptimization can seamlessly fall back to Ignition

Mental Model

Think of Sparkplug as a speed reader converting handwritten notes (bytecode) into typed text (machine code). It doesn't edit for clarity, fix grammar, or reorganize paragraphs — it just transcribes faster than reading handwriting. The content is the same; the delivery is faster.

Tier 3: Maglev — The Mid-Tier Optimizer

Here's where things start getting really interesting. Maglev (introduced in V8 11.3 / Chrome 117) fills the gap between Sparkplug's zero optimization and TurboFan's heavy optimization. It performs lightweight optimizations using feedback vectors:

Type specialization: if feedback says a is always a SMI, Maglev emits SMI-specific machine code
Inline caching: property accesses compiled with type guards based on observed shapes
Basic inlining: small, hot callees inlined into callers
No complex graph optimizations: no escape analysis, no loop-invariant code motion, no advanced scheduling

Maglev compiles 10-100x faster than TurboFan while producing code that runs within 50-80% of TurboFan's speed. For many functions, Maglev is "good enough" and TurboFan never needs to engage.

Compilation speed:   Sparkplug >> Maglev >> TurboFan
Execution speed:     TurboFan > Maglev > Sparkplug > Ignition

Quiz

Why did V8 add Maglev between Sparkplug and TurboFan?

ABCD

Tier 4: TurboFan — The Peak Optimizer

This is the big one. TurboFan is V8's heavy-duty optimizing compiler. It kicks in for very hot functions — code that runs thousands of times with stable type feedback.

TurboFan builds a sea-of-nodes IR (Intermediate Representation): a graph where nodes represent operations and edges represent data flow and control flow. This IR enables powerful optimizations:

Speculative Optimization

TurboFan reads feedback vectors and speculates that observed types will continue:

function multiply(a, b) {
  return a * b;
}
// Called 10,000 times with integers
// TurboFan speculates: a and b are always SMI
// Emits: integer multiply instruction + type guard

If the speculation is wrong (someone passes a string), a type guard fails and V8 deoptimizes — falls back to Ignition and reruns the function.

Key TurboFan Optimizations

Inlining: hot callees are copied into the caller, eliminating call overhead and enabling cross-function optimization
Escape analysis: objects that don't escape a function are decomposed into local variables — no heap allocation
Loop-invariant code motion: expressions that don't change across loop iterations are hoisted out
Dead code elimination: unreachable branches removed based on type feedback
Bounds check elimination: array access guards removed when the index is provably in range
Constant folding: compile-time evaluation of constant expressions

Sea-of-Nodes IR: Why V8 Uses It

Unlike a traditional CFG (Control Flow Graph) where instructions are ordered within basic blocks, TurboFan's sea-of-nodes IR lets operations float freely — they're only constrained by their data dependencies. This allows optimizations that would be difficult with fixed instruction ordering.

For example, two independent computations can be reordered for better CPU pipeline utilization without explicit scheduling passes. The graph naturally expresses "this operation depends on that value" without implying "this operation must happen before that one."

The downsides: sea-of-nodes is harder to debug, harder to implement correctly, and compilation is slower due to the graph algorithms involved. This is why TurboFan is reserved for the hottest code paths.

Quiz

TurboFan's escape analysis detects that an object never leaves a function. What optimization does this enable?

ABCD

On-Stack Replacement (OSR)

What about long-running loops? Think about it — if a function enters a loop that runs millions of iterations, V8 can't wait until the function returns to optimize it. The function may never return in a reasonable time.

OSR solves this: V8 compiles the optimized version while the unoptimized code is still running, then replaces the running frame mid-execution:

function processAllData(data) {
  let total = 0;
  for (let i = 0; i < data.length; i++) {   // Runs millions of iterations
    total += transform(data[i]);              // Hot loop body
  }
  return total;
}

During the loop, V8 detects that processAllData is hot, compiles an optimized version with TurboFan, and swaps the currently executing frame to the optimized code — right in the middle of the loop. The loop counter, local variables, and stack state are mapped from the unoptimized frame to the optimized frame.

Common Trap

OSR has a constraint: the optimized code must be able to pick up exactly where the unoptimized code left off. This limits some optimizations — for example, TurboFan may not be able to fully hoist loop-invariant code if the OSR entry point is inside the loop. Functions that are called frequently (not stuck in one long loop) produce better optimized code because TurboFan can optimize the entire function body without OSR constraints.

Hot Function Detection: How V8 Decides to Optimize

V8 uses invocation counters and loop back-edge counters to decide when to tier up:

Every function has an invocation counter (incremented on each call)
Every loop back-edge increments a separate counter
When counters exceed a threshold, V8 marks the function for optimization

The exact thresholds vary and are tuned per V8 version, but roughly:

~100 calls or loop iterations → Sparkplug
~1,000 → Maglev
~10,000 → TurboFan

Interrupt Budget

V8 uses an interrupt budget rather than simple call counts. Each bytecode operation decrements the budget. When it hits zero, V8 checks if the function should be tiered up. This ensures that both frequently-called small functions and infrequently-called functions with hot loops are detected.

Quiz

A function is called 50,000 times with consistent integer arguments. What is the likely compilation progression?

ABCD

The Full Pipeline in Action

Let's put it all together. Here's the complete lifecycle of a hot function from cold start to peak performance:

Execution Trace

Parse

function sum(arr) { ... }

Parser creates AST. Lazy-parsed if not called immediately.

Ignition

Bytecode: LoadArray, ForLoop, Add, Return

AST compiled to bytecode. Feedback vectors attached. ~0.1ms compile.

Execute

sum([1,2,3]) — interpreted

Ignition runs bytecode. Feedback slot records: 'arr is always Array, elements are SMI'

Sparkplug

Baseline machine code generated

After ~100 calls. Direct bytecode-to-native transcription. ~0.01ms compile.

Maglev

Type-specialized machine code

After ~1,000 calls. SMI-specialized Add, bounds-check-guarded array access. ~0.5ms compile.

TurboFan

Peak-optimized machine code

After ~10,000 calls. Inlined loops, escape analysis, loop unrolling. ~5ms compile.

Key Rules

1V8 has four compilation tiers: Ignition (interpreter) → Sparkplug (baseline) → Maglev (mid-tier) → TurboFan (peak optimizer). Each trades compilation speed for execution speed.
2Feedback vectors are the bridge between tiers. Ignition collects type/shape information that higher tiers use for specialization.
3TurboFan uses speculative optimization — it generates code assuming observed types persist. Wrong assumptions trigger deoptimization.
4On-Stack Replacement (OSR) lets V8 switch from unoptimized to optimized code mid-execution, even inside a running loop.
5Hot function detection uses interrupt budgets, not simple call counts. Both frequent calls and hot loops trigger tier-up.
6Lazy parsing saves startup time but causes a re-parse penalty when the function is first called. Eagerly-called functions should not be lazy-parsed.
7Escape analysis, inlining, and dead code elimination are TurboFan-only — Maglev and below don't perform these optimizations.