Custom Performance Metrics

intermediate18 min read

Core Web Vitals Are Not Enough

Core Web Vitals tell you how fast the page loads (LCP), how responsive it is to input (INP), and how stable the layout is (CLS). These are great generic metrics. But they do not tell you anything specific to your application.

Does your search return results in under 300ms? How long does the course editor take to render 200 lessons? Is the checkout flow completing in under 5 seconds end-to-end? Core Web Vitals cannot answer these questions. You need custom performance metrics — measurements tailored to the specific actions and flows that matter to your users and your business.

Mental Model

Core Web Vitals are like a car's speedometer and fuel gauge — every car has them, and they are universally useful. Custom metrics are like the specialized gauges in a race car: tire temperature, brake pressure, turbo boost. The speedometer tells you how fast you are going. The custom gauges tell you whether you are going to win the race. Every application has critical user journeys that need their own gauges.

Measuring Component Render Time

React does not give you a built-in way to measure how long a component takes to render. But the Performance API does:

function useMeasureRender(componentName: string) {
  const startRef = useRef<number>(0);

  startRef.current = performance.now();

  useEffect(() => {
    const duration = performance.now() - startRef.current;
    performance.measure(`render:${componentName}`, {
      start: startRef.current,
      duration,
    });

    if (duration > 16) {
      sendMetric({
        name: 'slow_render',
        value: duration,
        tags: { component: componentName },
      });
    }
  });
}

function CourseEditor({ lessons }: { lessons: Lesson[] }) {
  useMeasureRender('CourseEditor');

  return (
    <div>
      {lessons.map((lesson) => (
        <LessonCard key={lesson.id} lesson={lesson} />
      ))}
    </div>
  );
}

The hook captures performance.now() at render start (before the component's JSX is evaluated). useEffect fires after the component mounts or updates, so performance.now() in the effect gives you the end timestamp. Any render over 16ms (one frame at 60fps) is logged for investigation.

Quiz

Your useMeasureRender hook reports that CourseEditor takes 45ms to render. But the user does not notice any jank. Why might the metric be misleading?

ABCD

API Response Time from the Client Perspective

Server-side latency metrics (measured at your API server) miss the full picture. The user experiences the entire round trip: DNS resolution, TCP handshake, TLS negotiation, request transfer, server processing, response transfer, and client-side parsing. Measuring from the client captures the complete latency.

async function fetchWithTiming<T>(
  url: string,
  options?: RequestInit,
): Promise<{ data: T; timing: { total: number; serverTime: number } }> {
  const start = performance.now();
  const response = await fetch(url, options);
  const data: T = await response.json();
  const total = performance.now() - start;

  const serverTime = parseInt(
    response.headers.get('Server-Timing')?.match(/total;dur=(\d+)/)?.[1] ?? '0',
  );

  sendMetric({
    name: 'api_latency',
    value: total,
    tags: {
      endpoint: new URL(url, window.location.origin).pathname,
      method: options?.method ?? 'GET',
      status: String(response.status),
      serverTime: String(serverTime),
      networkTime: String(total - serverTime),
    },
  });

  return { data, timing: { total, serverTime } };
}

By comparing total (client-measured) with serverTime (from the Server-Timing header), you can separate network latency from server processing time. If total is 800ms and server time is 100ms, you have a 700ms network problem, not a server problem.

The Server-Timing header

Ask your backend team to include a Server-Timing header in API responses. It looks like Server-Timing: total;dur=142, db;dur=89. This header is designed specifically for performance diagnostics — it lets the frontend see exactly how long the server spent processing, and even which sub-operations (database queries, cache lookups) took the most time, without exposing implementation details.

Search-to-Result Latency

Search is one of the most latency-sensitive features in any application. Users expect results as they type, and anything over 200ms feels sluggish. Measuring the complete search experience requires multiple data points:

function useSearchMetrics() {
  const keystrokeTimestamp = useRef(0);
  const searchStart = useRef(0);

  function onKeystroke() {
    keystrokeTimestamp.current = performance.now();
  }

  function onSearchStart() {
    searchStart.current = performance.now();
  }

  function onResultsRendered(resultCount: number) {
    const now = performance.now();
    const keystrokeToResults = now - keystrokeTimestamp.current;
    const fetchToResults = now - searchStart.current;

    sendMetric({
      name: 'search_keystroke_to_results',
      value: keystrokeToResults,
      tags: { resultCount: String(resultCount) },
    });

    sendMetric({
      name: 'search_fetch_to_results',
      value: fetchToResults,
      tags: { resultCount: String(resultCount) },
    });
  }

  return { onKeystroke, onSearchStart, onResultsRendered };
}

Two metrics matter here:

Keystroke-to-results — The full user experience. Includes debounce time, API call, response parsing, and rendering. This is what the user feels.
Fetch-to-results — Just the API call and render. Isolates backend performance from frontend debounce/processing overhead.

If keystroke-to-results is 450ms but fetch-to-results is 100ms, the problem is your 300ms debounce timer, not the search API.

Quiz

You measure search-to-result latency and find the p75 is 320ms. You reduce the debounce from 300ms to 150ms. The p75 drops to 210ms, but your search API starts returning 429 (Too Many Requests) errors. What went wrong?

ABCD

Feature-Specific SLIs

A Service Level Indicator (SLI) is a quantitative measure of a specific aspect of your service. While SLIs are traditionally a backend concept, frontend SLIs are increasingly important:

const featureSLIs = {
  courseEnrollment: {
    metric: 'enrollment_completion_time',
    target: 3000,
    description: 'Time from click Enroll to confirmation screen',
  },
  lessonNavigation: {
    metric: 'lesson_navigation_time',
    target: 500,
    description: 'Time from clicking next lesson to content visible',
  },
  codeEditorReady: {
    metric: 'editor_interactive_time',
    target: 2000,
    description: 'Time from page load to editor accepting input',
  },
  quizSubmission: {
    metric: 'quiz_submit_to_feedback',
    target: 200,
    description: 'Time from submitting answer to showing result',
  },
};

Each SLI defines what you are measuring, what "good" looks like (the target), and a human-readable description. You then track the percentage of interactions that meet the target:

function trackSLI(sliName: string, duration: number) {
  const sli = featureSLIs[sliName as keyof typeof featureSLIs];
  if (!sli) return;

  sendMetric({
    name: sli.metric,
    value: duration,
    tags: {
      withinBudget: String(duration <= sli.target),
      sli: sliName,
    },
  });
}

Over time, you can report "98.5% of lesson navigations complete within 500ms" — a concrete, meaningful statement about your application's performance that Core Web Vitals alone cannot provide.

Key Rules

1Define SLIs for every critical user journey — enrollment, navigation, search, submission, editor loading
2Set targets based on user research and competitive analysis, not arbitrary round numbers
3Track the percentage of interactions meeting the target (the SLI), not just the average duration
4Review SLIs weekly and investigate any drop in the success percentage

Building a Performance Dashboard

A performance dashboard aggregates your custom metrics and Core Web Vitals into a single view. The key is organizing by what matters, not by what is easy to display.

Dashboard Structure

Performance Dashboard LayersPhase 1 / 3

Phase 1 / 3Executive View

Overall health: percentage of page loads with good Web Vitals, SLI success rates, week-over-week trend

p75 LCPp75 INPSLI %

1/3

Sending Metrics to Datadog, Grafana, or a Custom Backend

The analytics endpoint receives metrics and stores them in a time-series database:

// /api/vitals route handler
export async function POST(request: Request) {
  const metrics = await request.json();

  await timeseriesDB.write({
    measurement: 'web_vitals',
    tags: {
      name: metrics.name,
      page: metrics.page,
      connection: metrics.connection,
      device: metrics.device,
    },
    fields: {
      value: metrics.value,
    },
    timestamp: new Date(),
  });

  return new Response(null, { status: 204 });
}

For small-to-medium teams, InfluxDB or Prometheus + Grafana is a solid stack. For larger teams, Datadog's RUM product collects Web Vitals automatically and provides pre-built dashboards.

Performance Budgets in CI

Custom metrics become truly powerful when you enforce them in CI. Performance budgets prevent regressions by failing the build when a metric exceeds its threshold.

// performance.config.ts
export const budgets = [
  { metric: 'bundle-size:total', limit: '350 KB', type: 'static' },
  { metric: 'bundle-size:main-chunk', limit: '120 KB', type: 'static' },
  { metric: 'lcp:homepage', limit: 2500, type: 'runtime' },
  { metric: 'tti:course-page', limit: 3500, type: 'runtime' },
  { metric: 'component-render:CourseEditor', limit: 32, type: 'runtime' },
];

Static Budgets (Bundle Size)

Static budgets check things that can be measured at build time:

- name: Check bundle size budget
  run: |
    pnpm build
    node scripts/check-bundle-size.js

// scripts/check-bundle-size.js
import { readdirSync, statSync } from 'node:fs';
import { join } from 'node:path';

const BUDGET_KB = 350;
const chunksDir = '.next/static/chunks';

let totalSize = 0;
for (const file of readdirSync(chunksDir)) {
  if (file.endsWith('.js')) {
    totalSize += statSync(join(chunksDir, file)).size;
  }
}

const totalKB = totalSize / 1024;
if (totalKB > BUDGET_KB) {
  console.error(
    `Bundle size ${totalKB.toFixed(1)} KB exceeds budget of ${BUDGET_KB} KB`,
  );
  process.exit(1);
}

console.log(`Bundle size: ${totalKB.toFixed(1)} KB (budget: ${BUDGET_KB} KB)`);

Runtime Budgets (Lighthouse + Playwright)

Runtime budgets check metrics that require running the application:

// playwright/performance.spec.ts
import { test, expect } from '@playwright/test';

test('course page LCP under budget', async ({ page }) => {
  await page.goto('/courses/frontend-engineering');

  const lcp = await page.evaluate(() => {
    return new Promise<number>((resolve) => {
      new PerformanceObserver((list) => {
        const entries = list.getEntries();
        resolve(entries[entries.length - 1].startTime);
      }).observe({ type: 'largest-contentful-paint', buffered: true });
    });
  });

  expect(lcp).toBeLessThan(2500);
});

test('search results render under 300ms', async ({ page }) => {
  await page.goto('/courses');
  const searchInput = page.getByRole('searchbox');

  const start = Date.now();
  await searchInput.fill('javascript');
  await page.getByTestId('search-results').waitFor({ state: 'visible' });
  const duration = Date.now() - start;

  expect(duration).toBeLessThan(300);
});

Quiz

Your CI performance budget checks bundle size on every PR. A developer opens a PR that adds 2KB of JavaScript but replaces a 15KB dependency with a 3KB alternative. The total bundle size decreases by 10KB. Should the performance budget check pass?

ABCD

Common Trap

Performance budgets that are too strict become ignored. If your budget fails on every PR because it was set 5% above current values and normal development pushes past it, developers will learn to skip the check or raise the budget repeatedly. Set budgets at 10-20% above your current baseline, review and tighten them quarterly, and always allow an escape hatch (a PR label that skips the check with a required justification comment) for intentional increases.

What developers do	What they should do
Only measuring Core Web Vitals and nothing application-specific Core Web Vitals measure page-level performance. Your users care about feature-level performance: how fast search works, how fast lessons load, how responsive the editor is.	Define custom SLIs for every critical user journey
Measuring render time with Date.now() instead of performance.now() Date.now() has millisecond resolution and can be affected by system clock adjustments. performance.now() provides microsecond resolution from a monotonic clock that is not affected by clock skew.	Always use performance.now() for sub-millisecond precision timing
Setting performance budgets based on aspirational targets instead of current baselines Aspirational budgets fail on every PR and get ignored. Budgets slightly below current values catch regressions while remaining achievable.	Set budgets 10-20% below current values and tighten gradually
Collecting metrics but never acting on them A dashboard nobody looks at is just a hosting cost. Metrics only matter if they drive decisions.	Review metrics weekly, investigate regressions immediately, and tie metrics to team OKRs

The Metrics That Matter

The best performance teams do not measure everything — they measure the right things. Start with Core Web Vitals for baseline health. Add SLIs for your 3-5 most critical user journeys. Enforce the most important budgets in CI. Build a dashboard that answers "is our app fast for our users?" in 10 seconds.

Then iterate. When a user reports slowness, add a metric for that flow. When you ship a new feature, add a budget for its impact. When you notice a pattern in your debug data, create an alert. Your custom metrics should evolve with your application, not be set once and forgotten.