Experiments

See how lightweight A/B tests and guardrail metrics let you steer roadmaps with evidence instead of opinions, becoming the engineer leadership listens to.

Roadmap debates are often won by whoever talks the loudest. Staff+ engineers change the game by providing evidence. Running experiments shifts decision-making from opinion-driven to evidence-driven. The goal is to establish credible, repeatable signals that leadership can rely on.

When you bring numbers into the room, the tone of the discussion changes. Suddenly, you’re not saying “I think this works” or “Here’s proof it does.” That’s influence.

Of course, not every experiment is powerful. John ran one to see which out-of-office message got him fewer Slack pings while on vacation. You’re about to learn how to run experiments that inform product decisions, not just optimize beach time.

Press + to interact

The basics of experiments

At its simplest, an experiment tests what happens when your feature is present vs. when it is absent.

  • Control vs. variant: One group receives the old behavior, while the other group receives the new. Compare outcomes.

  • Guardrails: Watch for unintended side effects, such as churn, latency, or support tickets, even if your main metric improves.

  • Simple stats: You don’t need advanced math. Even “+3 percentage points in activation with 95% confidence” can sway leadership.

The goal is to demonstrate, with sufficient evidence, that your work is meaningful.

Doing experiments well

Experiments build credibility, but only if they are conducted with discipline.

Here are best practices to apply:

  • Define event schemas: Decide what counts as “success” before you start.

  • Keep them lightweight: Don’t over-engineer. A simple control vs. variant often answers the question.

  • Ensure visibility: Make logs and results accessible to the team, rather than hiding them in private notes.

  • Use A/B tests wisely: They’re the workhorse of experimentation, balancing rigor with speed.

When to experiment

Not every question deserves an experiment. Sometimes telemetry is enough; sometimes intuition and speed matter more. Staff+ engineers know when the rigor is worth it.

A general guideline:

  • Run an experiment when:

    • The stakes are high (new onboarding, payments, pricing changes).

    • Leadership is debating direction, and evidence will decide.

    • There’s measurable user behavior tied to the feature.

  • Skip the experiment when:

    • The feature is low-impact or reversible.

    • You already have telemetry that answers the question.

    • Speed matters more than statistical confidence.

Example: Onboarding flow

Your team wants to test a new onboarding flow.

  • Control: Users go through the existing onboarding.

  • Variant: Users see the new flow.

  • Guardrails: Track activation rate as the primary success metric, while also monitoring churn, load times, and support tickets.

  • Results: After two weeks, activation increases from 20% to 25%, and we’re 95% confident this improvement is real and not just random noise.

You’ve shown leadership which feature deserves investment. That’s how you transition from builder to roadmap shaper.

Ask