Top 7 Headless Browser Automations That Actually Work

Mar 5, 2026

San Francisco

Nikola Balic

If you have tried scraping a React app with requests, you have seen the failure mode: an empty shell, a JavaScript bundle, and no usable data.

Headless browsers solve that part. They run a real browser engine, execute the page, and let you interact with it like a user would.

The harder part is keeping that automation alive in production.

Chrome processes hang. Sessions leak memory. Pricing changes by region. Form submissions duplicate when retries are sloppy. Synthetic checks pass while the real UI is broken.

Instead of treating browser automation like simple HTTP scripting, treat it like production infrastructure with state, retries, and observability.

This guide covers 7 patterns that show up constantly, what breaks first, and how to run them with fewer surprises.

Steel is used here as one managed-session example. It gives you cloud browser sessions you control through your existing frameworks, including Playwright, Puppeteer, and Selenium (Sessions API overview, Playwright guide, Puppeteer guide, Selenium guide).

What You Need No Matter Which Vendor You Use

Vendor choice does not remove the fundamentals.

Explicit timeouts: Defaults are how pipelines stall forever.
Retries with idempotency: Retrying a submit step blindly is how you create duplicate writes.
Selectors as contracts: They will drift. Version them and expect maintenance.
Artifacts on failure: If you cannot see what happened, you cannot fix it.
Region-aware execution: Wrong geography often means wrong data.

None of this is Steel-specific. It is the job.

Connecting to Steel with Playwright

The connection pattern is simple: create a session, connect over CDP, run Playwright.

import { chromium } from "playwright";
import { Steel } from "steel-sdk";

const steel = new Steel({ apiKey: process.env.STEEL_API_KEY });

const session = await steel.sessions.create({
  useProxy: true,
  solveCaptcha: true,
});

const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://example.com");

await browser.close();
await steel.sessions.release(session.id);

import { chromium } from "playwright";
import { Steel } from "steel-sdk";

const steel = new Steel({ apiKey: process.env.STEEL_API_KEY });

const session = await steel.sessions.create({
  useProxy: true,
  solveCaptcha: true,
});

const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://example.com");

await browser.close();
await steel.sessions.release(session.id);

import { chromium } from "playwright";
import { Steel } from "steel-sdk";

const steel = new Steel({ apiKey: process.env.STEEL_API_KEY });

const session = await steel.sessions.create({
  useProxy: true,
  solveCaptcha: true,
});

const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://example.com");

await browser.close();
await steel.sessions.release(session.id);

import { chromium } from "playwright";
import { Steel } from "steel-sdk";

const steel = new Steel({ apiKey: process.env.STEEL_API_KEY });

const session = await steel.sessions.create({
  useProxy: true,
  solveCaptcha: true,
});

const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://example.com");

await browser.close();
await steel.sessions.release(session.id);

Steel also supports state reuse for longer-lived authenticated flows through sessions and profiles (session lifecycle, reusing auth context, Profiles API overview).

1. Price Monitoring and Product Tracking

Price tracking is conceptually simple: open page, wait for the price, parse it, compare it, alert on the change.

The hard part is doing that every day for months without quietly storing garbage data.

await page.goto("https://shop.example.com/products/sku-123");
await page.waitForSelector("[data-test='price']", { state: "visible" });

const priceText = await page.locator("[data-test='price']").innerText();
const price = Number(priceText.replace(/[^0-9.]/g, ""));

if (price < TARGET_PRICE) {
  await sendAlert({ sku: "sku-123", price });
}

await page.goto("https://shop.example.com/products/sku-123");
await page.waitForSelector("[data-test='price']", { state: "visible" });

const priceText = await page.locator("[data-test='price']").innerText();
const price = Number(priceText.replace(/[^0-9.]/g, ""));

if (price < TARGET_PRICE) {
  await sendAlert({ sku: "sku-123", price });
}

await page.goto("https://shop.example.com/products/sku-123");
await page.waitForSelector("[data-test='price']", { state: "visible" });

const priceText = await page.locator("[data-test='price']").innerText();
const price = Number(priceText.replace(/[^0-9.]/g, ""));

if (price < TARGET_PRICE) {
  await sendAlert({ sku: "sku-123", price });
}

await page.goto("https://shop.example.com/products/sku-123");
await page.waitForSelector("[data-test='price']", { state: "visible" });

const priceText = await page.locator("[data-test='price']").innerText();
const price = Number(priceText.replace(/[^0-9.]/g, ""));

if (price < TARGET_PRICE) {
  await sendAlert({ sku: "sku-123", price });
}

What keeps it reliable:

Wait for state, not time: Prefer selector or network-state waits over static sleeps.
Track meaningful deltas: Alert on a real price move, not a small fluctuation.
Handle null outcomes: Out of stock, delisted, and regional variants should not crash the job.
Use geo-aligned routing: Region mismatch can return a technically valid price that is still the wrong price.

Common gotcha: some retailers soft-block instead of hard-block. You still get HTML, just not the data you thought you were collecting.

Typical use case: nightly monitoring across competitor catalogs with threshold-based alerts to Slack or email.

2. Form Autofill and Submission Workflows

Multi-step forms are where pure HTTP approaches usually fall apart.

State moves through cookies, CSRF tokens, client-side transitions, and hidden fields that only exist after the last interaction. A browser handles that naturally. Your job is to keep the run coherent.

await page.goto("https://portal.example.com/apply");
await page.fill("#email", applicant.email);
await page.fill("#password", applicant.password);
await page.click("#sign-in");

await page.waitForSelector(".step-2", { state: "visible" });
await page.fill("#first-name", applicant.firstName);

await page.click("#submit");
await page.waitForSelector(".confirmation-number", { state: "visible" });
const confirmation = await page.locator(".confirmation-number").innerText();

await page.goto("https://portal.example.com/apply");
await page.fill("#email", applicant.email);
await page.fill("#password", applicant.password);
await page.click("#sign-in");

await page.waitForSelector(".step-2", { state: "visible" });
await page.fill("#first-name", applicant.firstName);

await page.click("#submit");
await page.waitForSelector(".confirmation-number", { state: "visible" });
const confirmation = await page.locator(".confirmation-number").innerText();

await page.goto("https://portal.example.com/apply");
await page.fill("#email", applicant.email);
await page.fill("#password", applicant.password);
await page.click("#sign-in");

await page.waitForSelector(".step-2", { state: "visible" });
await page.fill("#first-name", applicant.firstName);

await page.click("#submit");
await page.waitForSelector(".confirmation-number", { state: "visible" });
const confirmation = await page.locator(".confirmation-number").innerText();

await page.goto("https://portal.example.com/apply");
await page.fill("#email", applicant.email);
await page.fill("#password", applicant.password);
await page.click("#sign-in");

await page.waitForSelector(".step-2", { state: "visible" });
await page.fill("#first-name", applicant.firstName);

await page.click("#submit");
await page.waitForSelector(".confirmation-number", { state: "visible" });
const confirmation = await page.locator(".confirmation-number").innerText();

What keeps it reliable:

Use one session per submission: Keep auth, cookies, and page state together.
Retry steps, not whole flows: Whole-flow retries are expensive and can trigger duplicate-submission checks.
Persist proof of completion: Save the confirmation number before tearing down the session.
Use idempotency keys when money or accounts are involved: Retries need guardrails.

Common gotcha: file uploads. Use framework upload APIs like setInputFiles() instead of trying to drive a native file picker.

Typical use case: onboarding portals, job applications, partner dashboards, or internal back-office workflows.

3. Dynamic Scraping for SPA Sites

For React, Vue, and Angular pages, DOM extraction often beats reverse-engineering private APIs when maintainability matters more than raw speed.

Internal APIs can be faster. They can also disappear, add signatures, or change shape without warning. Rendered DOM is slower, but it usually survives redesigns better.

await page.goto("https://listings.example/search?city=split");
await page.waitForSelector(".listing-card", { state: "visible" });

const cards = await page.locator(".listing-card").all();
const rows = await Promise.all(
  cards.map(async (card) => ({
    title: await card.locator(".title").innerText(),
    price: await card.locator(".price").innerText(),
    rating: await card.locator(".rating").innerText().catch(() => null),
  }))
);

await page.goto("https://listings.example/search?city=split");
await page.waitForSelector(".listing-card", { state: "visible" });

const cards = await page.locator(".listing-card").all();
const rows = await Promise.all(
  cards.map(async (card) => ({
    title: await card.locator(".title").innerText(),
    price: await card.locator(".price").innerText(),
    rating: await card.locator(".rating").innerText().catch(() => null),
  }))
);

await page.goto("https://listings.example/search?city=split");
await page.waitForSelector(".listing-card", { state: "visible" });

const cards = await page.locator(".listing-card").all();
const rows = await Promise.all(
  cards.map(async (card) => ({
    title: await card.locator(".title").innerText(),
    price: await card.locator(".price").innerText(),
    rating: await card.locator(".rating").innerText().catch(() => null),
  }))
);

await page.goto("https://listings.example/search?city=split");
await page.waitForSelector(".listing-card", { state: "visible" });

const cards = await page.locator(".listing-card").all();
const rows = await Promise.all(
  cards.map(async (card) => ({
    title: await card.locator(".title").innerText(),
    price: await card.locator(".price").innerText(),
    rating: await card.locator(".rating").innerText().catch(() => null),
  }))
);

What keeps it reliable:

Wait for the right signal: A domain-specific selector is usually better than global networkidle.
Paginate the way the app paginates: Infinite scroll and "Load more" patterns need deterministic loops.
Extract only what you need: Full-page dumps waste memory and slow downstream processing.
Version selectors over time: DOM contracts drift.

Common gotcha: lazy-loaded cards. Scroll, wait, confirm count increase, then continue.

Typical use case: marketplace aggregation, listing intelligence, or competitor catalog tracking.

4. AI Web Agents

Agent loops are useful when hardcoding every selector is overkill, but they still need hard boundaries.

The better pattern is not "give the model a browser and hope." It is: keep the action space small, keep untrusted page content out of the control loop when possible, and require approval before anything expensive or irreversible.

const ALLOWED_ACTIONS = [
  "goto_pricing",
  "open_login",
  "click_cta",
  "extract_pricing_table",
  "request_human_approval",
];

const plan = await planner({
  task: "Find plan names and monthly prices",
  allowedActions: ALLOWED_ACTIONS,
  maxSteps: 8,
});

for (const step of plan) {
  assertAllowed(step.action, ALLOWED_ACTIONS);

  if (step.action === "request_human_approval") {
    await waitForApproval(step.reason);
    continue;
  }

  const result = await executor.run(step);
  auditLog.append({ step, result });
}

const pricing = await extractor.readPricingTable(page);
return pricing;

const ALLOWED_ACTIONS = [
  "goto_pricing",
  "open_login",
  "click_cta",
  "extract_pricing_table",
  "request_human_approval",
];

const plan = await planner({
  task: "Find plan names and monthly prices",
  allowedActions: ALLOWED_ACTIONS,
  maxSteps: 8,
});

for (const step of plan) {
  assertAllowed(step.action, ALLOWED_ACTIONS);

  if (step.action === "request_human_approval") {
    await waitForApproval(step.reason);
    continue;
  }

  const result = await executor.run(step);
  auditLog.append({ step, result });
}

const pricing = await extractor.readPricingTable(page);
return pricing;

const ALLOWED_ACTIONS = [
  "goto_pricing",
  "open_login",
  "click_cta",
  "extract_pricing_table",
  "request_human_approval",
];

const plan = await planner({
  task: "Find plan names and monthly prices",
  allowedActions: ALLOWED_ACTIONS,
  maxSteps: 8,
});

for (const step of plan) {
  assertAllowed(step.action, ALLOWED_ACTIONS);

  if (step.action === "request_human_approval") {
    await waitForApproval(step.reason);
    continue;
  }

  const result = await executor.run(step);
  auditLog.append({ step, result });
}

const pricing = await extractor.readPricingTable(page);
return pricing;

const ALLOWED_ACTIONS = [
  "goto_pricing",
  "open_login",
  "click_cta",
  "extract_pricing_table",
  "request_human_approval",
];

const plan = await planner({
  task: "Find plan names and monthly prices",
  allowedActions: ALLOWED_ACTIONS,
  maxSteps: 8,
});

for (const step of plan) {
  assertAllowed(step.action, ALLOWED_ACTIONS);

  if (step.action === "request_human_approval") {
    await waitForApproval(step.reason);
    continue;
  }

  const result = await executor.run(step);
  auditLog.append({ step, result });
}

const pricing = await extractor.readPricingTable(page);
return pricing;

What keeps it reliable:

Constrain task scope: "Extract plan names and prices" is much better than "Research this company."
Constrain the action set: Give the agent a small set of allowed moves instead of open-ended browser control.
Plan before execution when you can: Freeze the action list early, then run it under orchestration.
Set hard step limits: Avoid infinite loops on ambiguous UI states.
Separate navigation from extraction: Trust navigation, verify facts.
Treat page content as untrusted: Do not let arbitrary page text rewrite system instructions, approvals, or tool choices.
Minimize tainted context: Once you have the structured signal you need, drop raw page text from the decision loop.
Add approval gates for risky actions: Payment, account changes, and anything irreversible should pause for human review.
Log intermediate actions: You need rollback and debugging context.

Common gotcha: broad goals plus broad permissions. That is how agents wander, loop, or take actions you did not mean to authorize.

Typical use case: exploratory navigation across many sites before a deterministic parser turns the result into structured data.

5. Screenshot and PDF Generation

A lot of PDF generation infrastructure is just headless Chrome with extra steps.

For invoices, statements, reports, and visual records, rendering the real page is often more stable than maintaining a separate HTML-to-PDF system.

await page.goto("https://app.example.com/invoice/1042");
await page.waitForLoadState("networkidle");

const screenshot = await page.screenshot({ fullPage: true, type: "png" });
await s3.upload({ Body: screenshot, Key: "invoices/1042.png" });

const pdf = await page.pdf({ format: "A4", printBackground: true });
await s3.upload({ Body: pdf, Key: "invoices/1042.pdf" });

await page.goto("https://app.example.com/invoice/1042");
await page.waitForLoadState("networkidle");

const screenshot = await page.screenshot({ fullPage: true, type: "png" });
await s3.upload({ Body: screenshot, Key: "invoices/1042.png" });

const pdf = await page.pdf({ format: "A4", printBackground: true });
await s3.upload({ Body: pdf, Key: "invoices/1042.pdf" });

await page.goto("https://app.example.com/invoice/1042");
await page.waitForLoadState("networkidle");

const screenshot = await page.screenshot({ fullPage: true, type: "png" });
await s3.upload({ Body: screenshot, Key: "invoices/1042.png" });

const pdf = await page.pdf({ format: "A4", printBackground: true });
await s3.upload({ Body: pdf, Key: "invoices/1042.pdf" });

await page.goto("https://app.example.com/invoice/1042");
await page.waitForLoadState("networkidle");

const screenshot = await page.screenshot({ fullPage: true, type: "png" });
await s3.upload({ Body: screenshot, Key: "invoices/1042.png" });

const pdf = await page.pdf({ format: "A4", printBackground: true });
await s3.upload({ Body: pdf, Key: "invoices/1042.pdf" });

What keeps it reliable:

Wait for assets: Fonts and late-loading images will ruin otherwise valid captures.
Set viewport deliberately: Stable dimensions produce stable artifacts.
Stream to object storage: Avoid local temp-file cleanup jobs.
Handle auth before navigation: Protected pages need preloaded session state.

Common gotcha: forgetting printBackground: true, which changes layout and branding in exported PDFs.

Typical use case: invoice generation, compliance archiving, and visual regression snapshots.

6. Mobile-Mode Automation

Mobile automation is not just a user-agent string.

Input model, viewport, touch behavior, and browser characteristics all affect the result. If the mobile checkout flow differs from desktop, test the mobile experience directly.

const session = await steel.sessions.create({
  deviceConfig: { device: "mobile" },
});
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const context = await browser.newContext({
  hasTouch: true,
  isMobile: true,
  viewport: { width: 390, height: 844 },
});
const page = await context.newPage();

await page.tap("#checkout-btn");
await page.waitForSelector(".payment-sheet", { state: "visible" });

const session = await steel.sessions.create({
  deviceConfig: { device: "mobile" },
});
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const context = await browser.newContext({
  hasTouch: true,
  isMobile: true,
  viewport: { width: 390, height: 844 },
});
const page = await context.newPage();

await page.tap("#checkout-btn");
await page.waitForSelector(".payment-sheet", { state: "visible" });

const session = await steel.sessions.create({
  deviceConfig: { device: "mobile" },
});
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const context = await browser.newContext({
  hasTouch: true,
  isMobile: true,
  viewport: { width: 390, height: 844 },
});
const page = await context.newPage();

await page.tap("#checkout-btn");
await page.waitForSelector(".payment-sheet", { state: "visible" });

const session = await steel.sessions.create({
  deviceConfig: { device: "mobile" },
});
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const context = await browser.newContext({
  hasTouch: true,
  isMobile: true,
  viewport: { width: 390, height: 844 },
});
const page = await context.newPage();

await page.tap("#checkout-btn");
await page.waitForSelector(".payment-sheet", { state: "visible" });

What keeps it reliable:

Prefer touch actions when relevant: Some controls are touch-specific.
Test mobile and desktop separately: They drift at both UI and API layers.
Watch viewport-triggered lazy loading: Smaller screens often defer more content.

Common gotcha: page.tap() needs a touch-capable context. If your connected page does not support touch, create a new context with hasTouch: true.

Typical use case: mobile-first checkout flows, mobile-only signup variants, and responsive regression checks.

7. Synthetic Monitoring and Incident Alerting

Ping checks tell you the host responded. Synthetic browser checks tell you whether the path still works.

A page can return 200 OK and still be broken: blank shell, missing CTA, JavaScript exception, or incomplete render.

const session = await steel.sessions.create();
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://store.example.com");

const priceVisible = await page.locator(".product-price").isVisible();
const ctaVisible = await page.locator(".add-to-cart").isVisible();

if (!priceVisible || !ctaVisible) {
  const snap = await page.screenshot({ fullPage: true });
  await s3.upload({ Body: snap, Key: `incidents/${Date.now()}.png` });
  await slack.alert("#incidents", "Homepage check failed: missing critical elements");
}

await browser.close();
await steel.sessions.release(session.id);

const session = await steel.sessions.create();
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://store.example.com");

const priceVisible = await page.locator(".product-price").isVisible();
const ctaVisible = await page.locator(".add-to-cart").isVisible();

if (!priceVisible || !ctaVisible) {
  const snap = await page.screenshot({ fullPage: true });
  await s3.upload({ Body: snap, Key: `incidents/${Date.now()}.png` });
  await slack.alert("#incidents", "Homepage check failed: missing critical elements");
}

await browser.close();
await steel.sessions.release(session.id);

const session = await steel.sessions.create();
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://store.example.com");

const priceVisible = await page.locator(".product-price").isVisible();
const ctaVisible = await page.locator(".add-to-cart").isVisible();

if (!priceVisible || !ctaVisible) {
  const snap = await page.screenshot({ fullPage: true });
  await s3.upload({ Body: snap, Key: `incidents/${Date.now()}.png` });
  await slack.alert("#incidents", "Homepage check failed: missing critical elements");
}

await browser.close();
await steel.sessions.release(session.id);

const session = await steel.sessions.create();
const browser = await chromium.connectOverCDP(
  `wss://connect.steel.dev?apiKey=${process.env.STEEL_API_KEY}&sessionId=${session.id}`
);
const page = browser.contexts()[0].pages()[0];

await page.goto("https://store.example.com");

const priceVisible = await page.locator(".product-price").isVisible();
const ctaVisible = await page.locator(".add-to-cart").isVisible();

if (!priceVisible || !ctaVisible) {
  const snap = await page.screenshot({ fullPage: true });
  await s3.upload({ Body: snap, Key: `incidents/${Date.now()}.png` });
  await slack.alert("#incidents", "Homepage check failed: missing critical elements");
}

await browser.close();
await steel.sessions.release(session.id);

What keeps alerts useful:

Assert business-critical UI: Status code alone is not enough.
Alert on consecutive failures: Reduce noise from transient issues.
Capture artifacts automatically: Faster incident triage.
Run checks outside the failure domain you are testing: Otherwise your monitor dies with the app.

Common gotcha: monitoring from the same infrastructure stack as the product under test.

Typical use case: recurring checkout-path validation with screenshot evidence on failure.

Limitations: Works for X, Not Yet for Y

Works well for JS-rendered pages, multi-step forms, visual capture, and synthetic user-path checks.
It is not a substitute for a stable official API when one already exists and latency matters most.
CAPTCHA support should be treated as a reliability tool, not a bypass guarantee (CAPTCHAs API overview, CAPTCHA solving guide).
Mobile-mode emulation is useful for many flows, but it is not a perfect stand-in for every real-device signal (mobile mode).

How to Evaluate a Managed Browser Platform

Use this checklist for Steel or any alternative:

Capability to verify	Why it matters	Evidence link (Steel)
Session creation and lifecycle APIs	Prevent leaked browsers and orphaned jobs	Sessions API overview, session lifecycle
CDP connectivity for your framework	Lets you keep existing automation code	Playwright, Puppeteer, Selenium
Proxy controls	Needed for region and anti-block reliability	Proxies
CAPTCHA tooling	Reduces manual interruptions in long flows	CAPTCHAs API overview, CAPTCHA solving guide
State reuse between runs	Helps with authenticated and repeat workflows	reusing auth context, Profiles API overview
Mobile-mode support	Needed when mobile and desktop paths diverge	mobile mode

Why Steel Fits This Category

You do not need Steel to write these scripts. You need something like Steel when you want to run them reliably without turning Chrome infrastructure into your real product.

Steel handles the repetitive browser-ops layer: session lifecycle, routing options, framework compatibility, and the hooks you need when flows hit CAPTCHA or auth state. You still own selectors, retries, and business logic. That is the right split.

Start Small

The best rollout pattern is still the boring one:

Start with one flow and one clear success metric.
Add failure artifacts before scaling concurrency.
Expand only after retry and teardown behavior are stable.

If you want a concrete implementation baseline, start with the Steel quickstart, then adapt the patterns above to your own stack. You can also use the Steel Cookbook for working examples.

Full code with snippets from this article is available in a separate repo here.