Skip to content
X

Level 7: Browser Automator — Browser Control, Scraping, and PDFs

Claude Code itself cannot control a browser, but it can do so via an MCP (Model Context Protocol) server. By configuring Puppeteer MCP, you can instruct Claude in natural language to take screenshots, collect data from multiple pages, and convert pages to PDF.

Target audience: Anyone who understands headless automation and wants to try automating browser operations.

Estimated learning time: Read 20min + Practice 40min


An MCP server mediates between Claude Code and the browser. The flow is as follows.

Claude Code
    ↕ MCP protocol
Puppeteer MCP Server

Browser (Chromium)

Web

Claude Code sends browser operation commands through MCP tools, and the MCP server executes the actual browser operations via Chromium.

Add a puppeteer entry to .claude/settings.json.

{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"]
    }
  }
}

Restart Claude Code to enable the Puppeteer browser operation tools. You can verify the configuration with the /mcp command.

> Please open https://example.com
> and save a screenshot of the top page to screenshots/top-2026-03-29.png.

Useful for verifying differences before and after deployment, or recording periodic UI changes.

> Please open https://news.example.com/tech
> and retrieve 10 articles published today with their titles and URLs.
> Return the results in JSON format {"articles": [{"title": "...", "url": "..."}]}.

The retrieved JSON can be processed with jq and used in subsequent scripts.

> Please open https://docs.example.com/api-reference
> and convert the entire page to a PDF saved at output/api-reference.pdf.

Useful for archiving documentation or automatically generating print materials.

Besides Puppeteer, Playwright is another option for browser automation MCP.

ItemPuppeteer MCPPlaywright MCP
Supported browsersChromium onlyChromium / Firefox / WebKit
SetupSimpleSlightly more complex
Cross-browser testingNot possiblePossible
Use caseData collection · screenshotsE2E testing · cross-browser verification

If your main goal is data collection or taking screenshots, Puppeteer MCP is appropriate. If you need cross-browser behavior verification or E2E testing, Playwright MCP is the better fit.

Playwright MCP configuration example:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@executeautomation/playwright-mcp-server"]
    }
  }
}

Competitive research: Periodically collect page structure, pricing tables, and feature lists from competing services and report on differences.

Price monitoring: Periodically scrape the pricing pages of e-commerce sites or cloud services and send a notification when changes occur.

Document sync: Convert specific pages of an external documentation site to PDF and save them to an internal shared folder.

Form testing: Enter values into a web form, submit it, and verify the response is as expected.

Always check the following when using browser automation.

  • Check robots.txt: Do not access paths where scraping is prohibited
  • Check terms of service: Verify the service’s terms do not prohibit scraping
  • Respect rate limits: Sending a large number of requests in a short time can cause IP blocks or account suspension
  • Handle personal data: If collected data contains personal information, manage it appropriately

Q. What should I do if Puppeteer MCP is configured but not working?

Try running npx @modelcontextprotocol/server-puppeteer standalone to check if it starts. An outdated Node.js version may need to be updated.

Q. Can I retrieve content from pages that are dynamically generated with JavaScript?

Yes. Puppeteer actually launches a browser and operates it, so it can retrieve content after JavaScript rendering.


Hands-on tutorial for this level →

I’ve understood the mechanics and usage of browser automation via MCP. Next, I’ll learn the orchestrator pattern for running multiple agents in parallel.

Let’s move on to Level 8: Orchestrator — Directing Parallel Agents.