Level 7: Browser Automator — Browser Control, Scraping, and PDFs

Claude Code itself cannot control a browser, but it can do so via an MCP (Model Context Protocol) server. By configuring Puppeteer MCP, you can instruct Claude in natural language to take screenshots, collect data from multiple pages, and convert pages to PDF.

Target audience: Anyone who understands headless automation and wants to try automating browser operations.

Estimated learning time: Read 20min + Practice 40min

How Browser Automation Works

An MCP server mediates between Claude Code and the browser. The flow is as follows.

Claude Code
    ↕ MCP protocol
Puppeteer MCP Server
    ↕
Browser (Chromium)
    ↕
Web

Claude Code sends browser operation commands through MCP tools, and the MCP server executes the actual browser operations via Chromium.

How to Configure Puppeteer MCP

Add a puppeteer entry to .claude/settings.json.

{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"]
    }
  }
}

Restart Claude Code to enable the Puppeteer browser operation tools. You can verify the configuration with the /mcp command.

General Usage Examples

Taking Screenshots

> Please open https://example.com
> and save a screenshot of the top page to screenshots/top-2026-03-29.png.

Useful for verifying differences before and after deployment, or recording periodic UI changes.

Collecting Data from Multiple Pages

> Please open https://news.example.com/tech
> and retrieve 10 articles published today with their titles and URLs.
> Return the results in JSON format {"articles": [{"title": "...", "url": "..."}]}.

The retrieved JSON can be processed with jq and used in subsequent scripts.

PDF Conversion

> Please open https://docs.example.com/api-reference
> and convert the entire page to a PDF saved at output/api-reference.pdf.

Useful for archiving documentation or automatically generating print materials.

Playwright MCP as an Alternative

Besides Puppeteer, Playwright is another option for browser automation MCP.

Item	Puppeteer MCP	Playwright MCP
Supported browsers	Chromium only	Chromium / Firefox / WebKit
Setup	Simple	Slightly more complex
Cross-browser testing	Not possible	Possible
Use case	Data collection · screenshots	E2E testing · cross-browser verification

If your main goal is data collection or taking screenshots, Puppeteer MCP is appropriate. If you need cross-browser behavior verification or E2E testing, Playwright MCP is the better fit.

Playwright MCP configuration example:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@executeautomation/playwright-mcp-server"]
    }
  }
}

General Use Case Examples

Competitive research: Periodically collect page structure, pricing tables, and feature lists from competing services and report on differences.

Price monitoring: Periodically scrape the pricing pages of e-commerce sites or cloud services and send a notification when changes occur.

Document sync: Convert specific pages of an external documentation site to PDF and save them to an internal shared folder.

Form testing: Enter values into a web form, submit it, and verify the response is as expected.

Important Notes

Always check the following when using browser automation.

Check robots.txt: Do not access paths where scraping is prohibited
Check terms of service: Verify the service’s terms do not prohibit scraping
Respect rate limits: Sending a large number of requests in a short time can cause IP blocks or account suspension
Handle personal data: If collected data contains personal information, manage it appropriately

FAQ

Q. What should I do if Puppeteer MCP is configured but not working?

Try running npx @modelcontextprotocol/server-puppeteer standalone to check if it starts. An outdated Node.js version may need to be updated.

Q. Can I retrieve content from pages that are dynamically generated with JavaScript?

Yes. Puppeteer actually launches a browser and operates it, so it can retrieve content after JavaScript rendering.

Hands-On Tutorial

Hands-on tutorial for this level →

Next Level

I’ve understood the mechanics and usage of browser automation via MCP. Next, I’ll learn the orchestrator pattern for running multiple agents in parallel.

Let’s move on to Level 8: Orchestrator — Directing Parallel Agents.