AI-Powered Web Scraping with Playwright and Claude: End-to-End Python Tutorial

Build a Python scraper that drives Playwright, screenshots a rendered page, sends the image to Claude Sonnet 4.6, and validates the output with Pydantic.

20 May 2026 Updated 20 May 2026 ~10 min read

GitHub social card for microsoft/playwright-python — the Python binding for the Playwright browser-automation library used in this tutorial

Image: microsoft/playwright-python on GitHub, used for editorial coverage of the Python binding taught in this tutorial.

What you’ll build

A small Python project that loads a target page in a real browser via Playwright, takes a full-page screenshot of the rendered DOM, sends that screenshot to Claude Sonnet 4.6 with a prompt asking for structured data, and parses Claude’s JSON reply into a Pydantic model so the rest of your code sees a typed object instead of raw text. The whole pipeline fits in roughly 90 lines of Python.

The recommendation here is structural: per the Anthropic vision documentation, all current Claude models accept image input alongside text in the same Messages-API call,¹ which lets you treat a page screenshot as the source of truth instead of brittle CSS selectors. According to the Playwright Python installation guide, Playwright drives Chromium, Firefox, and WebKit through a single API and auto-waits for elements to be actionable,² so JS-heavy single-page apps render before the screenshot is captured. Pydantic v2’s BaseModel gives you a typed validation layer over the model’s JSON output.³

The differentiator over selector-based scraping is robustness. A CSS selector breaks the moment the target site renames a class. A screenshot-plus-multimodal-LLM pipeline reads the page the way a human does: the model identifies the price, title, or table content visually, so a re-skinned layout that keeps the same information visible keeps working without a code change.

Prerequisites

You’ll need:

Python 3.9 or newer.
An Anthropic API key. Per the Anthropic pricing page, Claude Sonnet 4.6 is billed at $3 per million input tokens and $15 per million output tokens⁴ — a single page screenshot plus a short extraction prompt typically lands well under a cent per request, but check the pricing page for current rates before running a large batch.
A working terminal and a code editor.

Set the API key as an environment variable so the SDK picks it up automatically:

export ANTHROPIC_API_KEY="sk-ant-..."

The Anthropic Python SDK reads ANTHROPIC_API_KEY from the environment when you instantiate the client without arguments, per the SDK’s PyPI page.⁵

Step 1: Install Playwright, the Anthropic SDK, and Pydantic

Create a virtual environment, then install the three dependencies:

python -m venv .venv
source .venv/bin/activate
pip install playwright anthropic pydantic

Playwright needs its browser binaries on top of the Python package. Per the Playwright installation guide, the post-install command downloads Chromium, Firefox, and WebKit:²

playwright install

The download is a one-time step and takes a few minutes the first run. If you only need one browser, playwright install chromium keeps the footprint smaller.

GitHub social card for microsoft/playwright — the upstream Playwright framework that the Python binding wraps

Image: microsoft/playwright on GitHub, used for editorial coverage of the upstream framework.

Step 2: Write a basic Playwright scraper

The first piece loads a page and saves a screenshot. Create scraper.py:

from playwright.sync_api import sync_playwright

def capture_page(url: str, output_path: str = "page.png") -> str:
    """Load a URL in headless Chromium and save a full-page screenshot."""
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context(viewport={"width": 1280, "height": 800})
        page = context.new_page()
        page.goto(url, wait_until="networkidle")
        page.screenshot(path=output_path, full_page=True)
        browser.close()
    return output_path

if __name__ == "__main__":
    path = capture_page("https://example.com")
    print(f"Saved screenshot to {path}")

Three details earn their place. sync_playwright opens a context manager that owns the browser lifecycle, so a crash mid-script doesn’t leave a zombie Chromium process. wait_until="networkidle" blocks until the page has had no network activity for 500 ms, which the Playwright actionability documentation describes as the standard wait condition for JS-heavy pages.⁶ full_page=True tells Playwright to scroll the page and stitch the result; per the Playwright screenshots reference, the default captures only the viewport.⁷

Run it:

python scraper.py

A page.png file appears in the working directory. Open it to confirm the render matches what a human visitor sees.

Step 3: Send the screenshot to Claude

Now wire in the multimodal call. Per the Anthropic vision documentation, the Messages API accepts an image content block alongside a text content block in the same user turn; the image can be passed as base64 or as a URL.¹ Base64 is the right choice here because the screenshot lives on the local disk.

Add this to scraper.py:

import base64
from pathlib import Path
from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from environment

def extract_with_claude(image_path: str, prompt: str) -> str:
    """Send a local screenshot plus a text prompt to Claude. Return raw text."""
    image_bytes = Path(image_path).read_bytes()
    image_b64 = base64.standard_b64encode(image_bytes).decode("utf-8")

    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": image_b64,
                        },
                    },
                    {
                        "type": "text",
                        "text": prompt,
                    },
                ],
            }
        ],
    )
    return message.content[0].text

A few choices worth narrating. claude-sonnet-4-6 is the canonical Sonnet model ID per the Anthropic models overview page;⁸ Sonnet is the right register for an extraction task because it’s faster and cheaper than Opus while still strong on vision and structured output.⁴ max_tokens=1024 caps the response: extraction tasks rarely need more, and a tight cap keeps cost predictable. The image block uses media_type: "image/png" because Playwright’s default screenshot encoding is PNG; switch to image/jpeg if you reconfigure Playwright to emit JPEG.

GitHub social card for anthropics/anthropic-sdk-python — the official Python SDK used to call Claude's Messages API in this tutorial

Image: anthropics/anthropic-sdk-python on GitHub, used for editorial coverage of the official Python client.

Step 4: Validate the response with Pydantic

Claude can return JSON when asked, but the response is a string until something parses it. Pydantic v2’s BaseModel gives you a typed validation layer; per the Pydantic documentation, model_validate_json parses a JSON string and raises a ValidationError if the shape or types don’t match.⁹

Define the schema and the parser. For this walkthrough the target is a product card with a title, a price, and an availability flag:

from pydantic import BaseModel, Field, ValidationError

class Product(BaseModel):
    title: str = Field(description="The product name shown most prominently.")
    price: str = Field(description="The price as displayed, with currency symbol.")
    in_stock: bool = Field(description="True if the page shows the item as available.")

EXTRACTION_PROMPT = """\
Look at the screenshot of a product page and return ONLY a JSON object that
matches this schema, with no markdown fences and no extra commentary:

{
  "title": "<the product name>",
  "price": "<the price as displayed, e.g. INR1299 or USD24.99>",
  "in_stock": true or false
}
"""

def parse_product(image_path: str) -> Product:
    raw = extract_with_claude(image_path, EXTRACTION_PROMPT)
    try:
        return Product.model_validate_json(raw)
    except ValidationError as exc:
        raise RuntimeError(f"Claude returned non-conforming JSON: {raw!r}") from exc

Two things keep this honest. The prompt is explicit about no markdown fences; Claude will sometimes wrap JSON in triple backticks, and that wrapper trips the parser. The ValidationError handler surfaces the raw model output, which makes debugging a malformed response trivial when the prompt or the page changes.

GitHub social card for pydantic/pydantic — the data-validation library used to type-check Claude's JSON output in this tutorial

Image: pydantic/pydantic on GitHub, used for editorial coverage of the validation library.

Step 5: Wire the end-to-end run

The full main block ties the three steps together:

if __name__ == "__main__":
    import sys

    target_url = sys.argv[1] if len(sys.argv) > 1 else "https://example.com"
    screenshot = capture_page(target_url, "page.png")
    product = parse_product(screenshot)
    print(product.model_dump_json(indent=2))

Run it against a real product page:

python scraper.py https://www.example.com/some-product

model_dump_json(indent=2) is Pydantic v2’s serialisation helper per the BaseModel reference;⁹ it round-trips the validated model back to a JSON string with pretty-printed indentation for human inspection.

Why this beats selector-based scraping

Selector-based scrapers (BeautifulSoup, lxml, even Playwright’s own page.locator(...)) bind extraction to the DOM structure. A div.product-price selector breaks when the site renames the class to product-price--v2, which happens routinely after redesigns. The screenshot-plus-multimodal pipeline reads the rendered visual, so a re-skinned layout that keeps the price visible keeps working.

Two trade-offs are worth naming. Latency is higher: a Playwright render plus a vision-model call typically runs a few seconds per page, against milliseconds for a parsed HTML extract. Cost is higher per request: per the Anthropic pricing page, Claude Sonnet 4.6 charges $3 per million input tokens with image input billed against the input-token meter,⁴ so a high-volume scrape of millions of pages adds up. For low-volume, layout-fragile targets (product pages on small retailers, dashboards behind a login, sites that block headless requests but render fine in a real browser) the robustness usually wins.

JS-heavy SPAs are the strongest case. A React or Vue app that renders content client-side returns an almost-empty HTML shell to a plain requests.get(...); Playwright executes the JavaScript and waits for the DOM to settle, so the screenshot captures the same content a logged-in user sees. The model then reads that content without you writing any framework-specific selector logic.

GitHub social card for the Anthropic cookbook repository — companion examples for Claude API integration patterns referenced in this tutorial

Image: Anthropic cookbook repository on GitHub, used for editorial coverage of the official Claude API examples.

Hardening checklist

A toy scraper is one thing; a production-shaped one needs a few more guards.

Respect robots.txt and rate limits. Per Playwright’s documentation, the framework drives a real browser and consumes resources accordingly;² add a delay between requests and check the target site’s terms of service before scaling.
Retry the vision call on transient errors. The Anthropic Python SDK raises typed exceptions for rate-limit and server errors; wrap client.messages.create(...) in a backoff loop for unattended runs.
Pin the model ID. claude-sonnet-4-6 is a pinned snapshot per the Anthropic models overview;⁸ pinning the ID rather than relying on an evergreen alias keeps the extraction behaviour stable across SDK upgrades.
Cap the screenshot size. Per the Anthropic vision documentation, the per-image dimension cap is 8000x8000 pixels;¹ a long scroll page can exceed that. Pass clip={"x": 0, "y": 0, "width": 1280, "height": 4000} to page.screenshot(...) for very long pages and run the extraction in tiles if needed.
Validate the JSON shape, not the prose. Pydantic’s ValidationError is the right boundary: if Claude drifts on schema, fail loud rather than silently shipping a malformed record downstream.

Where to take it next

A few natural extensions:

Multi-page traversal. Use Playwright’s page.click(...) and page.wait_for_load_state(...) to walk a list-detail flow, screenshotting each detail page and accumulating the extracted models in a list.
Streaming output. The Anthropic Messages API supports streaming responses per the API reference;¹⁰ useful when extracting from a long page where you want the first fields back before the full response lands.
Schema iteration. Extend the Product model with nested fields (specs, reviews count, seller name) and re-run; Pydantic’s validation does the heavy lifting of catching prompt drift.
Cached screenshots. For development, save the screenshot once and run the extraction loop against the saved file; that decouples prompt iteration from Playwright latency and saves on API calls while you’re tuning.

The full source for the walkthrough is the four code blocks above, in order. Drop them into a single scraper.py and the end-to-end run takes a screenshot, extracts the structured data, and prints validated JSON for the target URL in one command.

How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.

Sources consulted

Cited Sources

1. Anthropic — Vision capabilities documentation (image content block, base64 and URL source types) (accessed 2026-05-20) ↩
2. Playwright for Python — Installation guide (pip install playwright plus playwright install) (accessed 2026-05-20) ↩
3. Pydantic — Documentation home (v2 BaseModel and validation) (accessed 2026-05-20) ↩
4. Anthropic — API pricing (Claude Sonnet 4.6 input and output token rates) (accessed 2026-05-20) ↩
5. Anthropic — Python SDK (anthropic) on PyPI (ANTHROPIC_API_KEY environment variable) (accessed 2026-05-20) ↩
6. Playwright for Python — Auto-waiting and actionability (networkidle wait condition) (accessed 2026-05-20) ↩
7. Playwright for Python — Screenshots API reference (full_page and clip options) (accessed 2026-05-20) ↩
8. Anthropic — Models overview (claude-sonnet-4-6 canonical model ID) (accessed 2026-05-20) ↩
9. Pydantic — BaseModel API reference (model_validate_json and model_dump_json) (accessed 2026-05-20) ↩
10. Anthropic — Messages API reference (streaming responses) (accessed 2026-05-20) ↩

Anonymous · no cookies set

Found this useful? Share it.