Build an AI Image-Tagging Pipeline: Claude Vision + S3 + Postgres in Python

Wire boto3, Claude vision, and psycopg into an end-to-end pipeline that tags every S3 upload and serves a /search?tag= API in FastAPI.

19 May 2026 Updated 19 May 2026 ~11 min read

Anthropic vision documentation page at platform.claude.com/docs, the canonical reference the tutorial builds against

Image: Anthropic vision guide, used for editorial coverage of the API surface the tutorial builds against.

What you’ll build

A small Python service that watches an S3 bucket, sends every newly uploaded image to Claude for tag extraction plus alt-text, writes the result to Postgres, and exposes a /search?tag= endpoint via FastAPI. Per the Anthropic vision guide, every current Claude model accepts image input, so the same code path works against claude-opus-4-7, claude-sonnet-4-6, and claude-haiku-4-5; pick the tier that matches your latency and budget targets.¹ The Haiku tier is the source-recommended default for high-volume tagging given its $1 / $5 per million input/output tokens price point.²

The pipeline has four moving parts:

S3 upload trigger. An s3:ObjectCreated:Put event notification invokes a Lambda function asynchronously with the bucket and object key.³
Tagger. The function pulls the object bytes via the boto3 S3 client’s get_object call, base64-encodes them, and sends them to Claude with a structured-output prompt.⁴
Postgres writer. The response (tags array plus alt-text) is upserted into an images table via psycopg 3, the current stable PostgreSQL adapter family.⁵
Search API. A FastAPI service exposes /search?tag=cat&tag=outdoors, translating list query parameters into a PostgreSQL array-overlap filter.⁶

This piece is a tutorial; it does not benchmark Claude’s tagging accuracy against alternatives. The Anthropic vision guide notes that Opus 4.7 supports images up to 2,576 px on the long edge, more than three times the prior generation, making it the source-recommended choice when input fidelity matters.⁷

Anthropic models overview page showing the claude-opus-4-7, claude-sonnet-4-6, and claude-haiku-4-5 model IDs and vision capability

Image: Anthropic models overview, used for editorial coverage of the model IDs the tutorial passes to the SDK.

What you’ll need

Python 3.11 or later.
An Anthropic API key (set ANTHROPIC_API_KEY in your environment).
An AWS account with permission to create an S3 bucket, a Lambda function, and an IAM role.
A PostgreSQL 14 or later instance: local, RDS, or any managed Postgres works.
The packages: anthropic, boto3, psycopg[binary], and fastapi[standard].

python -m venv .venv
source .venv/bin/activate
pip install "anthropic>=0.40" "boto3>=1.43" "psycopg[binary]>=3.3" "fastapi[standard]>=0.110"

The Anthropic SDK ships on PyPI and is updated alongside model releases; pin the floor at a recent minor so the claude-opus-4-7 ID is recognised.⁸

Step 1: define the Postgres schema

The schema is a single table. Tags are stored as a native TEXT array so the search endpoint can use Postgres’s array operators directly, per the official array-functions reference.⁹

CREATE TABLE IF NOT EXISTS images (
    id            BIGSERIAL PRIMARY KEY,
    s3_key        TEXT NOT NULL UNIQUE,
    bucket        TEXT NOT NULL,
    tags          TEXT[] NOT NULL DEFAULT '{}',
    alt_text      TEXT NOT NULL DEFAULT '',
    tagged_at     TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS images_tags_gin
    ON images USING GIN (tags);

The GIN index on tags is what makes overlap queries cheap at scale. Without it, every /search request forces a sequential scan once the table grows past a few thousand rows.

Apply the migration with whatever runner you prefer; a one-shot psql -f schema.sql is enough for the tutorial. Persist the connection string in DATABASE_URL.

Step 2: write the tagger

The tagger is a single Python module that the Lambda handler imports. It fetches the S3 object, sends it to Claude, and parses the JSON response. Keep it framework-agnostic so the same code runs locally, in a script, or in Lambda.

# tagger.py
import base64
import json
import os

import boto3
from anthropic import Anthropic

_s3 = boto3.client("s3")
_anthropic = Anthropic()  # reads ANTHROPIC_API_KEY from env

PROMPT = """You are an image-tagging assistant. Return a single JSON
object with two keys:

- tags: a list of 5 to 10 lowercase tags describing the subject,
  setting, and notable attributes.
- alt_text: a single sentence describing the image for a screen
  reader, under 125 characters.

Reply with the JSON object only, no prose."""


def fetch_image_bytes(bucket, key):
    response = _s3.get_object(Bucket=bucket, Key=key)
    body = response["Body"].read()
    media_type = response.get("ContentType", "image/jpeg")
    return body, media_type


def tag_image(bucket, key, model="claude-haiku-4-5"):
    image_bytes, media_type = fetch_image_bytes(bucket, key)
    encoded = base64.standard_b64encode(image_bytes).decode("ascii")

    message = _anthropic.messages.create(
        model=model,
        max_tokens=512,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": media_type,
                            "data": encoded,
                        },
                    },
                    {"type": "text", "text": PROMPT},
                ],
            }
        ],
    )

    raw = message.content[0].text.strip()
    payload = json.loads(raw)
    return {
        "tags": [t.lower().strip() for t in payload["tags"]],
        "alt_text": payload["alt_text"].strip(),
    }

Two things worth flagging. First, the content array follows the shape documented in the Anthropic vision guide: a type: image block with a source object carrying type: base64, the media type, and base64-encoded bytes, paired with a type: text block for the instruction.¹⁰ Second, the SDK reads ANTHROPIC_API_KEY from the environment automatically, so the constructor takes no arguments in the common case.

If the model returns prose around the JSON (rare for Haiku and Sonnet with this prompt, slightly more common for Opus on adversarial inputs), wrap the json.loads in a try/except and re-prompt; retry-on-parse-failure is the simplest robust pattern.

Anthropic pricing page showing the input and output token rates for Opus 4.7, Sonnet 4.6, and Haiku 4.5

Image: Anthropic pricing page, used for editorial coverage of the token-rate numbers the tutorial cites.

Step 3: write to Postgres

Use psycopg 3 with the connection context manager; the with block ensures the connection is closed even on exceptions. The cur.execute placeholder is %s, which the driver maps to the correct Postgres parameter form, including arrays.

# db.py
import os
from contextlib import contextmanager

import psycopg

DATABASE_URL = os.environ["DATABASE_URL"]


@contextmanager
def get_conn():
    with psycopg.connect(DATABASE_URL) as conn:
        yield conn


def upsert_image(bucket, key, tags, alt_text):
    sql = """
        INSERT INTO images (s3_key, bucket, tags, alt_text)
        VALUES (%s, %s, %s, %s)
        ON CONFLICT (s3_key) DO UPDATE
        SET tags = EXCLUDED.tags,
            alt_text = EXCLUDED.alt_text,
            tagged_at = NOW();
    """
    with get_conn() as conn:
        with conn.cursor() as cur:
            cur.execute(sql, (key, bucket, tags, alt_text))

The psycopg 3 docs cover this pattern as the basic-usage default; the cursor closes when the inner with exits and the connection commits when the outer with exits.¹¹ No manual commit call needed in the happy path.

Step 4: wire the Lambda handler

The S3 trigger delivers an event with a Records array; each record names the bucket and object key. Iterate the records (S3 usually delivers one per invocation, but the schema is plural for a reason) and call the tagger plus the writer.

# lambda_handler.py
from urllib.parse import unquote_plus

from db import upsert_image
from tagger import tag_image


def lambda_handler(event, context):
    for record in event["Records"]:
        bucket = record["s3"]["bucket"]["name"]
        key = unquote_plus(record["s3"]["object"]["key"])
        result = tag_image(bucket, key)
        upsert_image(
            bucket=bucket,
            key=key,
            tags=result["tags"],
            alt_text=result["alt_text"],
        )
    return {"status": "ok"}

unquote_plus matters because S3 URL-encodes spaces and special characters in the event payload; the AWS Lambda S3-trigger tutorial flags this as a common source of NoSuchKey errors when callers forget to decode.¹²

On the AWS side, configure the bucket’s event notification for s3:ObjectCreated:Put (or s3:ObjectCreated:* if you also want multi-part uploads) and grant the Lambda’s execution role s3:GetObject on the bucket plus outbound network access if Postgres is in a VPC. The AWS Lambda docs cover the resource-based policy that lets S3 invoke the function.¹³

PostgreSQL official documentation page covering array functions and operators including overlap and contains

Image: PostgreSQL array functions and operators, used for editorial coverage of the overlap operator the search endpoint uses.

Step 5: serve the search API

FastAPI maps repeated query parameters to a list when the parameter is typed as a list. The endpoint accepts one or more tags, treats them as an OR-overlap against the stored array, and returns matching rows.

# api.py
from typing import Annotated

from fastapi import FastAPI, Query

from db import get_conn

app = FastAPI()


@app.get("/search")
def search_images(tag: Annotated[list[str], Query(min_length=1)]):
    sql = """
        SELECT s3_key, bucket, tags, alt_text, tagged_at
        FROM images
        WHERE tags && %s
        ORDER BY tagged_at DESC
        LIMIT 50;
    """
    with get_conn() as conn:
        with conn.cursor() as cur:
            cur.execute(sql, (tag,))
            rows = cur.fetchall()

    return [
        {
            "s3_key": row[0],
            "bucket": row[1],
            "tags": row[2],
            "alt_text": row[3],
            "tagged_at": row[4].isoformat(),
        }
        for row in rows
    ]

The FastAPI docs cover list query parameters and the validator that rejects empty ?tag= calls before they hit the database.¹⁴ The Postgres overlap operator is true if any element of the left array appears in the right array, and is the operator the array-functions reference recommends for tag-search workloads.¹⁵

Run it locally with fastapi dev api.py and hit http://127.0.0.1:8000/search?tag=cat&tag=outdoors. The response is a JSON array of matching rows, ordered most-recent-first.

Step 6: verify end to end

The fastest local verification path skips Lambda entirely. Drop a test image into the bucket, then invoke the handler manually:

# verify.py
from lambda_handler import lambda_handler

event = {
    "Records": [
        {
            "s3": {
                "bucket": {"name": "your-bucket"},
                "object": {"key": "samples/cat.jpg"},
            }
        }
    ]
}

print(lambda_handler(event, None))

Then query the API. If the row appears with sensible tags and a screen-reader-friendly alt-text, the pipeline is wired. Promote to Lambda by zipping the three modules plus the deps, uploading via the console or aws lambda update-function-code, and attaching the S3 trigger.

Cost and rate-limit notes

Per Anthropic’s pricing page (verified 2026-05-19; prices fluctuate, verify before deploying), Haiku 4.5 input costs $1 per million tokens and output $5 per million tokens.¹⁶ A 1024 by 1024 image consumes roughly 1,600 tokens by the vision guide’s tokens-per-image formula; at Haiku rates that works out to about $0.0016 per tagged image plus the output cost of the ~80-token JSON response.¹⁷ Bucket-scale uploads compound quickly: batch behind an SQS queue if you expect more than a few requests per second so you can backpressure against the API’s per-minute rate limits.

For higher-fidelity tagging (fine-grained product attributes, brand recognition, text in images), the Anthropic models page surfaces Opus 4.7 as the source-recommended tier given its higher max image resolution.¹⁸ The code path is identical: change the model argument and add a claude-opus-4-7 budget alarm in your billing console.

What sources flag as common failure modes

Forgetting unquote_plus on the S3 key. Spaces in filenames arrive as + characters in the event payload; the AWS docs flag this explicitly.¹²
Sending raw bytes instead of base64. The vision guide is explicit that the base64 source expects an ASCII-encoded string, not raw bytes.¹⁰
Skipping the GIN index. Postgres can answer array-overlap queries without it, but the array-functions reference rates GIN as the index type designed for the operator.⁹
Using an IN clause instead of overlap. A scalar IN clause does not do what you want against an array column; use the overlap operator for any-match semantics or the contains operator for all-match.⁹

Where to take it next

Three reasonable extensions, each named in the cited references:

Switch to the Files API if you’re uploading the same image more than once; the vision guide flags file_id references as the lower-payload path for high-volume tagging.¹
Add a confidence score by asking Claude to rate each tag from 0 to 1, then store it in a parallel confidences float-array column for ranked retrieval.
Layer pgvector on top of the alt-text column to support semantic search; the Postgres array operators handle exact-tag search, and pgvector closes the loop for “images about quiet rainy mornings” queries.

How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.

Sources consulted

Cited Sources

1. Anthropic — Vision guide, "Before you upload" and "Files API" sections (accessed 2026-05-19) ↩
2. Anthropic — Models overview, Haiku 4.5 row (\$1 / \$5 per MTok) (accessed 2026-05-19) ↩
3. AWS Lambda — Process Amazon S3 event notifications with Lambda (accessed 2026-05-19) ↩
4. boto3 — S3 client get_object reference (Body StreamingBody.read()) (accessed 2026-05-19) ↩
5. psycopg on PyPI — current stable release family (3.3.x) (accessed 2026-05-19) ↩
6. FastAPI — Query parameter models (list handling) (accessed 2026-05-19) ↩
7. Anthropic — Vision guide, "General limits" section (8000x8000 px ceiling; per-model resolution differences) (accessed 2026-05-19) ↩
8. Anthropic Python SDK on PyPI (accessed 2026-05-19) ↩
9. PostgreSQL — Array functions and operators (overlap, contains, GIN index) (accessed 2026-05-19) ↩
10. Anthropic — Vision guide, base64 image example section (accessed 2026-05-19) ↩
11. psycopg 3 — Basic module usage (connection and cursor context managers) (accessed 2026-05-19) ↩
12. AWS Lambda — S3 trigger tutorial (URL-decoding the object key) (accessed 2026-05-19) ↩
13. AWS Lambda — Resource-based policy for S3 invocation (accessed 2026-05-19) ↩
14. FastAPI — Query parameter models, list and validator usage (accessed 2026-05-19) ↩
15. PostgreSQL — Array operators, overlap semantics (accessed 2026-05-19) ↩
16. Anthropic — Pricing (Haiku 4.5: \$1 / input MTok, \$5 / output MTok) (accessed 2026-05-19) ↩
17. Anthropic — Vision guide, "Calculate image costs" section (accessed 2026-05-19) ↩
18. Anthropic — Models overview, Opus 4.7 image-resolution note (accessed 2026-05-19) ↩

Anonymous · no cookies set

Found this useful? Share it.