Build an AI Image-Tagging Pipeline: Claude Vision + S3 + Postgres in Python
Wire boto3, Claude vision, and psycopg into an end-to-end pipeline that tags every S3 upload and serves a /search?tag= API in FastAPI.
Image: Anthropic vision guide, used for editorial coverage of the API surface the tutorial builds against.
What you’ll build
A small Python service that watches an S3 bucket, sends every newly uploaded image to Claude for tag extraction plus alt-text, writes the result to Postgres, and exposes a /search?tag= endpoint via FastAPI. Per the Anthropic vision guide, every current Claude model accepts image input, so the same code path works against claude-opus-4-7, claude-sonnet-4-6, and claude-haiku-4-5; pick the tier that matches your latency and budget targets. 1 The Haiku tier is the source-recommended default for high-volume tagging given its $1 / $5 per million input/output tokens price point. 2
The pipeline has four moving parts:
- S3 upload trigger. An
s3:ObjectCreated:Putevent notification invokes a Lambda function asynchronously with the bucket and object key. 3 - Tagger. The function pulls the object bytes via the boto3 S3 client’s
get_objectcall, base64-encodes them, and sends them to Claude with a structured-output prompt. 4 - Postgres writer. The response (tags array plus alt-text) is upserted into an
imagestable via psycopg 3, the current stable PostgreSQL adapter family. 5 - Search API. A FastAPI service exposes
/search?tag=cat&tag=outdoors, translating list query parameters into a PostgreSQL array-overlap filter. 6
This piece is a tutorial; it does not benchmark Claude’s tagging accuracy against alternatives. The Anthropic vision guide notes that Opus 4.7 supports images up to 2,576 px on the long edge, more than three times the prior generation, making it the source-recommended choice when input fidelity matters. 7
Image: Anthropic models overview, used for editorial coverage of the model IDs the tutorial passes to the SDK.
What you’ll need
- Python 3.11 or later.
- An Anthropic API key (set
ANTHROPIC_API_KEYin your environment). - An AWS account with permission to create an S3 bucket, a Lambda function, and an IAM role.
- A PostgreSQL 14 or later instance: local, RDS, or any managed Postgres works.
- The packages:
anthropic,boto3,psycopg[binary], andfastapi[standard].
python -m venv .venv
source .venv/bin/activate
pip install "anthropic>=0.40" "boto3>=1.43" "psycopg[binary]>=3.3" "fastapi[standard]>=0.110"
The Anthropic SDK ships on PyPI and is updated alongside model releases; pin the floor at a recent minor so the claude-opus-4-7 ID is recognised. 8
Step 1: define the Postgres schema
The schema is a single table. Tags are stored as a native TEXT array so the search endpoint can use Postgres’s array operators directly, per the official array-functions reference. 9
CREATE TABLE IF NOT EXISTS images (
id BIGSERIAL PRIMARY KEY,
s3_key TEXT NOT NULL UNIQUE,
bucket TEXT NOT NULL,
tags TEXT[] NOT NULL DEFAULT '{}',
alt_text TEXT NOT NULL DEFAULT '',
tagged_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS images_tags_gin
ON images USING GIN (tags);
The GIN index on tags is what makes overlap queries cheap at scale. Without it, every /search request forces a sequential scan once the table grows past a few thousand rows.
Apply the migration with whatever runner you prefer; a one-shot psql -f schema.sql is enough for the tutorial. Persist the connection string in DATABASE_URL.
Step 2: write the tagger
The tagger is a single Python module that the Lambda handler imports. It fetches the S3 object, sends it to Claude, and parses the JSON response. Keep it framework-agnostic so the same code runs locally, in a script, or in Lambda.
# tagger.py
import base64
import json
import os
import boto3
from anthropic import Anthropic
_s3 = boto3.client("s3")
_anthropic = Anthropic() # reads ANTHROPIC_API_KEY from env
PROMPT = """You are an image-tagging assistant. Return a single JSON
object with two keys:
- tags: a list of 5 to 10 lowercase tags describing the subject,
setting, and notable attributes.
- alt_text: a single sentence describing the image for a screen
reader, under 125 characters.
Reply with the JSON object only, no prose."""
def fetch_image_bytes(bucket, key):
response = _s3.get_object(Bucket=bucket, Key=key)
body = response["Body"].read()
media_type = response.get("ContentType", "image/jpeg")
return body, media_type
def tag_image(bucket, key, model="claude-haiku-4-5"):
image_bytes, media_type = fetch_image_bytes(bucket, key)
encoded = base64.standard_b64encode(image_bytes).decode("ascii")
message = _anthropic.messages.create(
model=model,
max_tokens=512,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": encoded,
},
},
{"type": "text", "text": PROMPT},
],
}
],
)
raw = message.content[0].text.strip()
payload = json.loads(raw)
return {
"tags": [t.lower().strip() for t in payload["tags"]],
"alt_text": payload["alt_text"].strip(),
}
Two things worth flagging. First, the content array follows the shape documented in the Anthropic vision guide: a type: image block with a source object carrying type: base64, the media type, and base64-encoded bytes, paired with a type: text block for the instruction. 10 Second, the SDK reads ANTHROPIC_API_KEY from the environment automatically, so the constructor takes no arguments in the common case.
If the model returns prose around the JSON (rare for Haiku and Sonnet with this prompt, slightly more common for Opus on adversarial inputs), wrap the json.loads in a try/except and re-prompt; retry-on-parse-failure is the simplest robust pattern.
Image: Anthropic pricing page, used for editorial coverage of the token-rate numbers the tutorial cites.
Step 3: write to Postgres
Use psycopg 3 with the connection context manager; the with block ensures the connection is closed even on exceptions. The cur.execute placeholder is %s, which the driver maps to the correct Postgres parameter form, including arrays.
# db.py
import os
from contextlib import contextmanager
import psycopg
DATABASE_URL = os.environ["DATABASE_URL"]
@contextmanager
def get_conn():
with psycopg.connect(DATABASE_URL) as conn:
yield conn
def upsert_image(bucket, key, tags, alt_text):
sql = """
INSERT INTO images (s3_key, bucket, tags, alt_text)
VALUES (%s, %s, %s, %s)
ON CONFLICT (s3_key) DO UPDATE
SET tags = EXCLUDED.tags,
alt_text = EXCLUDED.alt_text,
tagged_at = NOW();
"""
with get_conn() as conn:
with conn.cursor() as cur:
cur.execute(sql, (key, bucket, tags, alt_text))
The psycopg 3 docs cover this pattern as the basic-usage default; the cursor closes when the inner with exits and the connection commits when the outer with exits. 11 No manual commit call needed in the happy path.
Step 4: wire the Lambda handler
The S3 trigger delivers an event with a Records array; each record names the bucket and object key. Iterate the records (S3 usually delivers one per invocation, but the schema is plural for a reason) and call the tagger plus the writer.
# lambda_handler.py
from urllib.parse import unquote_plus
from db import upsert_image
from tagger import tag_image
def lambda_handler(event, context):
for record in event["Records"]:
bucket = record["s3"]["bucket"]["name"]
key = unquote_plus(record["s3"]["object"]["key"])
result = tag_image(bucket, key)
upsert_image(
bucket=bucket,
key=key,
tags=result["tags"],
alt_text=result["alt_text"],
)
return {"status": "ok"}
unquote_plus matters because S3 URL-encodes spaces and special characters in the event payload; the AWS Lambda S3-trigger tutorial flags this as a common source of NoSuchKey errors when callers forget to decode. 12
On the AWS side, configure the bucket’s event notification for s3:ObjectCreated:Put (or s3:ObjectCreated:* if you also want multi-part uploads) and grant the Lambda’s execution role s3:GetObject on the bucket plus outbound network access if Postgres is in a VPC. The AWS Lambda docs cover the resource-based policy that lets S3 invoke the function. 13
Image: PostgreSQL array functions and operators, used for editorial coverage of the overlap operator the search endpoint uses.
Step 5: serve the search API
FastAPI maps repeated query parameters to a list when the parameter is typed as a list. The endpoint accepts one or more tags, treats them as an OR-overlap against the stored array, and returns matching rows.
# api.py
from typing import Annotated
from fastapi import FastAPI, Query
from db import get_conn
app = FastAPI()
@app.get("/search")
def search_images(tag: Annotated[list[str], Query(min_length=1)]):
sql = """
SELECT s3_key, bucket, tags, alt_text, tagged_at
FROM images
WHERE tags && %s
ORDER BY tagged_at DESC
LIMIT 50;
"""
with get_conn() as conn:
with conn.cursor() as cur:
cur.execute(sql, (tag,))
rows = cur.fetchall()
return [
{
"s3_key": row[0],
"bucket": row[1],
"tags": row[2],
"alt_text": row[3],
"tagged_at": row[4].isoformat(),
}
for row in rows
]
The FastAPI docs cover list query parameters and the validator that rejects empty ?tag= calls before they hit the database. 14 The Postgres overlap operator is true if any element of the left array appears in the right array, and is the operator the array-functions reference recommends for tag-search workloads. 15
Run it locally with fastapi dev api.py and hit http://127.0.0.1:8000/search?tag=cat&tag=outdoors. The response is a JSON array of matching rows, ordered most-recent-first.
Step 6: verify end to end
The fastest local verification path skips Lambda entirely. Drop a test image into the bucket, then invoke the handler manually:
# verify.py
from lambda_handler import lambda_handler
event = {
"Records": [
{
"s3": {
"bucket": {"name": "your-bucket"},
"object": {"key": "samples/cat.jpg"},
}
}
]
}
print(lambda_handler(event, None))
Then query the API. If the row appears with sensible tags and a screen-reader-friendly alt-text, the pipeline is wired. Promote to Lambda by zipping the three modules plus the deps, uploading via the console or aws lambda update-function-code, and attaching the S3 trigger.
Cost and rate-limit notes
Per Anthropic’s pricing page (verified 2026-05-19; prices fluctuate, verify before deploying), Haiku 4.5 input costs $1 per million tokens and output $5 per million tokens. 16 A 1024 by 1024 image consumes roughly 1,600 tokens by the vision guide’s tokens-per-image formula; at Haiku rates that works out to about $0.0016 per tagged image plus the output cost of the ~80-token JSON response. 17 Bucket-scale uploads compound quickly: batch behind an SQS queue if you expect more than a few requests per second so you can backpressure against the API’s per-minute rate limits.
For higher-fidelity tagging (fine-grained product attributes, brand recognition, text in images), the Anthropic models page surfaces Opus 4.7 as the source-recommended tier given its higher max image resolution. 18 The code path is identical: change the model argument and add a claude-opus-4-7 budget alarm in your billing console.
What sources flag as common failure modes
- Forgetting
unquote_pluson the S3 key. Spaces in filenames arrive as+characters in the event payload; the AWS docs flag this explicitly. 12 - Sending raw bytes instead of base64. The vision guide is explicit that the base64 source expects an ASCII-encoded string, not raw bytes. 10
- Skipping the GIN index. Postgres can answer array-overlap queries without it, but the array-functions reference rates GIN as the index type designed for the operator. 9
- Using an IN clause instead of overlap. A scalar IN clause does not do what you want against an array column; use the overlap operator for any-match semantics or the contains operator for all-match. 9
Where to take it next
Three reasonable extensions, each named in the cited references:
- Switch to the Files API if you’re uploading the same image more than once; the vision guide flags
file_idreferences as the lower-payload path for high-volume tagging. 1 - Add a confidence score by asking Claude to rate each tag from 0 to 1, then store it in a parallel
confidencesfloat-array column for ranked retrieval. - Layer pgvector on top of the alt-text column to support semantic search; the Postgres array operators handle exact-tag search, and pgvector closes the loop for “images about quiet rainy mornings” queries.
How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.
Sources consulted
Cited Sources
- 1. Anthropic — Vision guide, "Before you upload" and "Files API" sections (accessed ) ↩
- 2. Anthropic — Models overview, Haiku 4.5 row (\$1 / \$5 per MTok) (accessed ) ↩
- 3. AWS Lambda — Process Amazon S3 event notifications with Lambda (accessed ) ↩
- 4. boto3 — S3 client get_object reference (Body StreamingBody.read()) (accessed ) ↩
- 5. psycopg on PyPI — current stable release family (3.3.x) (accessed ) ↩
- 6. FastAPI — Query parameter models (list handling) (accessed ) ↩
- 7. Anthropic — Vision guide, "General limits" section (8000x8000 px ceiling; per-model resolution differences) (accessed ) ↩
- 8. Anthropic Python SDK on PyPI (accessed ) ↩
- 9. PostgreSQL — Array functions and operators (overlap, contains, GIN index) (accessed ) ↩
- 10. Anthropic — Vision guide, base64 image example section (accessed ) ↩
- 11. psycopg 3 — Basic module usage (connection and cursor context managers) (accessed ) ↩
- 12. AWS Lambda — S3 trigger tutorial (URL-decoding the object key) (accessed ) ↩
- 13. AWS Lambda — Resource-based policy for S3 invocation (accessed ) ↩
- 14. FastAPI — Query parameter models, list and validator usage (accessed ) ↩
- 15. PostgreSQL — Array operators, overlap semantics (accessed ) ↩
- 16. Anthropic — Pricing (Haiku 4.5: \$1 / input MTok, \$5 / output MTok) (accessed ) ↩
- 17. Anthropic — Vision guide, "Calculate image costs" section (accessed ) ↩
- 18. Anthropic — Models overview, Opus 4.7 image-resolution note (accessed ) ↩
Anonymous · no cookies set