Neural Tech Daily
ai-tutorials

Build your first Smolagents agent in Python: a 30-minute tutorial without LangGraph or LangChain

A step-by-step Smolagents walkthrough in Python. Roughly 40 lines of code, any LLM, working code-writing agent in 30 minutes — no LangChain, no LangGraph.

Updated ~9 min read
Share
Header graphic from Hugging Face's December 2024 launch blog post introducing Smolagents, with the Smolagents wordmark and the tagline framing the framework as code-writing agents

Image: Hugging Face Smolagents launch blog, used for editorial coverage.

What you’ll need

Hugging Face frames Smolagents as the simplest path to a working AI agent in Python 1 . In about 30 minutes, with roughly 40 lines of code and any large-language model (Anthropic’s Claude, OpenAI’s GPT-5.5, a local Llama via Ollama, or any model on the Hugging Face Hub), this tutorial walks the reader from pip install to a research-assistant agent that searches the web, calls a custom tool, and writes its own loops in Python.

The framework was introduced by Hugging Face in December 2024 1 as a deliberately small alternative to LangChain and LangGraph. The smolagents core fits in roughly one thousand lines of code 2 , ships with a CodeAgent that writes Python instead of emitting JSON tool calls, integrates with LiteLLM for any-model support, and connects to MCP (Model Context Protocol) servers without extra glue code.

The aggregated source consensus, per Hugging Face’s own framing 1 and the Smolagents docs 2 , supports starting here for a first agent: the framework is positioned as a “barebones library” with the CodeAgent pattern as its distinguishing design choice, and the skills carry forward when more orchestration is needed later.

Prerequisites:

  • Python 3.10 or later (Smolagents requires it)
  • An LLM API key from Anthropic, OpenAI, or Hugging Face Hub, or a local model running via Ollama
  • A terminal, a code editor, and roughly 30 minutes

Step 1: install Smolagents

A single pip command installs the framework. The [toolkit] extra pulls in the optional tools used later (DuckDuckGo search, web browsing, file operations).

pip install "smolagents[toolkit]"

Verify the install:

python -c "import smolagents; print(smolagents.__version__)"

The current release is on the 1.x line 3 . Anything 1.0 or later supports the API used below.

Smolagents documentation index page on huggingface.co/docs/smolagents, the canonical reference for the framework's installation and current release version

Image: Smolagents documentation index (huggingface.co/docs/smolagents), used for editorial coverage of the installation step in this tutorial.

Step 2: configure the LLM

Smolagents separates the agent (the loop) from the model (the brain). The model is whatever LLM produces the agent’s reasoning, and LiteLLMModel is the universal adapter. It speaks to Anthropic, OpenAI, Cohere, Mistral, Groq, Google, and dozens of other providers through one interface 4 .

Set the API key as an environment variable, then construct the model:

import os
from smolagents import CodeAgent, LiteLLMModel

os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."

model = LiteLLMModel(
    model_id="anthropic/claude-sonnet-4-6",
    temperature=0.2,  # lower temperature reduces variance in tool-call decisions
)

For OpenAI, the model_id becomes openai/gpt-5.5 (and the env var is OPENAI_API_KEY). For a local Ollama model, use ollama_chat/llama3.1 after running ollama pull llama3.1. The agent code itself does not change; only the model_id string and the API key do.

Step 3: define a custom tool

A Smolagents tool is a Python function with a type-annotated signature, a docstring, and the @tool decorator. The agent reads the docstring to decide when to call the tool and inspects the type annotations to validate the arguments.

from smolagents import tool

@tool
def word_count(text: str) -> int:
    """Count the number of words in a string of text.

    Args:
        text: The text whose words should be counted.
    """
    return len(text.split())

That’s the entire contract. The function name becomes the tool name, the docstring becomes the tool description the LLM sees, and the type annotations become the parameter schema. There is no JSON spec to author and no class to subclass.

Step 4: add the built-in DuckDuckGo search tool

Most agents need to search the web. Smolagents ships a built-in DuckDuckGoSearchTool that returns search results as a formatted string, with no API key required.

from smolagents import DuckDuckGoSearchTool

search_tool = DuckDuckGoSearchTool()

The class is named DuckDuckGoSearchTool for historical reasons; under the hood it wraps the DDGS package and searches multiple engines. Inside the agent the tool is exposed as web_search, which is the function name the model writes when it calls the tool.

Both the custom word_count tool and the built-in search tool are passed to the agent in a single list. The agent decides when to use each.

Smolagents Tools tutorial on huggingface.co/docs/smolagents/en/tutorials/tools, documenting the @tool decorator, type-annotation contract, and built-in search tools used in this step

Image: Smolagents Tools tutorial (huggingface.co/docs/smolagents/en/tutorials/tools), used for editorial coverage of the @tool decorator pattern this step uses.

Step 5: run the agent

The full working pipeline assembles into one CodeAgent with two tools and one model:

from smolagents import CodeAgent, LiteLLMModel, DuckDuckGoSearchTool, tool

@tool
def word_count(text: str) -> int:
    """Count the number of words in a string of text.

    Args:
        text: The text whose words should be counted.
    """
    return len(text.split())

model = LiteLLMModel(model_id="anthropic/claude-sonnet-4-6")

agent = CodeAgent(
    tools=[DuckDuckGoSearchTool(), word_count],
    model=model,
)

result = agent.run(
    "Search for the latest news on the Hugging Face Smolagents library, "
    "then tell me the word count of the most relevant headline."
)
print(result)

What happens internally is the load-bearing detail. The CodeAgent does NOT emit a JSON tool call. It writes a Python snippet that calls web_search(...), parses the result, calls word_count(...), and returns. The Smolagents launch blog cites Wang et al.’s 2024 Executable Code Actions Elicit Better LLM Agents paper 5 for the framing that writing actions in code (rather than as JSON tool-call snippets) gives better composability, object management, and generality, because the model can express nested calls, store intermediate results, and use the full expressive range of Python instead of negotiating each tool call as a separate JSON turn.

The agent runs in a restricted local Python interpreter by default (the LocalPythonExecutor), which limits imports and applies AST-level rules but is not a security sandbox per the Smolagents docs 6 . For any untrusted input — user-supplied prompts, scraped content, third-party MCP servers — swap in a remote executor such as E2B, Modal, or Docker.

Screenshot of the Smolagents documentation 'Introduction to Agents' conceptual guide page on huggingface.co/docs/smolagents, showing the navigation sidebar and page header

Image: Smolagents documentation, Conceptual guide — Introduction to Agents (huggingface.co/docs/smolagents), used for editorial coverage.

Step 6 (optional): add an MCP server as a tool source

Model Context Protocol 7 is the open standard Anthropic introduced in November 2024 for connecting LLMs to external data and tools. Any MCP server (a filesystem server, a Postgres server, a Slack server, or anything in the MCP server registry) plugs into a Smolagents agent in two lines.

from smolagents import CodeAgent, LiteLLMModel
from mcp import StdioServerParameters
from smolagents.tools import ToolCollection

server_params = StdioServerParameters(
    command="uvx",
    args=["mcp-server-fetch"],
)

with ToolCollection.from_mcp(server_params, trust_remote_code=True) as tool_collection:
    agent = CodeAgent(
        tools=[*tool_collection.tools],
        model=LiteLLMModel(model_id="anthropic/claude-sonnet-4-6"),
    )
    agent.run("Fetch https://huggingface.co/blog/smolagents and summarise the post.")

The mcp-server-fetch server is a reference implementation that gives the agent a fetch tool for retrieving web pages (pip install mcp-server-fetch, or uvx mcp-server-fetch runs it without a separate install). Swap the command for any MCP server in the registry and the agent gains those capabilities. This is what makes Smolagents practically useful: any tool ecosystem written for Claude Desktop or Cursor’s MCP integration is available to a Smolagents agent without porting.

Smolagents open-source repository on GitHub at github.com/huggingface/smolagents, hosting the framework's full source including the agents.py file with the reasoning-and-action loop

Image: Smolagents GitHub repository (github.com/huggingface/smolagents), used for editorial coverage of the open-source framework this tutorial builds on.

Common pitfalls

A few traps catch most first-time users:

  • Rate limits. The free tier of every LLM provider rate-limits aggressively. Anthropic’s free Workbench tier and OpenAI’s free trial typically rate-limit at the per-minute level; expect 429 errors on multi-step tasks. Either upgrade the provider tier or insert retries via LiteLLM’s built-in backoff.
  • API-key configuration. LiteLLM reads provider credentials from environment variables. ANTHROPIC_API_KEY, OPENAI_API_KEY, HF_TOKEN (for Hugging Face Inference). A common failure mode is exporting the key in one shell session and running the script from a different one. Keys do not persist across terminals unless added to .bashrc or a .env file.
  • Tool error handling. If a tool raises an exception, the agent sees the traceback and tries to recover, which is usually fine. But network-bound tools (search, fetch) sometimes hang. Wrap long-running tools in asyncio.wait_for or set HTTP timeouts explicitly.
  • Prompt injection in untrusted content. A web page the agent fetches can contain instructions targeting the LLM. The Smolagents docs flag this explicitly and recommend a remote executor (E2B, Modal, or Docker) for any agent processing third-party content, since the local interpreter is not a security boundary.

Where to go next

Three directions extend this tutorial:

  • Multi-agent setups. Smolagents supports nested agents through the managed_agents parameter on CodeAgent. A “manager” agent can call a “researcher” sub-agent that itself has tools, which is useful for separating planning from execution.
  • Web UI via Gradio. The GradioUI helper wraps an agent in a chat interface in three lines: from smolagents import GradioUI; GradioUI(agent).launch(). The Hugging Face cookbook ships a worked example.
  • Deployment. For production, the same agent runs inside a FastAPI endpoint, a Hugging Face Space, or a Docker container. The remote-executor pattern (E2B, Docker) replaces LocalPythonExecutor for safe untrusted-input handling.

The framework is small enough to read end-to-end. For builders who want to see how the loop actually works, the GitHub repository 8 has the full source; the agents.py module is the home of the reasoning-and-action loop, and the Smolagents core fits in roughly one thousand lines of code overall, consistent with the framing from the launch blog.

How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.

Sources consulted

Anonymous · no cookies set

Report a problem with this article

Articles are produced by an autonomous AI pipeline; mistakes do happen. Tell us what's wrong and the editorial review will revisit the claim.

Category

Found this useful? Share it.