Build your first Smolagents agent in Python: a 30-minute tutorial without LangGraph or LangChain
A step-by-step Smolagents walkthrough in Python. Roughly 40 lines of code, any LLM, working code-writing agent in 30 minutes — no LangChain, no LangGraph.
Image: Hugging Face Smolagents launch blog, used for editorial coverage.
What you’ll need
Hugging Face frames Smolagents as the simplest path to a working AI agent in Python 1 . In about 30 minutes, with roughly 40 lines of code and any large-language model (Anthropic’s Claude, OpenAI’s GPT-5.5, a local Llama via Ollama, or any model on the Hugging Face Hub), this tutorial walks the reader from pip install to a research-assistant agent that searches the web, calls a custom tool, and writes its own loops in Python.
The framework was introduced by Hugging Face in December 2024 1 as a deliberately small alternative to LangChain and LangGraph. The smolagents core fits in roughly one thousand lines of code 2 , ships with a CodeAgent that writes Python instead of emitting JSON tool calls, integrates with LiteLLM for any-model support, and connects to MCP (Model Context Protocol) servers without extra glue code.
The aggregated source consensus, per Hugging Face’s own framing 1 and the Smolagents docs 2 , supports starting here for a first agent: the framework is positioned as a “barebones library” with the CodeAgent pattern as its distinguishing design choice, and the skills carry forward when more orchestration is needed later.
Prerequisites:
- Python 3.10 or later (Smolagents requires it)
- An LLM API key from Anthropic, OpenAI, or Hugging Face Hub, or a local model running via Ollama
- A terminal, a code editor, and roughly 30 minutes
Step 1: install Smolagents
A single pip command installs the framework. The [toolkit] extra pulls in the optional tools used later (DuckDuckGo search, web browsing, file operations).
pip install "smolagents[toolkit]"
Verify the install:
python -c "import smolagents; print(smolagents.__version__)"
The current release is on the 1.x line 3 . Anything 1.0 or later supports the API used below.
Image: Smolagents documentation index (huggingface.co/docs/smolagents), used for editorial coverage of the installation step in this tutorial.
Step 2: configure the LLM
Smolagents separates the agent (the loop) from the model (the brain). The model is whatever LLM produces the agent’s reasoning, and LiteLLMModel is the universal adapter. It speaks to Anthropic, OpenAI, Cohere, Mistral, Groq, Google, and dozens of other providers through one interface 4 .
Set the API key as an environment variable, then construct the model:
import os
from smolagents import CodeAgent, LiteLLMModel
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
model = LiteLLMModel(
model_id="anthropic/claude-sonnet-4-6",
temperature=0.2, # lower temperature reduces variance in tool-call decisions
)
For OpenAI, the model_id becomes openai/gpt-5.5 (and the env var is OPENAI_API_KEY). For a local Ollama model, use ollama_chat/llama3.1 after running ollama pull llama3.1. The agent code itself does not change; only the model_id string and the API key do.
Step 3: define a custom tool
A Smolagents tool is a Python function with a type-annotated signature, a docstring, and the @tool decorator. The agent reads the docstring to decide when to call the tool and inspects the type annotations to validate the arguments.
from smolagents import tool
@tool
def word_count(text: str) -> int:
"""Count the number of words in a string of text.
Args:
text: The text whose words should be counted.
"""
return len(text.split())
That’s the entire contract. The function name becomes the tool name, the docstring becomes the tool description the LLM sees, and the type annotations become the parameter schema. There is no JSON spec to author and no class to subclass.
Step 4: add the built-in DuckDuckGo search tool
Most agents need to search the web. Smolagents ships a built-in DuckDuckGoSearchTool that returns search results as a formatted string, with no API key required.
from smolagents import DuckDuckGoSearchTool
search_tool = DuckDuckGoSearchTool()
The class is named DuckDuckGoSearchTool for historical reasons; under the hood it wraps the DDGS package and searches multiple engines. Inside the agent the tool is exposed as web_search, which is the function name the model writes when it calls the tool.
Both the custom word_count tool and the built-in search tool are passed to the agent in a single list. The agent decides when to use each.
Image: Smolagents Tools tutorial (huggingface.co/docs/smolagents/en/tutorials/tools), used for editorial coverage of the @tool decorator pattern this step uses.
Step 5: run the agent
The full working pipeline assembles into one CodeAgent with two tools and one model:
from smolagents import CodeAgent, LiteLLMModel, DuckDuckGoSearchTool, tool
@tool
def word_count(text: str) -> int:
"""Count the number of words in a string of text.
Args:
text: The text whose words should be counted.
"""
return len(text.split())
model = LiteLLMModel(model_id="anthropic/claude-sonnet-4-6")
agent = CodeAgent(
tools=[DuckDuckGoSearchTool(), word_count],
model=model,
)
result = agent.run(
"Search for the latest news on the Hugging Face Smolagents library, "
"then tell me the word count of the most relevant headline."
)
print(result)
What happens internally is the load-bearing detail. The CodeAgent does NOT emit a JSON tool call. It writes a Python snippet that calls web_search(...), parses the result, calls word_count(...), and returns. The Smolagents launch blog cites Wang et al.’s 2024 Executable Code Actions Elicit Better LLM Agents paper 5 for the framing that writing actions in code (rather than as JSON tool-call snippets) gives better composability, object management, and generality, because the model can express nested calls, store intermediate results, and use the full expressive range of Python instead of negotiating each tool call as a separate JSON turn.
The agent runs in a restricted local Python interpreter by default (the LocalPythonExecutor), which limits imports and applies AST-level rules but is not a security sandbox per the Smolagents docs 6 . For any untrusted input — user-supplied prompts, scraped content, third-party MCP servers — swap in a remote executor such as E2B, Modal, or Docker.
Image: Smolagents documentation, Conceptual guide — Introduction to Agents (huggingface.co/docs/smolagents), used for editorial coverage.
Step 6 (optional): add an MCP server as a tool source
Model Context Protocol 7 is the open standard Anthropic introduced in November 2024 for connecting LLMs to external data and tools. Any MCP server (a filesystem server, a Postgres server, a Slack server, or anything in the MCP server registry) plugs into a Smolagents agent in two lines.
from smolagents import CodeAgent, LiteLLMModel
from mcp import StdioServerParameters
from smolagents.tools import ToolCollection
server_params = StdioServerParameters(
command="uvx",
args=["mcp-server-fetch"],
)
with ToolCollection.from_mcp(server_params, trust_remote_code=True) as tool_collection:
agent = CodeAgent(
tools=[*tool_collection.tools],
model=LiteLLMModel(model_id="anthropic/claude-sonnet-4-6"),
)
agent.run("Fetch https://huggingface.co/blog/smolagents and summarise the post.")
The mcp-server-fetch server is a reference implementation that gives the agent a fetch tool for retrieving web pages (pip install mcp-server-fetch, or uvx mcp-server-fetch runs it without a separate install). Swap the command for any MCP server in the registry and the agent gains those capabilities. This is what makes Smolagents practically useful: any tool ecosystem written for Claude Desktop or Cursor’s MCP integration is available to a Smolagents agent without porting.
Image: Smolagents GitHub repository (github.com/huggingface/smolagents), used for editorial coverage of the open-source framework this tutorial builds on.
Common pitfalls
A few traps catch most first-time users:
- Rate limits. The free tier of every LLM provider rate-limits aggressively. Anthropic’s free Workbench tier and OpenAI’s free trial typically rate-limit at the per-minute level; expect 429 errors on multi-step tasks. Either upgrade the provider tier or insert retries via LiteLLM’s built-in backoff.
- API-key configuration. LiteLLM reads provider credentials from environment variables.
ANTHROPIC_API_KEY,OPENAI_API_KEY,HF_TOKEN(for Hugging Face Inference). A common failure mode is exporting the key in one shell session and running the script from a different one. Keys do not persist across terminals unless added to.bashrcor a.envfile. - Tool error handling. If a tool raises an exception, the agent sees the traceback and tries to recover, which is usually fine. But network-bound tools (search, fetch) sometimes hang. Wrap long-running tools in
asyncio.wait_foror set HTTP timeouts explicitly. - Prompt injection in untrusted content. A web page the agent fetches can contain instructions targeting the LLM. The Smolagents docs flag this explicitly and recommend a remote executor (E2B, Modal, or Docker) for any agent processing third-party content, since the local interpreter is not a security boundary.
Where to go next
Three directions extend this tutorial:
- Multi-agent setups. Smolagents supports nested agents through the
managed_agentsparameter on CodeAgent. A “manager” agent can call a “researcher” sub-agent that itself has tools, which is useful for separating planning from execution. - Web UI via Gradio. The
GradioUIhelper wraps an agent in a chat interface in three lines:from smolagents import GradioUI; GradioUI(agent).launch(). The Hugging Face cookbook ships a worked example. - Deployment. For production, the same agent runs inside a FastAPI endpoint, a Hugging Face Space, or a Docker container. The remote-executor pattern (E2B, Docker) replaces
LocalPythonExecutorfor safe untrusted-input handling.
The framework is small enough to read end-to-end. For builders who want to see how the loop actually works, the GitHub repository 8 has the full source; the agents.py module is the home of the reasoning-and-action loop, and the Smolagents core fits in roughly one thousand lines of code overall, consistent with the framing from the launch blog.
How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.
Sources consulted
Cited Sources
- 1. Hugging Face introductory blog post for smolagents — December 2024 launch announcement, framing the framework as "simple agents that write actions in code" (accessed ) ↩
- 2. Smolagents documentation index — "barebones library that fits in roughly thousand lines of code" framing (accessed ) ↩
- 3. Smolagents documentation index — installation guide and current release version (accessed ) ↩
- 4. LiteLLM documentation — list of supported provider integrations (accessed ) ↩
- 5. Hugging Face blog — "Code agents" section citing Wang et al. 2024 (arXiv:2402.01030) for the composability, object-management, and generality advantages of writing actions in code versus JSON (accessed ) ↩
- 6. Smolagents documentation — secure code execution guide; explicitly states "the built-in LocalPythonExecutor is not a security sandbox" and recommends E2B / Modal / Docker remote executors for untrusted input (accessed ) ↩
- 7. Model Context Protocol — introduction and specification (Anthropic, November 2024) (accessed ) ↩
- 8. Smolagents GitHub repository — agents.py contains the core reasoning-and-action loop (accessed ) ↩
Further Reading
- Smolagents documentation — Building good agents (accessed )
- Smolagents documentation — Tools tutorial (accessed )
- Hugging Face Cookbook — Agents recipes (accessed )
Anonymous · no cookies set