5-Step Guide Empowers Developers to Build Custom AI Agents with LangGraph and Claude

A detailed tutorial outlines step-by-step instructions for building a versatile AI assistant that combines LangGraph with the Claude model from Anthropic. This multi-role agent can handle number crunching, live web searches, weather lookups, text analysis and time reporting. It starts with dependency setup, then moves into code examples covering tool definitions, model integration and workflow orchestration, using code snippets and commentary to guide coders through each feature.

The first code block installs key Python libraries by invoking pip via the subprocess module with -q to suppress verbose output. This toolkit covers dependencies such as langchain, langgraph, chatanthropic, duckduckgo_search, requests plus any helper modules needed for environment handling. The script checks exit codes and prints errors if a download or build fails. That single-cell approach gives developers a reproducible starting point for notebooks or local setups.

Next, the script imports standard packages—including os for system interaction, json for structured data, datetime for timestamps, math for numeric routines and subprocess when required. From PyPI it brings in requests for HTTP calls and duckduckgo_search for web lookups. Additional imports from the LangChain and LangGraph ecosystems handle chat messages, tool decorators and graph components. The ChatAnthropic connector provides access to Anthropic’s Claude models.

An environment variable named ANTHROPIC_API_KEY must supply a valid secret before model initialization. The code assigns the key via os.environ['ANTHROPIC_API_KEY'], which can be loaded from a .env file or injected through a CI/CD pipeline. Subsequent calls use os.getenv('ANTHROPIC_API_KEY') to avoid repeating the token inline. That pattern keeps sensitive credentials out of shared notebooks and version history.

A TypedDict subclass called AgentState captures a basic memory structure for storing past messages. A @tool-decorated calculator function parses and evaluates math expressions. It maps caret (^) to Python’s **, locks the namespace to safe names from the math module (for example, sin, cos, tan, log, sqrt) and handles errors by returning clear feedback instead of crashing.

The web_search tool wraps the DuckDuckGo API provided by duckduckgo_search. It accepts a query string and an optional num_results parameter enforced to lie between 1 and 10. A session call returns raw hits with title, link and snippet. Each result is formatted into an indexed line, then joined into a single block. If an exception occurs or no results appear, the function catches it and returns a notice that no data was found.

A weather_info function acts as a placeholder for real meteorology services. It holds a dictionary of sample records for New York, London, Tokyo and Paris. After normalizing the city name to lowercase, it looks up the entry and builds a reply showing temperature in Celsius, weather condition and humidity. Queries for unsupported locations yield a message that no forecast is available.

The text_analyzer utility inspects any supplied string. It trims whitespace, counts total characters including spaces, then derives a no-space count by stripping blanks. The tool splits on whitespace to calculate word and sentence totals—detecting punctuation like ., ?, ! as sentence ends—and computes an average words-per-sentence figure. A quick pass with collections.Counter finds the most frequent token. If the input is empty after trimming, the tool prompts for valid content.

To serve time queries, the current_time tool taps datetime.datetime.now(). It formats the result as YYYY-MM-DD HH:MM:SS, delivering a clear timestamp so the agent can answer questions about the present moment or time-stamp its own replies.

Model setup checks the presence of ANTHROPIC_API_KEY. If present, the code loads Claude with model='claude-3-haiku', temperature=0.2 and a reasonable max_tokens setting. If the variable is empty, a MockLLM class intercepts calls, scans the prompt for keywords such as “calculate”, “search”, “weather” and “time”, then returns stubbed JSON that routes to the appropriate tools. That fallback lets testers try most routines offline.

Next, a call to bind_tools attaches each Python function to the LLM instance by name. For example, the calculator function registers as "calculator", web_search as "web_search", and so forth. That linkage updates the model’s internal registry so tool invocation proceeds via structured prompts, letting the LLM invoke Python code instead of generating plain text.

The core decision logic lives in an agent_node function. It accepts an AgentState and a user message, appends both into state.messages, then calls model.generate(state.messages). The model’s reply might include a direct answer or an instruction to call a tool. The function appends that output back into state.messages to preserve context for future turns.

A companion function named should_continue examines the latest entry in AgentState.messages. It looks for a standardized marker—often a JSON object with a "tool" field—to detect that the model intends to invoke a tool. If that marker exists, the workflow driver sends control to the ToolNode. If not, it signals the graph to end the session, returning final text to the user.

Building the operational graph uses StateGraph primitives. It begins with a node tied to agent_node. An edge routes to a ToolNode whenever should_continue returns True. After executing the tool, the graph loops back to agent_node so the conversation can proceed. A MemorySaver component writes each message to disk or cloud storage, keyed by a thread_id, allowing sessions to resume later.

With all nodes and edges defined, calling graph.compile() produces an application object. That object can dispatch chat requests over HTTP in a web server or run in-process for local testing. The compiled graph respects the designed transitions, memory hooks and tool calls, delivering a stable conversational engine.

An automated test_agent function sends a suite of sample prompts to the compiled app. It tries expressions like 2 + 2, a web query such as “latest AI news,” a weather lookup for “Tokyo,” a time check and a text analysis request. For each input, it prints the prompt and the agent’s reply, making it easy to verify that tools are wired correctly and responses meet expectations.

For interactive exploration, chat_with_agent spins up a console loop. At launch it prints basic guidance—typing help shows commands and quit exits. Each user line passes through the graph app, which returns a response that may include tool output. The cycle repeats until the user ends the session, offering a hands-on look at how the agent juggles mixed requests.

In the main guard block (if __name__ == '__main__':), the script runs test_agent first, then calls chat_with_agent for live use. A quick demo sequence follows, running a few calls back to back—example calculations, a brief search and a timestamp query—so newcomers can see the agent’s behavior without diving into code. Print statements at launch explain how to set ANTHROPIC_API_KEY, run demos or switch to offline mode with MockLLM.

Throughout the notebook, inline comments and Markdown cells explain each segment of logic. Notes show how to swap the weather stub for a real HTTP endpoint or how to raise num_results limits in the search tool. The pattern for adding new utilities remains consistent: use @tool, write your function signature, call bind_tools, then update graph transitions.

To extend the agent with a custom utility, replicate the decorator pattern and function signature. A database_query tool, for instance, could accept SQL text and return formatted rows. Once decorated and bound, adding an edge to the StateGraph lets the model invoke your service just like the built-in routines. That design enables rapid rollout of domain-specific features—analytics, translation or notifications—without changing core logic.

The MemorySaver strategy writes conversation snapshots under each thread_id to a persistent store, which could be a local file path, database or cloud bucket. On a fresh run, the loader can retrieve past messages so users see earlier exchanges, creating a seamless multi-session experience.

Error paths are covered, too. If a tool function raises an exception, the graph engine catches it and sends back a clear message such as “Invalid expression in calculator” or “Search query returned no results.” The model sees that feedback in context and can offer guidance or ask for corrected input.

Model and tool calls run synchronously by default in the notebook. Developers can adapt the graph to execute tool functions in background threads or deploy them as microservices to reduce latency. The graph transitions and memory logic remain unchanged, allowing infrastructure tweaks without touching agent code.

Though the tutorial uses Claude 3 Haiku by default, any LLM with a compatible chat interface can be substituted. Swapping providers involves replacing the loader and updating the prompt templates. Tool definitions, bind_tools calls and StateGraph wiring all work the same way, giving flexibility for cost, speed or data-residency requirements.

The final section explains how to package the project as a Python module or container image. It suggests adding a setup.py or pyproject.toml, listing dependencies in requirements.txt and defining a console script entry point for easy installs. That setup lets teams publish the agent as a reusable package in private or public registries.

By following this blueprint, organizations can spin up multi-utility AI assistants for websites, chat platforms or internal dashboards in a matter of hours. The tutorial’s mix of safe evaluation, external API calls, persistent memory and interactive UX provides a solid foundation for advanced scenarios such as scheduled reports, event-driven alerts or automated data pipelines.

Similar Posts