Langchain Middleware Concepts

LangChain Middleware is one of LangChain's most powerful features. It allows you to insert custom logic at various stages of Agent execution, implementing retry, fallback, caching, content filtering, logging, and other functions—without modifying the Agent's own code. * * * ## What is Middleware Middleware is a **Hook** in the Agent execution flow. Each hook lets you execute custom code at a specific point in time: ## Example # Intuitive understanding of Middleware: # Assume the Agent's execution flow is like this: # 1. User input → 2. Model thinking → 3. May call tools → 4. Model thinks again → 5. Output result # Middleware lets you insert custom logic between these 5 stages: # 1. User input # ↓ [before_agent hook: logging, permission check] # 2. Model thinking # ↓ [before_model hook: message preprocessing] # ↓ [wrap_model_call hook: retry, fallback, caching] # ↓ [after_model hook: content moderation] # 3. Tool execution # ↓ [wrap_tool_call hook: tool call retry] # 4. Back to model thinking (loop until complete) # ↓ [after_agent hook: result formatting, statistical analysis] # 5. Output result * * * ## Six Hook Points LangChain's Middleware provides 6 hooks, divided into two categories by execution frequency: | Hook | Execution Frequency | Execution Position | Main Purpose | | --- | --- | --- | --- | | before_agent | Once | Before Agent starts | Initialization, permission check, input preprocessing | | before_model | Each loop | Before model call | Message preprocessing, dynamic context injection | | wrap_model_call | Each loop | Wraps model call | Retry, fallback, caching, request rewriting | | after_model | Each loop | After model call | Content moderation, response filtering, logging | | wrap_tool_call | Each tool call | Wraps tool execution | Tool retry, result caching, parameter rewriting | | after_agent | Once | After Agent ends | Formatting output, statistics, resource cleanup | !(#) * * * ## Two Usage Methods Middleware can be used through class inheritance or decorators: ### Method 1: Decorator (Recommended) ## Example from langchain.agents.middleware import before_model, after_model # Decorator method: simple, intuitive @before_model def log_before(state, runtime): """Log before each model call""" msg_count = len(state.get("messages", [])) print(f" Current message count: {msg_count}") return None @after_model def log_after(state, runtime): """Log after each model call""" last_msg = state if state.get("messages") else None if last_msg and hasattr(last_msg, 'tool_calls') and last_msg.tool_calls: print(f" Model requested tool call") return None ### Method 2: Class Inheritance (Suitable for complex logic) ## Example from langchain.agents.middleware import AgentMiddleware class LoggingMiddleware(AgentMiddleware): """Custom logging middleware""" @property def name(self) -> str: # Custom middleware name (default is class name) return "logging" def before_agent(self, state, runtime): """Logic before Agent starts""" print(" Agent started execution") return None def before_model(self, state, runtime): """Logic before model call""" msg_count = len(state.get("messages", [])) print(f" Preparing to call model, currently {msg_count} messages") return None def after_model(self, state, runtime): """Logic after model call""" print(" Model call completed") return None def after_agent(self, state, runtime): """Logic after Agent ends""" print(" Agent execution ended") return None * * * ## Complete Lifecycle Example ## Example from dotenv import load_dotenv load_dotenv() from langchain.agents import create_agent from langchain.agents.middleware import ( before_agent, after_agent, before_model, after_model, ) from langchain.chat_models import init_chat_model from langchain.messages import HumanMessage from langchain.tools import tool @before_agent def start_log(state, runtime): """Before Agent starts""" print(">>> Agent started << Message {msg_count}") return None @after_model def post_model(state, runtime): """After each model call""" last = state if state.get("messages") else None if hasattr(last, 'tool_calls') and last.tool_calls: tools = [tc['name'] for tc in last.tool_calls] print(f" <- Requested tools: {tools}") else: content = str(last.content)[:50] if last and hasattr(last, 'content') else "" print(f" <- Direct reply: {content}...") return None @after_agent def end_log(state, runtime): """After Agent ends""" total_msgs = len(state.get("messages", [])) print(f"<<< Agent ended, total {total_msgs} messages << str: """Query weather""" return f"{city}: Sunny, 25°C" model = init_chat_model("deepseek:deepseek-v4-flash", temperature=0) agent = create_agent( model=model, tools=, middleware=[start_log, pre_model, post_model, end_log], system_prompt="You are an assistant.", ) print("n========== First question (requires tool) ==========") result = agent.invoke({ "messages": [HumanMessage(content="Hangzhou weather?")] }) print(f"n Final reply: {result['messages'].content}") print("n========== Second question (no tool needed) ==========") result = agent.invoke({ "messages": [HumanMessage(content="Hello")] }) print(f"n Final reply: {result['messages'].content}") Execution result: ========== First question (requires tool) ========== >>> Agent started << Message 2 Message 3 <- Direct reply: Hangzhou today is sunny, temperature 25°C.... <<< Agent ended, total 4 messages <<>> Agent started << Message 1 <- Direct reply: Hello! How can I help you?... <<< Agent ended, total 2 messages << The returned dict is merged through the Agent state's reducer. For the messages field, the add_messages reducer is used, so returned messages are appended rather than overwritten.

YouTip

Langchain Middleware Concepts

📂 Categories