Langchain Middleware Concepts
LangChain Middleware is one of LangChain's most powerful features. It allows you to insert custom logic at various stages of Agent execution, implementing retry, fallback, caching, content filtering, logging, and other functionsβwithout modifying the Agent's own code.
* * *
## What is Middleware
Middleware is a **Hook** in the Agent execution flow. Each hook lets you execute custom code at a specific point in time:
## Example
# Intuitive understanding of Middleware:
# Assume the Agent's execution flow is like this:
# 1. User input β 2. Model thinking β 3. May call tools β 4. Model thinks again β 5. Output result
# Middleware lets you insert custom logic between these 5 stages:
# 1. User input
# β [before_agent hook: logging, permission check]
# 2. Model thinking
# β [before_model hook: message preprocessing]
# β [wrap_model_call hook: retry, fallback, caching]
# β [after_model hook: content moderation]
# 3. Tool execution
# β [wrap_tool_call hook: tool call retry]
# 4. Back to model thinking (loop until complete)
# β [after_agent hook: result formatting, statistical analysis]
# 5. Output result
* * *
## Six Hook Points
LangChain's Middleware provides 6 hooks, divided into two categories by execution frequency:
| Hook | Execution Frequency | Execution Position | Main Purpose |
| --- | --- | --- | --- |
| before_agent | Once | Before Agent starts | Initialization, permission check, input preprocessing |
| before_model | Each loop | Before model call | Message preprocessing, dynamic context injection |
| wrap_model_call | Each loop | Wraps model call | Retry, fallback, caching, request rewriting |
| after_model | Each loop | After model call | Content moderation, response filtering, logging |
| wrap_tool_call | Each tool call | Wraps tool execution | Tool retry, result caching, parameter rewriting |
| after_agent | Once | After Agent ends | Formatting output, statistics, resource cleanup |
!(#)
* * *
## Two Usage Methods
Middleware can be used through class inheritance or decorators:
### Method 1: Decorator (Recommended)
## Example
from langchain.agents.middleware import before_model, after_model
# Decorator method: simple, intuitive
@before_model
def log_before(state, runtime):
"""Log before each model call"""
msg_count = len(state.get("messages", []))
print(f" Current message count: {msg_count}")
return None
@after_model
def log_after(state, runtime):
"""Log after each model call"""
last_msg = state if state.get("messages") else None
if last_msg and hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
print(f" Model requested tool call")
return None
### Method 2: Class Inheritance (Suitable for complex logic)
## Example
from langchain.agents.middleware import AgentMiddleware
class LoggingMiddleware(AgentMiddleware):
"""Custom logging middleware"""
@property
def name(self) -> str:
# Custom middleware name (default is class name)
return "logging"
def before_agent(self, state, runtime):
"""Logic before Agent starts"""
print(" Agent started execution")
return None
def before_model(self, state, runtime):
"""Logic before model call"""
msg_count = len(state.get("messages", []))
print(f" Preparing to call model, currently {msg_count} messages")
return None
def after_model(self, state, runtime):
"""Logic after model call"""
print(" Model call completed")
return None
def after_agent(self, state, runtime):
"""Logic after Agent ends"""
print(" Agent execution ended")
return None
* * *
## Complete Lifecycle Example
## Example
from dotenv import load_dotenv
load_dotenv()
from langchain.agents import create_agent
from langchain.agents.middleware import (
before_agent, after_agent,
before_model, after_model,
)
from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage
from langchain.tools import tool
@before_agent
def start_log(state, runtime):
"""Before Agent starts"""
print(">>> Agent started << Message {msg_count}")
return None
@after_model
def post_model(state, runtime):
"""After each model call"""
last = state if state.get("messages") else None
if hasattr(last, 'tool_calls') and last.tool_calls:
tools = [tc['name'] for tc in last.tool_calls]
print(f" <- Requested tools: {tools}")
else:
content = str(last.content)[:50] if last and hasattr(last, 'content') else ""
print(f" <- Direct reply: {content}...")
return None
@after_agent
def end_log(state, runtime):
"""After Agent ends"""
total_msgs = len(state.get("messages", []))
print(f"<<< Agent ended, total {total_msgs} messages << str:
"""Query weather"""
return f"{city}: Sunny, 25Β°C"
model = init_chat_model("deepseek:deepseek-v4-flash", temperature=0)
agent = create_agent(
model=model,
tools=,
middleware=[start_log, pre_model, post_model, end_log],
system_prompt="You are an assistant.",
)
print("n========== First question (requires tool) ==========")
result = agent.invoke({
"messages": [HumanMessage(content="Hangzhou weather?")]
})
print(f"n Final reply: {result['messages'].content}")
print("n========== Second question (no tool needed) ==========")
result = agent.invoke({
"messages": [HumanMessage(content="Hello")]
})
print(f"n Final reply: {result['messages'].content}")
Execution result:
========== First question (requires tool) ==========
>>> Agent started << Message 2
Message 3
<- Direct reply: Hangzhou today is sunny, temperature 25Β°C....
<<< Agent ended, total 4 messages <<>> Agent started << Message 1
<- Direct reply: Hello! How can I help you?...
<<< Agent ended, total 2 messages << The returned dict is merged through the Agent state's reducer. For the messages field, the add_messages reducer is used, so returned messages are appended rather than overwritten.
YouTip