Agent Context Engineering

`Agent Context Engineering` is a technical practice for systematically designing and optimizing the context information passed to AI Agents. Its goal is to enable the Agent to obtain the most effective information within a limited context window, thereby improving the accuracy, efficiency, and reliability of task execution. Unlike traditional programs with fixed execution logic, AI Agents' behavior depends entirely on the received context—including system prompts, tool descriptions, historical dialogue, and externally retrieved information. The core of context engineering lies in answering one question: **What does the Agent need to know most in this call?** !(#) > Context engineering is not a one-time prompt writing, but a continuous optimization process throughout the entire lifecycle of Agent system development, testing, and operations. ### Core Elements Agent context consists of the following layers: | Layer | Content | Characteristics | | --- | --- | --- | | System Prompt | Role definition, behavior rules, output format requirements | Carried in every call, relatively stable | | Tool Definitions | Available tools' names, parameters, and functionality descriptions | Occupies a large number of tokens, needs streamlining | | History | Current session's dialogue and tool call records | Continuously grows, requires truncation or summarization strategies | | Retrieved Context | Content retrieved from external knowledge base or code base | Injected on demand, requires relevance ranking | | User Input | User's current instruction or question | Uncontrollable, but can be optimized through clarification | * * * ## Core Practices ### Context Budget Management Treat the context window as a limited "budget" and allocate it reasonably among different types of context. Not all information is worth occupying context space; every piece of context should have a clear ROI. Context budget allocation suggestions (using 200K token window as example):System Prompt ██████████ 10% (20K tokens)Tool Definitions ████████████████ 20% (40K tokens)Retrieved Context ████████████████████ 25% (50K tokens)History ████████████████████████ 30% (60K tokens)User Input ████████ 5%~10% (10-20K tokens)Reserved Buffer ████ 5% (10K tokens)Actual ratios should be dynamically adjusted based on the Agent's specific task type. If the Agent primarily does code generation, the retrieved context ratio can be increased. If the Agent is a multi-turn dialogue assistant, the history budget should be prioritized. ### Engineering System Prompts System prompts are the cornerstone of Agent behavior and need to follow structured writing principles. **Writing Principles:** | Principle | Description | Example | | --- | --- | --- | | Layered Organization | Divide instructions into chunks by role, rules, process, format | Use Markdown headings for separation | | Positive Expression | Tell the Agent what to do, not what not to do | "Keep answers concise" is better than "Don't be verbose" | | Provide Examples | Use few-shot examples instead of long descriptions | Output format examples are more efficient than text descriptions | | Clear Priority | When rules conflict, which one the Agent should follow | "Safety rules take priority over efficiency requirements" | | Remove Redundancy | Delete rules that were never triggered and duplicate explanations | Regularly review and streamline system prompts | The following is an example skeleton of a structured system prompt: ## RoleYou are a Python code review assistant.## Core Rules- Provide fix suggestions for each issue- Sort output by severity- Mark uncertain items as "Pending Confirmation"## Workflow1. Analyze code changes2. Categorize issues by severity3. Output each issue with suggestions one by one## Output Format**Issue**: **Severity**: [High/Medium/Low]**Suggestion**: ### Optimizing Tool Descriptions Tool descriptions often occupy the largest proportion of context, but many tools are never actually called. Optimization strategies include: | Strategy | Method | Effect | | --- | --- | --- | | Tool Simplification | Remove tool definitions unrelated to current task | Reduce tool context by 30%~50% | | Description Refinement | Describe each tool's purpose and parameters in one sentence | Improve model's understanding accuracy of tools | | Parameter Constraints | Clarify parameter usage scenarios and limitations in description | Reduce parameter errors in tool calls | | Grouped Registration | Dynamically register different tool sets by task stage | Avoid "choice difficulty" caused by too many tools | > Tool definitions are a burden if not zero. Every retained tool requires trade-offs between description quality and token cost. ### History Compression Strategies In multi-turn dialogues, history grows rapidly and occupies context. Appropriate compression strategies need to be chosen to balance context completeness and window limitations. Comparison of common history compression strategies:Strategy Method Applicable Scenario─────────────────────────────────────────────────Sliding Window Keep last N complete rounds Short dialogues, real-time interactionTiered Summary Incremental summary per round Long conversations, customer service scenariosLayered Summary Keep recent rounds complete, summarize distant ones Document generation, complex tasksKey Turn Marking Mark important turns, discard others Debugging, step-by-step execution tasksLayered summarization is the most commonly used strategy: keep the last 3-5 rounds complete, and replace earlier dialogue with structured summaries. Structured summaries should contain key information from the original dialogue: what was the user's goal, what did the Agent do, what output was produced, what errors were encountered. ### Quality Control of Retrieved Context When an Agent relies on external knowledge bases, the quality of retrieved context directly determines output quality. Key points for improving retrieved context quality: | Stage | Common Issues | Improvement Methods | | --- | --- | --- | | Query Rewriting | User input is not precise enough | Let Agent rewrite query before retrieval | | Relevance Filtering | Retrieved results contain irrelevant content | Set relevance threshold, proactively prompt when insufficient | | Source Attribution | Agent cannot judge information credibility | Attach source path and update time | | Length Trimming | Retrieved results occupy too much context | Truncate by paragraphs, keep most relevant fragments | * * * ## Common Patterns ### Progressive Disclosure Don't stuff all information into context at once; disclose it gradually based on execution stages. Initial context only contains system prompts and necessary tools for the current stage. When the Agent completes the current stage, inject the next stage's tools and rules. This pattern keeps the context "density" within a reasonable range, avoiding the Agent being disturbed by irrelevant information. ### Context Compression Chain When a task requires processing large amounts of data, use chain calls to split into multiple steps, keeping only key intermediate results at each step. Step 1: Read all log files → Output { Error count, time range, error type list }Step 2: Based on Step 1 output, analyze Top 3 errors → Output { Root cause analysis, fix suggestions }Step 3: Based on Step 2 output, generate fix PR → Output { PR title, description, code changes }Each chain's input is a refined summary from the previous step, not raw data, keeping context at a manageable size. ### Context Watermarks Place "watermark" markers at key positions in context to help monitor and diagnose Agent behavior. Example of context watermark:Current context strategy version

YouTip

Agent Context Engineering

📂 Categories