How Skills Work

You might wonder: Since an agent is not human, how does it know what a Skill can do and how to do it? **The answer is simple—by reading files.** Behind every Skill, there is a description file, usually called SKILL.md, just like an instruction manual. ### How Skills Work Skills use Progressive Disclosure, allowing the agent to load information of varying levels of detail at different stages. This design enables the agent to manage a large number of skills simultaneously without exhausting the context space. ### Three Stages The agent loads skills in the following three stages: * **1. Discovery Stage:** When a conversation starts, the agent scans all available skill folders, reading only the name and description of each skill. This is the lightest information, enough for the agent to determine whether a skill might be relevant to the current task. * **2. Activation Stage:** When the agent determines that a skill's description matches the user's request, it loads the complete SKILL.md file content into the context. At this point, the agent reads the full instructions and explanations. * **3. Execution Stage:** The agent executes the task according to the instructions in SKILL.md. During this process, the agent may invoke scripts included with the skill, read reference materials, or use other resources. ### Advantages of Progressive Loading | Stage | Loaded Content | Typical Size | | --- | --- | --- | | Discovery Stage | name + description | About 100 tokens | | Activation Stage | Complete SKILL.md | Recommended no more than 5000 tokens | | Execution Stage | Scripts, reference materials, etc. | Loaded on demand | Before executing a task, the agent reads this instruction manual to figure out three things: * What can this Skill do? (Capability description) * What do I need to provide to use it? (Input parameters) * What are the rules for doing this? (Environmental constraints and precautions) After reading, the agent knows: Oh, this Skill is suitable for solving the current problem, I will invoke it. The image below illustrates this process: ### Skill Execution Flow * Starting from the user's instruction, it first performs Skill intent recognition to decide whether to enter the controlled execution path. * Once a Skill is matched, the system loads SKILL.md, establishes tool permissions and behavioral boundaries, and then performs reasoning combined with the context. * Only invoke allowed external tools when truly necessary; otherwise, complete the logic within the rules. * The final result is output after constraint integration, and the user's next input triggers a new complete cycle. !(#) * * * ## Skills Workflow After the user makes a request, the Agent first understands the task content, and then analyzes which capabilities (Skills) should be used to complete the work. A Skill can be understood as an independent capability module or plugin, such as weather queries, search, file processing, database access, etc. The Agent is responsible for thinking and decision-making, while the Skill is responsible for executing actions. After execution, the result is returned to the Agent, which organizes it into a user-friendly result and outputs it. The entire process forms a complete closed loop: Understanding → Decision → Execution → Return Result. !(#) | Step | Stage | Agent Internal Action | Example | | --- | --- | --- | --- | | ① | User Initiates Request | Receive natural language task | "Help me check today's weather in Beijing" | | ② | Understand User Intent | Analyze what the user really wants to do, and extract parameters | Action: Query; Type: Weather; Location: Beijing | | ③ | Formulate Execution Plan | Break down task steps | Find weather service → Get weather data → Organize results | | ④ | Select Skill | Match the most suitable capability module | Select Weather Skill | | ⑤ | Execute Skill | Call external tools or APIs | Request weather API | | ⑥ | Get Execution Result | Receive data returned by Skill | Temperature, humidity, air quality | | ⑦ | Generate Final Reply | Organize and convert to natural language | "Beijing is sunny today, 18℃~30℃" | | ⑧ | Return to User | Output final result | User sees the complete reply | Skill Types: | Skill Type | Function Description | Common Scenarios | | --- | --- | --- | | Search Skill | Query external information | Search news, knowledge Q&A | | Weather Skill | Get weather data | Weather queries | | File Skill | Read file contents | PDF, Word, Excel | | Data Analysis Skill | Data calculation and statistics | Report analysis | | Database Skill | Query database | MySQL, PostgreSQL | | Browser Skill | Automate web operations | Auto-login, web scraping | | Code Skill | Execute code | Python, Node.js | | Image Skill | Image processing capabilities | OCR, image generation |

YouTip

How Skills Work

📂 Categories