How Skills Work
You might wonder: Since an agent is not human, how does it know what a Skill can do and how to do it?
**The answer is simpleβby reading files.**
Behind every Skill, there is a description file, usually called SKILL.md, just like an instruction manual.
### How Skills Work
Skills use Progressive Disclosure, allowing the agent to load information of varying levels of detail at different stages. This design enables the agent to manage a large number of skills simultaneously without exhausting the context space.
### Three Stages
The agent loads skills in the following three stages:
* **1. Discovery Stage:** When a conversation starts, the agent scans all available skill folders, reading only the name and description of each skill. This is the lightest information, enough for the agent to determine whether a skill might be relevant to the current task.
* **2. Activation Stage:** When the agent determines that a skill's description matches the user's request, it loads the complete SKILL.md file content into the context. At this point, the agent reads the full instructions and explanations.
* **3. Execution Stage:** The agent executes the task according to the instructions in SKILL.md. During this process, the agent may invoke scripts included with the skill, read reference materials, or use other resources.
### Advantages of Progressive Loading
| Stage | Loaded Content | Typical Size |
| --- | --- | --- |
| Discovery Stage | name + description | About 100 tokens |
| Activation Stage | Complete SKILL.md | Recommended no more than 5000 tokens |
| Execution Stage | Scripts, reference materials, etc. | Loaded on demand |
Before executing a task, the agent reads this instruction manual to figure out three things:
* What can this Skill do? (Capability description)
* What do I need to provide to use it? (Input parameters)
* What are the rules for doing this? (Environmental constraints and precautions)
After reading, the agent knows: Oh, this Skill is suitable for solving the current problem, I will invoke it.
The image below illustrates this process:
### Skill Execution Flow
* Starting from the user's instruction, it first performs Skill intent recognition to decide whether to enter the controlled execution path.
* Once a Skill is matched, the system loads SKILL.md, establishes tool permissions and behavioral boundaries, and then performs reasoning combined with the context.
* Only invoke allowed external tools when truly necessary; otherwise, complete the logic within the rules.
* The final result is output after constraint integration, and the user's next input triggers a new complete cycle.
!(#)
* * *
## Skills Workflow
After the user makes a request, the Agent first understands the task content, and then analyzes which capabilities (Skills) should be used to complete the work.
A Skill can be understood as an independent capability module or plugin, such as weather queries, search, file processing, database access, etc.
The Agent is responsible for thinking and decision-making, while the Skill is responsible for executing actions. After execution, the result is returned to the Agent, which organizes it into a user-friendly result and outputs it.
The entire process forms a complete closed loop: Understanding β Decision β Execution β Return Result.
!(#)
| Step | Stage | Agent Internal Action | Example |
| --- | --- | --- | --- |
| β | User Initiates Request | Receive natural language task | "Help me check today's weather in Beijing" |
| β‘ | Understand User Intent | Analyze what the user really wants to do, and extract parameters | Action: Query; Type: Weather; Location: Beijing |
| β’ | Formulate Execution Plan | Break down task steps | Find weather service β Get weather data β Organize results |
| β£ | Select Skill | Match the most suitable capability module | Select Weather Skill |
| β€ | Execute Skill | Call external tools or APIs | Request weather API |
| β₯ | Get Execution Result | Receive data returned by Skill | Temperature, humidity, air quality |
| β¦ | Generate Final Reply | Organize and convert to natural language | "Beijing is sunny today, 18β~30β" |
| β§ | Return to User | Output final result | User sees the complete reply |
Skill Types:
| Skill Type | Function Description | Common Scenarios |
| --- | --- | --- |
| Search Skill | Query external information | Search news, knowledge Q&A |
| Weather Skill | Get weather data | Weather queries |
| File Skill | Read file contents | PDF, Word, Excel |
| Data Analysis Skill | Data calculation and statistics | Report analysis |
| Database Skill | Query database | MySQL, PostgreSQL |
| Browser Skill | Automate web operations | Auto-login, web scraping |
| Code Skill | Execute code | Python, Node.js |
| Image Skill | Image processing capabilities | OCR, image generation |
YouTip