Skill is not a new prompt, it is the job manual for the agent.

What exactly is a Skill?

If I were to explain it in the simplest terms, I would say this: It’s like taking the expert knowledge from a seasoned employee’s head and organizing it into a reusable, triggerable, and actionable manual. That thing is a Skill. The official OpenAI documentation defines it very directly: A Skill is a package of capabilities designed for specific tasks. It can contain instructions, reference materials, and optional scripts. The goal isn’t to make the model “smarter,” but rather to ensure that when performing a certain type of task, it outputs results consistently according to a fixed workflow. What is it most like?

It’s not like a regular prompt because it’s not something you just say once and are done with.
It’s not like an MCP because it doesn’t handle connecting tools and data sources.
And it’s not like the general rules in AGENTS.md because it’s not a universal rule for the entire repository. It is more like a specialized Standard Operating Procedure (SOP), or a trade manual. For example:
When handling GitHub PR comments, first identify which comments need attention, then ask the user which ones to handle, then modify the code, and finally provide feedback.
When debugging a CI failure, first pull the GitHub Actions logs, then extract the failing segments, then propose a fix plan, and only execute after approval.
When writing a blog post, first generate titles according to a fixed style, then supplement with facts, then add frontmatter, and finally publish it. The common thread among these tasks is not that “the model doesn’t know how to answer,” but rather that “the model’s approach varies every time, making it easy for it to drift off course.” This is where the value of a Skill comes into play.

The Difference Between Skill and MCP

This issue really needs to be viewed within a real workflow, otherwise, it’s easy to talk too theoretically.

MCP is like the interface layer. The official definition of MCP is an open standard for connecting AI applications to external systems. Files, local databases, search engines, design mockups, third-party services—all of these can be connected via MCP. Therefore, it solves the problem of “connecting things.”

Skill is like the process layer. The OpenAI Agent Skills documentation makes it very clear that a Skill is the writing format for reusable workflows. A Skill must have at least an SKILL.md, and can also include scripts/, references/, and assets/. Codex first reads its name and description; only when it determines that it needs to use it does it load the full description into the context. This is what the official documentation calls progressive disclosure.

So, the division of labor between the two is very clear:

MCP: Connects the capabilities (the “what”).
Skill: Defines the order of operations (the “how”).

To give a very intuitive example, something like converting Figma to code. If the agent cannot read the design mockup at all, then what you need first is an MCP. But if it can already read the design mockup, but just writes things randomly—writing components today, cutting pages tomorrow, and forgetting visual checks the day after—then what you need to improve is the Skill.

How to Develop Skills

This is actually not as heavy as you might think. OpenAI officially recommends starting with the built-in $skill-creator, which will help you build the basic structure, such as trigger conditions, scope, and whether a script is needed. By default, it prioritizes instruction-only, meaning don’t rush to write scripts; first, make sure your instructions are clear. If writing manually, the minimum structure is very simple:

my-skill/
├── SKILL.md
├── scripts/
├── references/
└── assets/

Of these, only SKILL.md is truly essential. Furthermore, this file must have at least two pieces of metadata:

---
name: skill-name
description: Explain exactly when this skill should and should not trigger.
---

I think the most crucial steps when developing a Skill are the following.

1. Look for “Recurring Biases”

Not every task is worth turning into a Skill. If you only ask the agent to do something occasionally, a regular prompt is enough. The situations that are truly suitable for creating a Skill are usually like this:

You have repeated the same thing many times.
The agent has the capability, but the execution order always changes.
The location where it makes mistakes is similar every time. Simply put, it’s not an “ability gap,” but rather an “inconsistent process.”

2. Write the `description` as a trigger condition, not as marketing copy

This step is crucial. The official documentation specifically emphasizes that whether Codex will implicitly call a Skill heavily depends on the description. So, don’t write things like “This is a very useful skill”; instead, write “When it should be used and when it should not be used.” For example, the description for the official gh-fix-ci skill is very clear: Use this when the user asks you to debug or fix failed GitHub PR checks; the focus is on checking logs, summarizing the failure reasons, providing a remediation plan, and only implementing it after receiving explicit approval. Just by reading it, you know its boundaries.

3. If it can be solved with instructions, don’t write a script first

OpenAI’s documentation is also very practical: unless you explicitly need deterministic behavior or external tools, prioritize using instructions rather than scripting everything right away. Why? Because the more scripts you have, the higher the maintenance cost becomes. Many Skills initially only require clearly explaining the steps:

What to do first
What to do next
What the output format should be
In which situations to stop and ask the user This can solve most of the problems. Only when a step is particularly stable, mechanical, or highly suitable for automation should you move it down into scripts/.

4. Extract Data, Templates, and Resources

As you write the Skill, it’s easy for it to become one long block of descriptive text. This is when you need to break things out.

SKILL.md is responsible for explaining rules and sequence.
references/ holds background documents and reference materials.
assets/ stores templates, icons, and examples.
scripts/ contains actions that can be executed reliably. Doing this has two benefits. First, the main file won’t become increasingly bloated. Second, the model only loads details when necessary, which aligns with the concept of progressive disclosure.

5. Local Use First, Cross-Project Distribution Later

The official documentation explains this very clearly. If you are only using it for your current repository, placing it in .agents/skills/ is sufficient. Codex scans the skills directory from locations such as the repository, user, and system. However, if you find that this functionality can be used across more than one repository, or if you want to package and distribute multiple skills together, then don’t stop at just the skill folder level; you should consider using plugin. The official OpenAI documentation is also very clear: Skill is the workflow itself, while plugin is the unit that is better suited for installation and distribution.

What Scenarios are Suitable for Skills

Having more Skills is not always better; they are best suited for the following types of tasks.

High-Frequency Repetitive Tasks

Tasks you do every week, and the sequence is pretty much the same each time. For example:

Handling PR review comments
Writing a blog post and adding frontmatter
Debugging CI failures
Performing pre-release checks What’s most daunting about these kinds of tasks isn’t that the model can’t do it, but having to explain it all over again every single time.

Tasks with Fixed Procedures

Some tasks naturally have a sequence. For example, when troubleshooting an issue, you should first check the logs, then narrow down the scope, then propose a plan, and finally make changes. In this case, Skill is particularly suitable because it can enforce the order, reducing reliance on the model’s ad-hoc performance.

Tasks Requiring Domain Context Binding

Some tasks do not have fixed steps and come with strong constraints. For example:

Must only check official documentation
Must output in a specific review format
Must retain existing terminology within the team
Must comply with the writing or development standards of a certain repository If you rely on prompts alone every time, it’s easy to miss these details.

Tasks Requiring Tools and Processes Together

This scenario is particularly typical. It’s not just about “connecting to GitHub” and being done; after connecting, you still need to examine the logs in a certain way, deduce the problems, decide whether or not to modify anything, and finally how to provide feedback. In other words, external connections and internal processes must work together. This is often when MCP + Skill appear jointly.

When Not to Use a Skill

It’s also important to be clear about this, otherwise it’s easy to want to make everything into a Skill.

One-off Casual Tasks

When a user asks a quick question, a standard prompt is often sufficient.

You just want to “connect an external system”

In that case, prioritize MCP, not Skill.

You want to constrain the long-term behavior of the entire repository

This is more like the job of AGENTS.md, not a Skill. So, to summarize simply:

Missing connections, use MCP
Missing processes/workflows, use Skill
Missing global rules, use AGENTS.md

Several Representative Examples

It’s better to look at a few real-world examples than to hear too many concepts.

1. `roll-dice`: The Minimal Viable Entry Case

This example comes from the official OpenAI Agent Skills documentation. It is very small; there is almost only an SKILL.md in the directory, which allows the agent to call PowerShell’s random number command when the user requests rolling dice. Why is this example good? Because it directly exposes the most core skeleton of a Skill:

It has clear triggering conditions
It has a clear execution method
It has clear boundaries It illustrates one thing: a Skill doesn’t have to be large. As long as something happens repeatedly, and you don’t want the model to improvise wildly, you can create it as a Skill.

2. `gh-address-comments`: A Workflow Example for Handling GitHub Comments

This example comes from the official OpenAI openai/skills repository. Its goal is not to “connect to GitHub,” but rather to encapsulate the process of “handling comments on the current branch’s PR.” The steps in the official version are very typical:

First, confirm if gh is already authenticated.
Then, fetch the comments and review threads for the current PR.
Number and summarize these comments.
Allow the user to explicitly select which ones need processing.
Only then start the actual work. This example particularly illustrates the value of a Skill. For many engineering tasks, the difficulty isn’t whether “the model knows what GitHub is,” but rather “whether it will process things in the correct order.” gh-address-comments solves exactly this kind of sequencing problem.

3. `gh-fix-ci`: Troubleshooting Engineering Cases with Failed CI

This is also the official skill in the openai/skills repository. It addresses another very typical engineering task: a PR check has failed, should it be fixed, and how to fix it. The workflow defined in this Skill is also very representative:

First, confirm the gh login status.
Find the current PR.
Pull the failed checks and logs from GitHub Actions.
Extract the failure snippets.
Propose a fix plan first.
Only act after getting approval. This scenario cannot be reliably handled by a simple prompt like “Help me see why CI failed.” Because it involves permissions, logs, external tools, approval boundaries, and execution order—all of which need to be defined.

4. Private Repository Skills: Solidifying Team’s Own Methodologies

Beyond the official examples, I think the greater value of Skill lies within private repositories. For instance, the blog-writer in this blog repository is essentially a very typical repo-scoped skill. It doesn’t aim to “teach the model how to write Chinese,” but rather codifies the writing style, structure, fact-checking process, output path, and final storage format that have already been established within this specific repository into a workflow. These kinds of Skills often hold the most practical value. This is because they are not designed for everyone; instead, they specifically solve the problem: “In this repository, what kind of task keeps recurring, and where do we tend to go off track?”

When to Actually Use a Skill

So, let’s get back to the most practical question: when should you seriously implement a Skill? My answer is: When you realize the problem is no longer “the model lacks capability,” but rather “its execution is inconsistent every time.” At this point, continuously stacking prompts yields diminishing returns. If you add one sentence today and another tomorrow, eventually the prompt becomes like a rambling diary entry, and the model will still make mistakes. Instead, organizing it into a Skill—separating the trigger conditions, steps, boundaries, scripts, and necessary data—results in more stable and reusable outcomes. The MCP brings in the external world; the Skill solidifies the internal methodology. The former solves “if it can be done,” while the latter solves “how to do it reliably.” This is my current understanding of a Skill. It’s not just a new prompt, nor is it a new protocol. It’s more like an operational manual for the agent.

References

Writing Notes

Original Prompt

Prompt: AI Large Model Programming, first appeared MCP, then Skill. Using plain language, explain what Skill is, how to develop a Skill, what scenarios are suitable for Skill, and provide specific, representative examples from each source.

Writing Approach Summary

Instead of repeating a “concept overview of MCP and Skill,” the focus is placed on the role and boundaries of Skill.
Abstract definitions are grounded in real workflows using common analogies like “trade manuals” or “specialized SOPs.”
The development section follows the actual structure of official documentation, retaining key points such as description, progressive disclosure, and instruction-only.
Case studies prioritize official sources, utilizing roll-dice from the official documentation, along with gh-address-comments and gh-fix-ci from the openai/skills repository.
The conclusion re-clarifies the boundaries between MCP, Skill, and AGENTS.md to prevent readers from remaining confused after reading.