Skill on Uncle Xiang's Notebook

AI Writing a Blog: The Next Steps Towards Engineering (Part 1)

Fri, 03 Apr 2026 20:58:02 +0800

I wrote quite a few AI articles last year. The most basic workflow back then was: first, organize an outline or a list of questions myself; let the large model spit out the main body text; then copy the content into a local md document, add frontmatter, tags, categories, and titles, and finally publish it. This process isn’t unusable, but it’s tedious. The part that really wastes time isn’t the main body text, but the repetitive labor surrounding it. Especially after using Codex a lot recently, this awkwardness has become even stronger. It can read repositories, modify files, supplement materials, and even write articles directly into the directory. If I still have to copy and paste things manually, it feels like I’m tying down the tool’s legs.

This series of articles is actually trying to convey one thing: AI writing blogs cannot rely solely on a single prompt in the long run. This current article first discusses why blog-writer came into existence; the next article will continue with AI Writing Blogs: Later, It Still Needs to Be Engineered (Part II): How blog-style-suite Separates Style Learning and Token Costs; and the last article concludes with AI Writing Blogs: Later, It Still Needs to Be Engineered (Part III): How Local Models, Online Models, and Minimax Will Finally Divide Labor.

What’s truly annoying isn’t writing the draft, but that sequence of mechanical actions.

The early workflow was essentially like an outsourced assembly line. I would first list out the problems clearly, or build a rough outline. The model is responsible for laying out the main body text. Then, a human comes back to complete the remaining publishing steps.

Copy to local md file
Fill in title, date, and slug
Add tags and categories
Insert 
Organize reference materials
Finally, decide which directory it should go into Looking at each step individually, this sequence isn’t difficult. But when strung together, it becomes tedious. What’s annoying isn’t the technical difficulty; it’s that these steps are all mechanical, yet they cannot be skipped. This is why I increasingly feel that changes like AI coding interaction based on command line are not just about “changing the entry point.” When AI can directly read and write files within a repository, if blog writing still stops at the level of “copying the body text to a local document,” the entire workflow is actually outdated.

blog-writer The First Layer of Value: It’s Not the Style, It’s Locking Down the Contract

The very first node for blog-writer was at 17:00 on April 1, 2026, with the commit hash 991536a. Looking at the git commit history, this version included SKILL.md, write_post.py, and an initial set of style guidelines all together.

However, when I looked back later, the most valuable part of this draft wasn’t that “the AI learned my writing style,” but rather that it established a rigid contract for content creation.

What does “locking down the contract” mean?

The input must include at least an outline and factual anchors.
The output must be complete Markdown, not a work in progress.
Frontmatter cannot rely on manual additions anymore.
The article cannot just stay in the chat window; it must land directly into content/post.

This point is crucial because prompts themselves are inherently unstable. If you say, “Write it like before,” today, it might understand that only the tone should be similar; if you repeat it tomorrow, it might only learn superficial sentence structures. But once it’s written as a Skill, the rule shifts from “improvisation” to a “fixed workflow.”

The subsequent nodes were actually all about reinforcing this contract.

The commit at 22:54 on April 2, 2026, with hash 8eb735a, standardized elements like author fields, writing notes, and original prompts. By this stage, the blog draft was no longer considered “finished once the body text is done”; instead, metadata, traceability, and public notes were all standardized together.

Therefore, the first layer of value for blog-writer has never been about making the model seem better at writing; it’s about finally giving the act of drafting repeatable boundaries.

Series Mode, Which is Actually One Step Forward in Writing Workflow

After stabilizing writing a single article, the next problem quickly emerged. Some topics are simply not suitable to be crammed into one piece. If you force it, the result often becomes a long article that is information-heavy, has a scattered main thread, and fails to fully explain every point. This is why the commit 1a5604e on April 2, 2026, at 23:55 was so crucial. That time, they directly added the series mode along with write_post_series.py. The articles are linked using relref, and the replacement is done uniformly during batch writing. This might look like a minor upgrade to a file-writing script, but it’s not. It illustrates one thing: content engineering is no longer just about “how to generate this single article,” but rather starts considering “how to stably save this set of content, how to guarantee the order, and how to link between them on the site.” The next day’s commit, 04dccb9 on April 3, 2026, at 09:29, pushed this process one step further. The timestamps for series articles now increment by minutes instead of sharing a single timestamp. This change is small but very “engineering-y” because it solves real problems like Hugo list pages, previous/next article navigation, and series ordering. Simply put, the series mode isn’t about looking advanced; it’s about eliminating the need for manual fixes when publishing multiple articles together.

But relying on just one skill will eventually hit a token wall.

The problem lies here. Once you start seriously tinkering with style learning, the context of blog-writer will quickly become bloated. You not only want it to write, but you also want it to write like you used to. The most natural way to do this is to dump all your historical articles into it. This works for a single run, of course. But as soon as you’re not writing an occasional piece, but trying to make it a long-term workflow, the problems immediately arise:

High token consumption
Repeatedly feeding the same batch of old articles every time
Model attention is diluted by old material
Drafting and style maintenance are intertwined; neither is easy.

It was from here that I slowly realized that blog-writer is better suited for the consumption side, rather than trying to feed it everything. The act of drafting should be as light, direct, and limited to reading only the effective versions as possible. As for how to generate, filter, or compress style data, that’s a matter for a separate production pipeline. This realization finally pushed me to the next step, which was AI Blog Writing: It Still Needs to Be Engineered (Part II): How blog-style-suite Separates Style Learning from Token Costs.

First, stabilize the process; only then can we talk about style and models.

Looking back now, blog-writer didn’t emerge because I suddenly wanted to build a blog writing assistant. It was more because the original workflow started failing to keep up with new ways of working. Once a tool like Codex can connect to the internet for supplementary material, read and write within repositories, and directly call scripts, the act of writing a blog shouldn’t stop at “copying the body text to a local document.” If you don’t automate this part, it will actually become the clumsiest link in the entire chain. So, I’ll leave the conclusion for the first post here. What blog-writer solved initially wasn’t writing style, but the repetitive labor of the publishing action. Without this layer of contract, any subsequent discussion about tokens, data structures, or local models is actually baseless.

References

Repository Commit: 991536a237d04aba7c44dec501b3d98c644040c8
Repository Commit: 8eb735aa8448c97deb2af1ea46b86772008fa9e3
Repository Commit: 1a5604e7e6ce0a13f260fcbb8c2c1d964cdd0892
Repository Commit: 04dccb98c55a6ea3b81408012b33a6219cf8ab77
Repository File: .agents/skills/blog-writer/SKILL.md
Repository File: .agents/skills/blog-writer/scripts/write_post.py
Repository File: .agents/skills/blog-writer/scripts/write_post_series.py

Writing Notes

Original Prompt

$blog-writer This content is quite extensive, so I've split it into a series of articles: Last year, many drafts were written using large models. Back then, the process was to create an outline or a list of questions myself, and then have the AI generate the draft, copy the content into a local md document, fill in header information, tag information, and publish the article; recently, I used Codex a lot and found that its web search capability is very strong. So, could I write a skill to automate these tasks? This led to the first draft of the skill blog-writer. I also thought about having the AI learn my previous writing style, which caused blog-writer to consume a lot of tokens when running. Subsequently, I optimized blog-writer in several versions, splitting out the data module and the data generation module. The original data generation module was still an independent skill. As I continued writing, I realized that it would be better as a Python project, which led to blog-style-suite. Then, I found that training on style data also consumes a lot of tokens, so I wanted to use a local large model and connected to one locally. I then thought about comparing the differences between the local large model and the online version, so I integrated minimax. The evolution history of blog-style-suite and blog-writer can be analyzed from the git commit history. Additionally, based on the code for local blog-writer and blog-style-suite, I can discuss the design ideas, how token saving was achieved, and how the data structure was designed—the core design concepts. If tokens are abundant, it can consume entire historical articles; preprocessing can save a lot of tokens.

Writing Strategy Summary

The first article should focus on the workflow trigger point, without rushing to detail the division of labor between tokens and models, to avoid having all three articles compete for the main narrative.
It retains the key insight: “The body content is not difficult; what’s troublesome are the mechanical actions before and after publishing.”
By using nodes like 991536a, 8eb735a, 1a5604e, and 04dccb9, we ground the concept of “process contracturization” in actual Git evolution.
The series pattern is reserved for this article to illustrate that blog writing has moved from generating single pieces to managing entire sets of deliverables.
The ending deliberately points toward the token wall, setting up the groundwork for data engineering and preprocessing in the second article.

Skill is not a new prompt, it is the job manual for the agent.

Thu, 02 Apr 2026 22:43:16 +0800

These past few days, while reading about AI programming, people were first discussing MCP, and then immediately started talking about Skill. Many people who see this term for the first time will instinctively treat it as another new protocol or another advanced prompt.

My judgment is very straightforward: Skill isn’t here to replace MCP; rather, it’s more like providing an occupational manual for the agent. MCP solves the problem of “enabling the agent to connect to the external world,” while Skill solves the problem of “how to reliably get the job done after connecting.” These two are not a replacement relationship; they are more like one following the other.

Simply put, MCP gives the agent hands and feet, and Skill tells the agent not to mess around.

What exactly is a Skill?

If I were to explain it in the simplest terms, I would say this: It’s like taking the expert knowledge from a seasoned employee’s head and organizing it into a reusable, triggerable, and actionable manual. That thing is a Skill. The official OpenAI documentation defines it very directly: A Skill is a package of capabilities designed for specific tasks. It can contain instructions, reference materials, and optional scripts. The goal isn’t to make the model “smarter,” but rather to ensure that when performing a certain type of task, it outputs results consistently according to a fixed workflow. What is it most like?

It’s not like a regular prompt because it’s not something you just say once and are done with.
It’s not like an MCP because it doesn’t handle connecting tools and data sources.
And it’s not like the general rules in AGENTS.md because it’s not a universal rule for the entire repository. It is more like a specialized Standard Operating Procedure (SOP), or a trade manual. For example:
When handling GitHub PR comments, first identify which comments need attention, then ask the user which ones to handle, then modify the code, and finally provide feedback.
When debugging a CI failure, first pull the GitHub Actions logs, then extract the failing segments, then propose a fix plan, and only execute after approval.
When writing a blog post, first generate titles according to a fixed style, then supplement with facts, then add frontmatter, and finally publish it. The common thread among these tasks is not that “the model doesn’t know how to answer,” but rather that “the model’s approach varies every time, making it easy for it to drift off course.” This is where the value of a Skill comes into play.

The Difference Between Skill and MCP

This issue really needs to be viewed within a real workflow, otherwise, it’s easy to talk too theoretically.

MCP is like the interface layer. The official definition of MCP is an open standard for connecting AI applications to external systems. Files, local databases, search engines, design mockups, third-party services—all of these can be connected via MCP. Therefore, it solves the problem of “connecting things.”

Skill is like the process layer. The OpenAI Agent Skills documentation makes it very clear that a Skill is the writing format for reusable workflows. A Skill must have at least an SKILL.md, and can also include scripts/, references/, and assets/. Codex first reads its name and description; only when it determines that it needs to use it does it load the full description into the context. This is what the official documentation calls progressive disclosure.

So, the division of labor between the two is very clear:

MCP: Connects the capabilities (the “what”).
Skill: Defines the order of operations (the “how”).

To give a very intuitive example, something like converting Figma to code. If the agent cannot read the design mockup at all, then what you need first is an MCP. But if it can already read the design mockup, but just writes things randomly—writing components today, cutting pages tomorrow, and forgetting visual checks the day after—then what you need to improve is the Skill.

How to Develop Skills

This is actually not as heavy as you might think. OpenAI officially recommends starting with the built-in $skill-creator, which will help you build the basic structure, such as trigger conditions, scope, and whether a script is needed. By default, it prioritizes instruction-only, meaning don’t rush to write scripts; first, make sure your instructions are clear. If writing manually, the minimum structure is very simple:

my-skill/
├── SKILL.md
├── scripts/
├── references/
└── assets/

Of these, only SKILL.md is truly essential. Furthermore, this file must have at least two pieces of metadata:

---
name: skill-name
description: Explain exactly when this skill should and should not trigger.
---

I think the most crucial steps when developing a Skill are the following.

1. Look for “Recurring Biases”

Not every task is worth turning into a Skill. If you only ask the agent to do something occasionally, a regular prompt is enough. The situations that are truly suitable for creating a Skill are usually like this:

You have repeated the same thing many times.
The agent has the capability, but the execution order always changes.
The location where it makes mistakes is similar every time. Simply put, it’s not an “ability gap,” but rather an “inconsistent process.”

2. Write the `description` as a trigger condition, not as marketing copy

This step is crucial. The official documentation specifically emphasizes that whether Codex will implicitly call a Skill heavily depends on the description. So, don’t write things like “This is a very useful skill”; instead, write “When it should be used and when it should not be used.” For example, the description for the official gh-fix-ci skill is very clear: Use this when the user asks you to debug or fix failed GitHub PR checks; the focus is on checking logs, summarizing the failure reasons, providing a remediation plan, and only implementing it after receiving explicit approval. Just by reading it, you know its boundaries.

3. If it can be solved with instructions, don’t write a script first

OpenAI’s documentation is also very practical: unless you explicitly need deterministic behavior or external tools, prioritize using instructions rather than scripting everything right away. Why? Because the more scripts you have, the higher the maintenance cost becomes. Many Skills initially only require clearly explaining the steps:

What to do first
What to do next
What the output format should be
In which situations to stop and ask the user This can solve most of the problems. Only when a step is particularly stable, mechanical, or highly suitable for automation should you move it down into scripts/.

4. Extract Data, Templates, and Resources

As you write the Skill, it’s easy for it to become one long block of descriptive text. This is when you need to break things out.

SKILL.md is responsible for explaining rules and sequence.
references/ holds background documents and reference materials.
assets/ stores templates, icons, and examples.
scripts/ contains actions that can be executed reliably. Doing this has two benefits. First, the main file won’t become increasingly bloated. Second, the model only loads details when necessary, which aligns with the concept of progressive disclosure.

5. Local Use First, Cross-Project Distribution Later

The official documentation explains this very clearly. If you are only using it for your current repository, placing it in .agents/skills/ is sufficient. Codex scans the skills directory from locations such as the repository, user, and system. However, if you find that this functionality can be used across more than one repository, or if you want to package and distribute multiple skills together, then don’t stop at just the skill folder level; you should consider using plugin. The official OpenAI documentation is also very clear: Skill is the workflow itself, while plugin is the unit that is better suited for installation and distribution.

What Scenarios are Suitable for Skills

Having more Skills is not always better; they are best suited for the following types of tasks.

High-Frequency Repetitive Tasks

Tasks you do every week, and the sequence is pretty much the same each time. For example:

Handling PR review comments
Writing a blog post and adding frontmatter
Debugging CI failures
Performing pre-release checks What’s most daunting about these kinds of tasks isn’t that the model can’t do it, but having to explain it all over again every single time.

Tasks with Fixed Procedures

Some tasks naturally have a sequence. For example, when troubleshooting an issue, you should first check the logs, then narrow down the scope, then propose a plan, and finally make changes. In this case, Skill is particularly suitable because it can enforce the order, reducing reliance on the model’s ad-hoc performance.

Tasks Requiring Domain Context Binding

Some tasks do not have fixed steps and come with strong constraints. For example:

Must only check official documentation
Must output in a specific review format
Must retain existing terminology within the team
Must comply with the writing or development standards of a certain repository If you rely on prompts alone every time, it’s easy to miss these details.

Tasks Requiring Tools and Processes Together

This scenario is particularly typical. It’s not just about “connecting to GitHub” and being done; after connecting, you still need to examine the logs in a certain way, deduce the problems, decide whether or not to modify anything, and finally how to provide feedback. In other words, external connections and internal processes must work together. This is often when MCP + Skill appear jointly.

When Not to Use a Skill

It’s also important to be clear about this, otherwise it’s easy to want to make everything into a Skill.

One-off Casual Tasks

When a user asks a quick question, a standard prompt is often sufficient.

You just want to “connect an external system”

In that case, prioritize MCP, not Skill.

You want to constrain the long-term behavior of the entire repository

This is more like the job of AGENTS.md, not a Skill. So, to summarize simply:

Missing connections, use MCP
Missing processes/workflows, use Skill
Missing global rules, use AGENTS.md

Several Representative Examples

It’s better to look at a few real-world examples than to hear too many concepts.

1. `roll-dice`: The Minimal Viable Entry Case

This example comes from the official OpenAI Agent Skills documentation. It is very small; there is almost only an SKILL.md in the directory, which allows the agent to call PowerShell’s random number command when the user requests rolling dice. Why is this example good? Because it directly exposes the most core skeleton of a Skill:

It has clear triggering conditions
It has a clear execution method
It has clear boundaries It illustrates one thing: a Skill doesn’t have to be large. As long as something happens repeatedly, and you don’t want the model to improvise wildly, you can create it as a Skill.

2. `gh-address-comments`: A Workflow Example for Handling GitHub Comments

This example comes from the official OpenAI openai/skills repository. Its goal is not to “connect to GitHub,” but rather to encapsulate the process of “handling comments on the current branch’s PR.” The steps in the official version are very typical:

First, confirm if gh is already authenticated.
Then, fetch the comments and review threads for the current PR.
Number and summarize these comments.
Allow the user to explicitly select which ones need processing.
Only then start the actual work. This example particularly illustrates the value of a Skill. For many engineering tasks, the difficulty isn’t whether “the model knows what GitHub is,” but rather “whether it will process things in the correct order.” gh-address-comments solves exactly this kind of sequencing problem.

3. `gh-fix-ci`: Troubleshooting Engineering Cases with Failed CI

This is also the official skill in the openai/skills repository. It addresses another very typical engineering task: a PR check has failed, should it be fixed, and how to fix it. The workflow defined in this Skill is also very representative:

First, confirm the gh login status.
Find the current PR.
Pull the failed checks and logs from GitHub Actions.
Extract the failure snippets.
Propose a fix plan first.
Only act after getting approval. This scenario cannot be reliably handled by a simple prompt like “Help me see why CI failed.” Because it involves permissions, logs, external tools, approval boundaries, and execution order—all of which need to be defined.

4. Private Repository Skills: Solidifying Team’s Own Methodologies

Beyond the official examples, I think the greater value of Skill lies within private repositories. For instance, the blog-writer in this blog repository is essentially a very typical repo-scoped skill. It doesn’t aim to “teach the model how to write Chinese,” but rather codifies the writing style, structure, fact-checking process, output path, and final storage format that have already been established within this specific repository into a workflow. These kinds of Skills often hold the most practical value. This is because they are not designed for everyone; instead, they specifically solve the problem: “In this repository, what kind of task keeps recurring, and where do we tend to go off track?”

When to Actually Use a Skill

So, let’s get back to the most practical question: when should you seriously implement a Skill? My answer is: When you realize the problem is no longer “the model lacks capability,” but rather “its execution is inconsistent every time.” At this point, continuously stacking prompts yields diminishing returns. If you add one sentence today and another tomorrow, eventually the prompt becomes like a rambling diary entry, and the model will still make mistakes. Instead, organizing it into a Skill—separating the trigger conditions, steps, boundaries, scripts, and necessary data—results in more stable and reusable outcomes. The MCP brings in the external world; the Skill solidifies the internal methodology. The former solves “if it can be done,” while the latter solves “how to do it reliably.” This is my current understanding of a Skill. It’s not just a new prompt, nor is it a new protocol. It’s more like an operational manual for the agent.

References

Writing Notes

Original Prompt

Prompt: AI Large Model Programming, first appeared MCP, then Skill. Using plain language, explain what Skill is, how to develop a Skill, what scenarios are suitable for Skill, and provide specific, representative examples from each source.

Writing Approach Summary

Instead of repeating a “concept overview of MCP and Skill,” the focus is placed on the role and boundaries of Skill.
Abstract definitions are grounded in real workflows using common analogies like “trade manuals” or “specialized SOPs.”
The development section follows the actual structure of official documentation, retaining key points such as description, progressive disclosure, and instruction-only.
Case studies prioritize official sources, utilizing roll-dice from the official documentation, along with gh-address-comments and gh-fix-ci from the openai/skills repository.
The conclusion re-clarifies the boundaries between MCP, Skill, and AGENTS.md to prevent readers from remaining confused after reading.

Skill on Uncle Xiang's Notebook

AI Writing a Blog: The Next Steps Towards Engineering (Part 1)

What’s truly annoying isn’t writing the draft, but that sequence of mechanical actions.

blog-writer The First Layer of Value: It’s Not the Style, It’s Locking Down the Contract

Series Mode, Which is Actually One Step Forward in Writing Workflow

But relying on just one skill will eventually hit a token wall.

First, stabilize the process; only then can we talk about style and models.

References

Writing Notes

Original Prompt

Writing Strategy Summary

Skill is not a new prompt, it is the job manual for the agent.

What exactly is a Skill?

The Difference Between Skill and MCP

How to Develop Skills

1. Look for “Recurring Biases”

2. Write the description as a trigger condition, not as marketing copy

3. If it can be solved with instructions, don’t write a script first

4. Extract Data, Templates, and Resources

5. Local Use First, Cross-Project Distribution Later

What Scenarios are Suitable for Skills

High-Frequency Repetitive Tasks

Tasks with Fixed Procedures

Tasks Requiring Domain Context Binding

Tasks Requiring Tools and Processes Together

When Not to Use a Skill

One-off Casual Tasks

You just want to “connect an external system”

You want to constrain the long-term behavior of the entire repository

Several Representative Examples

1. roll-dice: The Minimal Viable Entry Case

2. gh-address-comments: A Workflow Example for Handling GitHub Comments

3. gh-fix-ci: Troubleshooting Engineering Cases with Failed CI

4. Private Repository Skills: Solidifying Team’s Own Methodologies

When to Actually Use a Skill

References

Writing Notes

Original Prompt

Writing Approach Summary

2. Write the `description` as a trigger condition, not as marketing copy

1. `roll-dice`: The Minimal Viable Entry Case

2. `gh-address-comments`: A Workflow Example for Handling GitHub Comments

3. `gh-fix-ci`: Troubleshooting Engineering Cases with Failed CI