Lessons Learned While Using Claude Code Skills

Recently, I started experimenting with Claude Code Skills.

I first saw this YouTube video:
https://www.youtube.com/watch?v=CEvIs9y1uog

The idea of skills looked very promising, so I wanted to get familiar with how they work in practice.

I took a real task I am currently working on and turned the entire testing and verification flow into a single skill.
The flow looks like this:

  • Get a list of ids
  • Send requests to the server based on these ids
  • Verify the side effects after the requests
  • Check whether each downstream system is in the expected state

After the skill was ready, I started testing it. Very quickly, I noticed that the results were not accurate:

  • I wanted to test 100 ids, but only about 30 were actually tested, and the result still showed success
  • Some downstream checks clearly had errors, but the skill still reported success
  • Only 50 out of 100 ids were verified, but the final result was still marked as correct
  • Even when earlier steps were incomplete, downstream steps continued to run

At this point, I realized that skills do not behave like scripts that strictly execute step by step and stop when something goes wrong. Even when the state is incorrect, the flow can continue.

Because I spent many hours testing this, I briefly wondered whether it would be simpler to just write a script instead, even though I had already connected MCP servers to the skill.

Later, I realized that the main issue was my initial assumption.
If I want skills to behave strictly, I need to design them that way.

So I made several changes:

  • Defined clear success and failure conditions for each step
  • Stopped the flow immediately when results were not as expected
  • Required root cause checks before continuing
  • Added multiple MCP servers to support different types of verification

After these changes, the workflow did reduce some false-positive situations.
MCP servers are also quite flexible. For example, when writing code, checking service logs is not always convenient, but with skills and MCP servers, querying different data sources becomes easier.

I am still using skills, but it does not feel fully smooth yet.
I am also not sure whether there are established best practices, especially when the number of skills grows and management becomes a concern.

沒有留言:

張貼留言

Lessons Learned While Using Claude Code Skills

Recently, I started experimenting with Claude Code Skills . I first saw this YouTube video: https://www.youtube.com/watch?v=CEvIs9y1uog ...