TL;DR
Anthropic has published lessons from using hundreds of Claude Code Skills inside its engineering organization, arguing that Skills work best as reusable folders rather than saved prompts. The company says verification Skills had the strongest effect on output quality, while the broader lesson for teams is that agent instructions can become shared, versioned operating knowledge.
Anthropic says its experience running hundreds of Claude Code Skills across its engineering organization shows that a Skill is best treated as a reusable folder of working knowledge, not a saved prompt. The company’s June 3, 2026 write-up matters because it points to a more durable way for teams to make AI coding agents follow shared practices, use approved tools and check their own work.
The confirmed development is that Thariq Shihipar, a Claude Code engineer at Anthropic, published a post titled Lessons from building Claude Code: How we use skills. According to the source material, Anthropic’s internal use of Skills led it to frame them as folders that can include SKILL.md instructions, deeper references, scripts, templates, configuration, assets, hooks and memory.
Anthropic’s reported model is based on progressive disclosure: the agent reads a short root file first, then pulls in more detailed materials only when the task calls for them. The source material says Anthropic grouped its internal Skills into nine categories, including library references, product verification, data analysis, business-process automation, scaffolding, code review, CI/CD, runbooks and infrastructure operations.
The strongest performance claim in the material is attributed to Anthropic: verification Skills, meaning Skills that check whether work was done correctly, had the largest measured effect on output quality. The company’s exact measurement method is not included in the supplied material, so that claim should be read as Anthropic’s reported finding rather than an independently verified benchmark.
A Skill is a folder, not a prompt
Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.
“A Skill is just a clever markdown prompt you save in a file.”
A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.
The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.
Agent Workflows Become Reusable Assets
The report matters because many teams still rely on repeated prompting to get coding agents to behave consistently. Anthropic’s approach suggests a shift toward versioned operating procedures that agents can read, run and improve over time.
For engineering leaders, the practical issue is repeatability. A Skill can package the specific way a team reviews code, validates a product change, deploys software or handles an incident. That can reduce dependence on informal knowledge held by individual employees, though the source material does not prove how broadly the approach works outside Anthropic.
For developers, the main implication is that effective agent guidance may require real tooling, not just better prose. The supplied analysis emphasizes scripts, templates and guardrails because they give the agent executable steps instead of asking it to recreate boilerplate from memory.

Coding with AI For Dummies (For Dummies: Learning Made Easy)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
From Prompting To Packaged Practice
The Anthropic post sits within the wider push to make AI coding agents more reliable in day-to-day software work. The supplied Thorsten Meyer AI analysis, published July 1, 2026, argues that the larger business point is not a coding trick but the conversion of informal practices into shared assets.
The source material describes a typical Skill folder as containing root instructions, references, scripts, assets, configuration, hooks and memory. The root SKILL.md acts as both instruction and trigger: it tells the model when the Skill applies, while additional files provide details only when needed.
The analysis also highlights several design lessons attributed to Anthropic: describe the Skill for the model, avoid obvious instructions, use scripts where possible, add on-demand guardrails, let the Skill keep a record where appropriate and allow the agent room to adapt to the task.
“Lessons from building Claude Code: How we use skills”
— Thariq Shihipar, Anthropic Claude Code engineer

Building AI-Powered Products: The Essential Guide to AI and GenAI Product Management
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Evidence Outside Anthropic Is Limited
Several details remain unclear from the supplied material. Anthropic’s post is described as based on hundreds of internal Skills, but the material does not provide the full evaluation setup, sample size by category or numerical quality gains.
It is also unclear how easily the same approach transfers to smaller teams, companies with less mature documentation or organizations using different coding agents. The source material flags that best practices are still changing and that checked-in Skills can add context cost, meaning more material for the model to consider.
The strongest confirmed takeaway is architectural: Anthropic treats Skills as folders with instructions and tools. The broader claim that Skills become an appreciating organizational asset is an interpretation in the supplied analysis, not a settled industry result.

Ruby Automation with Scripts: File Handling, Data Tasks, APIs, and Developer Workflows
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Teams Start With Verification Skills
The next practical step suggested by the source material is narrow adoption: start with one Skill, capture one known failure mode and focus first on the category that catches mistakes. Based on Anthropic’s reported finding, that points many teams toward verification Skills before broader libraries.
For Anthropic, the next signal to watch is whether it publishes more detailed measurements, examples or customer guidance for Skills. For engineering teams, the near-term test is whether a Skill library can stay curated, current and useful rather than becoming another neglected documentation store.
versioned AI instruction folders
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
What did Anthropic publish about Claude Code Skills?
Anthropic published a June 3, 2026 Claude Code post describing lessons from using hundreds of Skills across its engineering organization. The post frames Skills as reusable folders that can include instructions, scripts, references and supporting files.
Why is a Skill described as a folder rather than a prompt?
The supplied material says a Skill can contain more than markdown instructions. It may include runnable scripts, templates, configuration, references, hooks and memory, allowing an agent to read and run parts of the folder as needed.
Which type of Skill had the biggest reported effect?
According to the source material, Anthropic’s own measurement found that verification Skills had the biggest effect on output quality. The material does not include the full measurement details.
What does this mean for companies using AI coding agents?
The main implication is that teams may get more consistent results by turning repeat instructions, review steps and internal practices into shared, versioned Skills. That could make agent behavior less dependent on one-off prompts.
What remains unproven about Anthropic’s Skill approach?
It is not yet clear how broadly Anthropic’s results apply outside its own engineering organization. The supplied material does not provide full benchmark data, and it notes that best practices are still changing.
Source: Thorsten Meyer AI