Jun 2026

AI Skills Are a Cottage Industry - And That Might Be Permanent


Everyone in my organisation is building their own AI skills. Prompts for summarising documents, triaging emails, drafting updates, analysing data. Each one slightly different. Each one optimised for the person who built it.

The obvious reaction is: we need to standardise this. Create a shared library. Avoid duplication. But I'm not sure that instinct is right - or even achievable.


The Python library analogy, and where it breaks

At first it looks a lot like the early days of Python. A massive proliferation of libraries, everyone solving the same problems slightly differently, and only gradually a market emerging where you could tell the mature, trustworthy packages from the abandoned proof-of-concepts.

And the numbers are striking. PyPI grew from 60,000 packages in 2015 to over 500,000 a decade later - today it lists more than 800,000. Most of them built by single individuals, not organisations. A huge long tail of abandoned or barely-maintained work, and a tiny fraction dominating actual usage. The ecosystem became navigable - but it never converged to a clean, curated set of canonical packages. The market never fully cleared.

But there's a critical difference. Writing a Python library required real technical depth. You had to understand APIs, packaging, dependency management. That friction was also a filter - it meant most libraries were written by people with enough rigour to make something coherent.

With AI skills, that bar is gone. Anyone who understands a problem and knows how to talk to an LLM can build a skill. That's genuinely exciting - domain experts can now build tools for their own domains without needing a developer. But it also means the signal-to-noise ratio will be far worse than PyPI ever was. The Custom GPT store gives a glimpse of what's coming: over 3 million custom GPTs were created within two months of launch, yet only around 160,000 were ever made public. The rest built privately, used once, or quietly abandoned. And unlike code, there's no agreed standard for what a good prompt even looks like - let alone how to evaluate one.


Trust, not quality

Here's what I think changes most fundamentally. With code, you can inspect quality. Read the source, check the tests, look at the issue tracker. With a skill, the source is a prompt - and even if you read it, there's no coding standard for prompts, no way to formally verify correctness, and the outputs are probabilistic anyway. The same skill can give you something sharp on Monday and something subtly off on Friday.

So you can't really evaluate a skill. You can only trust it - or not.

Trust built through repeated use, word of mouth, and the credibility of whoever built it. That's a very different relationship to tooling. It's closer to how you trust a colleague than how you trust a library.

This connects to something deeper. Michael Polanyi observed that "we can know more than we can tell" - that the most valuable knowledge is often the hardest to articulate. Skills are a kind of externalised tacit knowledge. Every refinement you make, every edge case you quietly handle, every implicit assumption about your workflow - it all folds into the prompt in ways you probably couldn't fully explain. And that kind of knowledge doesn't transfer cleanly. Hand someone the skill and you're not handing them the context that shaped it.


Skills resist generalisation

There's another dynamic at play. Skills are so malleable that you can refine them endlessly for your own use case - ask the LLM to adjust them, specialise them further, fold in your own context. The result is something exquisitely fitted to your niche. And unlike extending a Python library, you don't need to understand how it was constructed to modify it.

But that personalisation is also what makes skills hard to share.

This is niche construction in the biological sense - individuals adapting their local environment in ways that work for them, which is rational at the individual level, but which fragments the shared environment over time. With a skill, the tacit knowledge doesn't just surround the artifact. It lives inside it. The divergence compounds with every refinement.

So the ecosystem might not converge the way PyPI did. It might stay fragmented almost by nature - not because people aren't trying to share, but because the artifact resists generalisation.


The marketplace is forming - but the rules are different

There are early signs of infrastructure. GitHub repos cataloguing agent skills are accumulating stars. Dedicated marketplaces are emerging: SkillsMP alone indexes over 800,000 skills scraped from public GitHub repositories. Collections curated by practitioners who've earned credibility are appearing. Social proof is forming.

But look more closely and the PyPI parallel reasserts itself. SkillsMP's catalog size is the most misleading metric in the category - much of it abandoned experiments, half-finished tests, and duplicates, with no quality filter beyond having at least two GitHub stars. And the stakes of getting it wrong are higher than with a badly written library. The first large-scale empirical study of skills in the wild analysed over 31,000 skills from major marketplaces and found that 26.1% contained at least one security vulnerability - spanning prompt injection, data exfiltration, and supply chain risks. Skills execute with implicit trust. There's no pip audit equivalent. Anthropic's own documentation advises users to install only from trusted sources and thoroughly audit anything unverified.

That word again: trusted.

A starred skill repo isn't quite like a popular library. Forking a library produces something legible - another developer can read it, understand what changed, and evaluate the fork on its own terms. Forking a skill produces something that looks almost identical but behaves differently in ways that are hard to articulate. The non-determinism means two forks of the same skill can diverge silently, with no diff to inspect and no test suite to run. Stars travel with the original; trust doesn't automatically transfer to the fork.


The cottage industry might be permanent

The Python ecosystem partially converged - not through top-down standardisation, but through a few dominant packages pulling away on genuine merit, and experienced practitioners developing taste. But even there, the long tail never disappeared. PyPI is navigable, not clean.

I'm not sure the same convergence is coming for AI skills. The artifact itself resists generalisation. There's no deterministic behaviour to diff, no breaking change to announce, no changelog to read. Versioning a skill is a conceptual problem we haven't even started to solve - what does it mean for a skill to "change" if the output drifts rather than breaks?

So the cottage industry might not be a phase. It might be the natural state of this ecosystem.

If that's true, the question isn't how to standardise skills eventually. It's how to function well when fragmentation is the baseline. How trust networks replace registries. How taste develops without inspection. How knowledge survives when it lives in the practitioner, not the artifact.

Cite this blog post:


Michele Pasin. AI Skills Are a Cottage Industry - And That Might Be Permanent. Blog post on www.michelepasin.org. Published on June 16, 2026.

Comments via Github:


See also:

2013


paper  Structuring that which cannot be structured: A role for formal models in representing aspects of Medieval Scotland

New Perspectives on Medieval Scotland: 1093-1286, Woodbridge, Suffolk: Boydell and Brewer, Studies in Celtic History Series, Aug 2013.