Skip to content

[epic] v2.7: refactor top-20 skills to fit Matt Pocock's 100-line SKILL.md ceiling #655

@alirezarezvani

Description

@alirezarezvani

Summary

The v2.6.1 audit (scripts/audit_skills.py) reveals 263 of 298 legacy SKILL.md files (88%) exceed Matt Pocock's 100-line ceiling — the dominant pattern violation across the repo.

Per v2.6.1's engineering/write-a-skill/skills/write-a-skill/references/quality_gates_for_skills.md, this rule is now formally advisory for legacy skills (binding only for new skills post-v2.6.0). But the long-term goal is to grow the PASS rate (currently 9/298 = 3%) by refactoring the highest-impact skills.

This issue tracks the planning + execution sweep for v2.7.

Background

Audit metric v2.6.0 baseline v2.6.1
PASS (6/6) 4 (1%) 9 (3%)
WARN (5/6) 111 (37%) 137 (46%)
FAIL (≤4/6) 183 (61%) 152 (51%)
"SKILL.md ≤ 100 lines" failures 263 263 (unchanged)

The other 5 of Matt's 6 rules have improved (description triggers, terminology, examples, time-sensitivity, references depth). Only the 100-line ceiling has resisted because it requires structural refactoring — splitting SKILL.md content into references/<topic>.md files per Matt's progressive disclosure pattern.

Scope (proposed)

Phase 1 — Prioritize (1 PR, low effort):

  • Identify top-20 high-traffic skills (most-installed via marketplace metrics OR most-mentioned in domain CLAUDE.md files)
  • Refactor only these 20 to ≤100 lines via references/ splits
  • Expected impact: PASS rate jumps from 9/298 (3%) to ~25-30/298 (8-10%)

Phase 2 — Domain sweep (3-4 PRs, medium effort):

  • Address engineering-advanced tier first (40 skills, most cs-* commands depend on these)
  • Then engineering-team tier (32 skills)
  • Then C-level + product + marketing in parallel
  • Expected impact: PASS rate reaches ~50%

Phase 3 — Long tail (ongoing):

  • Remaining skills addressed when touched for other reasons (Boy Scout rule)

Per-skill refactoring pattern

For each over-100-line SKILL.md:

  1. Identify the largest section (often "Workflows" or "Use Cases")
  2. Move to references/<topic>.md
  3. Replace inline content in SKILL.md with: See [references/topic.md](references/topic.md) for X.
  4. Validate: python3 engineering/write-a-skill/skills/write-a-skill/scripts/skill_structure_validator.py <skill-folder>

Acceptance criteria for v2.7 release

  • At minimum: top-20 high-traffic skills refactored to ≤100 lines
  • PASS rate ≥ 25/298 (8.4%, up from 3%)
  • "SKILL.md ≤ 100 lines" failure rate ≤ 75% (down from 88%)
  • No regression on the 9 currently-PASS skills
  • Documentation: per-domain refactoring playbook in engineering/write-a-skill/

Open questions

  1. Should we batch all 20 in one PR or one-per-skill PRs?
  2. How do we determine "high-traffic"? Marketplace install metrics aren't exposed. Could use ClawHub downloads if available, or domain-CLAUDE-md prominence as a proxy.
  3. Does the 100-line ceiling apply to the whole SKILL.md including frontmatter, or just the body?

Surfaced by

v2.6.1 ecosystem verification — Agent 5 (regression + quality evaluator), SC9.

Related

🤖 Filed via Claude Code session

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions