Case study · Team Enablement · 2026

Building five CS skills for a team that didn't have a standard.

I'm not a CSM. The skills had to bake in pattern-matching I don't have. A skill is a POV — a specific take on how a workflow should be done. The way through wasn't to pretend I was the SME — it was to render the team's distributed POV into a working v0, ship something specific enough to argue about, and design an intake step that lets each CSM's own take land on top of every run.

5 skills

Shipped end-to-end on one shared foundation

9 voice rules

Hard-coded into every SKILL.md after Phase 1 came back unreadable

0 specs

From CSMs first — the skills had to be the spec, then CSMs would say where they were wrong

5 skills assumes the foundation already exists. _cs_lib v0.1.5 took two days before any skill got written — plus the data_query MCP for warehouse access and live Lark base reads. The skill count is real; the foundation is the price of admission.

01The setup

A team with no shared standard. Five workflows repeating every week.

Every CSM does the same five workflows differently. Pre-meeting prep. Monday morning data-foundation triage. The 30-day account snapshot. The 90-day account plan. The week-one kickoff brief for a newly-closed deal. Output quality tracks tenure, not effort. A senior CSM produces a brief a new CSM can't match in her first six months — and the gap is taste plus pattern recognition, neither of which transfers by Slack message.

The team's not-a-standard isn't a process gap. It's the structural state of a service that scaled faster than its rituals could codify. Every CSM holds a POV on what good looks like — but the POVs are distributed across heads, not yet concrete enough to argue about. Shipping these skills is an act of codification, not automation: the v0 is the POV made specific enough that the team can sharpen it, replace it, or split it into something better.

I'm not the SME on CS workflows. The CSMs are. What they don't have is a technical entry-point or the time on top of customer-facing work to render their POV into a working artifact — and a v0 is not actually in most operators' role description. My job was two things at once: the meta-skill (I am the SME on building a skill suite for a team I'm not on) and the scaffolding (a v0 that makes the distributed POVs collide and articulate). The skill ships at v0; the CSMs take it from v0 to v10.

A skill is a POV. Its ceiling cannot exceed how well a human does the work without AI — the skill is a reflection of the expertise behind it, and for AI to work better, the domain depth has to go deeper.— The thesis I keep coming back to

02What shipped

Five skills. One foundation.

A 244KB tarball. An agent-readable bootstrap. A brand-new-CSM install guide. Validated on real accounts before I shipped — including a Customer NL kickoff run that wrote one Account Status row and ten Key Contact records into live Lark on the first try.

Five skills, one shared foundation↑ hover a row

Slash command	What it does
/data-readiness	Book-level data foundation triage, ranked by severity	ship
/meeting-prep	Pre-meeting brief: state + recent context + open items + flags	ship
/account-snapshot	30-day state + open tickets + gaps + 14-day action plan	ship
/account-plan	90-day plan, canonical-six pillars, T-45 conversion check-in	ship
/kickoff-deck-builder	Newly-closed account kickoff brief, 5-layer fallback chain	ship

Hover any skill to see what it does, what was hardest, and how it stays self-contained

All five import from one foundation — _cs_lib v0.1.5. Queries, Lark reads, gated writers, scoring, attribution decoding all live in one place. Skills change shape; the foundation does not.

03Methodology

Five principles for building a skill suite for a team you're not on.

I am not a CSM. The things that helped most weren't AI tricks — they were process choices about how to render the team's distributed POV into a working v0 while keeping the SME at the center of the loop.

Principle 01

A skill's ceiling is the human behind it.

AI does not compress expertise. It codifies a POV — and the POV is only as sharp as the person whose expertise it's drawn from. The gating question before writing any skill is whether you have a senior-quality version of the workflow in your head, or know someone who does and is willing to push back on the drafts. If you don't, the skill will come out competent and flat. The intake will read generic. The risk flags will sound rehearsed. The voice will land somewhere between "PM blog post" and "product release note" — and neither is what a CSM hands a customer.

The corollary is the uncomfortable one.For AI to do better work for you, your domain depth has to go deeper. The skill is the artifact; the artifact is the reflection of the POV. You can't shortcut taste with a longer prompt.

Principle 02

A skill is a POV. Ship the v0 the team can sharpen.

Every CSM on the team holds a POV on what good looks like for these workflows — but the POVs are distributed across heads and not concrete enough to argue about. 0 to 1 is the hardest move in any creative work, and a v0 sits outside most operators' role description anyway. Someone has to render the distributed POV into a working artifact so the team can sharpen it, replace it, or split it into something better.

That someone doesn't have to be the SME — but the POV they ship has to be specific enough to be wrong in interesting ways. A vague v0 generates vague feedback. A concrete v0 generates the kind of pushback that names what the team actually means. You have to throw something specific out before the team can move.

Below is the actual arc — v0 to v2 — and the CSM feedback that drove each step. The architecture pivot between v1 and v2 (keyword-search retired, Company-row-as-SoT introduced) is the moment the v0 POV got sharpened by someone who knew the work better than I did.

Iteration log: from v0 to architecture pivot↑ click a step

v0 — ship something that works on the easy account

Six skills built in parallel against the cleanest accounts. Output is technically correct. Looks plausible. Nothing has been read by an actual CSM yet.

Output contains lines like "last-2-weeks vs prior-6-weeks: model-A ↗ · model-B ↗ · model-C →." Theme numbers in risk flags. Internal column names everywhere. A CSM reads it and can't translate it for a client.

Principle 03

One skill per lego.

The bigger the boundary, the worse the output. If a workflow takes multiple judgment calls to complete, that's multiple skills, not one big one. A skill is sharp when it solves one thing — and writing the SKILL.md is what forces you to find out what that one thing actually is.

The moment I learned this: I scoped /book-overview as one skill that would tell a CSM what to work on this week across her whole book. Two days in, the SKILL.md was a Frankenstein — it composed data-readiness logic, account-snapshot logic, todos logic, and an opinion about how to weight them. It was four skills wearing a trench coat. The decomposition below is what landed.

The unsolved part.I still don't have a clean rule for skill size. Heuristics work — one workflow, one output, one writeback target, gated writes. The exact boundary is a per-skill judgment call I make case by case. /renewal-prep was the other direction of scope failure — too customer-specific to template at all. I dropped it. Sometimes the right move is killing a skill, not improving it.

The original /book-overview — one giant skill

Pull data readiness across the CSM's book

Pull account snapshots for top 5 at-risk accounts

Pull this week's todos for the CSM

Synthesize a priority order with an opinion

Render a kanban-style output

→ SKILL.md becomes a Frankenstein. Eval is impossible — what does "good" even mean for the priority order?

Decomposed — /book-overview composes four legos

/data-readiness — book-level foundation triage

/account-snapshot — per-account state (called for top N)

/todos — punch list for the next 3 days

/book-overview — orchestrator only: prioritize + render

→ Each lego is a skill a CSM could own. The orchestrator's opinion is its own thing, debatable on its own.

Principle 04

Intake is the spec.

Without an intake at runtime, you can't eval the output. You don't know what good looks like for this run — and every run is different. The intake captures the specific question the CSM is trying to answer this time. That question is the spec.

The intake also resolves the codification tension. The skill suite codifies a shared POV — the shared shell, the canonical pillars, the voice rules. The intake is where each CSM's individual POV lands on top of that shell. Same account, same skill, different intake, different surfaced pillars and milestones. The shell is shared; the take is individual.

If the CSM skips the intake, the output renders with a 🟡 signal-only pending intake marker and the intake prompts surface as the first section. The skill still produces useful analysis — the intake just sharpens which gaps and milestones come forward.

Same account. Same skill. Two intakes.↑ toggle

CSM intake

"Client wants to know whether to cut a major channel from spend allocation given the recent test result."

Surfaced pillars (selected from canonical six)

Adopt incrementality measurementselected
Diversify media mixselected
Operationalize attribution decisionsdeprioritized

Milestones the plan emphasizes

●Wk 1 — Walk the channel-vs-attribution-model delta; show why directional last-click reads overstate the channel
●Wk 3 — Propose a 15% reallocation test scoped to a single geo
●Wk 6 — Read out the geo test; decide channel mix for next quarter
●Wk 9 — Quarterly check on incrementality assumptions

Same skill. Same account. The shared shell is identical — current state, foundation, recent lift results, Key Contact amendments. The intake decides which of the canonical pillars get pulled forward and which milestones become non-negotiable. That's where the CSM's judgment lives.

Principle 05

Dry-run on the hardest real account the SME has.

Synthetic data passes. Generic test accounts pass. The bugs hide where the SME knows the ground truth so completely she can spot the gap instantly. The hardest real account is the only useful test.

The Customer KP bug from Principle 02 is the canonical scene: four open tickets, two missing "Customer KP" in the title, my keyword search returning zero. Synthetic data with on-the-nose titles would never have surfaced that. The CSM saw it in five seconds. The architecture pivoted that same afternoon.

Below is the validation log — every skill, the real account I shipped against, and the specific edge case that account surfaced. Each one was a bug that wouldn't have shown up on synthetic data.

Real accounts, real bugs surfaced

Skill	Validated against	What real data surfaced
/data-readiness	Customer BL	Phase 1 surfaced an old ad-source disconnect as the urgent issue. Phase 2 (live unified view) showed a more recent UTM-tagging issue had taken its place. The snapshot was 30 days stale; the live view was correct.
/meeting-prep	Multiple meetings on a senior CSM's calendar	The first pass framed open platform-side items as "TODO." CSM read it and said the action might already be closed — switched every item to "status unknown — confirm if still relevant."
/account-snapshot	Customer KP	Four open CS tickets returned zero. Two titles didn't contain "Customer KP." Keyword search retired. Company-row-as-SoT introduced.
/account-plan	Customer MR	Phase 1 invented a pillar name not in the canonical six. Phase 2 hard-coded the canonical six in the prompt. ASSERT enforced — the LLM cannot make a new one up.
/kickoff-deck-builder	Customer NL (newly closed)	Layer 1 (sales-meeting transcript) was empty. Layer 2 (CRM raw via warehouse sync) carried the build with a dozen contacts and a month of email engagement. The fallback chain is the skill — without it, the brief would have been a manual stub.

Synthetic data would have passed every one of these. Real accounts where the SME knows the ground truth caught all five. Customer codes are anonymized; the validation work and the bugs caught are faithful to the actual builds.

04The honest version

Where this isn't settled.

Four things I'm not sure about, named before a reader points them out.

What this is

A v0 of a POV — rendered concrete by a non-SME so the team's distributed POV can sharpen it. The CSM-led iteration from v0 to v10 is the actual finish line, not the ship.
Proof that the bottleneck in skill-building isn't technical. It's whether someone is willing to render a distributed POV into something specific enough to argue about.
A reusable methodology: shared foundation first, intake step always, dry-run on the hardest real account, gated writebacks, banned-string greps.
An act of codification — a POV made concrete. The skill suite says what a Monday morning data triage looks like, what a 30-day snapshot looks like. The CSMs who push back are sharpening the POV — that's the system working.

What this isn't

Not "SMEs shouldn't build skills." I still believe the team that executes the work is the right long-term builder. A v0 from a non-SME is a starter that frees the SME to iterate on the POV, not a replacement.
Not a settled answer on skill size. /book-overview was too big; /renewal-prep was too small. The exact boundary is a judgment call I make per-skill, not a principle I can defend.
Not voice-neutral. Every skill is a POV with a narrow band of acceptable output. A CSM whose natural style sits outside that band should fork the SKILL.md and ship her own POV — not wait for me to widen mine.
Not done. The interesting metric is which skills get re-invoked daily by the same CSM. That tells me which POVs the team actually agreed with. I'll know in three weeks.

A skill is a POV the team can argue with. The fastest way to find out what's wrong with the POV is to ship the v0. The fastest way to find out what's wrong with v1 is to ship the v0 in the first place.