mcp tuning
This commit is contained in:
@@ -4,7 +4,7 @@ description: Generate a custom checklist for the current feature based on user r
|
||||
|
||||
## Checklist Purpose: "Unit Tests for English"
|
||||
|
||||
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
|
||||
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, completeness, and decision-memory readiness of requirements in a given domain.
|
||||
|
||||
**NOT for verification/testing**:
|
||||
|
||||
@@ -20,6 +20,7 @@ description: Generate a custom checklist for the current feature based on user r
|
||||
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
|
||||
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
|
||||
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
|
||||
- ✅ "Do repo-shaping choices have explicit rationale and rejected alternatives before task decomposition?" (decision memory)
|
||||
|
||||
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
|
||||
|
||||
@@ -47,7 +48,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
|
||||
2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
|
||||
3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
|
||||
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
|
||||
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria, decision-memory needs.
|
||||
5. Formulate questions chosen from these archetypes:
|
||||
- Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
|
||||
- Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
|
||||
@@ -55,6 +56,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
|
||||
- Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
|
||||
- Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
|
||||
- Decision-memory gap (e.g., "Do we need explicit ADR and rejected-path checks for this feature?")
|
||||
|
||||
Question formatting rules:
|
||||
- If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
|
||||
@@ -76,9 +78,10 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Infer any missing context from spec/plan/tasks (do NOT hallucinate)
|
||||
|
||||
4. **Load feature context**: Read from FEATURE_DIR:
|
||||
- spec.md: Feature requirements and scope
|
||||
- plan.md (if exists): Technical details, dependencies
|
||||
- tasks.md (if exists): Implementation tasks
|
||||
- `spec.md`: Feature requirements and scope
|
||||
- `plan.md` (if exists): Technical details, dependencies, ADR references
|
||||
- `tasks.md` (if exists): Implementation tasks and inherited guardrails
|
||||
- ADR artifacts (if present): `[DEF:id:ADR]`, `@RATIONALE`, `@REJECTED`
|
||||
|
||||
**Context Loading Strategy**:
|
||||
- Load only necessary portions relevant to active focus areas (avoid full-file dumping)
|
||||
@@ -102,6 +105,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Consistency**: Do requirements align with each other?
|
||||
- **Measurability**: Can requirements be objectively verified?
|
||||
- **Coverage**: Are all scenarios/edge cases addressed?
|
||||
- **Decision Memory**: Are durable choices and rejected alternatives explicit before implementation starts?
|
||||
|
||||
**Category Structure** - Group items by requirement quality dimensions:
|
||||
- **Requirement Completeness** (Are all necessary requirements documented?)
|
||||
@@ -112,6 +116,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Edge Case Coverage** (Are boundary conditions defined?)
|
||||
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
|
||||
- **Dependencies & Assumptions** (Are they documented and validated?)
|
||||
- **Decision Memory & ADRs** (Are architectural choices, rationale, and rejected paths explicit?)
|
||||
- **Ambiguities & Conflicts** (What needs clarification?)
|
||||
|
||||
**HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
|
||||
@@ -127,8 +132,8 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
|
||||
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
|
||||
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
|
||||
- "Are loading states defined for asynchronous episode data?" [Completeness]
|
||||
- "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
|
||||
- "Are blocking architecture decisions recorded with explicit rationale and rejected alternatives before task generation?" [Decision Memory]
|
||||
- "Does the plan make clear which implementation shortcuts are forbidden for this feature?" [Decision Memory, Gap]
|
||||
|
||||
**ITEM STRUCTURE**:
|
||||
Each item should follow this pattern:
|
||||
@@ -163,6 +168,11 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
|
||||
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
|
||||
|
||||
Decision Memory:
|
||||
- "Do all repo-shaping technical choices have explicit rationale before tasks are generated? [Decision Memory, Plan]"
|
||||
- "Are rejected alternatives documented for architectural branches that would materially change implementation scope? [Decision Memory, Gap]"
|
||||
- "Can a coder determine from the planning artifacts which tempting shortcut is forbidden? [Decision Memory, Clarity]"
|
||||
|
||||
**Scenario Classification & Coverage** (Requirements Quality Focus):
|
||||
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
|
||||
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
|
||||
@@ -171,7 +181,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
**Traceability Requirements**:
|
||||
- MINIMUM: ≥80% of items MUST include at least one traceability reference
|
||||
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
|
||||
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`, `[ADR]`
|
||||
- If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
|
||||
|
||||
**Surface & Resolve Issues** (Requirements Quality Problems):
|
||||
@@ -181,6 +191,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
|
||||
- Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
|
||||
- Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
|
||||
- Decision-memory drift: "Do tasks inherit the same rejected-path guardrails defined in planning? [Decision Memory, Conflict]"
|
||||
|
||||
**Content Consolidation**:
|
||||
- Soft cap: If raw candidate items > 40, prioritize by risk/impact
|
||||
@@ -193,7 +204,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- ❌ "Displays correctly", "works properly", "functions as expected"
|
||||
- ❌ "Click", "navigate", "render", "load", "execute"
|
||||
- ❌ Test cases, test plans, QA procedures
|
||||
- ❌ Implementation details (frameworks, APIs, algorithms)
|
||||
- ❌ Implementation details (frameworks, APIs, algorithms) unless the checklist is asking whether those decisions were explicitly documented and bounded by rationale/rejected alternatives
|
||||
|
||||
**✅ REQUIRED PATTERNS** - These test requirements quality:
|
||||
- ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
|
||||
@@ -202,6 +213,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- ✅ "Can [requirement] be objectively measured/verified?"
|
||||
- ✅ "Are [edge cases/scenarios] addressed in requirements?"
|
||||
- ✅ "Does the spec define [missing aspect]?"
|
||||
- ✅ "Does the plan record why [accepted path] was chosen and why [rejected path] is forbidden?"
|
||||
|
||||
6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
|
||||
|
||||
@@ -210,6 +222,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Depth level
|
||||
- Actor/timing
|
||||
- Any explicit user-specified must-have items incorporated
|
||||
- Whether ADR / decision-memory checks were included
|
||||
|
||||
**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
|
||||
|
||||
@@ -262,6 +275,15 @@ Sample items:
|
||||
- "Are security requirements consistent with compliance obligations? [Consistency]"
|
||||
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
|
||||
|
||||
**Architecture Decision Quality:** `architecture.md`
|
||||
|
||||
Sample items:
|
||||
|
||||
- "Do all repo-shaping architecture choices have explicit rationale before tasks are generated? [Decision Memory]"
|
||||
- "Are rejected alternatives documented for each blocking technology branch? [Decision Memory, Gap]"
|
||||
- "Can an implementer tell which shortcuts are forbidden without re-reading research artifacts? [Clarity, ADR]"
|
||||
- "Are ADR decisions traceable to requirements or constraints in the spec? [Traceability, ADR]"
|
||||
|
||||
## Anti-Examples: What NOT To Do
|
||||
|
||||
**❌ WRONG - These test implementation, not requirements:**
|
||||
@@ -282,6 +304,7 @@ Sample items:
|
||||
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
|
||||
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
|
||||
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
|
||||
- [ ] CHK007 - Do planning artifacts state why the accepted architecture was chosen and which alternative is rejected? [Decision Memory, ADR]
|
||||
```
|
||||
|
||||
**Key Differences:**
|
||||
|
||||
Reference in New Issue
Block a user