mcp tuning
This commit is contained in:
@@ -12,7 +12,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
## Goal
|
||||
|
||||
Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
|
||||
Identify inconsistencies, duplications, ambiguities, underspecified items, and decision-memory drift across the core artifacts (`spec.md`, `plan.md`, `tasks.md`, and ADR sources) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
|
||||
|
||||
## Operating Constraints
|
||||
|
||||
@@ -29,6 +29,7 @@ Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --inclu
|
||||
- SPEC = FEATURE_DIR/spec.md
|
||||
- PLAN = FEATURE_DIR/plan.md
|
||||
- TASKS = FEATURE_DIR/tasks.md
|
||||
- ADR = `docs/architecture.md` and/or feature-local decision files when present
|
||||
|
||||
Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
|
||||
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
||||
@@ -37,7 +38,7 @@ For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot
|
||||
|
||||
Load only the minimal necessary context from each artifact:
|
||||
|
||||
**From spec.md:**
|
||||
**From `spec.md`:**
|
||||
|
||||
- Overview/Context
|
||||
- Functional Requirements
|
||||
@@ -45,20 +46,29 @@ Load only the minimal necessary context from each artifact:
|
||||
- User Stories
|
||||
- Edge Cases (if present)
|
||||
|
||||
**From plan.md:**
|
||||
**From `plan.md`:**
|
||||
|
||||
- Architecture/stack choices
|
||||
- Data Model references
|
||||
- Phases
|
||||
- Technical constraints
|
||||
- ADR references or emitted decisions
|
||||
|
||||
**From tasks.md:**
|
||||
**From `tasks.md`:**
|
||||
|
||||
- Task IDs
|
||||
- Descriptions
|
||||
- Phase grouping
|
||||
- Parallel markers [P]
|
||||
- Referenced file paths
|
||||
- Guardrail summaries derived from `@RATIONALE` / `@REJECTED`
|
||||
|
||||
**From ADR sources:**
|
||||
|
||||
- `[DEF:id:ADR]` nodes
|
||||
- `@RATIONALE`
|
||||
- `@REJECTED`
|
||||
- `@RELATION`
|
||||
|
||||
**From constitution:**
|
||||
|
||||
@@ -73,6 +83,7 @@ Create internal representations (do not include raw artifacts in output):
|
||||
- **User story/action inventory**: Discrete user actions with acceptance criteria
|
||||
- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
|
||||
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
|
||||
- **Decision-memory inventory**: ADR ids, accepted paths, rejected paths, and the tasks/contracts expected to inherit them
|
||||
|
||||
### 4. Detection Passes (Token-Efficient Analysis)
|
||||
|
||||
@@ -112,13 +123,21 @@ Focus on high-signal findings. Limit to 50 findings total; aggregate remainder i
|
||||
- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
|
||||
- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
|
||||
|
||||
#### G. Decision-Memory Drift
|
||||
|
||||
- ADR exists in planning but has no downstream task guardrail
|
||||
- Task carries a guardrail with no upstream ADR or plan rationale
|
||||
- Task text accidentally schedules an ADR-rejected path
|
||||
- Missing preventive `@RATIONALE` / `@REJECTED` summaries for known traps
|
||||
- Rejected-path notes that contradict later plan or task language without explicit decision revision
|
||||
|
||||
### 5. Severity Assignment
|
||||
|
||||
Use this heuristic to prioritize findings:
|
||||
|
||||
- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
|
||||
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
|
||||
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
|
||||
- **CRITICAL**: Violates constitution MUST, missing core spec artifact, missing blocking ADR, rejected path scheduled as work, or requirement with zero coverage that blocks baseline functionality
|
||||
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion, ADR guardrail drift
|
||||
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case, incomplete decision-memory propagation
|
||||
- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
|
||||
|
||||
### 6. Produce Compact Analysis Report
|
||||
@@ -138,6 +157,11 @@ Output a Markdown report (no file writes) with the following structure:
|
||||
| Requirement Key | Has Task? | Task IDs | Notes |
|
||||
|-----------------|-----------|----------|-------|
|
||||
|
||||
**Decision Memory Summary Table:**
|
||||
|
||||
| ADR / Guardrail | Present in Plan | Propagated to Tasks | Rejected Path Protected | Notes |
|
||||
|-----------------|-----------------|---------------------|-------------------------|-------|
|
||||
|
||||
**Constitution Alignment Issues:** (if any)
|
||||
|
||||
**Unmapped Tasks:** (if any)
|
||||
@@ -150,6 +174,8 @@ Output a Markdown report (no file writes) with the following structure:
|
||||
- Ambiguity Count
|
||||
- Duplication Count
|
||||
- Critical Issues Count
|
||||
- ADR Count
|
||||
- Guardrail Drift Count
|
||||
|
||||
### 7. Provide Next Actions
|
||||
|
||||
@@ -179,6 +205,7 @@ Ask the user: "Would you like me to suggest concrete remediation edits for the t
|
||||
- **Prioritize constitution violations** (these are always CRITICAL)
|
||||
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
|
||||
- **Report zero issues gracefully** (emit success report with coverage statistics)
|
||||
- **Treat missing ADR propagation as a real defect, not a documentation nit**
|
||||
|
||||
## Context
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ description: Generate a custom checklist for the current feature based on user r
|
||||
|
||||
## Checklist Purpose: "Unit Tests for English"
|
||||
|
||||
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
|
||||
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, completeness, and decision-memory readiness of requirements in a given domain.
|
||||
|
||||
**NOT for verification/testing**:
|
||||
|
||||
@@ -20,6 +20,7 @@ description: Generate a custom checklist for the current feature based on user r
|
||||
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
|
||||
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
|
||||
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
|
||||
- ✅ "Do repo-shaping choices have explicit rationale and rejected alternatives before task decomposition?" (decision memory)
|
||||
|
||||
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
|
||||
|
||||
@@ -47,7 +48,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
|
||||
2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
|
||||
3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
|
||||
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
|
||||
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria, decision-memory needs.
|
||||
5. Formulate questions chosen from these archetypes:
|
||||
- Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
|
||||
- Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
|
||||
@@ -55,6 +56,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
|
||||
- Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
|
||||
- Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
|
||||
- Decision-memory gap (e.g., "Do we need explicit ADR and rejected-path checks for this feature?")
|
||||
|
||||
Question formatting rules:
|
||||
- If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
|
||||
@@ -76,9 +78,10 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Infer any missing context from spec/plan/tasks (do NOT hallucinate)
|
||||
|
||||
4. **Load feature context**: Read from FEATURE_DIR:
|
||||
- spec.md: Feature requirements and scope
|
||||
- plan.md (if exists): Technical details, dependencies
|
||||
- tasks.md (if exists): Implementation tasks
|
||||
- `spec.md`: Feature requirements and scope
|
||||
- `plan.md` (if exists): Technical details, dependencies, ADR references
|
||||
- `tasks.md` (if exists): Implementation tasks and inherited guardrails
|
||||
- ADR artifacts (if present): `[DEF:id:ADR]`, `@RATIONALE`, `@REJECTED`
|
||||
|
||||
**Context Loading Strategy**:
|
||||
- Load only necessary portions relevant to active focus areas (avoid full-file dumping)
|
||||
@@ -102,6 +105,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Consistency**: Do requirements align with each other?
|
||||
- **Measurability**: Can requirements be objectively verified?
|
||||
- **Coverage**: Are all scenarios/edge cases addressed?
|
||||
- **Decision Memory**: Are durable choices and rejected alternatives explicit before implementation starts?
|
||||
|
||||
**Category Structure** - Group items by requirement quality dimensions:
|
||||
- **Requirement Completeness** (Are all necessary requirements documented?)
|
||||
@@ -112,6 +116,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Edge Case Coverage** (Are boundary conditions defined?)
|
||||
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
|
||||
- **Dependencies & Assumptions** (Are they documented and validated?)
|
||||
- **Decision Memory & ADRs** (Are architectural choices, rationale, and rejected paths explicit?)
|
||||
- **Ambiguities & Conflicts** (What needs clarification?)
|
||||
|
||||
**HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
|
||||
@@ -127,8 +132,8 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
|
||||
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
|
||||
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
|
||||
- "Are loading states defined for asynchronous episode data?" [Completeness]
|
||||
- "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
|
||||
- "Are blocking architecture decisions recorded with explicit rationale and rejected alternatives before task generation?" [Decision Memory]
|
||||
- "Does the plan make clear which implementation shortcuts are forbidden for this feature?" [Decision Memory, Gap]
|
||||
|
||||
**ITEM STRUCTURE**:
|
||||
Each item should follow this pattern:
|
||||
@@ -163,6 +168,11 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
|
||||
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
|
||||
|
||||
Decision Memory:
|
||||
- "Do all repo-shaping technical choices have explicit rationale before tasks are generated? [Decision Memory, Plan]"
|
||||
- "Are rejected alternatives documented for architectural branches that would materially change implementation scope? [Decision Memory, Gap]"
|
||||
- "Can a coder determine from the planning artifacts which tempting shortcut is forbidden? [Decision Memory, Clarity]"
|
||||
|
||||
**Scenario Classification & Coverage** (Requirements Quality Focus):
|
||||
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
|
||||
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
|
||||
@@ -171,7 +181,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
**Traceability Requirements**:
|
||||
- MINIMUM: ≥80% of items MUST include at least one traceability reference
|
||||
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
|
||||
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`, `[ADR]`
|
||||
- If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
|
||||
|
||||
**Surface & Resolve Issues** (Requirements Quality Problems):
|
||||
@@ -181,6 +191,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
|
||||
- Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
|
||||
- Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
|
||||
- Decision-memory drift: "Do tasks inherit the same rejected-path guardrails defined in planning? [Decision Memory, Conflict]"
|
||||
|
||||
**Content Consolidation**:
|
||||
- Soft cap: If raw candidate items > 40, prioritize by risk/impact
|
||||
@@ -193,7 +204,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- ❌ "Displays correctly", "works properly", "functions as expected"
|
||||
- ❌ "Click", "navigate", "render", "load", "execute"
|
||||
- ❌ Test cases, test plans, QA procedures
|
||||
- ❌ Implementation details (frameworks, APIs, algorithms)
|
||||
- ❌ Implementation details (frameworks, APIs, algorithms) unless the checklist is asking whether those decisions were explicitly documented and bounded by rationale/rejected alternatives
|
||||
|
||||
**✅ REQUIRED PATTERNS** - These test requirements quality:
|
||||
- ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
|
||||
@@ -202,6 +213,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- ✅ "Can [requirement] be objectively measured/verified?"
|
||||
- ✅ "Are [edge cases/scenarios] addressed in requirements?"
|
||||
- ✅ "Does the spec define [missing aspect]?"
|
||||
- ✅ "Does the plan record why [accepted path] was chosen and why [rejected path] is forbidden?"
|
||||
|
||||
6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
|
||||
|
||||
@@ -210,6 +222,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Depth level
|
||||
- Actor/timing
|
||||
- Any explicit user-specified must-have items incorporated
|
||||
- Whether ADR / decision-memory checks were included
|
||||
|
||||
**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
|
||||
|
||||
@@ -262,6 +275,15 @@ Sample items:
|
||||
- "Are security requirements consistent with compliance obligations? [Consistency]"
|
||||
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
|
||||
|
||||
**Architecture Decision Quality:** `architecture.md`
|
||||
|
||||
Sample items:
|
||||
|
||||
- "Do all repo-shaping architecture choices have explicit rationale before tasks are generated? [Decision Memory]"
|
||||
- "Are rejected alternatives documented for each blocking technology branch? [Decision Memory, Gap]"
|
||||
- "Can an implementer tell which shortcuts are forbidden without re-reading research artifacts? [Clarity, ADR]"
|
||||
- "Are ADR decisions traceable to requirements or constraints in the spec? [Traceability, ADR]"
|
||||
|
||||
## Anti-Examples: What NOT To Do
|
||||
|
||||
**❌ WRONG - These test implementation, not requirements:**
|
||||
@@ -282,6 +304,7 @@ Sample items:
|
||||
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
|
||||
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
|
||||
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
|
||||
- [ ] CHK007 - Do planning artifacts state why the accepted architecture was chosen and which alternative is rejected? [Decision Memory, ADR]
|
||||
```
|
||||
|
||||
**Key Differences:**
|
||||
|
||||
@@ -56,35 +56,36 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
3. Load and analyze the implementation context:
|
||||
- **REQUIRED**: Read `.ai/standards/semantics.md` for strict coding standards and contract requirements
|
||||
- **REQUIRED**: Read tasks.md for the complete task list and execution plan
|
||||
- **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
|
||||
- **IF EXISTS**: Read data-model.md for entities and relationships
|
||||
- **IF EXISTS**: Read contracts/ for API specifications and test requirements
|
||||
- **IF EXISTS**: Read research.md for technical decisions and constraints
|
||||
- **IF EXISTS**: Read quickstart.md for integration scenarios
|
||||
- **REQUIRED**: Read `tasks.md` for the complete task list and execution plan
|
||||
- **REQUIRED**: Read `plan.md` for tech stack, architecture, and file structure
|
||||
- **REQUIRED IF PRESENT**: Read ADR artifacts containing `[DEF:id:ADR]` nodes and build a blocked-path inventory from `@REJECTED`
|
||||
- **IF EXISTS**: Read `data-model.md` for entities and relationships
|
||||
- **IF EXISTS**: Read `contracts/` for API specifications and test requirements
|
||||
- **IF EXISTS**: Read `research.md` for technical decisions and constraints
|
||||
- **IF EXISTS**: Read `quickstart.md` for integration scenarios
|
||||
|
||||
4. **Project Setup Verification**:
|
||||
- **REQUIRED**: Create/verify ignore files based on actual project setup:
|
||||
|
||||
**Detection & Creation Logic**:
|
||||
- Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):
|
||||
- Check if the following command succeeds to determine if the repository is a git repo (create/verify `.gitignore` if so):
|
||||
|
||||
```sh
|
||||
git rev-parse --git-dir 2>/dev/null
|
||||
```
|
||||
|
||||
- Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
|
||||
- Check if .eslintrc* exists → create/verify .eslintignore
|
||||
- Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
|
||||
- Check if .prettierrc* exists → create/verify .prettierignore
|
||||
- Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
|
||||
- Check if terraform files (*.tf) exist → create/verify .terraformignore
|
||||
- Check if .helmignore needed (helm charts present) → create/verify .helmignore
|
||||
- Check if Dockerfile* exists or Docker in `plan.md` → create/verify `.dockerignore`
|
||||
- Check if `.eslintrc*` exists → create/verify `.eslintignore`
|
||||
- Check if `eslint.config.*` exists → ensure the config's `ignores` entries cover required patterns
|
||||
- Check if `.prettierrc*` exists → create/verify `.prettierignore`
|
||||
- Check if `.npmrc` or `package.json` exists → create/verify `.npmignore` (if publishing)
|
||||
- Check if terraform files (`*.tf`) exist → create/verify `.terraformignore`
|
||||
- Check if `.helmignore` needed (helm charts present) → create/verify `.helmignore`
|
||||
|
||||
**If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
|
||||
**If ignore file missing**: Create with full pattern set for detected technology
|
||||
|
||||
**Common Patterns by Technology** (from plan.md tech stack):
|
||||
**Common Patterns by Technology** (from `plan.md` tech stack):
|
||||
- **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
|
||||
- **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
|
||||
- **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
|
||||
@@ -107,11 +108,12 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
|
||||
- **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
|
||||
|
||||
5. Parse tasks.md structure and extract:
|
||||
5. Parse `tasks.md` structure and extract:
|
||||
- **Task phases**: Setup, Tests, Core, Integration, Polish
|
||||
- **Task dependencies**: Sequential vs parallel execution rules
|
||||
- **Task details**: ID, description, file paths, parallel markers [P]
|
||||
- **Execution flow**: Order and dependency requirements
|
||||
- **Decision-memory requirements**: which tasks inherit ADR ids, `@RATIONALE`, and `@REJECTED` guardrails
|
||||
|
||||
6. Execute implementation following the task plan:
|
||||
- **Phase-by-phase execution**: Complete each phase before moving to the next
|
||||
@@ -119,6 +121,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
|
||||
- **File-based coordination**: Tasks affecting the same files must run sequentially
|
||||
- **Validation checkpoints**: Verify each phase completion before proceeding
|
||||
- **ADR guardrail discipline**: if a task packet or local contract forbids a path via `@REJECTED`, do not treat it as an implementation option
|
||||
|
||||
7. Implementation execution rules:
|
||||
- **Strict Adherence**: Apply `.ai/standards/semantics.md` rules:
|
||||
@@ -134,8 +137,10 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- For Python Complexity 5 modules, `belief_scope(...)` is mandatory and the critical path must be irrigated with `logger.reason()` / `logger.reflect()` according to the contract.
|
||||
- For Svelte components, require `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`; runes-only reactivity is allowed (`$state`, `$derived`, `$effect`, `$props`).
|
||||
- Reject pseudo-semantic markup: docstrings containing loose `@PURPOSE` / `@PRE` text do **NOT** satisfy the protocol unless represented in canonical anchored metadata blocks.
|
||||
- Preserve and propagate decision-memory tags. Upstream `@RATIONALE` / `@REJECTED` are mandatory when carried by the task packet or contract.
|
||||
- If `logger.explore()` or equivalent runtime evidence leads to a retained workaround, mutate the same contract header with reactive Micro-ADR tags: `@RATIONALE` and `@REJECTED`.
|
||||
- **Self-Audit**: The Coder MUST use `axiom-core` tools (like `audit_contracts_tool`) to verify semantic compliance before completion.
|
||||
- **Semantic Rejection Gate**: If self-audit reveals broken anchors, missing closing tags, missing required metadata for the effective complexity, orphaned critical classes/functions, or Complexity 4/5 Python code without required belief-state logging, the task is NOT complete and cannot be handed off as accepted work.
|
||||
- **Semantic Rejection Gate**: If self-audit reveals broken anchors, missing closing tags, missing required metadata for the effective complexity, orphaned critical classes/functions, Complexity 4/5 Python code without required belief-state logging, or retained workarounds without decision-memory tags, the task is NOT complete and cannot be handed off as accepted work.
|
||||
- **CRITICAL Contracts**: If a task description contains a contract summary (e.g., `CRITICAL: PRE: ..., POST: ...`), these constraints are **MANDATORY** and must be strictly implemented in the code using guards/assertions (if applicable per protocol).
|
||||
- **Setup first**: Initialize project structure, dependencies, configuration
|
||||
- **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
|
||||
@@ -150,11 +155,13 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Provide clear error messages with context for debugging.
|
||||
- Suggest next steps if implementation cannot proceed.
|
||||
- **IMPORTANT** For completed tasks, mark as [X] only AFTER local verification and self-audit.
|
||||
- If blocked because the only apparent fix is listed in upstream `@REJECTED`, escalate for decision revision instead of silently overriding the guardrail.
|
||||
|
||||
9. **Handoff to Tester (Audit Loop)**:
|
||||
- Once a task or phase is complete, the Coder hands off to the Tester.
|
||||
- Handoff includes: file paths, declared complexity, expected contracts (`@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, `@INVARIANT` when applicable), and a short logic overview.
|
||||
- Handoff MUST explicitly disclose any contract exceptions or known semantic debt. Hidden semantic debt is forbidden.
|
||||
- Handoff MUST disclose decision-memory changes: inherited ADR ids, new or updated `@RATIONALE`, new or updated `@REJECTED`, and any blocked paths that remain active.
|
||||
- The handoff payload MUST instruct the Tester to execute the dedicated testing workflow [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md), not just perform an informal review.
|
||||
|
||||
10. **Tester Verification & Orchestrator Gate**:
|
||||
@@ -164,11 +171,12 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Reject code that only imitates the protocol superficially, such as free-form docstrings with `@PURPOSE` text but without canonical `[DEF]...[/DEF]` anchors and header metadata.
|
||||
- Verify that effective complexity and required metadata match [`.ai/standards/semantics.md`](.ai/standards/semantics.md).
|
||||
- Verify that Python Complexity 4/5 implementations include required belief-state instrumentation (`belief_scope`, `logger.reason()`, `logger.reflect()`).
|
||||
- Verify that upstream rejected paths were not silently restored.
|
||||
- Emulate algorithms "in mind" step-by-step to ensure logic consistency.
|
||||
- Verify unit tests match the declared contracts.
|
||||
- If Tester finds issues:
|
||||
- Emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch | speckit_test_not_run]`.
|
||||
- Provide concrete file-path-based reasons, for example: missing anchors, module/class contract mismatch, missing `@DATA_CONTRACT`, missing `logger.reason()`, illegal docstring-only annotations, or missing execution of [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md).
|
||||
- Emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch | speckit_test_not_run | rejected_path_regression]`.
|
||||
- Provide concrete file-path-based reasons, for example: missing anchors, module/class contract mismatch, missing `@DATA_CONTRACT`, missing `logger.reason()`, illegal docstring-only annotations, missing decision-memory tags, re-enabled upstream rejected path, or missing execution of [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md).
|
||||
- Notify the Orchestrator.
|
||||
- Orchestrator redirects the feedback to the Coder for remediation.
|
||||
- Orchestrator green-status rule:
|
||||
@@ -187,7 +195,9 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- class/function-level docstring contracts standing in for canonical anchors,
|
||||
- missing closing anchors,
|
||||
- missing required metadata for declared complexity,
|
||||
- Complexity 5 repository/service code using only `belief_scope(...)` without explicit `logger.reason()` / `logger.reflect()` checkpoints.
|
||||
- Complexity 5 repository/service code using only `belief_scope(...)` without explicit `logger.reason()` / `logger.reflect()` checkpoints,
|
||||
- retained workarounds missing local `@RATIONALE` / `@REJECTED`,
|
||||
- silent resurrection of paths already blocked by upstream ADR or task guardrails.
|
||||
- Report final status with summary of completed and audited work.
|
||||
|
||||
Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
|
||||
Note: This command assumes a complete task breakdown exists in `tasks.md`. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
|
||||
|
||||
@@ -28,12 +28,13 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
|
||||
- Fill Constitution Check section from constitution
|
||||
- Evaluate gates (ERROR if violations unjustified)
|
||||
- Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
|
||||
- Phase 1: Generate data-model.md, contracts/, quickstart.md
|
||||
- Phase 0: Generate `research.md` (resolve all NEEDS CLARIFICATION)
|
||||
- Phase 1: Generate `data-model.md`, `contracts/`, `quickstart.md`
|
||||
- Phase 1: Generate global ADR artifacts and connect them to the plan
|
||||
- Phase 1: Update agent context by running the agent script
|
||||
- Re-evaluate Constitution Check post-design
|
||||
|
||||
4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
|
||||
4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, generated artifacts, and ADR decisions created.
|
||||
|
||||
## Phases
|
||||
|
||||
@@ -58,9 +59,9 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Rationale: [why chosen]
|
||||
- Alternatives considered: [what else evaluated]
|
||||
|
||||
**Output**: research.md with all NEEDS CLARIFICATION resolved
|
||||
**Output**: `research.md` with all NEEDS CLARIFICATION resolved
|
||||
|
||||
### Phase 1: Design & Contracts
|
||||
### Phase 1: Design, ADRs & Contracts
|
||||
|
||||
**Prerequisites:** `research.md` complete
|
||||
|
||||
@@ -72,7 +73,23 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
1. **Extract entities from feature spec** → `data-model.md`:
|
||||
- Entity name, fields, relationships, validation rules.
|
||||
|
||||
2. **Design & Verify Contracts (Semantic Protocol)**:
|
||||
2. **Generate Global ADRs (Decision Memory Root Layer)**:
|
||||
- Read `spec.md`, `research.md`, and the technical context to identify repo-shaping decisions: storage, auth pattern, framework boundaries, integration patterns, deployment assumptions, failure strategy.
|
||||
- For each durable architectural choice, emit a standalone semantic ADR block using `[DEF:DecisionId:ADR]`.
|
||||
- Every ADR block MUST include:
|
||||
- `@COMPLEXITY: 3` or `4` depending on blast radius
|
||||
- `@PURPOSE`
|
||||
- `@RATIONALE`
|
||||
- `@REJECTED`
|
||||
- `@RELATION` back to the originating spec/research/plan boundary or target module family
|
||||
- Preferred destinations:
|
||||
- `docs/architecture.md` for cross-cutting repository decisions
|
||||
- feature-local design docs when the decision is feature-scoped
|
||||
- root module headers only when the decision scope is truly local
|
||||
- **Hard Gate**: do not continue to task decomposition until the blocking global decisions have been materialized as ADR nodes.
|
||||
- **Anti-Regression Goal**: a later orchestrator must be able to read these ADRs and avoid creating tasks for rejected branches.
|
||||
|
||||
3. **Design & Verify Contracts (Semantic Protocol)**:
|
||||
- **Drafting**: Define semantic headers, metadata, and closing anchors for all new modules strictly from `.ai/standards/semantics.md`.
|
||||
- **Complexity Classification**: Classify each contract with `@COMPLEXITY: [1|2|3|4|5]` or `@C:`. Treat `@TIER` only as a legacy compatibility hint and never as the primary rule source.
|
||||
- **Adaptive Contract Requirements**:
|
||||
@@ -81,34 +98,42 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- **Complexity 3**: require `@PURPOSE` and `@RELATION`; UI also requires `@UX_STATE`.
|
||||
- **Complexity 4**: require `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; Python modules must define a meaningful `logger.reason()` / `logger.reflect()` path or equivalent belief-state mechanism.
|
||||
- **Complexity 5**: require full level-4 contract plus `@DATA_CONTRACT` and `@INVARIANT`; Python modules must require `belief_scope`; UI modules must define UX contracts including `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`.
|
||||
- **Decision-Memory Propagation**:
|
||||
- If a module/function/component realizes or is constrained by an ADR, add local `@RATIONALE` and `@REJECTED` guardrails before coding begins.
|
||||
- Use `@RELATION: IMPLEMENTS ->[AdrId]` when the contract realizes the ADR.
|
||||
- Use `@RELATION: DEPENDS_ON ->[AdrId]` when the contract is merely constrained by the ADR.
|
||||
- Record known LLM traps directly in the contract header so the implementer inherits the guardrail from the start.
|
||||
- **Relation Syntax**: Write dependency edges in canonical GraphRAG form: `@RELATION: [PREDICATE] ->[TARGET_ID]`.
|
||||
- **Context Guard**: If a target relation, DTO, or required dependency cannot be named confidently, stop generation and emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
|
||||
- **Context Guard**: If a target relation, DTO, required dependency, or decision rationale cannot be named confidently, stop generation and emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
|
||||
- **Testing Contracts**: Add `@TEST_CONTRACT`, `@TEST_SCENARIO`, `@TEST_FIXTURE`, `@TEST_EDGE`, and `@TEST_INVARIANT` when the design introduces audit-critical or explicitly test-governed contracts, especially for Complexity 5 boundaries.
|
||||
- **Self-Review**:
|
||||
- *Complexity Fit*: Does each contract include exactly the metadata and contract density required by its complexity level?
|
||||
- *Completeness*: Do `@PRE`/`@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, and UX tags cover the edge cases identified in Research and UX Reference?
|
||||
- *Completeness*: Do `@PRE`/`@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, UX tags, and decision-memory tags cover the edge cases identified in Research and UX Reference?
|
||||
- *Connectivity*: Do `@RELATION` tags form a coherent graph using canonical `@RELATION: [PREDICATE] ->[TARGET_ID]` syntax?
|
||||
- *Compliance*: Are all anchors properly opened and closed, and does the chosen comment syntax match the target medium?
|
||||
- *Belief-State Requirements*: Do Complexity 4/5 Python modules explicitly account for `logger.reason()`, `logger.reflect()`, and `belief_scope` requirements?
|
||||
- *ADR Continuity*: Does every blocking architectural decision have a corresponding ADR node and at least one downstream guarded contract?
|
||||
- **Output**: Write verified contracts to `contracts/modules.md`.
|
||||
|
||||
3. **Simulate Contract Usage**:
|
||||
4. **Simulate Contract Usage**:
|
||||
- Trace one key user scenario through the defined contracts to ensure data flow continuity.
|
||||
- If a contract interface mismatch is found, fix it immediately.
|
||||
- Verify that no traced path accidentally realizes an alternative already named in any ADR `@REJECTED` tag.
|
||||
|
||||
4. **Generate API contracts**:
|
||||
5. **Generate API contracts**:
|
||||
- Output OpenAPI/GraphQL schema to `/contracts/` for backend-frontend sync.
|
||||
|
||||
5. **Agent context update**:
|
||||
6. **Agent context update**:
|
||||
- Run `.specify/scripts/bash/update-agent-context.sh kilocode`
|
||||
- These scripts detect which AI agent is in use
|
||||
- Update the appropriate agent-specific context file
|
||||
- Add only new technology from current plan
|
||||
- Preserve manual additions between markers
|
||||
|
||||
**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file
|
||||
**Output**: `data-model.md`, `/contracts/*`, `quickstart.md`, ADR artifact(s), agent-specific file
|
||||
|
||||
## Key rules
|
||||
|
||||
- Use absolute paths
|
||||
- ERROR on gate failures or unresolved clarifications
|
||||
- Do not hand off to [`speckit.tasks`](.kilocode/workflows/speckit.tasks.md) until blocking ADRs exist and rejected branches are explicit
|
||||
|
||||
@@ -12,7 +12,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
## Goal
|
||||
|
||||
Ensure the codebase adheres to the semantic standards defined in `.ai/standards/semantics.md` by using the AXIOM MCP semantic graph as the primary execution engine. This involves reindexing the workspace, measuring semantic health, auditing contract compliance, and optionally delegating contract-safe fixes through MCP-aware agents.
|
||||
Ensure the codebase adheres to the semantic standards defined in `.ai/standards/semantics.md` by using the AXIOM MCP semantic graph as the primary execution engine. This involves reindexing the workspace, measuring semantic health, auditing contract compliance, auditing decision-memory continuity, and optionally delegating contract-safe fixes through MCP-aware agents.
|
||||
|
||||
## Operating Constraints
|
||||
|
||||
@@ -25,6 +25,7 @@ Ensure the codebase adheres to the semantic standards defined in `.ai/standards/
|
||||
7. **ID NAMING (CRITICAL)**: NEVER use fully-qualified Python import paths in `[DEF:id:Type]`. Use short, domain-driven semantic IDs (e.g., `[DEF:AuthService:Class]`). Follow the exact style shown in `.ai/standards/semantics.md`.
|
||||
8. **ORPHAN PREVENTION**: To reduce the orphan count, you MUST physically wrap actual class and function definitions with `[DEF:id:Type] ... [/DEF]` blocks in the code. Modifying `@RELATION` tags does NOT fix orphans. The AST parser flags any unwrapped function as an orphan.
|
||||
- **Exception for Tests**: In test modules, use `BINDS_TO` to link major helpers to the module root. Small helpers remain C1 and don't need relations.
|
||||
9. **DECISION-MEMORY CONTINUITY**: Audit ADR nodes, preventive task guardrails, and reactive Micro-ADR tags as one anti-regression chain. Missing or contradictory `@RATIONALE` / `@REJECTED` is a first-class semantic defect.
|
||||
|
||||
## Execution Steps
|
||||
|
||||
@@ -48,8 +49,13 @@ Treat high orphan counts and unresolved relations as first-class health indicato
|
||||
Use [`audit_contracts_tool`](.kilo/mcp.json) and classify findings into:
|
||||
- **Critical Parsing/Structure Errors**: malformed or incoherent semantic contract regions
|
||||
- **Critical Contract Gaps**: missing [`@DATA_CONTRACT`](.ai/standards/semantics.md), [`@PRE`](.ai/standards/semantics.md), [`@POST`](.ai/standards/semantics.md), [`@SIDE_EFFECT`](.ai/standards/semantics.md) on CRITICAL contracts
|
||||
- **Decision-Memory Gaps**:
|
||||
- missing standalone `[DEF:id:ADR]` for repo-shaping decisions
|
||||
- missing `@RATIONALE` / `@REJECTED` where task or implementation context clearly requires guardrails
|
||||
- retained workaround code without local reactive Micro-ADR tags
|
||||
- implementation that silently re-enables a path declared in upstream `@REJECTED`
|
||||
- **Coverage Gaps**: missing [`@TIER`](.ai/standards/semantics.md), missing [`@PURPOSE`](.ai/standards/semantics.md)
|
||||
- **Graph Breakages**: unresolved relations, broken references, isolated critical contracts
|
||||
- **Graph Breakages**: unresolved relations, broken references, isolated critical contracts, ADR nodes without downstream guarded contracts
|
||||
|
||||
### 4. Build Remediation Context
|
||||
|
||||
@@ -58,12 +64,14 @@ For the top failing contracts, use MCP semantic context tools such as [`get_sema
|
||||
2. Upstream/downstream semantic impact
|
||||
3. Related tests and fixtures
|
||||
4. Whether relation recovery is needed
|
||||
5. Whether decision-memory continuity is broken between ADR, task contract, and implementation
|
||||
|
||||
### 5. Execute Fixes (Optional/Handoff)
|
||||
|
||||
If $ARGUMENTS contains `fix` or `apply`:
|
||||
- Handoff to the [`semantic`](.kilocodemodes) mode or a dedicated implementation agent instead of applying naive textual edits in orchestration.
|
||||
- Require the fixing agent to prefer MCP contract mutation tools such as [`simulate_patch_tool`](.kilo/mcp.json), [`guarded_patch_contract_tool`](.kilo/mcp.json), [`patch_contract_tool`](.kilo/mcp.json), and [`infer_missing_relations_tool`](.kilo/mcp.json).
|
||||
- Require the fixing agent to preserve or restore `@RATIONALE` / `@REJECTED` continuity whenever blocked-path knowledge exists.
|
||||
- After changes, re-run reindex, health, and audit MCP steps to verify the delta.
|
||||
|
||||
### 6. Review Gate
|
||||
@@ -74,8 +82,9 @@ Before completion, request or perform an MCP-based review path aligned with the
|
||||
|
||||
Provide a summary of the semantic state:
|
||||
- **Health Metrics**: contracts / relations / orphans / unresolved_relations / files
|
||||
- **Status**: [PASS/FAIL] (FAIL if CRITICAL gaps or semantically significant unresolved relations exist)
|
||||
- **Status**: [PASS/FAIL] (FAIL if CRITICAL gaps, rejected-path regressions, or semantically significant unresolved relations exist)
|
||||
- **Top Issues**: List top 3-5 contracts or files needing attention.
|
||||
- **Decision Memory**: summarize missing ADRs, missing guardrails, and rejected-path regression risks.
|
||||
- **Action Taken**: Summary of MCP analysis performed, context gathered, and fixes or handoffs initiated.
|
||||
|
||||
## Context
|
||||
|
||||
@@ -24,26 +24,29 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
||||
|
||||
2. **Load design documents**: Read from FEATURE_DIR:
|
||||
- **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities), ux_reference.md (experience source of truth)
|
||||
- **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios)
|
||||
- **Required**: `plan.md` (tech stack, libraries, structure), `spec.md` (user stories with priorities), `ux_reference.md` (experience source of truth)
|
||||
- **Optional**: `data-model.md` (entities), `contracts/` (API endpoints), `research.md` (decisions), `quickstart.md` (test scenarios)
|
||||
- **Required when present in plan output**: ADR artifacts such as `docs/architecture.md` or feature-local architecture decision files containing `[DEF:id:ADR]` nodes
|
||||
- Note: Not all projects have all documents. Generate tasks based on what's available.
|
||||
|
||||
3. **Execute task generation workflow**:
|
||||
- Load plan.md and extract tech stack, libraries, project structure
|
||||
- Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
|
||||
- If data-model.md exists: Extract entities and map to user stories
|
||||
- If contracts/ exists: Map endpoints to user stories
|
||||
- If research.md exists: Extract decisions for setup tasks
|
||||
- Load `plan.md` and extract tech stack, libraries, project structure
|
||||
- Load `spec.md` and extract user stories with their priorities (P1, P2, P3, etc.)
|
||||
- Load ADR nodes and build a decision-memory inventory: `DecisionId`, `@RATIONALE`, `@REJECTED`, dependent modules
|
||||
- If `data-model.md` exists: Extract entities and map to user stories
|
||||
- If `contracts/` exists: Map endpoints to user stories
|
||||
- If `research.md` exists: Extract decisions for setup tasks
|
||||
- Generate tasks organized by user story (see Task Generation Rules below)
|
||||
- Generate dependency graph showing user story completion order
|
||||
- Create parallel execution examples per user story
|
||||
- Validate task completeness (each user story has all needed tasks, independently testable)
|
||||
- Validate guardrail continuity: no task may realize an ADR path named in `@REJECTED`
|
||||
|
||||
4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
|
||||
- Correct feature name from plan.md
|
||||
4. **Generate `tasks.md`**: Use `.specify/templates/tasks-template.md` as structure, fill with:
|
||||
- Correct feature name from `plan.md`
|
||||
- Phase 1: Setup tasks (project initialization)
|
||||
- Phase 2: Foundational tasks (blocking prerequisites for all user stories)
|
||||
- Phase 3+: One phase per user story (in priority order from spec.md)
|
||||
- Phase 3+: One phase per user story (in priority order from `spec.md`)
|
||||
- Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
|
||||
- Final Phase: Polish & cross-cutting concerns
|
||||
- All tasks must follow the strict checklist format (see Task Generation Rules below)
|
||||
@@ -51,18 +54,20 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
- Dependencies section showing story completion order
|
||||
- Parallel execution examples per story
|
||||
- Implementation strategy section (MVP first, incremental delivery)
|
||||
- Decision-memory notes for guarded tasks when ADRs or known traps apply
|
||||
|
||||
5. **Report**: Output path to generated tasks.md and summary:
|
||||
5. **Report**: Output path to generated `tasks.md` and summary:
|
||||
- Total task count
|
||||
- Task count per user story
|
||||
- Parallel opportunities identified
|
||||
- Independent test criteria for each story
|
||||
- Suggested MVP scope (typically just User Story 1)
|
||||
- Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
|
||||
- ADR propagation summary: which ADRs were inherited into task guardrails and which paths were rejected
|
||||
|
||||
Context for task generation: $ARGUMENTS
|
||||
|
||||
The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
|
||||
The `tasks.md` should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
|
||||
|
||||
## Task Generation Rules
|
||||
|
||||
@@ -72,10 +77,11 @@ The tasks.md should be immediately executable - each task must be specific enoug
|
||||
|
||||
### UX & Semantic Preservation (CRITICAL)
|
||||
|
||||
- **Source of Truth**: `ux_reference.md` for UX, `.ai/standards/semantics.md` for Code.
|
||||
- **Violation Warning**: If any task violates UX or GRACE standards, flag it immediately.
|
||||
- **Source of Truth**: `ux_reference.md` for UX, `.ai/standards/semantics.md` for code, and ADR artifacts for upstream technology decisions.
|
||||
- **Violation Warning**: If any task violates UX, ADR guardrails, or GRACE standards, flag it immediately.
|
||||
- **Verification Task (UX)**: Add a task at the end of each Story phase: `- [ ] Txxx [USx] Verify implementation matches ux_reference.md (Happy Path & Errors)`
|
||||
- **Verification Task (Audit)**: Add a mandatory audit task at the end of each Story phase: `- [ ] Txxx [USx] Acceptance: Perform semantic audit & algorithm emulation by Tester`
|
||||
- **Guardrail Rule**: If an ADR or contract says `@REJECTED`, task text must not schedule that path as implementation work.
|
||||
|
||||
### Checklist Format (REQUIRED)
|
||||
|
||||
@@ -91,7 +97,7 @@ Every task MUST strictly follow this format:
|
||||
2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
|
||||
3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
|
||||
4. **[Story] label**: REQUIRED for user story phase tasks only
|
||||
- Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
|
||||
- Format: [US1], [US2], [US3], etc. (maps to user stories from `spec.md`)
|
||||
- Setup phase: NO story label
|
||||
- Foundational phase: NO story label
|
||||
- User Story phases: MUST have story label
|
||||
@@ -111,7 +117,7 @@ Every task MUST strictly follow this format:
|
||||
|
||||
### Task Organization
|
||||
|
||||
1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
|
||||
1. **From User Stories (`spec.md`)** - PRIMARY ORGANIZATION:
|
||||
- Each user story (P1, P2, P3...) gets its own phase
|
||||
- Map all related components to their story:
|
||||
- Models needed for that story
|
||||
@@ -127,12 +133,18 @@ Every task MUST strictly follow this format:
|
||||
- Map each contract/endpoint → to the user story it serves
|
||||
- If tests requested: Each contract → contract test task [P] before implementation in that story's phase
|
||||
|
||||
3. **From Data Model**:
|
||||
3. **From ADRs and Decision Memory**:
|
||||
- For each implementation task constrained by an ADR, append a concise guardrail summary drawn from `@RATIONALE` and `@REJECTED`.
|
||||
- Example: `- [ ] T021 [US1] Implement payload parsing guardrails in src/api/input.py (RATIONALE: strict validation because frontend sends numeric strings; REJECTED: json.loads() without schema validation)`
|
||||
- If a task would naturally branch into an ADR-rejected alternative, rewrite the task around the accepted path instead of leaving the choice ambiguous.
|
||||
- If no safe executable path remains because ADR context is incomplete, stop and emit `[NEED_CONTEXT: target]`.
|
||||
|
||||
4. **From Data Model**:
|
||||
- Map each entity to the user story(ies) that need it
|
||||
- If entity serves multiple stories: Put in earliest story or Setup phase
|
||||
- Relationships → service layer tasks in appropriate story phase
|
||||
|
||||
4. **From Setup/Infrastructure**:
|
||||
5. **From Setup/Infrastructure**:
|
||||
- Shared infrastructure → Setup phase (Phase 1)
|
||||
- Foundational/blocking tasks → Foundational phase (Phase 2)
|
||||
- Story-specific setup → within that story's phase
|
||||
@@ -145,3 +157,11 @@ Every task MUST strictly follow this format:
|
||||
- Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
|
||||
- Each phase should be a complete, independently testable increment
|
||||
- **Final Phase**: Polish & Cross-Cutting Concerns
|
||||
|
||||
### Decision-Memory Validation Gate
|
||||
|
||||
Before finalizing `tasks.md`, verify all of the following:
|
||||
- Every repo-shaping ADR from planning is either represented in a setup/foundational task or inherited by a downstream story task.
|
||||
- Every guarded task that could tempt an implementer into a known wrong branch carries preventive `@RATIONALE` / `@REJECTED` guidance in its text.
|
||||
- No task instructs the implementer to realize an ADR path already named as rejected.
|
||||
- At least one explicit audit/verification task exists for checking rejected-path regressions in code review or test stages.
|
||||
|
||||
@@ -14,7 +14,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
||||
|
||||
## Goal
|
||||
|
||||
Execute semantic audit and full testing cycle: verify contract compliance, emulate logic, ensure maximum coverage, and maintain test quality.
|
||||
Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality.
|
||||
|
||||
## Operating Constraints
|
||||
|
||||
@@ -22,6 +22,7 @@ Execute semantic audit and full testing cycle: verify contract compliance, emula
|
||||
2. **NEVER duplicate tests** - Check existing tests first before creating new ones
|
||||
3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
|
||||
4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested
|
||||
5. **Decision-memory regression guard** - Tests and audits must not normalize silent reintroduction of any path documented in upstream `@REJECTED`
|
||||
|
||||
## Execution Steps
|
||||
|
||||
@@ -31,18 +32,25 @@ Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --inclu
|
||||
|
||||
Determine:
|
||||
- FEATURE_DIR - where the feature is located
|
||||
- TASKS_FILE - path to tasks.md
|
||||
- TASKS_FILE - path to `tasks.md`
|
||||
- Which modules need testing based on task status
|
||||
- Which ADRs or task guardrails define rejected paths for the touched scope
|
||||
|
||||
### 2. Load Relevant Artifacts
|
||||
|
||||
**From tasks.md:**
|
||||
**From `tasks.md`:**
|
||||
- Identify completed implementation tasks (not test tasks)
|
||||
- Extract file paths that need tests
|
||||
- Extract guardrail summaries and blocked paths
|
||||
|
||||
**From .ai/standards/semantics.md:**
|
||||
- Read @TIER annotations for modules
|
||||
- For CRITICAL modules: Read @TEST_ fixtures
|
||||
**From `.ai/standards/semantics.md`:**
|
||||
- Read effective complexity expectations
|
||||
- Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR
|
||||
- For CRITICAL modules: Read `@TEST_` fixtures
|
||||
|
||||
**From ADR sources and touched code:**
|
||||
- Read `[DEF:id:ADR]` nodes when present
|
||||
- Read local `@RATIONALE` and `@REJECTED` in touched contracts
|
||||
|
||||
**From existing tests:**
|
||||
- Scan `__tests__` directories for existing tests
|
||||
@@ -52,9 +60,9 @@ Determine:
|
||||
|
||||
Create coverage matrix:
|
||||
|
||||
| Module | File | Has Tests | TIER | TEST_FIXTURE Available |
|
||||
|--------|------|-----------|------|----------------------|
|
||||
| ... | ... | ... | ... | ... |
|
||||
| Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded |
|
||||
|--------|------|-----------|-------------------|------------------------|-----------------------|
|
||||
| ... | ... | ... | ... | ... | ... |
|
||||
|
||||
### 4. Semantic Audit & Logic Emulation (CRITICAL)
|
||||
|
||||
@@ -66,9 +74,12 @@ Before writing tests, the Tester MUST:
|
||||
- Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints.
|
||||
- Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`.
|
||||
- Treat broken or missing closing anchors as blocking violations.
|
||||
- Reject retained workaround code if the local contract lacks `@RATIONALE` / `@REJECTED`.
|
||||
- Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected.
|
||||
3. **Emulate Algorithm**: Step through the code implementation in mind.
|
||||
- Verify it adheres to the `@PURPOSE` and `@INVARIANT`.
|
||||
- Verify `@PRE` and `@POST` conditions are correctly handled.
|
||||
- Verify the implementation follows accepted-path rationale rather than drifting into a blocked path.
|
||||
4. **Validation Verdict**:
|
||||
- If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator.
|
||||
- Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived.
|
||||
@@ -79,7 +90,7 @@ Before writing tests, the Tester MUST:
|
||||
For each module requiring tests:
|
||||
|
||||
1. **Check existing tests**: Scan `__tests__/` for duplicates.
|
||||
2. **Read TEST_FIXTURE**: If CRITICAL tier, read @TEST_FIXTURE from semantics header.
|
||||
2. **Read TEST_FIXTURE**: If CRITICAL tier, read `@TEST_FIXTURE` from semantics header.
|
||||
3. **Do not normalize broken semantics through tests**:
|
||||
- The Tester must not write tests that silently accept malformed semantic protocol usage.
|
||||
- If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
|
||||
@@ -87,6 +98,8 @@ For each module requiring tests:
|
||||
- Python: `src/module/__tests__/test_module.py`
|
||||
- Svelte: `src/lib/components/__tests__/test_component.test.js`
|
||||
5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies
|
||||
6. **Add rejected-path regression coverage when relevant**:
|
||||
- If ADR or local contract names a blocked path in `@REJECTED`, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored.
|
||||
|
||||
### 4a. UX Contract Testing (Frontend Components)
|
||||
|
||||
@@ -103,9 +116,10 @@ For Svelte components with `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY` tags:
|
||||
expect(screen.getByTestId('sidebar')).toHaveClass('expanded');
|
||||
});
|
||||
```
|
||||
3. **Test @UX_FEEDBACK**: Verify visual feedback (toast, shake, color changes)
|
||||
4. **Test @UX_RECOVERY**: Verify error recovery mechanisms (retry, clear input)
|
||||
5. **Use @UX_TEST fixtures**: If component has `@UX_TEST` tags, use them as test specifications
|
||||
3. **Test `@UX_FEEDBACK`**: Verify visual feedback (toast, shake, color changes)
|
||||
4. **Test `@UX_RECOVERY`**: Verify error recovery mechanisms (retry, clear input)
|
||||
5. **Use `@UX_TEST` fixtures**: If component has `@UX_TEST` tags, use them as test specifications
|
||||
6. **Verify decision memory**: If the UI contract declares `@REJECTED`, ensure browser-visible behavior does not regress into the rejected path.
|
||||
|
||||
**UX Test Template:**
|
||||
```javascript
|
||||
@@ -139,6 +153,8 @@ tests/
|
||||
└── YYYY-MM-DD-report.md
|
||||
```
|
||||
|
||||
Include decision-memory coverage notes when ADR or rejected-path regressions were checked.
|
||||
|
||||
### 6. Execute Tests
|
||||
|
||||
Run tests and report results:
|
||||
@@ -155,10 +171,11 @@ cd frontend && npm run test
|
||||
|
||||
### 7. Update Tasks
|
||||
|
||||
Mark test tasks as completed in tasks.md with:
|
||||
Mark test tasks as completed in `tasks.md` with:
|
||||
- Test file path
|
||||
- Coverage achieved
|
||||
- Any issues found
|
||||
- Whether rejected-path regression checks passed or remain manual audit items
|
||||
|
||||
## Output
|
||||
|
||||
@@ -188,10 +205,15 @@ Generate test execution report:
|
||||
- Verdict: PASS | FAIL
|
||||
- Blocking Violations:
|
||||
- [file path] -> [reason]
|
||||
- Decision Memory:
|
||||
- ADRs checked: [...]
|
||||
- Rejected-path regressions: PASS | FAIL
|
||||
- Missing `@RATIONALE` / `@REJECTED`: [...]
|
||||
- Notes:
|
||||
- Reject docstring-only semantic pseudo-markup
|
||||
- Reject complexity/contract mismatches
|
||||
- Reject missing belief-state instrumentation for Python Complexity 4/5
|
||||
- Reject silent resurrection of rejected paths
|
||||
|
||||
## Issues Found
|
||||
|
||||
@@ -203,6 +225,7 @@ Generate test execution report:
|
||||
|
||||
- [ ] Fix failed tests
|
||||
- [ ] Fix blocking semantic violations before acceptance
|
||||
- [ ] Fix decision-memory drift or rejected-path regressions
|
||||
- [ ] Add more coverage for [module]
|
||||
- [ ] Review TEST_FIXTURE fixtures
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user