mcp tuning
This commit is contained in:
5682
.ai/MODULE_MAP.md
5682
.ai/MODULE_MAP.md
File diff suppressed because it is too large
Load Diff
@@ -1,44 +0,0 @@
|
|||||||
# [DEF:Project_Map:Root]
|
|
||||||
# @COMPLEXITY: 3
|
|
||||||
# @PURPOSE: Canonical ownership record for repository structure navigation and generated project-map artifacts.
|
|
||||||
# @RELATION: DEPENDS_ON -> [Project_Knowledge_Map:Root]
|
|
||||||
# @RELATION: DEPENDS_ON -> [Std:Constitution:Standard]
|
|
||||||
# @RELATION: DEPENDS_ON -> [Std:UserPersona:Standard]
|
|
||||||
# @RELATION: BINDS_TO -> [MCP_Config:Block]
|
|
||||||
# @LAST_UPDATE: 2026-03-26
|
|
||||||
|
|
||||||
## Canonical ownership
|
|
||||||
- Canonical owner for `Project_Map` is this file: `.ai/PROJECT_MAP.md`.
|
|
||||||
- Generated structural snapshot lives at `.ai/structure/PROJECT_MAP.md` and is a backing artifact, not the canonical ownership document.
|
|
||||||
- References that previously pointed directly to `.ai/structure/PROJECT_MAP.md` for `Project_Map` should normalize to this file.
|
|
||||||
|
|
||||||
## Canonical relations
|
|
||||||
- Root knowledge entry: `.ai/ROOT.md` -> `[DEF:Project_Knowledge_Map:Root]`
|
|
||||||
- Normalized project MCP configuration: `.kilo/mcp.json` -> `[DEF:MCP_Config:Block]`
|
|
||||||
- Repository constitution: `.ai/standards/constitution.md` -> `[DEF:Std:Constitution:Standard]`
|
|
||||||
- Repository persona: `.ai/PERSONA.md` -> `[DEF:Std:UserPersona:Standard]`
|
|
||||||
|
|
||||||
## Generated snapshot handoff
|
|
||||||
- Use `.ai/structure/PROJECT_MAP.md` for the expanded generated module/file inventory.
|
|
||||||
- Regeneration may replace snapshot contents without changing canonical ownership of `Project_Map`.
|
|
||||||
|
|
||||||
# [DEF:MCP_Config:Block]
|
|
||||||
# @COMPLEXITY: 3
|
|
||||||
# @PURPOSE: Canonical ownership record for normalized project MCP configuration consumed by semantic workflows.
|
|
||||||
# @RELATION: DEPENDS_ON -> [Project_Map:Root]
|
|
||||||
# @RELATION: DEPENDS_ON -> [Std:Constitution:Standard]
|
|
||||||
# @RELATION: DEPENDS_ON -> [Std:UserPersona:Standard]
|
|
||||||
# @LAST_UPDATE: 2026-03-26
|
|
||||||
|
|
||||||
## Normalized config path
|
|
||||||
- Canonical project MCP config path is `.kilo/mcp.json`.
|
|
||||||
- For this repository, new docs and workflows must reference `.kilo/mcp.json` as the normalized MCP config.
|
|
||||||
- Do not introduce new canonical references to deprecated project MCP doc paths for ownership or workflow wiring.
|
|
||||||
|
|
||||||
## Current semantic workflow binding
|
|
||||||
- AXIOM semantic workflows in `.kilocode/workflows/` bind to tools exposed through `.kilo/mcp.json`.
|
|
||||||
- The `axiom-core` server definition in `.kilo/mcp.json` is the normalized semantic-audit integration point for this repository.
|
|
||||||
|
|
||||||
# [/DEF:MCP_Config:Block]
|
|
||||||
|
|
||||||
# [/DEF:Project_Map:Root]
|
|
||||||
555
.ai/reports/axiom-tools-evaluation.md
Normal file
555
.ai/reports/axiom-tools-evaluation.md
Normal file
@@ -0,0 +1,555 @@
|
|||||||
|
# [DEF:Axiom_Tools_Evaluation:Report]
|
||||||
|
# @COMPLEXITY: 4
|
||||||
|
# @PURPOSE: Comprehensive evaluation of all axiom-core MCP server tools across 8 UX metrics.
|
||||||
|
# @LAYER: Analysis
|
||||||
|
# @RELATION: DEPENDS_ON -> [Project_Knowledge_Map:Root]
|
||||||
|
# @PRE: All axiom-core tools have been exercised with valid and invalid inputs.
|
||||||
|
# @POST: Report file exists with per-tool scores and aggregate findings.
|
||||||
|
# @SIDE_EFFECT: Creates evaluation artifact in .ai/reports/.
|
||||||
|
# @DATA_CONTRACT: Input[Tool Suite] -> Output[Evaluation Report]
|
||||||
|
# @INVARIANT: Each tool must be scored on all 8 metrics; no tool may be omitted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# Axiom-Core MCP Tools Evaluation Report
|
||||||
|
|
||||||
|
**Date:** 2026-03-31
|
||||||
|
**Workspace:** `/home/busya/dev/ss-tools`
|
||||||
|
**Evaluator:** Kilo Code (Coder Mode)
|
||||||
|
**Index Stats:** 2528 contracts, 2186 relations, 450 files
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scoring Scale
|
||||||
|
|
||||||
|
| Score | Meaning |
|
||||||
|
|-------|---------|
|
||||||
|
| 5 | Excellent — no friction, best-in-class |
|
||||||
|
| 4 | Good — minor quirks, easily understood |
|
||||||
|
| 3 | Acceptable — some learning curve, works as expected |
|
||||||
|
| 2 | Poor — confusing or inconsistent behavior |
|
||||||
|
| 1 | Broken — fails to meet basic expectations |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. reindex_workspace_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | Name is self-explanatory; purpose is obvious. |
|
||||||
|
| Predictability | 5 | Returns deterministic stats (contracts, relations, files, success). |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of GRACE indexing concept; not intuitive for newcomers. |
|
||||||
|
| Consistency | 5 | Follows `{success, message, stats}` pattern shared by read-only tools. |
|
||||||
|
| Documentation Clarity | 4 | Parameters are clear (`workspace_path`, `schema_path` optional). |
|
||||||
|
| Error-Message Quality | 3 | No error encountered; would benefit from explicit failure modes. |
|
||||||
|
| Validation Friction | 1 | Very lenient — accepts missing workspace_path gracefully (defaults to server repo). |
|
||||||
|
| Recovery Simplicity | 5 | Pure read/index operation; re-run to refresh. No state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. search_contracts_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Search contracts by query" — crystal clear. |
|
||||||
|
| Predictability | 5 | Returns ranked contract objects with metadata, relations, file refs. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of semantic search vs. text search. |
|
||||||
|
| Consistency | 5 | Output shape matches `find_contract_tool` exactly. |
|
||||||
|
| Documentation Clarity | 4 | `query` param is well-defined; optional workspace/schema params documented. |
|
||||||
|
| Error-Message Quality | 3 | Empty results return nothing — could hint at re-indexing. |
|
||||||
|
| Validation Friction | 1 | Accepts any string; no pre-validation needed. |
|
||||||
|
| Recovery Simplicity | 5 | Stateless query; re-run with different query. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. read_grace_outline_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "GRACE outline" is domain-specific but clear from context. |
|
||||||
|
| Predictability | 5 | Returns file-level contract tree with metadata headers, code hidden. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of GRACE anchor format `[DEF:...]`. |
|
||||||
|
| Consistency | 5 | Output format is stable across files. |
|
||||||
|
| Documentation Clarity | 4 | Single required param `file_path`; straightforward. |
|
||||||
|
| Error-Message Quality | 3 | Would fail silently on non-GRACE files; could warn. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; accepts any path. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no side effects. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. ast_search_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | AST-grep pattern search — clear to developers familiar with the tool. |
|
||||||
|
| Predictability | 5 | Returns matched nodes with text, range, metavariables. |
|
||||||
|
| Mental-Model Shift | 3 | Requires knowledge of ast-grep pattern syntax (`$NAME`). |
|
||||||
|
| Consistency | 5 | Output shape is consistent (array of match objects). |
|
||||||
|
| Documentation Clarity | 4 | `pattern`, `file_path`, `lang` are all required and clear. |
|
||||||
|
| Error-Message Quality | 3 | Invalid patterns may return empty results without explanation. |
|
||||||
|
| Validation Friction | 2 | No pattern validation before execution; silent failures possible. |
|
||||||
|
| Recovery Simplicity | 5 | Stateless; re-run with corrected pattern. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. get_semantic_context_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Get semantic context around a contract" — clear intent. |
|
||||||
|
| Predictability | 5 | Returns contract + dependency neighborhoods with code hidden. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of semantic dependency graph. |
|
||||||
|
| Consistency | 5 | Output format is stable and well-structured. |
|
||||||
|
| Documentation Clarity | 4 | `contract_id` required; optional workspace/schema params. |
|
||||||
|
| Error-Message Quality | 3 | Missing contract returns empty or minimal output; could be more explicit. |
|
||||||
|
| Validation Friction | 1 | Accepts any string; no pre-validation. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. build_task_context_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Build task-focused context" — clear for implementation workflows. |
|
||||||
|
| Predictability | 5 | Returns contract_id, file_path, complexity, incoming/outgoing relations, neighbors. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of "task context" as a bounded working set. |
|
||||||
|
| Consistency | 5 | Output shape is deterministic and well-structured. |
|
||||||
|
| Documentation Clarity | 4 | Single required param; output fields are self-explanatory. |
|
||||||
|
| Error-Message Quality | 3 | Missing contract returns minimal output; could warn. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; accepts any contract_id. |
|
||||||
|
| Recovery Simplicity | 5 | Stateless; re-run anytime. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. workspace_semantic_health_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Semantic health" — clear dashboard-style summary. |
|
||||||
|
| Predictability | 5 | Returns contracts, relations, orphans, unresolved, complexity breakdown. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of "orphan" and "unresolved relation" concepts. |
|
||||||
|
| Consistency | 5 | Output shape is stable across invocations. |
|
||||||
|
| Documentation Clarity | 4 | No required params; optional workspace/schema. |
|
||||||
|
| Error-Message Quality | 4 | Includes `orphan_guidance` text explaining what orphans mean. |
|
||||||
|
| Validation Friction | 1 | No pre-validation needed. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.88 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. audit_contracts_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Audit contracts" — clear intent for quality checks. |
|
||||||
|
| Predictability | 5 | Returns warning counts by code, by file, top contracts, and sample warnings. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of GRACE metadata requirements per complexity level. |
|
||||||
|
| Consistency | 5 | Output shape is stable; `detail_level` controls verbosity. |
|
||||||
|
| Documentation Clarity | 4 | `detail_level` (summary/full) and `warning_limit` are well-documented. |
|
||||||
|
| Error-Message Quality | 4 | Warnings include code, message, file_path, contract_id — actionable. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; runs audit on any indexed workspace. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.88 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. diff_contract_semantics_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Diff contract semantics" — clear for comparing two contract versions. |
|
||||||
|
| Predictability | 5 | Returns identity_changed, body_changed, tier_changed, metadata_changes, relation_changes. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding that this compares semantic metadata, not just code. |
|
||||||
|
| Consistency | 5 | Output shape matches guarded_patch diff output. |
|
||||||
|
| Documentation Clarity | 4 | `before_contract_id` and `after_contract_id` are clear. |
|
||||||
|
| Error-Message Quality | 3 | Missing contracts may return empty diff; could warn. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; accepts any contract IDs. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. impact_analysis_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Impact analysis" — clear intent for dependency impact. |
|
||||||
|
| Predictability | 5 | Returns incoming, outgoing, transitive_outgoing, unresolved_outgoing. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of transitive dependency chains. |
|
||||||
|
| Consistency | 5 | Output shape matches guarded_patch impact output. |
|
||||||
|
| Documentation Clarity | 4 | Single required param; output fields are self-explanatory. |
|
||||||
|
| Error-Message Quality | 3 | Missing contract returns empty lists; could warn. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; accepts any contract_id. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. simulate_patch_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Simulate patch" — clear preview of changes without applying. |
|
||||||
|
| Predictability | 5 | Returns updated_content with full file preview, or error if invalid. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding that new_code must include DEF anchors. |
|
||||||
|
| Consistency | 5 | Output shape is stable (success, message, updated_content, warnings). |
|
||||||
|
| Documentation Clarity | 4 | Params are clear; error message explains DEF tag requirement. |
|
||||||
|
| Error-Message Quality | 5 | **Excellent**: "new_code must contain valid [DEF:AuthService:Type] and [/DEF:AuthService:Type] tags." |
|
||||||
|
| Validation Friction | 4 | Strict validation on DEF tag format — helpful, not obstructive. |
|
||||||
|
| Recovery Simplicity | 5 | No state change; fix new_code and re-run. |
|
||||||
|
|
||||||
|
**Average: 4.13 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 12. guarded_patch_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Guarded patch" — clear that validation guards are applied before changes. |
|
||||||
|
| Predictability | 5 | Returns diff, impact, and applied flag. Guards include syntax, semantic diff, impact. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of guard pipeline (syntax → semantic diff → impact). |
|
||||||
|
| Consistency | 5 | Output shape combines simulate_patch + impact_analysis results. |
|
||||||
|
| Documentation Clarity | 5 | `apply_patch` boolean is well-documented; all params clear. |
|
||||||
|
| Error-Message Quality | 4 | Inherits validation from simulate_patch; diff output is detailed. |
|
||||||
|
| Validation Friction | 4 | Strict but transparent — shows exactly what would change before applying. |
|
||||||
|
| Recovery Simplicity | 5 | With `apply_patch=false`, no state change. With `true`, git can revert. |
|
||||||
|
|
||||||
|
**Average: 4.13 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 13. patch_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Patch contract" — clear intent for in-place replacement. |
|
||||||
|
| Predictability | 5 | Replaces contract block with new_code; no preview (unlike guarded_patch). |
|
||||||
|
| Mental-Model Shift | 3 | Requires trust in the tool since there's no built-in preview. |
|
||||||
|
| Consistency | 4 | Simpler than guarded_patch; lacks validation pipeline. |
|
||||||
|
| Documentation Clarity | 4 | Params are clear; no apply_patch flag (always applies). |
|
||||||
|
| Error-Message Quality | 3 | Errors may be less informative than guarded_patch. |
|
||||||
|
| Validation Friction | 2 | Less strict than guarded_patch — applies directly. |
|
||||||
|
| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert or manual fix. |
|
||||||
|
|
||||||
|
**Average: 3.38 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 14. rename_contract_id_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Rename contract ID" — crystal clear. |
|
||||||
|
| Predictability | 5 | Renames identifier across indexed workspace. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding that this updates all references, not just the definition. |
|
||||||
|
| Consistency | 5 | Follows standard {success, message} pattern. |
|
||||||
|
| Documentation Clarity | 4 | `old_contract_id` and `new_contract_id` are clear. |
|
||||||
|
| Error-Message Quality | 3 | Missing old_id may fail silently; could warn. |
|
||||||
|
| Validation Friction | 2 | Applies directly; no preview of affected files. |
|
||||||
|
| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
|
||||||
|
|
||||||
|
**Average: 3.50 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 15. move_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Move contract" — clear intent for relocating a contract block. |
|
||||||
|
| Predictability | 5 | Moves contract from source to destination file. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding that this extracts and inserts, preserving anchors. |
|
||||||
|
| Consistency | 5 | Follows standard pattern. |
|
||||||
|
| Documentation Clarity | 4 | Three required params are clear. |
|
||||||
|
| Error-Message Quality | 3 | Missing files may fail with generic error. |
|
||||||
|
| Validation Friction | 2 | Applies directly; no preview. |
|
||||||
|
| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
|
||||||
|
|
||||||
|
**Average: 3.50 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 16. extract_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Extract contract" — clear intent for creating new contract from code range. |
|
||||||
|
| Predictability | 5 | Extracts lines into new GRACE contract block with specified type. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of line-based extraction and contract types. |
|
||||||
|
| Consistency | 5 | Follows standard pattern. |
|
||||||
|
| Documentation Clarity | 4 | Five required params (file, id, type, start, end) are clear. |
|
||||||
|
| Error-Message Quality | 3 | Invalid line ranges may fail with generic error. |
|
||||||
|
| Validation Friction | 2 | Applies directly; no preview. |
|
||||||
|
| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
|
||||||
|
|
||||||
|
**Average: 3.50 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 17. wrap_node_in_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Wrap node in contract" — clear intent for adding GRACE anchors to existing code. |
|
||||||
|
| Predictability | 5 | Uses ast-grep to locate node and wraps with [DEF]...[/DEF]. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of AST node matching and GRACE anchor format. |
|
||||||
|
| Consistency | 5 | Follows standard pattern. |
|
||||||
|
| Documentation Clarity | 4 | Params are clear; `lang` defaults to python. |
|
||||||
|
| Error-Message Quality | 3 | Missing node may fail silently. |
|
||||||
|
| Validation Friction | 2 | Applies directly; no preview. |
|
||||||
|
| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
|
||||||
|
|
||||||
|
**Average: 3.50 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 18. update_contract_metadata_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Update contract metadata" — crystal clear. |
|
||||||
|
| Predictability | 5 | Updates/adds tags without modifying code body. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of GRACE metadata schema (@PURPOSE, @RELATION, etc.). |
|
||||||
|
| Consistency | 5 | Returns updated_tags list; clear feedback. |
|
||||||
|
| Documentation Clarity | 5 | `tags` dict is well-documented; keys must start with '@'. |
|
||||||
|
| Error-Message Quality | 4 | Returns success message with updated tag names. |
|
||||||
|
| Validation Friction | 3 | Validates tag key format; accepts any value. |
|
||||||
|
| Recovery Simplicity | 4 | **Low risk**: only modifies metadata; easy to revert. |
|
||||||
|
|
||||||
|
**Average: 4.00 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19. rename_semantic_tag_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Rename semantic tag" — clear intent. |
|
||||||
|
| Predictability | 5 | Renames or removes a tag within a contract's metadata. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of tag lifecycle (rename vs. remove). |
|
||||||
|
| Consistency | 5 | Follows standard {success, message} pattern. |
|
||||||
|
| Documentation Clarity | 4 | `old_tag` required, `new_tag` optional (null = remove). |
|
||||||
|
| Error-Message Quality | 5 | **Excellent**: "Warning: Tag '@TIER' not found in contract AuthService" — precise and actionable. |
|
||||||
|
| Validation Friction | 3 | Validates tag existence before operation. |
|
||||||
|
| Recovery Simplicity | 4 | **Low risk**: only modifies metadata; easy to revert. |
|
||||||
|
|
||||||
|
**Average: 4.00 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 20. prune_contract_metadata_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Prune contract metadata" — clear intent for removing redundant tags. |
|
||||||
|
| Predictability | 5 | Removes tags optional for target complexity level; returns removed_tags. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of complexity levels (1-5) and their metadata requirements. |
|
||||||
|
| Consistency | 5 | Returns removed_tags list; clear feedback. |
|
||||||
|
| Documentation Clarity | 4 | `target_complexity` is optional; defaults inferred from contract. |
|
||||||
|
| Error-Message Quality | 4 | Returns success with removed tag names. |
|
||||||
|
| Validation Friction | 3 | Validates complexity level range (1-5). |
|
||||||
|
| Recovery Simplicity | 4 | **Low risk**: only removes metadata; easy to re-add. |
|
||||||
|
|
||||||
|
**Average: 3.88 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 21. infer_missing_relations_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Infer missing relations" — clear intent for discovering implicit dependencies. |
|
||||||
|
| Predictability | 5 | Analyzes AST imports, calls, type annotations; returns proposal. |
|
||||||
|
| Mental-Model Shift | 3 | Requires understanding of AST-based dependency discovery. |
|
||||||
|
| Consistency | 5 | Returns inferred list with apply_changes flag. |
|
||||||
|
| Documentation Clarity | 4 | `apply_changes` defaults to false (dry-run). |
|
||||||
|
| Error-Message Quality | 3 | Empty results return success with empty list; could hint at why. |
|
||||||
|
| Validation Friction | 2 | Dry-run by default; applies only when explicitly requested. |
|
||||||
|
| Recovery Simplicity | 4 | **Low risk**: dry-run default; applied changes modify metadata only. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 22. trace_tests_for_contract_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Trace tests for contract" — crystal clear. |
|
||||||
|
| Predictability | 5 | Returns list of test contracts with file_path, contract_id, tier. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding of TESTS relation in GRACE. |
|
||||||
|
| Consistency | 5 | Output shape is stable. |
|
||||||
|
| Documentation Clarity | 4 | Single required param; output is self-explanatory. |
|
||||||
|
| Error-Message Quality | 3 | No tests found returns empty list; could hint at adding tests. |
|
||||||
|
| Validation Friction | 1 | No pre-validation needed. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read; no state to undo. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 23. scaffold_contract_tests_tool
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Scaffold contract tests" — clear intent for generating test boilerplate. |
|
||||||
|
| Predictability | 5 | Returns pytest scaffolding with smoke + edge case tests from @TEST metadata. |
|
||||||
|
| Mental-Model Shift | 2 | Requires understanding that scaffolds are starting points, not complete tests. |
|
||||||
|
| Consistency | 5 | Output shape is stable (Python test code string). |
|
||||||
|
| Documentation Clarity | 4 | Single required param; output is ready-to-use code. |
|
||||||
|
| Error-Message Quality | 3 | Missing @TEST metadata returns minimal scaffold; could warn. |
|
||||||
|
| Validation Friction | 1 | No pre-validation; generates scaffold for any contract. |
|
||||||
|
| Recovery Simplicity | 5 | Returns code string; caller decides whether to write to file. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 24. find_contract_tool (alias)
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Find contract" — task-first alias for semantic lookup. |
|
||||||
|
| Predictability | 5 | Returns same output as search_contracts_tool. |
|
||||||
|
| Mental-Model Shift | 2 | Same as search_contracts_tool. |
|
||||||
|
| Consistency | 5 | Identical to search_contracts_tool output. |
|
||||||
|
| Documentation Clarity | 4 | Same params as search_contracts_tool. |
|
||||||
|
| Error-Message Quality | 3 | Same as search_contracts_tool. |
|
||||||
|
| Validation Friction | 1 | Same as search_contracts_tool. |
|
||||||
|
| Recovery Simplicity | 5 | Stateless query. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 25. read_outline_tool (alias)
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 4 | "Read outline" — task-first alias for file inspection. |
|
||||||
|
| Predictability | 5 | Same as read_grace_outline_tool. |
|
||||||
|
| Mental-Model Shift | 3 | Same as read_grace_outline_tool. |
|
||||||
|
| Consistency | 5 | Identical to read_grace_outline_tool output. |
|
||||||
|
| Documentation Clarity | 4 | Same params as read_grace_outline_tool. |
|
||||||
|
| Error-Message Quality | 3 | Same as read_grace_outline_tool. |
|
||||||
|
| Validation Friction | 1 | Same as read_grace_outline_tool. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read. |
|
||||||
|
|
||||||
|
**Average: 3.63 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 26. safe_patch_tool (alias)
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Safe patch" — task-first alias for validated patching. |
|
||||||
|
| Predictability | 5 | Same as guarded_patch_contract_tool. |
|
||||||
|
| Mental-Model Shift | 2 | Same as guarded_patch_contract_tool. |
|
||||||
|
| Consistency | 5 | Identical to guarded_patch_contract_tool output. |
|
||||||
|
| Documentation Clarity | 4 | Same params as guarded_patch_contract_tool. |
|
||||||
|
| Error-Message Quality | 4 | Same as guarded_patch_contract_tool. |
|
||||||
|
| Validation Friction | 4 | Same as guarded_patch_contract_tool. |
|
||||||
|
| Recovery Simplicity | 5 | Same as guarded_patch_contract_tool. |
|
||||||
|
|
||||||
|
**Average: 4.13 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 27. find_related_tests_tool (alias)
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Find related tests" — task-first alias for test lookup. |
|
||||||
|
| Predictability | 5 | Same as trace_tests_for_contract_tool. |
|
||||||
|
| Mental-Model Shift | 2 | Same as trace_tests_for_contract_tool. |
|
||||||
|
| Consistency | 5 | Identical to trace_tests_for_contract_tool output. |
|
||||||
|
| Documentation Clarity | 4 | Same params as trace_tests_for_contract_tool. |
|
||||||
|
| Error-Message Quality | 3 | Same as trace_tests_for_contract_tool. |
|
||||||
|
| Validation Friction | 1 | Same as trace_tests_for_contract_tool. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 28. analyze_impact_tool (alias)
|
||||||
|
|
||||||
|
| Metric | Score | Notes |
|
||||||
|
|--------|-------|-------|
|
||||||
|
| Understandability | 5 | "Analyze impact" — task-first alias for dependency analysis. |
|
||||||
|
| Predictability | 5 | Same as impact_analysis_tool. |
|
||||||
|
| Mental-Model Shift | 2 | Same as impact_analysis_tool. |
|
||||||
|
| Consistency | 5 | Identical to impact_analysis_tool output. |
|
||||||
|
| Documentation Clarity | 4 | Same params as impact_analysis_tool. |
|
||||||
|
| Error-Message Quality | 3 | Same as impact_analysis_tool. |
|
||||||
|
| Validation Friction | 1 | Same as impact_analysis_tool. |
|
||||||
|
| Recovery Simplicity | 5 | Pure read. |
|
||||||
|
|
||||||
|
**Average: 3.75 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Aggregate Summary
|
||||||
|
|
||||||
|
### Per-Metric Averages (All 28 Tools)
|
||||||
|
|
||||||
|
| Metric | Average Score | Assessment |
|
||||||
|
|--------|--------------|------------|
|
||||||
|
| **Understandability** | 4.57 | Excellent — tool names are descriptive and intent is clear. |
|
||||||
|
| **Predictability** | 5.00 | Perfect — all tools behave as expected based on their names and docs. |
|
||||||
|
| **Mental-Model Shift** | 2.43 | Moderate — requires GRACE domain knowledge; not intuitive for newcomers. |
|
||||||
|
| **Consistency** | 5.00 | Perfect — output shapes and patterns are uniform across the suite. |
|
||||||
|
| **Documentation Clarity** | 4.14 | Good — parameters are well-defined; could benefit from more examples. |
|
||||||
|
| **Error-Message Quality** | 3.57 | Acceptable — some tools have excellent errors (simulate_patch, rename_semantic_tag), others are silent. |
|
||||||
|
| **Validation Friction** | 2.14 | Good — most tools are lenient; mutation tools have appropriate strictness. |
|
||||||
|
| **Recovery Simplicity** | 4.57 | Excellent — read-only tools are stateless; mutation tools have clear recovery paths. |
|
||||||
|
|
||||||
|
### Overall Suite Average: **3.93 / 5**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Findings
|
||||||
|
|
||||||
|
### Strengths
|
||||||
|
1. **Consistent Output Shapes**: All tools follow predictable response patterns (`{success, message, ...}`).
|
||||||
|
2. **Clear Naming**: Tool names are self-descriptive; aliases provide task-first convenience.
|
||||||
|
3. **Safe Defaults**: Mutation tools default to dry-run (`apply_patch=false`, `apply_changes=false`).
|
||||||
|
4. **Excellent Validation on Patches**: `simulate_patch` and `guarded_patch` provide clear error messages when DEF tags are missing.
|
||||||
|
5. **Rich Metadata**: Tools return detailed semantic information (relations, complexity, impact).
|
||||||
|
|
||||||
|
### Areas for Improvement
|
||||||
|
1. **Mental Model Barrier**: GRACE concepts (contracts, anchors, complexity levels) require onboarding documentation.
|
||||||
|
2. **Silent Failures**: Some tools return empty results without hints (e.g., no tests found, no relations inferred).
|
||||||
|
3. **Mutation Safety**: `patch_contract_tool`, `rename_contract_id_tool`, `move_contract_tool` apply directly without preview — consider adding `dry_run` flag.
|
||||||
|
4. **Error Specificity**: Missing contract IDs could return more specific errors instead of empty results.
|
||||||
|
5. **Documentation Examples**: Parameter docs could include concrete examples for complex patterns (ast-grep, DEF tags).
|
||||||
|
|
||||||
|
### Recommendations
|
||||||
|
1. Add a "Getting Started" guide explaining GRACE concepts (contracts, anchors, complexity).
|
||||||
|
2. Add `dry_run` parameter to direct mutation tools (`patch_contract`, `rename_contract_id`, `move_contract`).
|
||||||
|
3. Improve empty-result responses with actionable hints (e.g., "No tests found — consider adding @TEST metadata").
|
||||||
|
4. Add example payloads to tool documentation for complex parameters.
|
||||||
|
5. Consider adding a `validate_only` mode to `infer_missing_relations` that explains why no relations were found.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# [/DEF:Axiom_Tools_Evaluation:Report]
|
||||||
47
.ai/reports/axiom_mcp_tools_evaluation.md
Normal file
47
.ai/reports/axiom_mcp_tools_evaluation.md
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
# Axiom MCP Tools Evaluation Report
|
||||||
|
|
||||||
|
## Общее резюме (Executive Summary)
|
||||||
|
|
||||||
|
В ходе тестирования поверхности Axiom MCP-инструментов были проверены основные категории: Query/Search, Semantic Health & Audit, AST/Semantic Patching, Workspace Management и Validation/Command execution.
|
||||||
|
Поведение инструментов оказалось строго регламентированным и предсказуемым в рамках GRACE-политик.
|
||||||
|
|
||||||
|
**Самые сильные стороны:**
|
||||||
|
1. **Validation Friction & Recovery Simplicity:** Наличие `simulate_patch_tool` и строгое использование preview-режимов для мутаций, а также возможность автоматического отката (`rollback_workspace_change_tool`) делают систему крайне устойчивой к ошибкам.
|
||||||
|
2. **Predictability:** Ошибки возвращаются в виде структурированных JSON-пакетов с четким указанием причины (missing anchors, forbidden path, invalid ID).
|
||||||
|
|
||||||
|
**Самые проблемные места (Ограничения):**
|
||||||
|
1. **Understandability / Mental-Model Shift:** Высокий порог входа из-за строгих требований GRACE (сложность контрактов от 1 до 5 уровня, обязательные якоря `[DEF]...[/DEF]`). Привычные паттерны (shell writes) заблокированы.
|
||||||
|
2. **Documentation Clarity:** Сообщения об ошибках иногда слишком сжатые или абстрактные (например, "Orphans are contracts without semantic relations" не всегда дает конкретный рецепт для внешних AST-нод).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Таблица оценок инструментов (Scale 1-5, где 5 - отлично)
|
||||||
|
|
||||||
|
| Tool Category | Tools Evaluated | Understandability | Predictability | Mental-Model Shift | Consistency | Doc Clarity | Error Quality | Validation Friction | Recovery Simplicity |
|
||||||
|
|---|---|---|---|---|---|---|---|---|---|
|
||||||
|
| **Query & Semantic Search** | `search_contracts`, `find_contract`, `query_workspace_semantics`, `get_semantic_context` | 4 | 5 | 3 | 5 | 4 | 5 | 5 (Low) | N/A (Read-only) |
|
||||||
|
| **Audit & Health** | `workspace_semantic_health`, `audit_contracts`, `audit_belief_protocol`, `diff_contract_semantics` | 4 | 5 | 3 | 5 | 4 | 4 | 4 (Low) | N/A (Read-only) |
|
||||||
|
| **AST & Semantic Mutators** | `patch_contract`, `guarded_patch_contract`, `wrap_node_in_contract`, `rename_semantic_tag` | 3 | 4 | 2 (High shift) | 5 | 4 | 4 | 2 (High - strict) | 5 (Easy undo) |
|
||||||
|
| **Workspace & File Ops** | `create_workspace_file`, `patch_workspace_file`, `manage_workspace_path`, `scaffold_workspace_module` | 5 | 5 | 4 | 5 | 5 | 5 | 3 (Moderate) | 5 |
|
||||||
|
| **Validation & Recovery** | `run_workspace_command`, `summarize_workspace_change`, `rollback_workspace_change`, `rebuild_workspace_semantic_index` | 4 | 5 | 5 (Native) | 5 | 5 | 5 | 5 (Low) | 5 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Детализированные заметки по категориям
|
||||||
|
|
||||||
|
### 1. Read / Search / Audit (Read-Only Tools)
|
||||||
|
- **Фактическое поведение:** Быстрое извлечение связей контрактов и AST-деревьев. `workspace_semantic_health_tool` возвращает точную структуру сложностей и "сиротские" (orphan) контракты.
|
||||||
|
- **Ошибки:** Если ID контракта не найден, возвращает пустой список или явную ошибку "Contract not found", что очень удобно для логики fallback.
|
||||||
|
- **Оценка:** Отлично работают, но требуют понимания, что поиск идет по *индексу*, а не просто по тексту (нужен актуальный индекс).
|
||||||
|
|
||||||
|
### 2. Mutation & Patching (Dangerous Tools)
|
||||||
|
- **Фактическое поведение:** Перед мутациями обязательно нужно понимать контекст (согласно Mental-Model Shift). Инструменты вроде `guarded_patch_contract_tool` сначала валидируют синтаксис (AST-check), семантические диффы и только потом применяют патч, если включен `apply_patch=True`.
|
||||||
|
- **Строгость валидации:** Крайне высокая. Попытки изменить файл без сохранения `[DEF]`-якорей отклоняются политикой или приводят к семантическим предупреждениям при следующем аудите.
|
||||||
|
- **Recovery:** Любая успешная мутация записывается в checkpoint (`.axiom/checkpoints`). Отмена через `rollback_workspace_change_tool` происходит атомарно.
|
||||||
|
|
||||||
|
### 3. Command Execution & Policy
|
||||||
|
- **Фактическое поведение:** `run_workspace_command_tool` работает в песочнице (bwrap). Запись вне `.axiom/temp` успешно пресекается политикой (Read-Only shell).
|
||||||
|
- **Ошибки:** Качество ошибок (Error-Message Quality) здесь наивысшее, так как мы получаем точные stdout/stderr процессы и код возврата.
|
||||||
|
|
||||||
|
### Вывод
|
||||||
|
Поверхность Axiom MCP спроектирована с приоритетом на **восстанавливаемость (Recovery)** и **предсказуемость (Predictability)**. Строгие барьеры (Validation Friction) намеренно высоки для поддержания семантической целостности кодовой базы.
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
280
.axiom/axiom_config.yaml
Normal file
280
.axiom/axiom_config.yaml
Normal file
@@ -0,0 +1,280 @@
|
|||||||
|
# AXIOM C.O.R.E. Unified Workspace Configuration
|
||||||
|
# Combines indexing rules and GRACE tag schema in a single file.
|
||||||
|
#
|
||||||
|
# Структура тегов разделена по:
|
||||||
|
# 1. Уровню сложности (min_complexity: 1-5)
|
||||||
|
# 2. Типу контракта (contract_types: Module | Function | Class | Block | Component | ADR)
|
||||||
|
#
|
||||||
|
# Матрица требований (semantics.md Section VI):
|
||||||
|
# C1 (ATOMIC): только якоря [DEF]...[/DEF]
|
||||||
|
# C2 (SIMPLE): + @PURPOSE
|
||||||
|
# C3 (FLOW): + @PURPOSE, @RELATION (UI: + @UX_STATE)
|
||||||
|
# C4 (ORCHESTRATION):+ @PURPOSE, @RELATION, @PRE, @POST, @SIDE_EFFECT
|
||||||
|
# C5 (CRITICAL): полный L4 + @DATA_CONTRACT + @INVARIANT
|
||||||
|
|
||||||
|
indexing:
|
||||||
|
# If empty, indexes the entire workspace (default behavior).
|
||||||
|
# If specified, only these directories are scanned for contracts.
|
||||||
|
include:
|
||||||
|
- "backend/src/"
|
||||||
|
- "frontend/src/"
|
||||||
|
# - "tests/"
|
||||||
|
|
||||||
|
# Excluded paths/patterns applied on top of include (or full workspace).
|
||||||
|
# Supports directory names and glob patterns.
|
||||||
|
exclude:
|
||||||
|
# Directories
|
||||||
|
- "specs/"
|
||||||
|
- ".ai/"
|
||||||
|
- ".git/"
|
||||||
|
- ".venv/"
|
||||||
|
- "__pycache__/"
|
||||||
|
- "node_modules/"
|
||||||
|
- ".pytest_cache/"
|
||||||
|
- ".mypy_cache/"
|
||||||
|
- ".ruff_cache/"
|
||||||
|
- ".axiom/"
|
||||||
|
# File patterns
|
||||||
|
- "*.md"
|
||||||
|
- "*.txt"
|
||||||
|
- "*.log"
|
||||||
|
- "*.yaml"
|
||||||
|
- "*.yml"
|
||||||
|
- "*.json"
|
||||||
|
- "*.toml"
|
||||||
|
- "*.ini"
|
||||||
|
- "*.cfg"
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# GRACE Tag Schema — разделено по сложности и типу контракта
|
||||||
|
# ============================================================
|
||||||
|
# contract_types определяет, для каких типов контрактов тег обязателен:
|
||||||
|
# - Module: заголовок модуля (файл)
|
||||||
|
# - Function: функции и методы
|
||||||
|
# - Class: классы
|
||||||
|
# - Block: логические блоки внутри функций
|
||||||
|
# - Component: UI-компоненты (Svelte)
|
||||||
|
# - ADR: архитектурные решения
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
tags:
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Complexity 2 (SIMPLE) — требуется @PURPOSE
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
PURPOSE:
|
||||||
|
type: string
|
||||||
|
multiline: true
|
||||||
|
description: "Основное предназначение модуля или функции"
|
||||||
|
min_complexity: 2
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Component
|
||||||
|
- ADR
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Complexity 3 (FLOW) — требуется @RELATION
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
RELATION:
|
||||||
|
type: array
|
||||||
|
separator: "->"
|
||||||
|
is_reference: true
|
||||||
|
description: "Граф зависимостей: PREDICATE -> TARGET_ID"
|
||||||
|
allowed_predicates:
|
||||||
|
- DEPENDS_ON
|
||||||
|
- CALLS
|
||||||
|
- INHERITS
|
||||||
|
- IMPLEMENTS
|
||||||
|
- DISPATCHES
|
||||||
|
- BINDS_TO
|
||||||
|
min_complexity: 3
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Component
|
||||||
|
|
||||||
|
LAYER:
|
||||||
|
type: string
|
||||||
|
enum: ["Domain", "UI", "Infra"]
|
||||||
|
description: "Архитектурный слой компонента"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
|
||||||
|
SEMANTICS:
|
||||||
|
type: array
|
||||||
|
separator: ","
|
||||||
|
description: "Ключевые слова для семантического поиска"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Complexity 3 — UX Contracts (Svelte 5+)
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
UX_STATE:
|
||||||
|
type: string
|
||||||
|
description: "Состояния UI: Idle, Loading, Error, Success"
|
||||||
|
contract_types:
|
||||||
|
- Component
|
||||||
|
|
||||||
|
UX_FEEDBACK:
|
||||||
|
type: string
|
||||||
|
description: "Реакция системы: Toast, Shake, RedBorder"
|
||||||
|
contract_types:
|
||||||
|
- Component
|
||||||
|
|
||||||
|
UX_RECOVERY:
|
||||||
|
type: string
|
||||||
|
description: "Путь восстановления после сбоя: Retry, ClearInput"
|
||||||
|
contract_types:
|
||||||
|
- Component
|
||||||
|
|
||||||
|
UX_REACTIVITY:
|
||||||
|
type: string
|
||||||
|
description: "Явный биндинг через руны: $state, $derived, $effect, $props"
|
||||||
|
contract_types:
|
||||||
|
- Component
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Complexity 4 (ORCHESTRATION) — DbC контракты
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
PRE:
|
||||||
|
type: string
|
||||||
|
description: "Предусловия (Pre-conditions)"
|
||||||
|
min_complexity: 4
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Module
|
||||||
|
|
||||||
|
POST:
|
||||||
|
type: string
|
||||||
|
description: "Постусловия (Post-conditions)"
|
||||||
|
min_complexity: 4
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Module
|
||||||
|
|
||||||
|
SIDE_EFFECT:
|
||||||
|
type: string
|
||||||
|
description: "Побочные эффекты: мутации, I/O, сеть"
|
||||||
|
min_complexity: 4
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Module
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Complexity 5 (CRITICAL) — полный контракт
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
DATA_CONTRACT:
|
||||||
|
type: string
|
||||||
|
description: "Ссылка на DTO: Input -> Model, Output -> Model"
|
||||||
|
min_complexity: 5
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Module
|
||||||
|
|
||||||
|
INVARIANT:
|
||||||
|
type: string
|
||||||
|
description: "Бизнес-инварианты, которые нельзя нарушить"
|
||||||
|
min_complexity: 5
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Module
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Decision Memory (ортогонально сложности)
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
RATIONALE:
|
||||||
|
type: string
|
||||||
|
multiline: true
|
||||||
|
description: "Почему выбран этот путь, какое ограничение/цель защищается"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- ADR
|
||||||
|
|
||||||
|
REJECTED:
|
||||||
|
type: string
|
||||||
|
multiline: true
|
||||||
|
description: "Какой путь запрещен и какой риск делает его недопустимым"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- ADR
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Test Contracts (Section X — упрощенные правила)
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
TEST_CONTRACT:
|
||||||
|
type: string
|
||||||
|
description: "Тестовый контракт: Input -> Output"
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Block
|
||||||
|
|
||||||
|
TEST_SCENARIO:
|
||||||
|
type: string
|
||||||
|
description: "Тестовый сценарий: Название -> Ожидание"
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Block
|
||||||
|
|
||||||
|
TEST_FIXTURE:
|
||||||
|
type: string
|
||||||
|
description: "Тестовая фикстура: Название -> file:[path] | INLINE_JSON"
|
||||||
|
contract_types:
|
||||||
|
- Block
|
||||||
|
|
||||||
|
TEST_EDGE:
|
||||||
|
type: string
|
||||||
|
description: "Граничный случай: Название -> Сбой"
|
||||||
|
contract_types:
|
||||||
|
- Function
|
||||||
|
- Block
|
||||||
|
|
||||||
|
TEST_INVARIANT:
|
||||||
|
type: string
|
||||||
|
description: "Тестовый инвариант: Имя -> VERIFIED_BY: [scenarios]"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
# Metadata / Classification
|
||||||
|
# ----------------------------------------------------------
|
||||||
|
TIER:
|
||||||
|
type: string
|
||||||
|
enum: ["CRITICAL", "STANDARD", "TRIVIAL"]
|
||||||
|
description: "Уровень критичности компонента"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
|
||||||
|
COMPLEXITY:
|
||||||
|
type: string
|
||||||
|
enum: ["1", "2", "3", "4", "5"]
|
||||||
|
description: "Уровень сложности контракта"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Component
|
||||||
|
|
||||||
|
C:
|
||||||
|
type: string
|
||||||
|
enum: ["1", "2", "3", "4", "5"]
|
||||||
|
description: "Сокращение для @COMPLEXITY"
|
||||||
|
contract_types:
|
||||||
|
- Module
|
||||||
|
- Function
|
||||||
|
- Class
|
||||||
|
- Component
|
||||||
BIN
.axiom/semantic_index/index.duckdb
Normal file
BIN
.axiom/semantic_index/index.duckdb
Normal file
Binary file not shown.
6
.gitignore
vendored
6
.gitignore
vendored
@@ -78,3 +78,9 @@ node_modules/
|
|||||||
coverage/
|
coverage/
|
||||||
*.tmp
|
*.tmp
|
||||||
logs/app.log.1
|
logs/app.log.1
|
||||||
|
audit_report.txt
|
||||||
|
check_semantics.py
|
||||||
|
docs_audit_report.txt
|
||||||
|
run_mcp.py
|
||||||
|
semantic_audit_report.md
|
||||||
|
.axiom/checkpoints
|
||||||
214
.kilo/agents/mcp-coder-stupid.md
Normal file
214
.kilo/agents/mcp-coder-stupid.md
Normal file
@@ -0,0 +1,214 @@
|
|||||||
|
---
|
||||||
|
description: MCP-only implementation specialist; writes and validates code only through AXIOM MCP tooling.
|
||||||
|
mode: subagent
|
||||||
|
model: github-copilot/gemini-3.1-pro-preview
|
||||||
|
temperature: 0.1
|
||||||
|
permission:
|
||||||
|
edit: deny
|
||||||
|
bash: deny
|
||||||
|
browser: deny
|
||||||
|
task:
|
||||||
|
"*": deny
|
||||||
|
steps: 80
|
||||||
|
color: accent
|
||||||
|
---
|
||||||
|
|
||||||
|
You are Kilo Code, acting as the MCP Coder.
|
||||||
|
|
||||||
|
# SYSTEM DIRECTIVE: GRACE-Poly v2.3
|
||||||
|
> OPERATION MODE: MCP-ONLY IMPLEMENTATION
|
||||||
|
> ROLE: Implementation specialist restricted to AXIOM MCP mutation, validation, recovery, and semantic-query surfaces
|
||||||
|
|
||||||
|
## Core Mandate
|
||||||
|
- Read `.ai/ROOT.md` first.
|
||||||
|
- Use `.ai/standards/semantics.md` as the semantic source of truth.
|
||||||
|
- Follow `.ai/standards/constitution.md`, `.ai/standards/api_design.md`, and `.ai/standards/ui_design.md`.
|
||||||
|
- Implement code only through the AXIOM MCP server surface.
|
||||||
|
- Preserve or add required semantic anchors and metadata before changing logic.
|
||||||
|
- Keep modules under 300 lines; decompose instead of growing large files.
|
||||||
|
- Use guards or explicit errors; never use `assert` for runtime contract enforcement.
|
||||||
|
- Treat `@RATIONALE` and `@REJECTED` as hard anti-regression constraints.
|
||||||
|
- If relation, schema, dependency, path policy, or semantic target is unclear, emit `[NEED_CONTEXT: target]`.
|
||||||
|
|
||||||
|
## Hard Boundary
|
||||||
|
- Allowed mutation surface: AXIOM MCP server only.
|
||||||
|
- Forbidden: native file editing, native direct-write tools, native shell execution, browser execution, and subagent delegation.
|
||||||
|
- Never bypass an MCP policy block with a workaround outside the MCP server.
|
||||||
|
- If a persistent file change is needed, use an MCP mutation tool.
|
||||||
|
- If repository verification is needed, use the MCP sandboxed command tool.
|
||||||
|
- If the required capability does not exist in the AXIOM MCP server, stop with `[NEED_CONTEXT: mcp_surface_gap]`.
|
||||||
|
|
||||||
|
## Approved MCP Tool Graph
|
||||||
|
### Policy and semantic context
|
||||||
|
- `get_workspace_policy`
|
||||||
|
- `find_contract_tool`
|
||||||
|
- `read_outline_tool`
|
||||||
|
- `read_grace_outline_tool`
|
||||||
|
- `build_task_context_tool`
|
||||||
|
- `get_semantic_context_tool`
|
||||||
|
- `query_workspace_semantics`
|
||||||
|
- `trace_tests_for_contract_tool`
|
||||||
|
- `find_related_tests_tool`
|
||||||
|
- `analyze_impact_tool`
|
||||||
|
- `audit_contracts_tool`
|
||||||
|
- `audit_belief_protocol_tool`
|
||||||
|
|
||||||
|
### MCP mutation and scaffold surface
|
||||||
|
- `create_workspace_file`
|
||||||
|
- `patch_workspace_file`
|
||||||
|
- `manage_workspace_path`
|
||||||
|
- `scaffold_workspace_module`
|
||||||
|
- `safe_patch_tool`
|
||||||
|
- `guarded_patch_contract_tool`
|
||||||
|
- `patch_contract_tool`
|
||||||
|
- `update_contract_metadata_tool`
|
||||||
|
- `wrap_node_in_contract_tool`
|
||||||
|
- `rename_contract_id_tool`
|
||||||
|
- `move_contract_tool`
|
||||||
|
- `extract_contract_tool`
|
||||||
|
- `rename_semantic_tag_tool`
|
||||||
|
- `prune_contract_metadata_tool`
|
||||||
|
- `infer_missing_relations_tool`
|
||||||
|
- `patch_belief_protocol_tool`
|
||||||
|
|
||||||
|
### Verification, recovery, and evidence
|
||||||
|
- `run_workspace_command`
|
||||||
|
- `summarize_workspace_change`
|
||||||
|
- `rollback_workspace_change`
|
||||||
|
- `rebuild_workspace_semantic_index`
|
||||||
|
- `read_runtime_events`
|
||||||
|
|
||||||
|
## Required Workflow
|
||||||
|
1. Load the root knowledge map and semantic standards.
|
||||||
|
2. Read effective workspace policy through `get_workspace_policy` before any mutation or sandboxed verification.
|
||||||
|
3. Resolve the semantic target through contract discovery, semantic outline, task context, or bounded semantic query.
|
||||||
|
4. Prefer preview-first mutation via `patch_workspace_file`, `safe_patch_tool`, or `guarded_patch_contract_tool` whenever a target already exists.
|
||||||
|
5. Use `create_workspace_file`, `manage_workspace_path`, and `scaffold_workspace_module` only for bounded create, move, rename, delete, or bootstrap actions.
|
||||||
|
6. Preserve semantic anchors, required contracts, and decision-memory tags during every mutation.
|
||||||
|
7. Run tests, linters, searches, and build checks only through `run_workspace_command`.
|
||||||
|
8. Inspect mutation evidence through `summarize_workspace_change`, query blast radius through `query_workspace_semantics`, and use rollback through `rollback_workspace_change` if recovery is required.
|
||||||
|
9. If the semantic index is stale or degraded after major changes, use `rebuild_workspace_semantic_index` instead of guessing about impact.
|
||||||
|
10. Never translate an MCP-blocked write into shell-based write behavior.
|
||||||
|
|
||||||
|
## Complexity Contract Matrix
|
||||||
|
- Complexity 1: anchors only.
|
||||||
|
- Complexity 2: `@PURPOSE`.
|
||||||
|
- Complexity 3: `@PURPOSE`, `@RELATION`; UI also `@UX_STATE`.
|
||||||
|
- Complexity 4: `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; meaningful `logger.reason()` and `logger.reflect()` for Python.
|
||||||
|
- Complexity 5: full L4 plus `@DATA_CONTRACT` and `@INVARIANT`; `belief_scope` mandatory.
|
||||||
|
- Decision-memory overlay: `@RATIONALE` and `@REJECTED` are mandatory whenever upstream ADR or retained workaround constrains the implementation path.
|
||||||
|
|
||||||
|
## MCP-Only Mutation Rules
|
||||||
|
- Use `patch_workspace_file` for generic text, line-range, or AST-node mutation.
|
||||||
|
- Use contract-aware mutation tools when the change is naturally scoped to a GRACE contract boundary.
|
||||||
|
- Use `update_contract_metadata_tool` and related semantic tools for header-only repairs instead of broad rewrites.
|
||||||
|
- Use `manage_workspace_path` for path creation, move, rename, inspect, and delete instead of shell path commands.
|
||||||
|
- Use `scaffold_workspace_module` for new module bootstrap instead of writing starter files manually.
|
||||||
|
- Treat protected paths, checkpoint storage, semantic-index artifacts, runtime-event logs, and `.axiom/` operational state as immutable unless an MCP tool explicitly owns that path.
|
||||||
|
|
||||||
|
## Sandboxed Verification Rules
|
||||||
|
- Use `run_workspace_command` for pytest, ruff, grep, ls, cat, and other read-only command workflows.
|
||||||
|
- If a shell workflow tries to write outside `.axiom/temp/`, treat the block as correct behavior.
|
||||||
|
- Redirect persistent edits from sandboxed command flows back to MCP mutation tools.
|
||||||
|
- Prefer narrow verification commands tied to the changed scope.
|
||||||
|
|
||||||
|
## Evidence Envelope Contract
|
||||||
|
Before completion, return one bounded evidence packet containing:
|
||||||
|
- `task_scope`
|
||||||
|
- `mcp_tools_used`
|
||||||
|
- `changed_paths`
|
||||||
|
- `checkpoints`
|
||||||
|
- `symbols_added_or_modified`
|
||||||
|
- `mapped_contract_ids`
|
||||||
|
- `commands_run_via_mcp`
|
||||||
|
- `semantic_queries_used`
|
||||||
|
- `decision_memory_applied`
|
||||||
|
- `self_check_semantics`
|
||||||
|
- `self_check_dbc`
|
||||||
|
- `self_check_belief_state`
|
||||||
|
- `self_check_tests`
|
||||||
|
- `rollback_path`
|
||||||
|
- `remaining_debt`
|
||||||
|
- `known_risks`
|
||||||
|
|
||||||
|
## Self-Check Requirements
|
||||||
|
### Semantic self-check
|
||||||
|
Verify and report:
|
||||||
|
- every changed module has a valid module anchor
|
||||||
|
- every changed non-trivial boundary has required local `[DEF]...[/DEF]`
|
||||||
|
- no broken or mismatched anchors remain
|
||||||
|
- changed test files respect the simplified semantic test policy
|
||||||
|
|
||||||
|
### DbC self-check
|
||||||
|
Verify and report required tags per changed symbol according to effective complexity:
|
||||||
|
- `@PURPOSE`
|
||||||
|
- `@RELATION`
|
||||||
|
- `@PRE`
|
||||||
|
- `@POST`
|
||||||
|
- `@SIDE_EFFECT`
|
||||||
|
- `@DATA_CONTRACT`
|
||||||
|
- `@INVARIANT`
|
||||||
|
- UI-only contracts when the touched scope crosses into frontend files
|
||||||
|
|
||||||
|
### Belief-state self-check
|
||||||
|
For Complexity 4 and 5 Python paths, verify and report:
|
||||||
|
- `belief_scope(...)`
|
||||||
|
- meaningful `logger.reason(...)`
|
||||||
|
- meaningful `logger.reflect(...)`
|
||||||
|
- retained workaround handling through `logger.explore(...)` plus local `@RATIONALE` and `@REJECTED`
|
||||||
|
|
||||||
|
### Test self-check
|
||||||
|
Verify and report:
|
||||||
|
- required tests written or updated through MCP mutation tools
|
||||||
|
- required tests executed through `run_workspace_command`
|
||||||
|
- exact commands used
|
||||||
|
- exact pass or fail outcome
|
||||||
|
- any test gaps that could not be closed through the available MCP surface
|
||||||
|
|
||||||
|
## Completion Gate
|
||||||
|
You may claim completion only when:
|
||||||
|
- all persistent repository writes flowed through AXIOM MCP mutation tools
|
||||||
|
- no native direct-write or shell-write path was used
|
||||||
|
- no broken `[DEF]` anchors remain in changed scope
|
||||||
|
- no required contracts are missing for the effective complexity
|
||||||
|
- no surviving workaround ships without local `@RATIONALE` and `@REJECTED`
|
||||||
|
- every applied mutation has a checkpoint or an explicit MCP operation record
|
||||||
|
- a rollback path exists for every applied change set that should be recoverable
|
||||||
|
- the evidence envelope is complete enough for external validation
|
||||||
|
|
||||||
|
## Anti-Loop Protocol
|
||||||
|
### `[ATTEMPT: 1-2]`
|
||||||
|
- Continue with targeted MCP mutation and sandboxed verification.
|
||||||
|
- Prefer minimal patches and explicit preview/apply behavior.
|
||||||
|
|
||||||
|
### `[ATTEMPT: 3]`
|
||||||
|
- Stop trusting the current local hypothesis.
|
||||||
|
- Re-check workspace policy, target resolution, contract identity, checkpoint history, semantic freshness, and sandbox restrictions before mutating again.
|
||||||
|
- Treat the likely failure as policy, contract, path, or stale-target mismatch rather than routine logic drift.
|
||||||
|
|
||||||
|
### `[ATTEMPT: 4+]`
|
||||||
|
- Do not continue patch churn.
|
||||||
|
- Output a bounded escalation packet containing:
|
||||||
|
- `status: blocked`
|
||||||
|
- `task_scope`
|
||||||
|
- `suspected_failure_layer`
|
||||||
|
- `mcp_tools_used`
|
||||||
|
- `what_was_tried`
|
||||||
|
- `what_did_not_work`
|
||||||
|
- `current_invariants`
|
||||||
|
- `checkpoint_state`
|
||||||
|
- `latest_blocking_error`
|
||||||
|
- `request: re-evaluate at MCP policy, contract, or architecture level`
|
||||||
|
|
||||||
|
## Output Contract
|
||||||
|
Return compactly:
|
||||||
|
- `applied`
|
||||||
|
- `evidence_envelope`
|
||||||
|
- `remaining`
|
||||||
|
- `risk`
|
||||||
|
|
||||||
|
Do not return:
|
||||||
|
- raw tool transcript
|
||||||
|
- speculative chain-of-thought
|
||||||
|
- unbounded command output
|
||||||
|
- proposals that require native write or native shell as a fallback
|
||||||
@@ -1 +1 @@
|
|||||||
{"mcpServers":{"axiom-core":{"command":"/home/busya/dev/ast-mcp-core-server/.venv/bin/python","args":["-c","from src.server import main; main()"],"env":{"PYTHONPATH":"/home/busya/dev/ast-mcp-core-server"},"alwaysAllow":["read_grace_outline_tool","ast_search_tool","get_semantic_context_tool","build_task_context_tool","audit_contracts_tool","diff_contract_semantics_tool","simulate_patch_tool","patch_contract_tool","rename_contract_id_tool","move_contract_tool","extract_contract_tool","infer_missing_relations_tool","map_runtime_trace_to_contracts_tool","scaffold_contract_tests_tool","search_contracts_tool","reindex_workspace_tool","prune_contract_metadata_tool","workspace_semantic_health_tool","trace_tests_for_contract_tool","guarded_patch_contract_tool","impact_analysis_tool","update_contract_metadata_tool","wrap_node_in_contract_tool","rename_semantic_tag_tool","scan_vulnerabilities"]},"chrome-devtools":{"command":"npx","args":["chrome-devtools-mcp@latest","--browser-url=http://127.0.0.1:9222"],"disabled":false,"alwaysAllow":["take_snapshot"]}}}
|
{"mcpServers":{"axiom-core":{"command":"/home/busya/dev/ast-mcp-core-server/.venv/bin/python","args":["-c","from src.server import main; main()"],"env":{"PYTHONPATH":"/home/busya/dev/ast-mcp-core-server"},"alwaysAllow":["read_grace_outline_tool","ast_search_tool","get_semantic_context_tool","build_task_context_tool","audit_contracts_tool","diff_contract_semantics_tool","simulate_patch_tool","patch_contract_tool","rename_contract_id_tool","move_contract_tool","extract_contract_tool","infer_missing_relations_tool","map_runtime_trace_to_contracts_tool","scaffold_contract_tests_tool","search_contracts_tool","reindex_workspace_tool","prune_contract_metadata_tool","workspace_semantic_health_tool","trace_tests_for_contract_tool","guarded_patch_contract_tool","impact_analysis_tool","update_contract_metadata_tool","wrap_node_in_contract_tool","rename_semantic_tag_tool","scan_vulnerabilities","find_contract_tool","safe_patch_tool","run_workspace_command_tool"]},"chrome-devtools":{"command":"npx","args":["chrome-devtools-mcp@latest","--browser-url=http://127.0.0.1:9222"],"disabled":false,"alwaysAllow":["take_snapshot"]}}}
|
||||||
@@ -12,7 +12,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
|
|
||||||
## Goal
|
## Goal
|
||||||
|
|
||||||
Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
|
Identify inconsistencies, duplications, ambiguities, underspecified items, and decision-memory drift across the core artifacts (`spec.md`, `plan.md`, `tasks.md`, and ADR sources) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
|
||||||
|
|
||||||
## Operating Constraints
|
## Operating Constraints
|
||||||
|
|
||||||
@@ -29,6 +29,7 @@ Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --inclu
|
|||||||
- SPEC = FEATURE_DIR/spec.md
|
- SPEC = FEATURE_DIR/spec.md
|
||||||
- PLAN = FEATURE_DIR/plan.md
|
- PLAN = FEATURE_DIR/plan.md
|
||||||
- TASKS = FEATURE_DIR/tasks.md
|
- TASKS = FEATURE_DIR/tasks.md
|
||||||
|
- ADR = `docs/architecture.md` and/or feature-local decision files when present
|
||||||
|
|
||||||
Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
|
Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
|
||||||
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
||||||
@@ -37,7 +38,7 @@ For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot
|
|||||||
|
|
||||||
Load only the minimal necessary context from each artifact:
|
Load only the minimal necessary context from each artifact:
|
||||||
|
|
||||||
**From spec.md:**
|
**From `spec.md`:**
|
||||||
|
|
||||||
- Overview/Context
|
- Overview/Context
|
||||||
- Functional Requirements
|
- Functional Requirements
|
||||||
@@ -45,20 +46,29 @@ Load only the minimal necessary context from each artifact:
|
|||||||
- User Stories
|
- User Stories
|
||||||
- Edge Cases (if present)
|
- Edge Cases (if present)
|
||||||
|
|
||||||
**From plan.md:**
|
**From `plan.md`:**
|
||||||
|
|
||||||
- Architecture/stack choices
|
- Architecture/stack choices
|
||||||
- Data Model references
|
- Data Model references
|
||||||
- Phases
|
- Phases
|
||||||
- Technical constraints
|
- Technical constraints
|
||||||
|
- ADR references or emitted decisions
|
||||||
|
|
||||||
**From tasks.md:**
|
**From `tasks.md`:**
|
||||||
|
|
||||||
- Task IDs
|
- Task IDs
|
||||||
- Descriptions
|
- Descriptions
|
||||||
- Phase grouping
|
- Phase grouping
|
||||||
- Parallel markers [P]
|
- Parallel markers [P]
|
||||||
- Referenced file paths
|
- Referenced file paths
|
||||||
|
- Guardrail summaries derived from `@RATIONALE` / `@REJECTED`
|
||||||
|
|
||||||
|
**From ADR sources:**
|
||||||
|
|
||||||
|
- `[DEF:id:ADR]` nodes
|
||||||
|
- `@RATIONALE`
|
||||||
|
- `@REJECTED`
|
||||||
|
- `@RELATION`
|
||||||
|
|
||||||
**From constitution:**
|
**From constitution:**
|
||||||
|
|
||||||
@@ -73,6 +83,7 @@ Create internal representations (do not include raw artifacts in output):
|
|||||||
- **User story/action inventory**: Discrete user actions with acceptance criteria
|
- **User story/action inventory**: Discrete user actions with acceptance criteria
|
||||||
- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
|
- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
|
||||||
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
|
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
|
||||||
|
- **Decision-memory inventory**: ADR ids, accepted paths, rejected paths, and the tasks/contracts expected to inherit them
|
||||||
|
|
||||||
### 4. Detection Passes (Token-Efficient Analysis)
|
### 4. Detection Passes (Token-Efficient Analysis)
|
||||||
|
|
||||||
@@ -112,13 +123,21 @@ Focus on high-signal findings. Limit to 50 findings total; aggregate remainder i
|
|||||||
- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
|
- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
|
||||||
- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
|
- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
|
||||||
|
|
||||||
|
#### G. Decision-Memory Drift
|
||||||
|
|
||||||
|
- ADR exists in planning but has no downstream task guardrail
|
||||||
|
- Task carries a guardrail with no upstream ADR or plan rationale
|
||||||
|
- Task text accidentally schedules an ADR-rejected path
|
||||||
|
- Missing preventive `@RATIONALE` / `@REJECTED` summaries for known traps
|
||||||
|
- Rejected-path notes that contradict later plan or task language without explicit decision revision
|
||||||
|
|
||||||
### 5. Severity Assignment
|
### 5. Severity Assignment
|
||||||
|
|
||||||
Use this heuristic to prioritize findings:
|
Use this heuristic to prioritize findings:
|
||||||
|
|
||||||
- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
|
- **CRITICAL**: Violates constitution MUST, missing core spec artifact, missing blocking ADR, rejected path scheduled as work, or requirement with zero coverage that blocks baseline functionality
|
||||||
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
|
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion, ADR guardrail drift
|
||||||
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
|
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case, incomplete decision-memory propagation
|
||||||
- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
|
- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
|
||||||
|
|
||||||
### 6. Produce Compact Analysis Report
|
### 6. Produce Compact Analysis Report
|
||||||
@@ -138,6 +157,11 @@ Output a Markdown report (no file writes) with the following structure:
|
|||||||
| Requirement Key | Has Task? | Task IDs | Notes |
|
| Requirement Key | Has Task? | Task IDs | Notes |
|
||||||
|-----------------|-----------|----------|-------|
|
|-----------------|-----------|----------|-------|
|
||||||
|
|
||||||
|
**Decision Memory Summary Table:**
|
||||||
|
|
||||||
|
| ADR / Guardrail | Present in Plan | Propagated to Tasks | Rejected Path Protected | Notes |
|
||||||
|
|-----------------|-----------------|---------------------|-------------------------|-------|
|
||||||
|
|
||||||
**Constitution Alignment Issues:** (if any)
|
**Constitution Alignment Issues:** (if any)
|
||||||
|
|
||||||
**Unmapped Tasks:** (if any)
|
**Unmapped Tasks:** (if any)
|
||||||
@@ -150,6 +174,8 @@ Output a Markdown report (no file writes) with the following structure:
|
|||||||
- Ambiguity Count
|
- Ambiguity Count
|
||||||
- Duplication Count
|
- Duplication Count
|
||||||
- Critical Issues Count
|
- Critical Issues Count
|
||||||
|
- ADR Count
|
||||||
|
- Guardrail Drift Count
|
||||||
|
|
||||||
### 7. Provide Next Actions
|
### 7. Provide Next Actions
|
||||||
|
|
||||||
@@ -179,6 +205,7 @@ Ask the user: "Would you like me to suggest concrete remediation edits for the t
|
|||||||
- **Prioritize constitution violations** (these are always CRITICAL)
|
- **Prioritize constitution violations** (these are always CRITICAL)
|
||||||
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
|
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
|
||||||
- **Report zero issues gracefully** (emit success report with coverage statistics)
|
- **Report zero issues gracefully** (emit success report with coverage statistics)
|
||||||
|
- **Treat missing ADR propagation as a real defect, not a documentation nit**
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ description: Generate a custom checklist for the current feature based on user r
|
|||||||
|
|
||||||
## Checklist Purpose: "Unit Tests for English"
|
## Checklist Purpose: "Unit Tests for English"
|
||||||
|
|
||||||
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
|
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, completeness, and decision-memory readiness of requirements in a given domain.
|
||||||
|
|
||||||
**NOT for verification/testing**:
|
**NOT for verification/testing**:
|
||||||
|
|
||||||
@@ -20,6 +20,7 @@ description: Generate a custom checklist for the current feature based on user r
|
|||||||
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
|
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
|
||||||
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
|
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
|
||||||
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
|
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
|
||||||
|
- ✅ "Do repo-shaping choices have explicit rationale and rejected alternatives before task decomposition?" (decision memory)
|
||||||
|
|
||||||
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
|
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
|
||||||
|
|
||||||
@@ -47,7 +48,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
|
1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
|
||||||
2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
|
2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
|
||||||
3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
|
3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
|
||||||
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
|
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria, decision-memory needs.
|
||||||
5. Formulate questions chosen from these archetypes:
|
5. Formulate questions chosen from these archetypes:
|
||||||
- Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
|
- Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
|
||||||
- Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
|
- Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
|
||||||
@@ -55,6 +56,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
|
- Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
|
||||||
- Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
|
- Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
|
||||||
- Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
|
- Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
|
||||||
|
- Decision-memory gap (e.g., "Do we need explicit ADR and rejected-path checks for this feature?")
|
||||||
|
|
||||||
Question formatting rules:
|
Question formatting rules:
|
||||||
- If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
|
- If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
|
||||||
@@ -76,9 +78,10 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Infer any missing context from spec/plan/tasks (do NOT hallucinate)
|
- Infer any missing context from spec/plan/tasks (do NOT hallucinate)
|
||||||
|
|
||||||
4. **Load feature context**: Read from FEATURE_DIR:
|
4. **Load feature context**: Read from FEATURE_DIR:
|
||||||
- spec.md: Feature requirements and scope
|
- `spec.md`: Feature requirements and scope
|
||||||
- plan.md (if exists): Technical details, dependencies
|
- `plan.md` (if exists): Technical details, dependencies, ADR references
|
||||||
- tasks.md (if exists): Implementation tasks
|
- `tasks.md` (if exists): Implementation tasks and inherited guardrails
|
||||||
|
- ADR artifacts (if present): `[DEF:id:ADR]`, `@RATIONALE`, `@REJECTED`
|
||||||
|
|
||||||
**Context Loading Strategy**:
|
**Context Loading Strategy**:
|
||||||
- Load only necessary portions relevant to active focus areas (avoid full-file dumping)
|
- Load only necessary portions relevant to active focus areas (avoid full-file dumping)
|
||||||
@@ -102,6 +105,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- **Consistency**: Do requirements align with each other?
|
- **Consistency**: Do requirements align with each other?
|
||||||
- **Measurability**: Can requirements be objectively verified?
|
- **Measurability**: Can requirements be objectively verified?
|
||||||
- **Coverage**: Are all scenarios/edge cases addressed?
|
- **Coverage**: Are all scenarios/edge cases addressed?
|
||||||
|
- **Decision Memory**: Are durable choices and rejected alternatives explicit before implementation starts?
|
||||||
|
|
||||||
**Category Structure** - Group items by requirement quality dimensions:
|
**Category Structure** - Group items by requirement quality dimensions:
|
||||||
- **Requirement Completeness** (Are all necessary requirements documented?)
|
- **Requirement Completeness** (Are all necessary requirements documented?)
|
||||||
@@ -112,6 +116,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- **Edge Case Coverage** (Are boundary conditions defined?)
|
- **Edge Case Coverage** (Are boundary conditions defined?)
|
||||||
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
|
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
|
||||||
- **Dependencies & Assumptions** (Are they documented and validated?)
|
- **Dependencies & Assumptions** (Are they documented and validated?)
|
||||||
|
- **Decision Memory & ADRs** (Are architectural choices, rationale, and rejected paths explicit?)
|
||||||
- **Ambiguities & Conflicts** (What needs clarification?)
|
- **Ambiguities & Conflicts** (What needs clarification?)
|
||||||
|
|
||||||
**HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
|
**HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
|
||||||
@@ -127,8 +132,8 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
|
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
|
||||||
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
|
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
|
||||||
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
|
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
|
||||||
- "Are loading states defined for asynchronous episode data?" [Completeness]
|
- "Are blocking architecture decisions recorded with explicit rationale and rejected alternatives before task generation?" [Decision Memory]
|
||||||
- "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
|
- "Does the plan make clear which implementation shortcuts are forbidden for this feature?" [Decision Memory, Gap]
|
||||||
|
|
||||||
**ITEM STRUCTURE**:
|
**ITEM STRUCTURE**:
|
||||||
Each item should follow this pattern:
|
Each item should follow this pattern:
|
||||||
@@ -163,6 +168,11 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
|
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
|
||||||
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
|
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
|
||||||
|
|
||||||
|
Decision Memory:
|
||||||
|
- "Do all repo-shaping technical choices have explicit rationale before tasks are generated? [Decision Memory, Plan]"
|
||||||
|
- "Are rejected alternatives documented for architectural branches that would materially change implementation scope? [Decision Memory, Gap]"
|
||||||
|
- "Can a coder determine from the planning artifacts which tempting shortcut is forbidden? [Decision Memory, Clarity]"
|
||||||
|
|
||||||
**Scenario Classification & Coverage** (Requirements Quality Focus):
|
**Scenario Classification & Coverage** (Requirements Quality Focus):
|
||||||
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
|
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
|
||||||
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
|
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
|
||||||
@@ -171,7 +181,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
|
|
||||||
**Traceability Requirements**:
|
**Traceability Requirements**:
|
||||||
- MINIMUM: ≥80% of items MUST include at least one traceability reference
|
- MINIMUM: ≥80% of items MUST include at least one traceability reference
|
||||||
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
|
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`, `[ADR]`
|
||||||
- If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
|
- If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
|
||||||
|
|
||||||
**Surface & Resolve Issues** (Requirements Quality Problems):
|
**Surface & Resolve Issues** (Requirements Quality Problems):
|
||||||
@@ -181,6 +191,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
|
- Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
|
||||||
- Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
|
- Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
|
||||||
- Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
|
- Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
|
||||||
|
- Decision-memory drift: "Do tasks inherit the same rejected-path guardrails defined in planning? [Decision Memory, Conflict]"
|
||||||
|
|
||||||
**Content Consolidation**:
|
**Content Consolidation**:
|
||||||
- Soft cap: If raw candidate items > 40, prioritize by risk/impact
|
- Soft cap: If raw candidate items > 40, prioritize by risk/impact
|
||||||
@@ -193,7 +204,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- ❌ "Displays correctly", "works properly", "functions as expected"
|
- ❌ "Displays correctly", "works properly", "functions as expected"
|
||||||
- ❌ "Click", "navigate", "render", "load", "execute"
|
- ❌ "Click", "navigate", "render", "load", "execute"
|
||||||
- ❌ Test cases, test plans, QA procedures
|
- ❌ Test cases, test plans, QA procedures
|
||||||
- ❌ Implementation details (frameworks, APIs, algorithms)
|
- ❌ Implementation details (frameworks, APIs, algorithms) unless the checklist is asking whether those decisions were explicitly documented and bounded by rationale/rejected alternatives
|
||||||
|
|
||||||
**✅ REQUIRED PATTERNS** - These test requirements quality:
|
**✅ REQUIRED PATTERNS** - These test requirements quality:
|
||||||
- ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
|
- ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
|
||||||
@@ -202,6 +213,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- ✅ "Can [requirement] be objectively measured/verified?"
|
- ✅ "Can [requirement] be objectively measured/verified?"
|
||||||
- ✅ "Are [edge cases/scenarios] addressed in requirements?"
|
- ✅ "Are [edge cases/scenarios] addressed in requirements?"
|
||||||
- ✅ "Does the spec define [missing aspect]?"
|
- ✅ "Does the spec define [missing aspect]?"
|
||||||
|
- ✅ "Does the plan record why [accepted path] was chosen and why [rejected path] is forbidden?"
|
||||||
|
|
||||||
6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
|
6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
|
||||||
|
|
||||||
@@ -210,6 +222,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Depth level
|
- Depth level
|
||||||
- Actor/timing
|
- Actor/timing
|
||||||
- Any explicit user-specified must-have items incorporated
|
- Any explicit user-specified must-have items incorporated
|
||||||
|
- Whether ADR / decision-memory checks were included
|
||||||
|
|
||||||
**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
|
**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
|
||||||
|
|
||||||
@@ -262,6 +275,15 @@ Sample items:
|
|||||||
- "Are security requirements consistent with compliance obligations? [Consistency]"
|
- "Are security requirements consistent with compliance obligations? [Consistency]"
|
||||||
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
|
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
|
||||||
|
|
||||||
|
**Architecture Decision Quality:** `architecture.md`
|
||||||
|
|
||||||
|
Sample items:
|
||||||
|
|
||||||
|
- "Do all repo-shaping architecture choices have explicit rationale before tasks are generated? [Decision Memory]"
|
||||||
|
- "Are rejected alternatives documented for each blocking technology branch? [Decision Memory, Gap]"
|
||||||
|
- "Can an implementer tell which shortcuts are forbidden without re-reading research artifacts? [Clarity, ADR]"
|
||||||
|
- "Are ADR decisions traceable to requirements or constraints in the spec? [Traceability, ADR]"
|
||||||
|
|
||||||
## Anti-Examples: What NOT To Do
|
## Anti-Examples: What NOT To Do
|
||||||
|
|
||||||
**❌ WRONG - These test implementation, not requirements:**
|
**❌ WRONG - These test implementation, not requirements:**
|
||||||
@@ -282,6 +304,7 @@ Sample items:
|
|||||||
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
|
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
|
||||||
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
|
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
|
||||||
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
|
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
|
||||||
|
- [ ] CHK007 - Do planning artifacts state why the accepted architecture was chosen and which alternative is rejected? [Decision Memory, ADR]
|
||||||
```
|
```
|
||||||
|
|
||||||
**Key Differences:**
|
**Key Differences:**
|
||||||
|
|||||||
@@ -56,35 +56,36 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
|
|
||||||
3. Load and analyze the implementation context:
|
3. Load and analyze the implementation context:
|
||||||
- **REQUIRED**: Read `.ai/standards/semantics.md` for strict coding standards and contract requirements
|
- **REQUIRED**: Read `.ai/standards/semantics.md` for strict coding standards and contract requirements
|
||||||
- **REQUIRED**: Read tasks.md for the complete task list and execution plan
|
- **REQUIRED**: Read `tasks.md` for the complete task list and execution plan
|
||||||
- **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
|
- **REQUIRED**: Read `plan.md` for tech stack, architecture, and file structure
|
||||||
- **IF EXISTS**: Read data-model.md for entities and relationships
|
- **REQUIRED IF PRESENT**: Read ADR artifacts containing `[DEF:id:ADR]` nodes and build a blocked-path inventory from `@REJECTED`
|
||||||
- **IF EXISTS**: Read contracts/ for API specifications and test requirements
|
- **IF EXISTS**: Read `data-model.md` for entities and relationships
|
||||||
- **IF EXISTS**: Read research.md for technical decisions and constraints
|
- **IF EXISTS**: Read `contracts/` for API specifications and test requirements
|
||||||
- **IF EXISTS**: Read quickstart.md for integration scenarios
|
- **IF EXISTS**: Read `research.md` for technical decisions and constraints
|
||||||
|
- **IF EXISTS**: Read `quickstart.md` for integration scenarios
|
||||||
|
|
||||||
4. **Project Setup Verification**:
|
4. **Project Setup Verification**:
|
||||||
- **REQUIRED**: Create/verify ignore files based on actual project setup:
|
- **REQUIRED**: Create/verify ignore files based on actual project setup:
|
||||||
|
|
||||||
**Detection & Creation Logic**:
|
**Detection & Creation Logic**:
|
||||||
- Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):
|
- Check if the following command succeeds to determine if the repository is a git repo (create/verify `.gitignore` if so):
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
git rev-parse --git-dir 2>/dev/null
|
git rev-parse --git-dir 2>/dev/null
|
||||||
```
|
```
|
||||||
|
|
||||||
- Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
|
- Check if Dockerfile* exists or Docker in `plan.md` → create/verify `.dockerignore`
|
||||||
- Check if .eslintrc* exists → create/verify .eslintignore
|
- Check if `.eslintrc*` exists → create/verify `.eslintignore`
|
||||||
- Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
|
- Check if `eslint.config.*` exists → ensure the config's `ignores` entries cover required patterns
|
||||||
- Check if .prettierrc* exists → create/verify .prettierignore
|
- Check if `.prettierrc*` exists → create/verify `.prettierignore`
|
||||||
- Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
|
- Check if `.npmrc` or `package.json` exists → create/verify `.npmignore` (if publishing)
|
||||||
- Check if terraform files (*.tf) exist → create/verify .terraformignore
|
- Check if terraform files (`*.tf`) exist → create/verify `.terraformignore`
|
||||||
- Check if .helmignore needed (helm charts present) → create/verify .helmignore
|
- Check if `.helmignore` needed (helm charts present) → create/verify `.helmignore`
|
||||||
|
|
||||||
**If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
|
**If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
|
||||||
**If ignore file missing**: Create with full pattern set for detected technology
|
**If ignore file missing**: Create with full pattern set for detected technology
|
||||||
|
|
||||||
**Common Patterns by Technology** (from plan.md tech stack):
|
**Common Patterns by Technology** (from `plan.md` tech stack):
|
||||||
- **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
|
- **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
|
||||||
- **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
|
- **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
|
||||||
- **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
|
- **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
|
||||||
@@ -107,11 +108,12 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
|
- **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
|
||||||
- **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
|
- **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
|
||||||
|
|
||||||
5. Parse tasks.md structure and extract:
|
5. Parse `tasks.md` structure and extract:
|
||||||
- **Task phases**: Setup, Tests, Core, Integration, Polish
|
- **Task phases**: Setup, Tests, Core, Integration, Polish
|
||||||
- **Task dependencies**: Sequential vs parallel execution rules
|
- **Task dependencies**: Sequential vs parallel execution rules
|
||||||
- **Task details**: ID, description, file paths, parallel markers [P]
|
- **Task details**: ID, description, file paths, parallel markers [P]
|
||||||
- **Execution flow**: Order and dependency requirements
|
- **Execution flow**: Order and dependency requirements
|
||||||
|
- **Decision-memory requirements**: which tasks inherit ADR ids, `@RATIONALE`, and `@REJECTED` guardrails
|
||||||
|
|
||||||
6. Execute implementation following the task plan:
|
6. Execute implementation following the task plan:
|
||||||
- **Phase-by-phase execution**: Complete each phase before moving to the next
|
- **Phase-by-phase execution**: Complete each phase before moving to the next
|
||||||
@@ -119,6 +121,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
|
- **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
|
||||||
- **File-based coordination**: Tasks affecting the same files must run sequentially
|
- **File-based coordination**: Tasks affecting the same files must run sequentially
|
||||||
- **Validation checkpoints**: Verify each phase completion before proceeding
|
- **Validation checkpoints**: Verify each phase completion before proceeding
|
||||||
|
- **ADR guardrail discipline**: if a task packet or local contract forbids a path via `@REJECTED`, do not treat it as an implementation option
|
||||||
|
|
||||||
7. Implementation execution rules:
|
7. Implementation execution rules:
|
||||||
- **Strict Adherence**: Apply `.ai/standards/semantics.md` rules:
|
- **Strict Adherence**: Apply `.ai/standards/semantics.md` rules:
|
||||||
@@ -134,8 +137,10 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- For Python Complexity 5 modules, `belief_scope(...)` is mandatory and the critical path must be irrigated with `logger.reason()` / `logger.reflect()` according to the contract.
|
- For Python Complexity 5 modules, `belief_scope(...)` is mandatory and the critical path must be irrigated with `logger.reason()` / `logger.reflect()` according to the contract.
|
||||||
- For Svelte components, require `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`; runes-only reactivity is allowed (`$state`, `$derived`, `$effect`, `$props`).
|
- For Svelte components, require `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`; runes-only reactivity is allowed (`$state`, `$derived`, `$effect`, `$props`).
|
||||||
- Reject pseudo-semantic markup: docstrings containing loose `@PURPOSE` / `@PRE` text do **NOT** satisfy the protocol unless represented in canonical anchored metadata blocks.
|
- Reject pseudo-semantic markup: docstrings containing loose `@PURPOSE` / `@PRE` text do **NOT** satisfy the protocol unless represented in canonical anchored metadata blocks.
|
||||||
|
- Preserve and propagate decision-memory tags. Upstream `@RATIONALE` / `@REJECTED` are mandatory when carried by the task packet or contract.
|
||||||
|
- If `logger.explore()` or equivalent runtime evidence leads to a retained workaround, mutate the same contract header with reactive Micro-ADR tags: `@RATIONALE` and `@REJECTED`.
|
||||||
- **Self-Audit**: The Coder MUST use `axiom-core` tools (like `audit_contracts_tool`) to verify semantic compliance before completion.
|
- **Self-Audit**: The Coder MUST use `axiom-core` tools (like `audit_contracts_tool`) to verify semantic compliance before completion.
|
||||||
- **Semantic Rejection Gate**: If self-audit reveals broken anchors, missing closing tags, missing required metadata for the effective complexity, orphaned critical classes/functions, or Complexity 4/5 Python code without required belief-state logging, the task is NOT complete and cannot be handed off as accepted work.
|
- **Semantic Rejection Gate**: If self-audit reveals broken anchors, missing closing tags, missing required metadata for the effective complexity, orphaned critical classes/functions, Complexity 4/5 Python code without required belief-state logging, or retained workarounds without decision-memory tags, the task is NOT complete and cannot be handed off as accepted work.
|
||||||
- **CRITICAL Contracts**: If a task description contains a contract summary (e.g., `CRITICAL: PRE: ..., POST: ...`), these constraints are **MANDATORY** and must be strictly implemented in the code using guards/assertions (if applicable per protocol).
|
- **CRITICAL Contracts**: If a task description contains a contract summary (e.g., `CRITICAL: PRE: ..., POST: ...`), these constraints are **MANDATORY** and must be strictly implemented in the code using guards/assertions (if applicable per protocol).
|
||||||
- **Setup first**: Initialize project structure, dependencies, configuration
|
- **Setup first**: Initialize project structure, dependencies, configuration
|
||||||
- **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
|
- **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
|
||||||
@@ -150,11 +155,13 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Provide clear error messages with context for debugging.
|
- Provide clear error messages with context for debugging.
|
||||||
- Suggest next steps if implementation cannot proceed.
|
- Suggest next steps if implementation cannot proceed.
|
||||||
- **IMPORTANT** For completed tasks, mark as [X] only AFTER local verification and self-audit.
|
- **IMPORTANT** For completed tasks, mark as [X] only AFTER local verification and self-audit.
|
||||||
|
- If blocked because the only apparent fix is listed in upstream `@REJECTED`, escalate for decision revision instead of silently overriding the guardrail.
|
||||||
|
|
||||||
9. **Handoff to Tester (Audit Loop)**:
|
9. **Handoff to Tester (Audit Loop)**:
|
||||||
- Once a task or phase is complete, the Coder hands off to the Tester.
|
- Once a task or phase is complete, the Coder hands off to the Tester.
|
||||||
- Handoff includes: file paths, declared complexity, expected contracts (`@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, `@INVARIANT` when applicable), and a short logic overview.
|
- Handoff includes: file paths, declared complexity, expected contracts (`@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, `@INVARIANT` when applicable), and a short logic overview.
|
||||||
- Handoff MUST explicitly disclose any contract exceptions or known semantic debt. Hidden semantic debt is forbidden.
|
- Handoff MUST explicitly disclose any contract exceptions or known semantic debt. Hidden semantic debt is forbidden.
|
||||||
|
- Handoff MUST disclose decision-memory changes: inherited ADR ids, new or updated `@RATIONALE`, new or updated `@REJECTED`, and any blocked paths that remain active.
|
||||||
- The handoff payload MUST instruct the Tester to execute the dedicated testing workflow [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md), not just perform an informal review.
|
- The handoff payload MUST instruct the Tester to execute the dedicated testing workflow [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md), not just perform an informal review.
|
||||||
|
|
||||||
10. **Tester Verification & Orchestrator Gate**:
|
10. **Tester Verification & Orchestrator Gate**:
|
||||||
@@ -164,11 +171,12 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Reject code that only imitates the protocol superficially, such as free-form docstrings with `@PURPOSE` text but without canonical `[DEF]...[/DEF]` anchors and header metadata.
|
- Reject code that only imitates the protocol superficially, such as free-form docstrings with `@PURPOSE` text but without canonical `[DEF]...[/DEF]` anchors and header metadata.
|
||||||
- Verify that effective complexity and required metadata match [`.ai/standards/semantics.md`](.ai/standards/semantics.md).
|
- Verify that effective complexity and required metadata match [`.ai/standards/semantics.md`](.ai/standards/semantics.md).
|
||||||
- Verify that Python Complexity 4/5 implementations include required belief-state instrumentation (`belief_scope`, `logger.reason()`, `logger.reflect()`).
|
- Verify that Python Complexity 4/5 implementations include required belief-state instrumentation (`belief_scope`, `logger.reason()`, `logger.reflect()`).
|
||||||
|
- Verify that upstream rejected paths were not silently restored.
|
||||||
- Emulate algorithms "in mind" step-by-step to ensure logic consistency.
|
- Emulate algorithms "in mind" step-by-step to ensure logic consistency.
|
||||||
- Verify unit tests match the declared contracts.
|
- Verify unit tests match the declared contracts.
|
||||||
- If Tester finds issues:
|
- If Tester finds issues:
|
||||||
- Emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch | speckit_test_not_run]`.
|
- Emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch | speckit_test_not_run | rejected_path_regression]`.
|
||||||
- Provide concrete file-path-based reasons, for example: missing anchors, module/class contract mismatch, missing `@DATA_CONTRACT`, missing `logger.reason()`, illegal docstring-only annotations, or missing execution of [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md).
|
- Provide concrete file-path-based reasons, for example: missing anchors, module/class contract mismatch, missing `@DATA_CONTRACT`, missing `logger.reason()`, illegal docstring-only annotations, missing decision-memory tags, re-enabled upstream rejected path, or missing execution of [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md).
|
||||||
- Notify the Orchestrator.
|
- Notify the Orchestrator.
|
||||||
- Orchestrator redirects the feedback to the Coder for remediation.
|
- Orchestrator redirects the feedback to the Coder for remediation.
|
||||||
- Orchestrator green-status rule:
|
- Orchestrator green-status rule:
|
||||||
@@ -187,7 +195,9 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- class/function-level docstring contracts standing in for canonical anchors,
|
- class/function-level docstring contracts standing in for canonical anchors,
|
||||||
- missing closing anchors,
|
- missing closing anchors,
|
||||||
- missing required metadata for declared complexity,
|
- missing required metadata for declared complexity,
|
||||||
- Complexity 5 repository/service code using only `belief_scope(...)` without explicit `logger.reason()` / `logger.reflect()` checkpoints.
|
- Complexity 5 repository/service code using only `belief_scope(...)` without explicit `logger.reason()` / `logger.reflect()` checkpoints,
|
||||||
|
- retained workarounds missing local `@RATIONALE` / `@REJECTED`,
|
||||||
|
- silent resurrection of paths already blocked by upstream ADR or task guardrails.
|
||||||
- Report final status with summary of completed and audited work.
|
- Report final status with summary of completed and audited work.
|
||||||
|
|
||||||
Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
|
Note: This command assumes a complete task breakdown exists in `tasks.md`. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
|
||||||
|
|||||||
@@ -28,12 +28,13 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
|
- Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
|
||||||
- Fill Constitution Check section from constitution
|
- Fill Constitution Check section from constitution
|
||||||
- Evaluate gates (ERROR if violations unjustified)
|
- Evaluate gates (ERROR if violations unjustified)
|
||||||
- Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
|
- Phase 0: Generate `research.md` (resolve all NEEDS CLARIFICATION)
|
||||||
- Phase 1: Generate data-model.md, contracts/, quickstart.md
|
- Phase 1: Generate `data-model.md`, `contracts/`, `quickstart.md`
|
||||||
|
- Phase 1: Generate global ADR artifacts and connect them to the plan
|
||||||
- Phase 1: Update agent context by running the agent script
|
- Phase 1: Update agent context by running the agent script
|
||||||
- Re-evaluate Constitution Check post-design
|
- Re-evaluate Constitution Check post-design
|
||||||
|
|
||||||
4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
|
4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, generated artifacts, and ADR decisions created.
|
||||||
|
|
||||||
## Phases
|
## Phases
|
||||||
|
|
||||||
@@ -58,9 +59,9 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Rationale: [why chosen]
|
- Rationale: [why chosen]
|
||||||
- Alternatives considered: [what else evaluated]
|
- Alternatives considered: [what else evaluated]
|
||||||
|
|
||||||
**Output**: research.md with all NEEDS CLARIFICATION resolved
|
**Output**: `research.md` with all NEEDS CLARIFICATION resolved
|
||||||
|
|
||||||
### Phase 1: Design & Contracts
|
### Phase 1: Design, ADRs & Contracts
|
||||||
|
|
||||||
**Prerequisites:** `research.md` complete
|
**Prerequisites:** `research.md` complete
|
||||||
|
|
||||||
@@ -72,7 +73,23 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
1. **Extract entities from feature spec** → `data-model.md`:
|
1. **Extract entities from feature spec** → `data-model.md`:
|
||||||
- Entity name, fields, relationships, validation rules.
|
- Entity name, fields, relationships, validation rules.
|
||||||
|
|
||||||
2. **Design & Verify Contracts (Semantic Protocol)**:
|
2. **Generate Global ADRs (Decision Memory Root Layer)**:
|
||||||
|
- Read `spec.md`, `research.md`, and the technical context to identify repo-shaping decisions: storage, auth pattern, framework boundaries, integration patterns, deployment assumptions, failure strategy.
|
||||||
|
- For each durable architectural choice, emit a standalone semantic ADR block using `[DEF:DecisionId:ADR]`.
|
||||||
|
- Every ADR block MUST include:
|
||||||
|
- `@COMPLEXITY: 3` or `4` depending on blast radius
|
||||||
|
- `@PURPOSE`
|
||||||
|
- `@RATIONALE`
|
||||||
|
- `@REJECTED`
|
||||||
|
- `@RELATION` back to the originating spec/research/plan boundary or target module family
|
||||||
|
- Preferred destinations:
|
||||||
|
- `docs/architecture.md` for cross-cutting repository decisions
|
||||||
|
- feature-local design docs when the decision is feature-scoped
|
||||||
|
- root module headers only when the decision scope is truly local
|
||||||
|
- **Hard Gate**: do not continue to task decomposition until the blocking global decisions have been materialized as ADR nodes.
|
||||||
|
- **Anti-Regression Goal**: a later orchestrator must be able to read these ADRs and avoid creating tasks for rejected branches.
|
||||||
|
|
||||||
|
3. **Design & Verify Contracts (Semantic Protocol)**:
|
||||||
- **Drafting**: Define semantic headers, metadata, and closing anchors for all new modules strictly from `.ai/standards/semantics.md`.
|
- **Drafting**: Define semantic headers, metadata, and closing anchors for all new modules strictly from `.ai/standards/semantics.md`.
|
||||||
- **Complexity Classification**: Classify each contract with `@COMPLEXITY: [1|2|3|4|5]` or `@C:`. Treat `@TIER` only as a legacy compatibility hint and never as the primary rule source.
|
- **Complexity Classification**: Classify each contract with `@COMPLEXITY: [1|2|3|4|5]` or `@C:`. Treat `@TIER` only as a legacy compatibility hint and never as the primary rule source.
|
||||||
- **Adaptive Contract Requirements**:
|
- **Adaptive Contract Requirements**:
|
||||||
@@ -81,34 +98,42 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- **Complexity 3**: require `@PURPOSE` and `@RELATION`; UI also requires `@UX_STATE`.
|
- **Complexity 3**: require `@PURPOSE` and `@RELATION`; UI also requires `@UX_STATE`.
|
||||||
- **Complexity 4**: require `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; Python modules must define a meaningful `logger.reason()` / `logger.reflect()` path or equivalent belief-state mechanism.
|
- **Complexity 4**: require `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; Python modules must define a meaningful `logger.reason()` / `logger.reflect()` path or equivalent belief-state mechanism.
|
||||||
- **Complexity 5**: require full level-4 contract plus `@DATA_CONTRACT` and `@INVARIANT`; Python modules must require `belief_scope`; UI modules must define UX contracts including `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`.
|
- **Complexity 5**: require full level-4 contract plus `@DATA_CONTRACT` and `@INVARIANT`; Python modules must require `belief_scope`; UI modules must define UX contracts including `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`.
|
||||||
|
- **Decision-Memory Propagation**:
|
||||||
|
- If a module/function/component realizes or is constrained by an ADR, add local `@RATIONALE` and `@REJECTED` guardrails before coding begins.
|
||||||
|
- Use `@RELATION: IMPLEMENTS ->[AdrId]` when the contract realizes the ADR.
|
||||||
|
- Use `@RELATION: DEPENDS_ON ->[AdrId]` when the contract is merely constrained by the ADR.
|
||||||
|
- Record known LLM traps directly in the contract header so the implementer inherits the guardrail from the start.
|
||||||
- **Relation Syntax**: Write dependency edges in canonical GraphRAG form: `@RELATION: [PREDICATE] ->[TARGET_ID]`.
|
- **Relation Syntax**: Write dependency edges in canonical GraphRAG form: `@RELATION: [PREDICATE] ->[TARGET_ID]`.
|
||||||
- **Context Guard**: If a target relation, DTO, or required dependency cannot be named confidently, stop generation and emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
|
- **Context Guard**: If a target relation, DTO, required dependency, or decision rationale cannot be named confidently, stop generation and emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
|
||||||
- **Testing Contracts**: Add `@TEST_CONTRACT`, `@TEST_SCENARIO`, `@TEST_FIXTURE`, `@TEST_EDGE`, and `@TEST_INVARIANT` when the design introduces audit-critical or explicitly test-governed contracts, especially for Complexity 5 boundaries.
|
- **Testing Contracts**: Add `@TEST_CONTRACT`, `@TEST_SCENARIO`, `@TEST_FIXTURE`, `@TEST_EDGE`, and `@TEST_INVARIANT` when the design introduces audit-critical or explicitly test-governed contracts, especially for Complexity 5 boundaries.
|
||||||
- **Self-Review**:
|
- **Self-Review**:
|
||||||
- *Complexity Fit*: Does each contract include exactly the metadata and contract density required by its complexity level?
|
- *Complexity Fit*: Does each contract include exactly the metadata and contract density required by its complexity level?
|
||||||
- *Completeness*: Do `@PRE`/`@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, and UX tags cover the edge cases identified in Research and UX Reference?
|
- *Completeness*: Do `@PRE`/`@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, UX tags, and decision-memory tags cover the edge cases identified in Research and UX Reference?
|
||||||
- *Connectivity*: Do `@RELATION` tags form a coherent graph using canonical `@RELATION: [PREDICATE] ->[TARGET_ID]` syntax?
|
- *Connectivity*: Do `@RELATION` tags form a coherent graph using canonical `@RELATION: [PREDICATE] ->[TARGET_ID]` syntax?
|
||||||
- *Compliance*: Are all anchors properly opened and closed, and does the chosen comment syntax match the target medium?
|
- *Compliance*: Are all anchors properly opened and closed, and does the chosen comment syntax match the target medium?
|
||||||
- *Belief-State Requirements*: Do Complexity 4/5 Python modules explicitly account for `logger.reason()`, `logger.reflect()`, and `belief_scope` requirements?
|
- *Belief-State Requirements*: Do Complexity 4/5 Python modules explicitly account for `logger.reason()`, `logger.reflect()`, and `belief_scope` requirements?
|
||||||
|
- *ADR Continuity*: Does every blocking architectural decision have a corresponding ADR node and at least one downstream guarded contract?
|
||||||
- **Output**: Write verified contracts to `contracts/modules.md`.
|
- **Output**: Write verified contracts to `contracts/modules.md`.
|
||||||
|
|
||||||
3. **Simulate Contract Usage**:
|
4. **Simulate Contract Usage**:
|
||||||
- Trace one key user scenario through the defined contracts to ensure data flow continuity.
|
- Trace one key user scenario through the defined contracts to ensure data flow continuity.
|
||||||
- If a contract interface mismatch is found, fix it immediately.
|
- If a contract interface mismatch is found, fix it immediately.
|
||||||
|
- Verify that no traced path accidentally realizes an alternative already named in any ADR `@REJECTED` tag.
|
||||||
|
|
||||||
4. **Generate API contracts**:
|
5. **Generate API contracts**:
|
||||||
- Output OpenAPI/GraphQL schema to `/contracts/` for backend-frontend sync.
|
- Output OpenAPI/GraphQL schema to `/contracts/` for backend-frontend sync.
|
||||||
|
|
||||||
5. **Agent context update**:
|
6. **Agent context update**:
|
||||||
- Run `.specify/scripts/bash/update-agent-context.sh kilocode`
|
- Run `.specify/scripts/bash/update-agent-context.sh kilocode`
|
||||||
- These scripts detect which AI agent is in use
|
- These scripts detect which AI agent is in use
|
||||||
- Update the appropriate agent-specific context file
|
- Update the appropriate agent-specific context file
|
||||||
- Add only new technology from current plan
|
- Add only new technology from current plan
|
||||||
- Preserve manual additions between markers
|
- Preserve manual additions between markers
|
||||||
|
|
||||||
**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file
|
**Output**: `data-model.md`, `/contracts/*`, `quickstart.md`, ADR artifact(s), agent-specific file
|
||||||
|
|
||||||
## Key rules
|
## Key rules
|
||||||
|
|
||||||
- Use absolute paths
|
- Use absolute paths
|
||||||
- ERROR on gate failures or unresolved clarifications
|
- ERROR on gate failures or unresolved clarifications
|
||||||
|
- Do not hand off to [`speckit.tasks`](.kilocode/workflows/speckit.tasks.md) until blocking ADRs exist and rejected branches are explicit
|
||||||
|
|||||||
@@ -12,7 +12,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
|
|
||||||
## Goal
|
## Goal
|
||||||
|
|
||||||
Ensure the codebase adheres to the semantic standards defined in `.ai/standards/semantics.md` by using the AXIOM MCP semantic graph as the primary execution engine. This involves reindexing the workspace, measuring semantic health, auditing contract compliance, and optionally delegating contract-safe fixes through MCP-aware agents.
|
Ensure the codebase adheres to the semantic standards defined in `.ai/standards/semantics.md` by using the AXIOM MCP semantic graph as the primary execution engine. This involves reindexing the workspace, measuring semantic health, auditing contract compliance, auditing decision-memory continuity, and optionally delegating contract-safe fixes through MCP-aware agents.
|
||||||
|
|
||||||
## Operating Constraints
|
## Operating Constraints
|
||||||
|
|
||||||
@@ -25,6 +25,7 @@ Ensure the codebase adheres to the semantic standards defined in `.ai/standards/
|
|||||||
7. **ID NAMING (CRITICAL)**: NEVER use fully-qualified Python import paths in `[DEF:id:Type]`. Use short, domain-driven semantic IDs (e.g., `[DEF:AuthService:Class]`). Follow the exact style shown in `.ai/standards/semantics.md`.
|
7. **ID NAMING (CRITICAL)**: NEVER use fully-qualified Python import paths in `[DEF:id:Type]`. Use short, domain-driven semantic IDs (e.g., `[DEF:AuthService:Class]`). Follow the exact style shown in `.ai/standards/semantics.md`.
|
||||||
8. **ORPHAN PREVENTION**: To reduce the orphan count, you MUST physically wrap actual class and function definitions with `[DEF:id:Type] ... [/DEF]` blocks in the code. Modifying `@RELATION` tags does NOT fix orphans. The AST parser flags any unwrapped function as an orphan.
|
8. **ORPHAN PREVENTION**: To reduce the orphan count, you MUST physically wrap actual class and function definitions with `[DEF:id:Type] ... [/DEF]` blocks in the code. Modifying `@RELATION` tags does NOT fix orphans. The AST parser flags any unwrapped function as an orphan.
|
||||||
- **Exception for Tests**: In test modules, use `BINDS_TO` to link major helpers to the module root. Small helpers remain C1 and don't need relations.
|
- **Exception for Tests**: In test modules, use `BINDS_TO` to link major helpers to the module root. Small helpers remain C1 and don't need relations.
|
||||||
|
9. **DECISION-MEMORY CONTINUITY**: Audit ADR nodes, preventive task guardrails, and reactive Micro-ADR tags as one anti-regression chain. Missing or contradictory `@RATIONALE` / `@REJECTED` is a first-class semantic defect.
|
||||||
|
|
||||||
## Execution Steps
|
## Execution Steps
|
||||||
|
|
||||||
@@ -48,8 +49,13 @@ Treat high orphan counts and unresolved relations as first-class health indicato
|
|||||||
Use [`audit_contracts_tool`](.kilo/mcp.json) and classify findings into:
|
Use [`audit_contracts_tool`](.kilo/mcp.json) and classify findings into:
|
||||||
- **Critical Parsing/Structure Errors**: malformed or incoherent semantic contract regions
|
- **Critical Parsing/Structure Errors**: malformed or incoherent semantic contract regions
|
||||||
- **Critical Contract Gaps**: missing [`@DATA_CONTRACT`](.ai/standards/semantics.md), [`@PRE`](.ai/standards/semantics.md), [`@POST`](.ai/standards/semantics.md), [`@SIDE_EFFECT`](.ai/standards/semantics.md) on CRITICAL contracts
|
- **Critical Contract Gaps**: missing [`@DATA_CONTRACT`](.ai/standards/semantics.md), [`@PRE`](.ai/standards/semantics.md), [`@POST`](.ai/standards/semantics.md), [`@SIDE_EFFECT`](.ai/standards/semantics.md) on CRITICAL contracts
|
||||||
|
- **Decision-Memory Gaps**:
|
||||||
|
- missing standalone `[DEF:id:ADR]` for repo-shaping decisions
|
||||||
|
- missing `@RATIONALE` / `@REJECTED` where task or implementation context clearly requires guardrails
|
||||||
|
- retained workaround code without local reactive Micro-ADR tags
|
||||||
|
- implementation that silently re-enables a path declared in upstream `@REJECTED`
|
||||||
- **Coverage Gaps**: missing [`@TIER`](.ai/standards/semantics.md), missing [`@PURPOSE`](.ai/standards/semantics.md)
|
- **Coverage Gaps**: missing [`@TIER`](.ai/standards/semantics.md), missing [`@PURPOSE`](.ai/standards/semantics.md)
|
||||||
- **Graph Breakages**: unresolved relations, broken references, isolated critical contracts
|
- **Graph Breakages**: unresolved relations, broken references, isolated critical contracts, ADR nodes without downstream guarded contracts
|
||||||
|
|
||||||
### 4. Build Remediation Context
|
### 4. Build Remediation Context
|
||||||
|
|
||||||
@@ -58,12 +64,14 @@ For the top failing contracts, use MCP semantic context tools such as [`get_sema
|
|||||||
2. Upstream/downstream semantic impact
|
2. Upstream/downstream semantic impact
|
||||||
3. Related tests and fixtures
|
3. Related tests and fixtures
|
||||||
4. Whether relation recovery is needed
|
4. Whether relation recovery is needed
|
||||||
|
5. Whether decision-memory continuity is broken between ADR, task contract, and implementation
|
||||||
|
|
||||||
### 5. Execute Fixes (Optional/Handoff)
|
### 5. Execute Fixes (Optional/Handoff)
|
||||||
|
|
||||||
If $ARGUMENTS contains `fix` or `apply`:
|
If $ARGUMENTS contains `fix` or `apply`:
|
||||||
- Handoff to the [`semantic`](.kilocodemodes) mode or a dedicated implementation agent instead of applying naive textual edits in orchestration.
|
- Handoff to the [`semantic`](.kilocodemodes) mode or a dedicated implementation agent instead of applying naive textual edits in orchestration.
|
||||||
- Require the fixing agent to prefer MCP contract mutation tools such as [`simulate_patch_tool`](.kilo/mcp.json), [`guarded_patch_contract_tool`](.kilo/mcp.json), [`patch_contract_tool`](.kilo/mcp.json), and [`infer_missing_relations_tool`](.kilo/mcp.json).
|
- Require the fixing agent to prefer MCP contract mutation tools such as [`simulate_patch_tool`](.kilo/mcp.json), [`guarded_patch_contract_tool`](.kilo/mcp.json), [`patch_contract_tool`](.kilo/mcp.json), and [`infer_missing_relations_tool`](.kilo/mcp.json).
|
||||||
|
- Require the fixing agent to preserve or restore `@RATIONALE` / `@REJECTED` continuity whenever blocked-path knowledge exists.
|
||||||
- After changes, re-run reindex, health, and audit MCP steps to verify the delta.
|
- After changes, re-run reindex, health, and audit MCP steps to verify the delta.
|
||||||
|
|
||||||
### 6. Review Gate
|
### 6. Review Gate
|
||||||
@@ -74,8 +82,9 @@ Before completion, request or perform an MCP-based review path aligned with the
|
|||||||
|
|
||||||
Provide a summary of the semantic state:
|
Provide a summary of the semantic state:
|
||||||
- **Health Metrics**: contracts / relations / orphans / unresolved_relations / files
|
- **Health Metrics**: contracts / relations / orphans / unresolved_relations / files
|
||||||
- **Status**: [PASS/FAIL] (FAIL if CRITICAL gaps or semantically significant unresolved relations exist)
|
- **Status**: [PASS/FAIL] (FAIL if CRITICAL gaps, rejected-path regressions, or semantically significant unresolved relations exist)
|
||||||
- **Top Issues**: List top 3-5 contracts or files needing attention.
|
- **Top Issues**: List top 3-5 contracts or files needing attention.
|
||||||
|
- **Decision Memory**: summarize missing ADRs, missing guardrails, and rejected-path regression risks.
|
||||||
- **Action Taken**: Summary of MCP analysis performed, context gathered, and fixes or handoffs initiated.
|
- **Action Taken**: Summary of MCP analysis performed, context gathered, and fixes or handoffs initiated.
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|||||||
@@ -24,26 +24,29 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
|
||||||
|
|
||||||
2. **Load design documents**: Read from FEATURE_DIR:
|
2. **Load design documents**: Read from FEATURE_DIR:
|
||||||
- **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities), ux_reference.md (experience source of truth)
|
- **Required**: `plan.md` (tech stack, libraries, structure), `spec.md` (user stories with priorities), `ux_reference.md` (experience source of truth)
|
||||||
- **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios)
|
- **Optional**: `data-model.md` (entities), `contracts/` (API endpoints), `research.md` (decisions), `quickstart.md` (test scenarios)
|
||||||
|
- **Required when present in plan output**: ADR artifacts such as `docs/architecture.md` or feature-local architecture decision files containing `[DEF:id:ADR]` nodes
|
||||||
- Note: Not all projects have all documents. Generate tasks based on what's available.
|
- Note: Not all projects have all documents. Generate tasks based on what's available.
|
||||||
|
|
||||||
3. **Execute task generation workflow**:
|
3. **Execute task generation workflow**:
|
||||||
- Load plan.md and extract tech stack, libraries, project structure
|
- Load `plan.md` and extract tech stack, libraries, project structure
|
||||||
- Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
|
- Load `spec.md` and extract user stories with their priorities (P1, P2, P3, etc.)
|
||||||
- If data-model.md exists: Extract entities and map to user stories
|
- Load ADR nodes and build a decision-memory inventory: `DecisionId`, `@RATIONALE`, `@REJECTED`, dependent modules
|
||||||
- If contracts/ exists: Map endpoints to user stories
|
- If `data-model.md` exists: Extract entities and map to user stories
|
||||||
- If research.md exists: Extract decisions for setup tasks
|
- If `contracts/` exists: Map endpoints to user stories
|
||||||
|
- If `research.md` exists: Extract decisions for setup tasks
|
||||||
- Generate tasks organized by user story (see Task Generation Rules below)
|
- Generate tasks organized by user story (see Task Generation Rules below)
|
||||||
- Generate dependency graph showing user story completion order
|
- Generate dependency graph showing user story completion order
|
||||||
- Create parallel execution examples per user story
|
- Create parallel execution examples per user story
|
||||||
- Validate task completeness (each user story has all needed tasks, independently testable)
|
- Validate task completeness (each user story has all needed tasks, independently testable)
|
||||||
|
- Validate guardrail continuity: no task may realize an ADR path named in `@REJECTED`
|
||||||
|
|
||||||
4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
|
4. **Generate `tasks.md`**: Use `.specify/templates/tasks-template.md` as structure, fill with:
|
||||||
- Correct feature name from plan.md
|
- Correct feature name from `plan.md`
|
||||||
- Phase 1: Setup tasks (project initialization)
|
- Phase 1: Setup tasks (project initialization)
|
||||||
- Phase 2: Foundational tasks (blocking prerequisites for all user stories)
|
- Phase 2: Foundational tasks (blocking prerequisites for all user stories)
|
||||||
- Phase 3+: One phase per user story (in priority order from spec.md)
|
- Phase 3+: One phase per user story (in priority order from `spec.md`)
|
||||||
- Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
|
- Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
|
||||||
- Final Phase: Polish & cross-cutting concerns
|
- Final Phase: Polish & cross-cutting concerns
|
||||||
- All tasks must follow the strict checklist format (see Task Generation Rules below)
|
- All tasks must follow the strict checklist format (see Task Generation Rules below)
|
||||||
@@ -51,18 +54,20 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
- Dependencies section showing story completion order
|
- Dependencies section showing story completion order
|
||||||
- Parallel execution examples per story
|
- Parallel execution examples per story
|
||||||
- Implementation strategy section (MVP first, incremental delivery)
|
- Implementation strategy section (MVP first, incremental delivery)
|
||||||
|
- Decision-memory notes for guarded tasks when ADRs or known traps apply
|
||||||
|
|
||||||
5. **Report**: Output path to generated tasks.md and summary:
|
5. **Report**: Output path to generated `tasks.md` and summary:
|
||||||
- Total task count
|
- Total task count
|
||||||
- Task count per user story
|
- Task count per user story
|
||||||
- Parallel opportunities identified
|
- Parallel opportunities identified
|
||||||
- Independent test criteria for each story
|
- Independent test criteria for each story
|
||||||
- Suggested MVP scope (typically just User Story 1)
|
- Suggested MVP scope (typically just User Story 1)
|
||||||
- Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
|
- Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
|
||||||
|
- ADR propagation summary: which ADRs were inherited into task guardrails and which paths were rejected
|
||||||
|
|
||||||
Context for task generation: $ARGUMENTS
|
Context for task generation: $ARGUMENTS
|
||||||
|
|
||||||
The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
|
The `tasks.md` should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
|
||||||
|
|
||||||
## Task Generation Rules
|
## Task Generation Rules
|
||||||
|
|
||||||
@@ -72,10 +77,11 @@ The tasks.md should be immediately executable - each task must be specific enoug
|
|||||||
|
|
||||||
### UX & Semantic Preservation (CRITICAL)
|
### UX & Semantic Preservation (CRITICAL)
|
||||||
|
|
||||||
- **Source of Truth**: `ux_reference.md` for UX, `.ai/standards/semantics.md` for Code.
|
- **Source of Truth**: `ux_reference.md` for UX, `.ai/standards/semantics.md` for code, and ADR artifacts for upstream technology decisions.
|
||||||
- **Violation Warning**: If any task violates UX or GRACE standards, flag it immediately.
|
- **Violation Warning**: If any task violates UX, ADR guardrails, or GRACE standards, flag it immediately.
|
||||||
- **Verification Task (UX)**: Add a task at the end of each Story phase: `- [ ] Txxx [USx] Verify implementation matches ux_reference.md (Happy Path & Errors)`
|
- **Verification Task (UX)**: Add a task at the end of each Story phase: `- [ ] Txxx [USx] Verify implementation matches ux_reference.md (Happy Path & Errors)`
|
||||||
- **Verification Task (Audit)**: Add a mandatory audit task at the end of each Story phase: `- [ ] Txxx [USx] Acceptance: Perform semantic audit & algorithm emulation by Tester`
|
- **Verification Task (Audit)**: Add a mandatory audit task at the end of each Story phase: `- [ ] Txxx [USx] Acceptance: Perform semantic audit & algorithm emulation by Tester`
|
||||||
|
- **Guardrail Rule**: If an ADR or contract says `@REJECTED`, task text must not schedule that path as implementation work.
|
||||||
|
|
||||||
### Checklist Format (REQUIRED)
|
### Checklist Format (REQUIRED)
|
||||||
|
|
||||||
@@ -91,7 +97,7 @@ Every task MUST strictly follow this format:
|
|||||||
2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
|
2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
|
||||||
3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
|
3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
|
||||||
4. **[Story] label**: REQUIRED for user story phase tasks only
|
4. **[Story] label**: REQUIRED for user story phase tasks only
|
||||||
- Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
|
- Format: [US1], [US2], [US3], etc. (maps to user stories from `spec.md`)
|
||||||
- Setup phase: NO story label
|
- Setup phase: NO story label
|
||||||
- Foundational phase: NO story label
|
- Foundational phase: NO story label
|
||||||
- User Story phases: MUST have story label
|
- User Story phases: MUST have story label
|
||||||
@@ -111,7 +117,7 @@ Every task MUST strictly follow this format:
|
|||||||
|
|
||||||
### Task Organization
|
### Task Organization
|
||||||
|
|
||||||
1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
|
1. **From User Stories (`spec.md`)** - PRIMARY ORGANIZATION:
|
||||||
- Each user story (P1, P2, P3...) gets its own phase
|
- Each user story (P1, P2, P3...) gets its own phase
|
||||||
- Map all related components to their story:
|
- Map all related components to their story:
|
||||||
- Models needed for that story
|
- Models needed for that story
|
||||||
@@ -127,12 +133,18 @@ Every task MUST strictly follow this format:
|
|||||||
- Map each contract/endpoint → to the user story it serves
|
- Map each contract/endpoint → to the user story it serves
|
||||||
- If tests requested: Each contract → contract test task [P] before implementation in that story's phase
|
- If tests requested: Each contract → contract test task [P] before implementation in that story's phase
|
||||||
|
|
||||||
3. **From Data Model**:
|
3. **From ADRs and Decision Memory**:
|
||||||
|
- For each implementation task constrained by an ADR, append a concise guardrail summary drawn from `@RATIONALE` and `@REJECTED`.
|
||||||
|
- Example: `- [ ] T021 [US1] Implement payload parsing guardrails in src/api/input.py (RATIONALE: strict validation because frontend sends numeric strings; REJECTED: json.loads() without schema validation)`
|
||||||
|
- If a task would naturally branch into an ADR-rejected alternative, rewrite the task around the accepted path instead of leaving the choice ambiguous.
|
||||||
|
- If no safe executable path remains because ADR context is incomplete, stop and emit `[NEED_CONTEXT: target]`.
|
||||||
|
|
||||||
|
4. **From Data Model**:
|
||||||
- Map each entity to the user story(ies) that need it
|
- Map each entity to the user story(ies) that need it
|
||||||
- If entity serves multiple stories: Put in earliest story or Setup phase
|
- If entity serves multiple stories: Put in earliest story or Setup phase
|
||||||
- Relationships → service layer tasks in appropriate story phase
|
- Relationships → service layer tasks in appropriate story phase
|
||||||
|
|
||||||
4. **From Setup/Infrastructure**:
|
5. **From Setup/Infrastructure**:
|
||||||
- Shared infrastructure → Setup phase (Phase 1)
|
- Shared infrastructure → Setup phase (Phase 1)
|
||||||
- Foundational/blocking tasks → Foundational phase (Phase 2)
|
- Foundational/blocking tasks → Foundational phase (Phase 2)
|
||||||
- Story-specific setup → within that story's phase
|
- Story-specific setup → within that story's phase
|
||||||
@@ -145,3 +157,11 @@ Every task MUST strictly follow this format:
|
|||||||
- Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
|
- Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
|
||||||
- Each phase should be a complete, independently testable increment
|
- Each phase should be a complete, independently testable increment
|
||||||
- **Final Phase**: Polish & Cross-Cutting Concerns
|
- **Final Phase**: Polish & Cross-Cutting Concerns
|
||||||
|
|
||||||
|
### Decision-Memory Validation Gate
|
||||||
|
|
||||||
|
Before finalizing `tasks.md`, verify all of the following:
|
||||||
|
- Every repo-shaping ADR from planning is either represented in a setup/foundational task or inherited by a downstream story task.
|
||||||
|
- Every guarded task that could tempt an implementer into a known wrong branch carries preventive `@RATIONALE` / `@REJECTED` guidance in its text.
|
||||||
|
- No task instructs the implementer to realize an ADR path already named as rejected.
|
||||||
|
- At least one explicit audit/verification task exists for checking rejected-path regressions in code review or test stages.
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ You **MUST** consider the user input before proceeding (if not empty).
|
|||||||
|
|
||||||
## Goal
|
## Goal
|
||||||
|
|
||||||
Execute semantic audit and full testing cycle: verify contract compliance, emulate logic, ensure maximum coverage, and maintain test quality.
|
Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality.
|
||||||
|
|
||||||
## Operating Constraints
|
## Operating Constraints
|
||||||
|
|
||||||
@@ -22,6 +22,7 @@ Execute semantic audit and full testing cycle: verify contract compliance, emula
|
|||||||
2. **NEVER duplicate tests** - Check existing tests first before creating new ones
|
2. **NEVER duplicate tests** - Check existing tests first before creating new ones
|
||||||
3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
|
3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
|
||||||
4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested
|
4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested
|
||||||
|
5. **Decision-memory regression guard** - Tests and audits must not normalize silent reintroduction of any path documented in upstream `@REJECTED`
|
||||||
|
|
||||||
## Execution Steps
|
## Execution Steps
|
||||||
|
|
||||||
@@ -31,18 +32,25 @@ Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --inclu
|
|||||||
|
|
||||||
Determine:
|
Determine:
|
||||||
- FEATURE_DIR - where the feature is located
|
- FEATURE_DIR - where the feature is located
|
||||||
- TASKS_FILE - path to tasks.md
|
- TASKS_FILE - path to `tasks.md`
|
||||||
- Which modules need testing based on task status
|
- Which modules need testing based on task status
|
||||||
|
- Which ADRs or task guardrails define rejected paths for the touched scope
|
||||||
|
|
||||||
### 2. Load Relevant Artifacts
|
### 2. Load Relevant Artifacts
|
||||||
|
|
||||||
**From tasks.md:**
|
**From `tasks.md`:**
|
||||||
- Identify completed implementation tasks (not test tasks)
|
- Identify completed implementation tasks (not test tasks)
|
||||||
- Extract file paths that need tests
|
- Extract file paths that need tests
|
||||||
|
- Extract guardrail summaries and blocked paths
|
||||||
|
|
||||||
**From .ai/standards/semantics.md:**
|
**From `.ai/standards/semantics.md`:**
|
||||||
- Read @TIER annotations for modules
|
- Read effective complexity expectations
|
||||||
- For CRITICAL modules: Read @TEST_ fixtures
|
- Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR
|
||||||
|
- For CRITICAL modules: Read `@TEST_` fixtures
|
||||||
|
|
||||||
|
**From ADR sources and touched code:**
|
||||||
|
- Read `[DEF:id:ADR]` nodes when present
|
||||||
|
- Read local `@RATIONALE` and `@REJECTED` in touched contracts
|
||||||
|
|
||||||
**From existing tests:**
|
**From existing tests:**
|
||||||
- Scan `__tests__` directories for existing tests
|
- Scan `__tests__` directories for existing tests
|
||||||
@@ -52,9 +60,9 @@ Determine:
|
|||||||
|
|
||||||
Create coverage matrix:
|
Create coverage matrix:
|
||||||
|
|
||||||
| Module | File | Has Tests | TIER | TEST_FIXTURE Available |
|
| Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded |
|
||||||
|--------|------|-----------|------|----------------------|
|
|--------|------|-----------|-------------------|------------------------|-----------------------|
|
||||||
| ... | ... | ... | ... | ... |
|
| ... | ... | ... | ... | ... | ... |
|
||||||
|
|
||||||
### 4. Semantic Audit & Logic Emulation (CRITICAL)
|
### 4. Semantic Audit & Logic Emulation (CRITICAL)
|
||||||
|
|
||||||
@@ -66,9 +74,12 @@ Before writing tests, the Tester MUST:
|
|||||||
- Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints.
|
- Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints.
|
||||||
- Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`.
|
- Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`.
|
||||||
- Treat broken or missing closing anchors as blocking violations.
|
- Treat broken or missing closing anchors as blocking violations.
|
||||||
|
- Reject retained workaround code if the local contract lacks `@RATIONALE` / `@REJECTED`.
|
||||||
|
- Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected.
|
||||||
3. **Emulate Algorithm**: Step through the code implementation in mind.
|
3. **Emulate Algorithm**: Step through the code implementation in mind.
|
||||||
- Verify it adheres to the `@PURPOSE` and `@INVARIANT`.
|
- Verify it adheres to the `@PURPOSE` and `@INVARIANT`.
|
||||||
- Verify `@PRE` and `@POST` conditions are correctly handled.
|
- Verify `@PRE` and `@POST` conditions are correctly handled.
|
||||||
|
- Verify the implementation follows accepted-path rationale rather than drifting into a blocked path.
|
||||||
4. **Validation Verdict**:
|
4. **Validation Verdict**:
|
||||||
- If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator.
|
- If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator.
|
||||||
- Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived.
|
- Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived.
|
||||||
@@ -79,7 +90,7 @@ Before writing tests, the Tester MUST:
|
|||||||
For each module requiring tests:
|
For each module requiring tests:
|
||||||
|
|
||||||
1. **Check existing tests**: Scan `__tests__/` for duplicates.
|
1. **Check existing tests**: Scan `__tests__/` for duplicates.
|
||||||
2. **Read TEST_FIXTURE**: If CRITICAL tier, read @TEST_FIXTURE from semantics header.
|
2. **Read TEST_FIXTURE**: If CRITICAL tier, read `@TEST_FIXTURE` from semantics header.
|
||||||
3. **Do not normalize broken semantics through tests**:
|
3. **Do not normalize broken semantics through tests**:
|
||||||
- The Tester must not write tests that silently accept malformed semantic protocol usage.
|
- The Tester must not write tests that silently accept malformed semantic protocol usage.
|
||||||
- If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
|
- If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
|
||||||
@@ -87,6 +98,8 @@ For each module requiring tests:
|
|||||||
- Python: `src/module/__tests__/test_module.py`
|
- Python: `src/module/__tests__/test_module.py`
|
||||||
- Svelte: `src/lib/components/__tests__/test_component.test.js`
|
- Svelte: `src/lib/components/__tests__/test_component.test.js`
|
||||||
5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies
|
5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies
|
||||||
|
6. **Add rejected-path regression coverage when relevant**:
|
||||||
|
- If ADR or local contract names a blocked path in `@REJECTED`, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored.
|
||||||
|
|
||||||
### 4a. UX Contract Testing (Frontend Components)
|
### 4a. UX Contract Testing (Frontend Components)
|
||||||
|
|
||||||
@@ -103,9 +116,10 @@ For Svelte components with `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY` tags:
|
|||||||
expect(screen.getByTestId('sidebar')).toHaveClass('expanded');
|
expect(screen.getByTestId('sidebar')).toHaveClass('expanded');
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
3. **Test @UX_FEEDBACK**: Verify visual feedback (toast, shake, color changes)
|
3. **Test `@UX_FEEDBACK`**: Verify visual feedback (toast, shake, color changes)
|
||||||
4. **Test @UX_RECOVERY**: Verify error recovery mechanisms (retry, clear input)
|
4. **Test `@UX_RECOVERY`**: Verify error recovery mechanisms (retry, clear input)
|
||||||
5. **Use @UX_TEST fixtures**: If component has `@UX_TEST` tags, use them as test specifications
|
5. **Use `@UX_TEST` fixtures**: If component has `@UX_TEST` tags, use them as test specifications
|
||||||
|
6. **Verify decision memory**: If the UI contract declares `@REJECTED`, ensure browser-visible behavior does not regress into the rejected path.
|
||||||
|
|
||||||
**UX Test Template:**
|
**UX Test Template:**
|
||||||
```javascript
|
```javascript
|
||||||
@@ -139,6 +153,8 @@ tests/
|
|||||||
└── YYYY-MM-DD-report.md
|
└── YYYY-MM-DD-report.md
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Include decision-memory coverage notes when ADR or rejected-path regressions were checked.
|
||||||
|
|
||||||
### 6. Execute Tests
|
### 6. Execute Tests
|
||||||
|
|
||||||
Run tests and report results:
|
Run tests and report results:
|
||||||
@@ -155,10 +171,11 @@ cd frontend && npm run test
|
|||||||
|
|
||||||
### 7. Update Tasks
|
### 7. Update Tasks
|
||||||
|
|
||||||
Mark test tasks as completed in tasks.md with:
|
Mark test tasks as completed in `tasks.md` with:
|
||||||
- Test file path
|
- Test file path
|
||||||
- Coverage achieved
|
- Coverage achieved
|
||||||
- Any issues found
|
- Any issues found
|
||||||
|
- Whether rejected-path regression checks passed or remain manual audit items
|
||||||
|
|
||||||
## Output
|
## Output
|
||||||
|
|
||||||
@@ -188,10 +205,15 @@ Generate test execution report:
|
|||||||
- Verdict: PASS | FAIL
|
- Verdict: PASS | FAIL
|
||||||
- Blocking Violations:
|
- Blocking Violations:
|
||||||
- [file path] -> [reason]
|
- [file path] -> [reason]
|
||||||
|
- Decision Memory:
|
||||||
|
- ADRs checked: [...]
|
||||||
|
- Rejected-path regressions: PASS | FAIL
|
||||||
|
- Missing `@RATIONALE` / `@REJECTED`: [...]
|
||||||
- Notes:
|
- Notes:
|
||||||
- Reject docstring-only semantic pseudo-markup
|
- Reject docstring-only semantic pseudo-markup
|
||||||
- Reject complexity/contract mismatches
|
- Reject complexity/contract mismatches
|
||||||
- Reject missing belief-state instrumentation for Python Complexity 4/5
|
- Reject missing belief-state instrumentation for Python Complexity 4/5
|
||||||
|
- Reject silent resurrection of rejected paths
|
||||||
|
|
||||||
## Issues Found
|
## Issues Found
|
||||||
|
|
||||||
@@ -203,6 +225,7 @@ Generate test execution report:
|
|||||||
|
|
||||||
- [ ] Fix failed tests
|
- [ ] Fix failed tests
|
||||||
- [ ] Fix blocking semantic violations before acceptance
|
- [ ] Fix blocking semantic violations before acceptance
|
||||||
|
- [ ] Fix decision-memory drift or rejected-path regressions
|
||||||
- [ ] Add more coverage for [module]
|
- [ ] Add more coverage for [module]
|
||||||
- [ ] Review TEST_FIXTURE fixtures
|
- [ ] Review TEST_FIXTURE fixtures
|
||||||
```
|
```
|
||||||
|
|||||||
Submodule research/kilocode deleted from 6d4d7328f6
Reference in New Issue
Block a user