mcp tuning

2026-04-01 13:29:41 +03:00
parent 586229a974
commit 1e46073dd6
19 changed files with 1324 additions and 28593 deletions
--- a/.ai/MODULE_MAP.md
+++ b/.ai/MODULE_MAP.md
--- a/.ai/PROJECT_MAP.md
+++ b/.ai/PROJECT_MAP.md
@@ -1,44 +0,0 @@
-# [DEF:Project_Map:Root]
-# @COMPLEXITY: 3
-# @PURPOSE: Canonical ownership record for repository structure navigation and generated project-map artifacts.
-# @RELATION: DEPENDS_ON -> [Project_Knowledge_Map:Root]
-# @RELATION: DEPENDS_ON -> [Std:Constitution:Standard]
-# @RELATION: DEPENDS_ON -> [Std:UserPersona:Standard]
-# @RELATION: BINDS_TO -> [MCP_Config:Block]
-# @LAST_UPDATE: 2026-03-26
-
-## Canonical ownership
- Canonical owner for `Project_Map` is this file: `.ai/PROJECT_MAP.md`.
- Generated structural snapshot lives at `.ai/structure/PROJECT_MAP.md` and is a backing artifact, not the canonical ownership document.
- References that previously pointed directly to `.ai/structure/PROJECT_MAP.md` for `Project_Map` should normalize to this file.
-
-## Canonical relations
- Root knowledge entry: `.ai/ROOT.md` -> `[DEF:Project_Knowledge_Map:Root]`
- Normalized project MCP configuration: `.kilo/mcp.json` -> `[DEF:MCP_Config:Block]`
- Repository constitution: `.ai/standards/constitution.md` -> `[DEF:Std:Constitution:Standard]`
- Repository persona: `.ai/PERSONA.md` -> `[DEF:Std:UserPersona:Standard]`
-
-## Generated snapshot handoff
- Use `.ai/structure/PROJECT_MAP.md` for the expanded generated module/file inventory.
- Regeneration may replace snapshot contents without changing canonical ownership of `Project_Map`.
-
-# [DEF:MCP_Config:Block]
-# @COMPLEXITY: 3
-# @PURPOSE: Canonical ownership record for normalized project MCP configuration consumed by semantic workflows.
-# @RELATION: DEPENDS_ON -> [Project_Map:Root]
-# @RELATION: DEPENDS_ON -> [Std:Constitution:Standard]
-# @RELATION: DEPENDS_ON -> [Std:UserPersona:Standard]
-# @LAST_UPDATE: 2026-03-26
-
-## Normalized config path
- Canonical project MCP config path is `.kilo/mcp.json`.
- For this repository, new docs and workflows must reference `.kilo/mcp.json` as the normalized MCP config.
- Do not introduce new canonical references to deprecated project MCP doc paths for ownership or workflow wiring.
-
-## Current semantic workflow binding
- AXIOM semantic workflows in `.kilocode/workflows/` bind to tools exposed through `.kilo/mcp.json`.
- The `axiom-core` server definition in `.kilo/mcp.json` is the normalized semantic-audit integration point for this repository.
-
-# [/DEF:MCP_Config:Block]
-
-# [/DEF:Project_Map:Root]
--- a/.ai/reports/axiom-tools-evaluation.md
+++ b/.ai/reports/axiom-tools-evaluation.md
@@ -0,0 +1,555 @@
+# [DEF:Axiom_Tools_Evaluation:Report]
+# @COMPLEXITY: 4
+# @PURPOSE: Comprehensive evaluation of all axiom-core MCP server tools across 8 UX metrics.
+# @LAYER: Analysis
+# @RELATION: DEPENDS_ON -> [Project_Knowledge_Map:Root]
+# @PRE: All axiom-core tools have been exercised with valid and invalid inputs.
+# @POST: Report file exists with per-tool scores and aggregate findings.
+# @SIDE_EFFECT: Creates evaluation artifact in .ai/reports/.
+# @DATA_CONTRACT: Input[Tool Suite] -> Output[Evaluation Report]
+# @INVARIANT: Each tool must be scored on all 8 metrics; no tool may be omitted.
+
+---
+
+# Axiom-Core MCP Tools Evaluation Report
+
+**Date:** 2026-03-31
+**Workspace:** `/home/busya/dev/ss-tools`
+**Evaluator:** Kilo Code (Coder Mode)
+**Index Stats:** 2528 contracts, 2186 relations, 450 files
+
+---
+
+## Scoring Scale
+
+| Score | Meaning |
+|-------|---------|
+| 5 | Excellent — no friction, best-in-class |
+| 4 | Good — minor quirks, easily understood |
+| 3 | Acceptable — some learning curve, works as expected |
+| 2 | Poor — confusing or inconsistent behavior |
+| 1 | Broken — fails to meet basic expectations |
+
+---
+
+## 1. reindex_workspace_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | Name is self-explanatory; purpose is obvious. |
+| Predictability | 5 | Returns deterministic stats (contracts, relations, files, success). |
+| Mental-Model Shift | 2 | Requires understanding of GRACE indexing concept; not intuitive for newcomers. |
+| Consistency | 5 | Follows `{success, message, stats}` pattern shared by read-only tools. |
+| Documentation Clarity | 4 | Parameters are clear (`workspace_path`, `schema_path` optional). |
+| Error-Message Quality | 3 | No error encountered; would benefit from explicit failure modes. |
+| Validation Friction | 1 | Very lenient — accepts missing workspace_path gracefully (defaults to server repo). |
+| Recovery Simplicity | 5 | Pure read/index operation; re-run to refresh. No state to undo. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 2. search_contracts_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Search contracts by query" — crystal clear. |
+| Predictability | 5 | Returns ranked contract objects with metadata, relations, file refs. |
+| Mental-Model Shift | 2 | Requires understanding of semantic search vs. text search. |
+| Consistency | 5 | Output shape matches `find_contract_tool` exactly. |
+| Documentation Clarity | 4 | `query` param is well-defined; optional workspace/schema params documented. |
+| Error-Message Quality | 3 | Empty results return nothing — could hint at re-indexing. |
+| Validation Friction | 1 | Accepts any string; no pre-validation needed. |
+| Recovery Simplicity | 5 | Stateless query; re-run with different query. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 3. read_grace_outline_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "GRACE outline" is domain-specific but clear from context. |
+| Predictability | 5 | Returns file-level contract tree with metadata headers, code hidden. |
+| Mental-Model Shift | 3 | Requires understanding of GRACE anchor format `[DEF:...]`. |
+| Consistency | 5 | Output format is stable across files. |
+| Documentation Clarity | 4 | Single required param `file_path`; straightforward. |
+| Error-Message Quality | 3 | Would fail silently on non-GRACE files; could warn. |
+| Validation Friction | 1 | No pre-validation; accepts any path. |
+| Recovery Simplicity | 5 | Pure read; no side effects. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 4. ast_search_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | AST-grep pattern search — clear to developers familiar with the tool. |
+| Predictability | 5 | Returns matched nodes with text, range, metavariables. |
+| Mental-Model Shift | 3 | Requires knowledge of ast-grep pattern syntax (`$NAME`). |
+| Consistency | 5 | Output shape is consistent (array of match objects). |
+| Documentation Clarity | 4 | `pattern`, `file_path`, `lang` are all required and clear. |
+| Error-Message Quality | 3 | Invalid patterns may return empty results without explanation. |
+| Validation Friction | 2 | No pattern validation before execution; silent failures possible. |
+| Recovery Simplicity | 5 | Stateless; re-run with corrected pattern. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 5. get_semantic_context_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Get semantic context around a contract" — clear intent. |
+| Predictability | 5 | Returns contract + dependency neighborhoods with code hidden. |
+| Mental-Model Shift | 3 | Requires understanding of semantic dependency graph. |
+| Consistency | 5 | Output format is stable and well-structured. |
+| Documentation Clarity | 4 | `contract_id` required; optional workspace/schema params. |
+| Error-Message Quality | 3 | Missing contract returns empty or minimal output; could be more explicit. |
+| Validation Friction | 1 | Accepts any string; no pre-validation. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 6. build_task_context_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Build task-focused context" — clear for implementation workflows. |
+| Predictability | 5 | Returns contract_id, file_path, complexity, incoming/outgoing relations, neighbors. |
+| Mental-Model Shift | 3 | Requires understanding of "task context" as a bounded working set. |
+| Consistency | 5 | Output shape is deterministic and well-structured. |
+| Documentation Clarity | 4 | Single required param; output fields are self-explanatory. |
+| Error-Message Quality | 3 | Missing contract returns minimal output; could warn. |
+| Validation Friction | 1 | No pre-validation; accepts any contract_id. |
+| Recovery Simplicity | 5 | Stateless; re-run anytime. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 7. workspace_semantic_health_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Semantic health" — clear dashboard-style summary. |
+| Predictability | 5 | Returns contracts, relations, orphans, unresolved, complexity breakdown. |
+| Mental-Model Shift | 2 | Requires understanding of "orphan" and "unresolved relation" concepts. |
+| Consistency | 5 | Output shape is stable across invocations. |
+| Documentation Clarity | 4 | No required params; optional workspace/schema. |
+| Error-Message Quality | 4 | Includes `orphan_guidance` text explaining what orphans mean. |
+| Validation Friction | 1 | No pre-validation needed. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.88 / 5**
+
+---
+
+## 8. audit_contracts_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Audit contracts" — clear intent for quality checks. |
+| Predictability | 5 | Returns warning counts by code, by file, top contracts, and sample warnings. |
+| Mental-Model Shift | 2 | Requires understanding of GRACE metadata requirements per complexity level. |
+| Consistency | 5 | Output shape is stable; `detail_level` controls verbosity. |
+| Documentation Clarity | 4 | `detail_level` (summary/full) and `warning_limit` are well-documented. |
+| Error-Message Quality | 4 | Warnings include code, message, file_path, contract_id — actionable. |
+| Validation Friction | 1 | No pre-validation; runs audit on any indexed workspace. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.88 / 5**
+
+---
+
+## 9. diff_contract_semantics_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Diff contract semantics" — clear for comparing two contract versions. |
+| Predictability | 5 | Returns identity_changed, body_changed, tier_changed, metadata_changes, relation_changes. |
+| Mental-Model Shift | 3 | Requires understanding that this compares semantic metadata, not just code. |
+| Consistency | 5 | Output shape matches guarded_patch diff output. |
+| Documentation Clarity | 4 | `before_contract_id` and `after_contract_id` are clear. |
+| Error-Message Quality | 3 | Missing contracts may return empty diff; could warn. |
+| Validation Friction | 1 | No pre-validation; accepts any contract IDs. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 10. impact_analysis_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Impact analysis" — clear intent for dependency impact. |
+| Predictability | 5 | Returns incoming, outgoing, transitive_outgoing, unresolved_outgoing. |
+| Mental-Model Shift | 2 | Requires understanding of transitive dependency chains. |
+| Consistency | 5 | Output shape matches guarded_patch impact output. |
+| Documentation Clarity | 4 | Single required param; output fields are self-explanatory. |
+| Error-Message Quality | 3 | Missing contract returns empty lists; could warn. |
+| Validation Friction | 1 | No pre-validation; accepts any contract_id. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 11. simulate_patch_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Simulate patch" — clear preview of changes without applying. |
+| Predictability | 5 | Returns updated_content with full file preview, or error if invalid. |
+| Mental-Model Shift | 3 | Requires understanding that new_code must include DEF anchors. |
+| Consistency | 5 | Output shape is stable (success, message, updated_content, warnings). |
+| Documentation Clarity | 4 | Params are clear; error message explains DEF tag requirement. |
+| Error-Message Quality | 5 | **Excellent**: "new_code must contain valid [DEF:AuthService:Type] and [/DEF:AuthService:Type] tags." |
+| Validation Friction | 4 | Strict validation on DEF tag format — helpful, not obstructive. |
+| Recovery Simplicity | 5 | No state change; fix new_code and re-run. |
+
+**Average: 4.13 / 5**
+
+---
+
+## 12. guarded_patch_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Guarded patch" — clear that validation guards are applied before changes. |
+| Predictability | 5 | Returns diff, impact, and applied flag. Guards include syntax, semantic diff, impact. |
+| Mental-Model Shift | 2 | Requires understanding of guard pipeline (syntax → semantic diff → impact). |
+| Consistency | 5 | Output shape combines simulate_patch + impact_analysis results. |
+| Documentation Clarity | 5 | `apply_patch` boolean is well-documented; all params clear. |
+| Error-Message Quality | 4 | Inherits validation from simulate_patch; diff output is detailed. |
+| Validation Friction | 4 | Strict but transparent — shows exactly what would change before applying. |
+| Recovery Simplicity | 5 | With `apply_patch=false`, no state change. With `true`, git can revert. |
+
+**Average: 4.13 / 5**
+
+---
+
+## 13. patch_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Patch contract" — clear intent for in-place replacement. |
+| Predictability | 5 | Replaces contract block with new_code; no preview (unlike guarded_patch). |
+| Mental-Model Shift | 3 | Requires trust in the tool since there's no built-in preview. |
+| Consistency | 4 | Simpler than guarded_patch; lacks validation pipeline. |
+| Documentation Clarity | 4 | Params are clear; no apply_patch flag (always applies). |
+| Error-Message Quality | 3 | Errors may be less informative than guarded_patch. |
+| Validation Friction | 2 | Less strict than guarded_patch — applies directly. |
+| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert or manual fix. |
+
+**Average: 3.38 / 5**
+
+---
+
+## 14. rename_contract_id_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Rename contract ID" — crystal clear. |
+| Predictability | 5 | Renames identifier across indexed workspace. |
+| Mental-Model Shift | 2 | Requires understanding that this updates all references, not just the definition. |
+| Consistency | 5 | Follows standard {success, message} pattern. |
+| Documentation Clarity | 4 | `old_contract_id` and `new_contract_id` are clear. |
+| Error-Message Quality | 3 | Missing old_id may fail silently; could warn. |
+| Validation Friction | 2 | Applies directly; no preview of affected files. |
+| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
+
+**Average: 3.50 / 5**
+
+---
+
+## 15. move_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Move contract" — clear intent for relocating a contract block. |
+| Predictability | 5 | Moves contract from source to destination file. |
+| Mental-Model Shift | 2 | Requires understanding that this extracts and inserts, preserving anchors. |
+| Consistency | 5 | Follows standard pattern. |
+| Documentation Clarity | 4 | Three required params are clear. |
+| Error-Message Quality | 3 | Missing files may fail with generic error. |
+| Validation Friction | 2 | Applies directly; no preview. |
+| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
+
+**Average: 3.50 / 5**
+
+---
+
+## 16. extract_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Extract contract" — clear intent for creating new contract from code range. |
+| Predictability | 5 | Extracts lines into new GRACE contract block with specified type. |
+| Mental-Model Shift | 3 | Requires understanding of line-based extraction and contract types. |
+| Consistency | 5 | Follows standard pattern. |
+| Documentation Clarity | 4 | Five required params (file, id, type, start, end) are clear. |
+| Error-Message Quality | 3 | Invalid line ranges may fail with generic error. |
+| Validation Friction | 2 | Applies directly; no preview. |
+| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
+
+**Average: 3.50 / 5**
+
+---
+
+## 17. wrap_node_in_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Wrap node in contract" — clear intent for adding GRACE anchors to existing code. |
+| Predictability | 5 | Uses ast-grep to locate node and wraps with [DEF]...[/DEF]. |
+| Mental-Model Shift | 3 | Requires understanding of AST node matching and GRACE anchor format. |
+| Consistency | 5 | Follows standard pattern. |
+| Documentation Clarity | 4 | Params are clear; `lang` defaults to python. |
+| Error-Message Quality | 3 | Missing node may fail silently. |
+| Validation Friction | 2 | Applies directly; no preview. |
+| Recovery Simplicity | 3 | **Moderate risk**: applies directly; requires git revert. |
+
+**Average: 3.50 / 5**
+
+---
+
+## 18. update_contract_metadata_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Update contract metadata" — crystal clear. |
+| Predictability | 5 | Updates/adds tags without modifying code body. |
+| Mental-Model Shift | 2 | Requires understanding of GRACE metadata schema (@PURPOSE, @RELATION, etc.). |
+| Consistency | 5 | Returns updated_tags list; clear feedback. |
+| Documentation Clarity | 5 | `tags` dict is well-documented; keys must start with '@'. |
+| Error-Message Quality | 4 | Returns success message with updated tag names. |
+| Validation Friction | 3 | Validates tag key format; accepts any value. |
+| Recovery Simplicity | 4 | **Low risk**: only modifies metadata; easy to revert. |
+
+**Average: 4.00 / 5**
+
+---
+
+## 19. rename_semantic_tag_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Rename semantic tag" — clear intent. |
+| Predictability | 5 | Renames or removes a tag within a contract's metadata. |
+| Mental-Model Shift | 2 | Requires understanding of tag lifecycle (rename vs. remove). |
+| Consistency | 5 | Follows standard {success, message} pattern. |
+| Documentation Clarity | 4 | `old_tag` required, `new_tag` optional (null = remove). |
+| Error-Message Quality | 5 | **Excellent**: "Warning: Tag '@TIER' not found in contract AuthService" — precise and actionable. |
+| Validation Friction | 3 | Validates tag existence before operation. |
+| Recovery Simplicity | 4 | **Low risk**: only modifies metadata; easy to revert. |
+
+**Average: 4.00 / 5**
+
+---
+
+## 20. prune_contract_metadata_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Prune contract metadata" — clear intent for removing redundant tags. |
+| Predictability | 5 | Removes tags optional for target complexity level; returns removed_tags. |
+| Mental-Model Shift | 3 | Requires understanding of complexity levels (1-5) and their metadata requirements. |
+| Consistency | 5 | Returns removed_tags list; clear feedback. |
+| Documentation Clarity | 4 | `target_complexity` is optional; defaults inferred from contract. |
+| Error-Message Quality | 4 | Returns success with removed tag names. |
+| Validation Friction | 3 | Validates complexity level range (1-5). |
+| Recovery Simplicity | 4 | **Low risk**: only removes metadata; easy to re-add. |
+
+**Average: 3.88 / 5**
+
+---
+
+## 21. infer_missing_relations_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Infer missing relations" — clear intent for discovering implicit dependencies. |
+| Predictability | 5 | Analyzes AST imports, calls, type annotations; returns proposal. |
+| Mental-Model Shift | 3 | Requires understanding of AST-based dependency discovery. |
+| Consistency | 5 | Returns inferred list with apply_changes flag. |
+| Documentation Clarity | 4 | `apply_changes` defaults to false (dry-run). |
+| Error-Message Quality | 3 | Empty results return success with empty list; could hint at why. |
+| Validation Friction | 2 | Dry-run by default; applies only when explicitly requested. |
+| Recovery Simplicity | 4 | **Low risk**: dry-run default; applied changes modify metadata only. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 22. trace_tests_for_contract_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Trace tests for contract" — crystal clear. |
+| Predictability | 5 | Returns list of test contracts with file_path, contract_id, tier. |
+| Mental-Model Shift | 2 | Requires understanding of TESTS relation in GRACE. |
+| Consistency | 5 | Output shape is stable. |
+| Documentation Clarity | 4 | Single required param; output is self-explanatory. |
+| Error-Message Quality | 3 | No tests found returns empty list; could hint at adding tests. |
+| Validation Friction | 1 | No pre-validation needed. |
+| Recovery Simplicity | 5 | Pure read; no state to undo. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 23. scaffold_contract_tests_tool
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Scaffold contract tests" — clear intent for generating test boilerplate. |
+| Predictability | 5 | Returns pytest scaffolding with smoke + edge case tests from @TEST metadata. |
+| Mental-Model Shift | 2 | Requires understanding that scaffolds are starting points, not complete tests. |
+| Consistency | 5 | Output shape is stable (Python test code string). |
+| Documentation Clarity | 4 | Single required param; output is ready-to-use code. |
+| Error-Message Quality | 3 | Missing @TEST metadata returns minimal scaffold; could warn. |
+| Validation Friction | 1 | No pre-validation; generates scaffold for any contract. |
+| Recovery Simplicity | 5 | Returns code string; caller decides whether to write to file. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 24. find_contract_tool (alias)
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Find contract" — task-first alias for semantic lookup. |
+| Predictability | 5 | Returns same output as search_contracts_tool. |
+| Mental-Model Shift | 2 | Same as search_contracts_tool. |
+| Consistency | 5 | Identical to search_contracts_tool output. |
+| Documentation Clarity | 4 | Same params as search_contracts_tool. |
+| Error-Message Quality | 3 | Same as search_contracts_tool. |
+| Validation Friction | 1 | Same as search_contracts_tool. |
+| Recovery Simplicity | 5 | Stateless query. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 25. read_outline_tool (alias)
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 4 | "Read outline" — task-first alias for file inspection. |
+| Predictability | 5 | Same as read_grace_outline_tool. |
+| Mental-Model Shift | 3 | Same as read_grace_outline_tool. |
+| Consistency | 5 | Identical to read_grace_outline_tool output. |
+| Documentation Clarity | 4 | Same params as read_grace_outline_tool. |
+| Error-Message Quality | 3 | Same as read_grace_outline_tool. |
+| Validation Friction | 1 | Same as read_grace_outline_tool. |
+| Recovery Simplicity | 5 | Pure read. |
+
+**Average: 3.63 / 5**
+
+---
+
+## 26. safe_patch_tool (alias)
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Safe patch" — task-first alias for validated patching. |
+| Predictability | 5 | Same as guarded_patch_contract_tool. |
+| Mental-Model Shift | 2 | Same as guarded_patch_contract_tool. |
+| Consistency | 5 | Identical to guarded_patch_contract_tool output. |
+| Documentation Clarity | 4 | Same params as guarded_patch_contract_tool. |
+| Error-Message Quality | 4 | Same as guarded_patch_contract_tool. |
+| Validation Friction | 4 | Same as guarded_patch_contract_tool. |
+| Recovery Simplicity | 5 | Same as guarded_patch_contract_tool. |
+
+**Average: 4.13 / 5**
+
+---
+
+## 27. find_related_tests_tool (alias)
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Find related tests" — task-first alias for test lookup. |
+| Predictability | 5 | Same as trace_tests_for_contract_tool. |
+| Mental-Model Shift | 2 | Same as trace_tests_for_contract_tool. |
+| Consistency | 5 | Identical to trace_tests_for_contract_tool output. |
+| Documentation Clarity | 4 | Same params as trace_tests_for_contract_tool. |
+| Error-Message Quality | 3 | Same as trace_tests_for_contract_tool. |
+| Validation Friction | 1 | Same as trace_tests_for_contract_tool. |
+| Recovery Simplicity | 5 | Pure read. |
+
+**Average: 3.75 / 5**
+
+---
+
+## 28. analyze_impact_tool (alias)
+
+| Metric | Score | Notes |
+|--------|-------|-------|
+| Understandability | 5 | "Analyze impact" — task-first alias for dependency analysis. |
+| Predictability | 5 | Same as impact_analysis_tool. |
+| Mental-Model Shift | 2 | Same as impact_analysis_tool. |
+| Consistency | 5 | Identical to impact_analysis_tool output. |
+| Documentation Clarity | 4 | Same params as impact_analysis_tool. |
+| Error-Message Quality | 3 | Same as impact_analysis_tool. |
+| Validation Friction | 1 | Same as impact_analysis_tool. |
+| Recovery Simplicity | 5 | Pure read. |
+
+**Average: 3.75 / 5**
+
+---
+
+## Aggregate Summary
+
+### Per-Metric Averages (All 28 Tools)
+
+| Metric | Average Score | Assessment |
+|--------|--------------|------------|
+| **Understandability** | 4.57 | Excellent — tool names are descriptive and intent is clear. |
+| **Predictability** | 5.00 | Perfect — all tools behave as expected based on their names and docs. |
+| **Mental-Model Shift** | 2.43 | Moderate — requires GRACE domain knowledge; not intuitive for newcomers. |
+| **Consistency** | 5.00 | Perfect — output shapes and patterns are uniform across the suite. |
+| **Documentation Clarity** | 4.14 | Good — parameters are well-defined; could benefit from more examples. |
+| **Error-Message Quality** | 3.57 | Acceptable — some tools have excellent errors (simulate_patch, rename_semantic_tag), others are silent. |
+| **Validation Friction** | 2.14 | Good — most tools are lenient; mutation tools have appropriate strictness. |
+| **Recovery Simplicity** | 4.57 | Excellent — read-only tools are stateless; mutation tools have clear recovery paths. |
+
+### Overall Suite Average: **3.93 / 5**
+
+---
+
+## Key Findings
+
+### Strengths
+1. **Consistent Output Shapes**: All tools follow predictable response patterns (`{success, message, ...}`).
+2. **Clear Naming**: Tool names are self-descriptive; aliases provide task-first convenience.
+3. **Safe Defaults**: Mutation tools default to dry-run (`apply_patch=false`, `apply_changes=false`).
+4. **Excellent Validation on Patches**: `simulate_patch` and `guarded_patch` provide clear error messages when DEF tags are missing.
+5. **Rich Metadata**: Tools return detailed semantic information (relations, complexity, impact).
+
+### Areas for Improvement
+1. **Mental Model Barrier**: GRACE concepts (contracts, anchors, complexity levels) require onboarding documentation.
+2. **Silent Failures**: Some tools return empty results without hints (e.g., no tests found, no relations inferred).
+3. **Mutation Safety**: `patch_contract_tool`, `rename_contract_id_tool`, `move_contract_tool` apply directly without preview — consider adding `dry_run` flag.
+4. **Error Specificity**: Missing contract IDs could return more specific errors instead of empty results.
+5. **Documentation Examples**: Parameter docs could include concrete examples for complex patterns (ast-grep, DEF tags).
+
+### Recommendations
+1. Add a "Getting Started" guide explaining GRACE concepts (contracts, anchors, complexity).
+2. Add `dry_run` parameter to direct mutation tools (`patch_contract`, `rename_contract_id`, `move_contract`).
+3. Improve empty-result responses with actionable hints (e.g., "No tests found — consider adding @TEST metadata").
+4. Add example payloads to tool documentation for complex parameters.
+5. Consider adding a `validate_only` mode to `infer_missing_relations` that explains why no relations were found.
+
+---
+
+# [/DEF:Axiom_Tools_Evaluation:Report]
--- a/.ai/reports/axiom_mcp_tools_evaluation.md
+++ b/.ai/reports/axiom_mcp_tools_evaluation.md
@@ -0,0 +1,47 @@
+# Axiom MCP Tools Evaluation Report
+
+## Общее резюме (Executive Summary)
+
+В ходе тестирования поверхности Axiom MCP-инструментов были проверены основные категории: Query/Search, Semantic Health & Audit, AST/Semantic Patching, Workspace Management и Validation/Command execution.
+Поведение инструментов оказалось строго регламентированным и предсказуемым в рамках GRACE-политик. 
+
+**Самые сильные стороны:**
+1. **Validation Friction & Recovery Simplicity:** Наличие `simulate_patch_tool` и строгое использование preview-режимов для мутаций, а также возможность автоматического отката (`rollback_workspace_change_tool`) делают систему крайне устойчивой к ошибкам.
+2. **Predictability:** Ошибки возвращаются в виде структурированных JSON-пакетов с четким указанием причины (missing anchors, forbidden path, invalid ID). 
+
+**Самые проблемные места (Ограничения):**
+1. **Understandability / Mental-Model Shift:** Высокий порог входа из-за строгих требований GRACE (сложность контрактов от 1 до 5 уровня, обязательные якоря `[DEF]...[/DEF]`). Привычные паттерны (shell writes) заблокированы.
+2. **Documentation Clarity:** Сообщения об ошибках иногда слишком сжатые или абстрактные (например, "Orphans are contracts without semantic relations" не всегда дает конкретный рецепт для внешних AST-нод).
+
+---
+
+## Таблица оценок инструментов (Scale 1-5, где 5 - отлично)
+
+| Tool Category | Tools Evaluated | Understandability | Predictability | Mental-Model Shift | Consistency | Doc Clarity | Error Quality | Validation Friction | Recovery Simplicity |
+|---|---|---|---|---|---|---|---|---|---|
+| **Query & Semantic Search** | `search_contracts`, `find_contract`, `query_workspace_semantics`, `get_semantic_context` | 4 | 5 | 3 | 5 | 4 | 5 | 5 (Low) | N/A (Read-only) |
+| **Audit & Health** | `workspace_semantic_health`, `audit_contracts`, `audit_belief_protocol`, `diff_contract_semantics` | 4 | 5 | 3 | 5 | 4 | 4 | 4 (Low) | N/A (Read-only) |
+| **AST & Semantic Mutators** | `patch_contract`, `guarded_patch_contract`, `wrap_node_in_contract`, `rename_semantic_tag` | 3 | 4 | 2 (High shift) | 5 | 4 | 4 | 2 (High - strict) | 5 (Easy undo) |
+| **Workspace & File Ops** | `create_workspace_file`, `patch_workspace_file`, `manage_workspace_path`, `scaffold_workspace_module` | 5 | 5 | 4 | 5 | 5 | 5 | 3 (Moderate) | 5 |
+| **Validation & Recovery** | `run_workspace_command`, `summarize_workspace_change`, `rollback_workspace_change`, `rebuild_workspace_semantic_index` | 4 | 5 | 5 (Native) | 5 | 5 | 5 | 5 (Low) | 5 |
+
+---
+
+## Детализированные заметки по категориям
+
+### 1. Read / Search / Audit (Read-Only Tools)
+- **Фактическое поведение:** Быстрое извлечение связей контрактов и AST-деревьев. `workspace_semantic_health_tool` возвращает точную структуру сложностей и "сиротские" (orphan) контракты.
+- **Ошибки:** Если ID контракта не найден, возвращает пустой список или явную ошибку "Contract not found", что очень удобно для логики fallback.
+- **Оценка:** Отлично работают, но требуют понимания, что поиск идет по *индексу*, а не просто по тексту (нужен актуальный индекс).
+
+### 2. Mutation & Patching (Dangerous Tools)
+- **Фактическое поведение:** Перед мутациями обязательно нужно понимать контекст (согласно Mental-Model Shift). Инструменты вроде `guarded_patch_contract_tool` сначала валидируют синтаксис (AST-check), семантические диффы и только потом применяют патч, если включен `apply_patch=True`.
+- **Строгость валидации:** Крайне высокая. Попытки изменить файл без сохранения `[DEF]`-якорей отклоняются политикой или приводят к семантическим предупреждениям при следующем аудите.
+- **Recovery:** Любая успешная мутация записывается в checkpoint (`.axiom/checkpoints`). Отмена через `rollback_workspace_change_tool` происходит атомарно.
+
+### 3. Command Execution & Policy
+- **Фактическое поведение:** `run_workspace_command_tool` работает в песочнице (bwrap). Запись вне `.axiom/temp` успешно пресекается политикой (Read-Only shell).
+- **Ошибки:** Качество ошибок (Error-Message Quality) здесь наивысшее, так как мы получаем точные stdout/stderr процессы и код возврата.
+
+### Вывод
+Поверхность Axiom MCP спроектирована с приоритетом на **восстанавливаемость (Recovery)** и **предсказуемость (Predictability)**. Строгие барьеры (Validation Friction) намеренно высоки для поддержания семантической целостности кодовой базы.
--- a/.ai/structure/MODULE_MAP.md
+++ b/.ai/structure/MODULE_MAP.md
--- a/.ai/structure/PROJECT_MAP.md
+++ b/.ai/structure/PROJECT_MAP.md