mcp tuning

This commit is contained in:
2026-04-01 13:29:41 +03:00
parent 586229a974
commit 1e46073dd6
19 changed files with 1324 additions and 28593 deletions

View File

@@ -14,7 +14,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Goal
Execute semantic audit and full testing cycle: verify contract compliance, emulate logic, ensure maximum coverage, and maintain test quality.
Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality.
## Operating Constraints
@@ -22,6 +22,7 @@ Execute semantic audit and full testing cycle: verify contract compliance, emula
2. **NEVER duplicate tests** - Check existing tests first before creating new ones
3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested
5. **Decision-memory regression guard** - Tests and audits must not normalize silent reintroduction of any path documented in upstream `@REJECTED`
## Execution Steps
@@ -31,18 +32,25 @@ Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --inclu
Determine:
- FEATURE_DIR - where the feature is located
- TASKS_FILE - path to tasks.md
- TASKS_FILE - path to `tasks.md`
- Which modules need testing based on task status
- Which ADRs or task guardrails define rejected paths for the touched scope
### 2. Load Relevant Artifacts
**From tasks.md:**
**From `tasks.md`:**
- Identify completed implementation tasks (not test tasks)
- Extract file paths that need tests
- Extract guardrail summaries and blocked paths
**From .ai/standards/semantics.md:**
- Read @TIER annotations for modules
- For CRITICAL modules: Read @TEST_ fixtures
**From `.ai/standards/semantics.md`:**
- Read effective complexity expectations
- Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR
- For CRITICAL modules: Read `@TEST_` fixtures
**From ADR sources and touched code:**
- Read `[DEF:id:ADR]` nodes when present
- Read local `@RATIONALE` and `@REJECTED` in touched contracts
**From existing tests:**
- Scan `__tests__` directories for existing tests
@@ -52,9 +60,9 @@ Determine:
Create coverage matrix:
| Module | File | Has Tests | TIER | TEST_FIXTURE Available |
|--------|------|-----------|------|----------------------|
| ... | ... | ... | ... | ... |
| Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded |
|--------|------|-----------|-------------------|------------------------|-----------------------|
| ... | ... | ... | ... | ... | ... |
### 4. Semantic Audit & Logic Emulation (CRITICAL)
@@ -66,9 +74,12 @@ Before writing tests, the Tester MUST:
- Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints.
- Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`.
- Treat broken or missing closing anchors as blocking violations.
- Reject retained workaround code if the local contract lacks `@RATIONALE` / `@REJECTED`.
- Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected.
3. **Emulate Algorithm**: Step through the code implementation in mind.
- Verify it adheres to the `@PURPOSE` and `@INVARIANT`.
- Verify `@PRE` and `@POST` conditions are correctly handled.
- Verify the implementation follows accepted-path rationale rather than drifting into a blocked path.
4. **Validation Verdict**:
- If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator.
- Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived.
@@ -79,7 +90,7 @@ Before writing tests, the Tester MUST:
For each module requiring tests:
1. **Check existing tests**: Scan `__tests__/` for duplicates.
2. **Read TEST_FIXTURE**: If CRITICAL tier, read @TEST_FIXTURE from semantics header.
2. **Read TEST_FIXTURE**: If CRITICAL tier, read `@TEST_FIXTURE` from semantics header.
3. **Do not normalize broken semantics through tests**:
- The Tester must not write tests that silently accept malformed semantic protocol usage.
- If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
@@ -87,6 +98,8 @@ For each module requiring tests:
- Python: `src/module/__tests__/test_module.py`
- Svelte: `src/lib/components/__tests__/test_component.test.js`
5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies
6. **Add rejected-path regression coverage when relevant**:
- If ADR or local contract names a blocked path in `@REJECTED`, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored.
### 4a. UX Contract Testing (Frontend Components)
@@ -103,9 +116,10 @@ For Svelte components with `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY` tags:
expect(screen.getByTestId('sidebar')).toHaveClass('expanded');
});
```
3. **Test @UX_FEEDBACK**: Verify visual feedback (toast, shake, color changes)
4. **Test @UX_RECOVERY**: Verify error recovery mechanisms (retry, clear input)
5. **Use @UX_TEST fixtures**: If component has `@UX_TEST` tags, use them as test specifications
3. **Test `@UX_FEEDBACK`**: Verify visual feedback (toast, shake, color changes)
4. **Test `@UX_RECOVERY`**: Verify error recovery mechanisms (retry, clear input)
5. **Use `@UX_TEST` fixtures**: If component has `@UX_TEST` tags, use them as test specifications
6. **Verify decision memory**: If the UI contract declares `@REJECTED`, ensure browser-visible behavior does not regress into the rejected path.
**UX Test Template:**
```javascript
@@ -139,6 +153,8 @@ tests/
└── YYYY-MM-DD-report.md
```
Include decision-memory coverage notes when ADR or rejected-path regressions were checked.
### 6. Execute Tests
Run tests and report results:
@@ -155,10 +171,11 @@ cd frontend && npm run test
### 7. Update Tasks
Mark test tasks as completed in tasks.md with:
Mark test tasks as completed in `tasks.md` with:
- Test file path
- Coverage achieved
- Any issues found
- Whether rejected-path regression checks passed or remain manual audit items
## Output
@@ -188,10 +205,15 @@ Generate test execution report:
- Verdict: PASS | FAIL
- Blocking Violations:
- [file path] -> [reason]
- Decision Memory:
- ADRs checked: [...]
- Rejected-path regressions: PASS | FAIL
- Missing `@RATIONALE` / `@REJECTED`: [...]
- Notes:
- Reject docstring-only semantic pseudo-markup
- Reject complexity/contract mismatches
- Reject missing belief-state instrumentation for Python Complexity 4/5
- Reject silent resurrection of rejected paths
## Issues Found
@@ -203,6 +225,7 @@ Generate test execution report:
- [ ] Fix failed tests
- [ ] Fix blocking semantic violations before acceptance
- [ ] Fix decision-memory drift or rejected-path regressions
- [ ] Add more coverage for [module]
- [ ] Review TEST_FIXTURE fixtures
```