--- description: Generate tests, manage test documentation, and ensure maximum code coverage --- ## User Input ```text $ARGUMENTS ``` You **MUST** consider the user input before proceeding (if not empty). ## Goal Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality. ## Operating Constraints 1. **NEVER delete existing tests** - Only update if they fail due to bugs in the test or implementation 2. **NEVER duplicate tests** - Check existing tests first before creating new ones 3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md 4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested 5. **Decision-memory regression guard** - Tests and audits must not normalize silent reintroduction of any path documented in upstream `@REJECTED` ## Execution Steps ### 1. Analyze Context Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS. Determine: - FEATURE_DIR - where the feature is located - TASKS_FILE - path to `tasks.md` - Which modules need testing based on task status - Which ADRs or task guardrails define rejected paths for the touched scope ### 2. Load Relevant Artifacts **From `tasks.md`:** - Identify completed implementation tasks (not test tasks) - Extract file paths that need tests - Extract guardrail summaries and blocked paths **From `.ai/standards/semantics.md`:** - Read effective complexity expectations - Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR - For CRITICAL modules: Read `@TEST_` fixtures **From ADR sources and touched code:** - Read `[DEF:id:ADR]` nodes when present - Read local `@RATIONALE` and `@REJECTED` in touched contracts **From existing tests:** - Scan `__tests__` directories for existing tests - Identify test patterns and coverage gaps ### 3. Test Coverage Analysis Create coverage matrix: | Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded | |--------|------|-----------|-------------------|------------------------|-----------------------| | ... | ... | ... | ... | ... | ... | ### 4. Semantic Audit & Logic Emulation (CRITICAL) Before writing tests, the Tester MUST: 1. **Run `axiom-core.audit_contracts_tool`**: Identify semantic violations. 2. **Run a protocol-shape review on touched files**: - Reject non-canonical semantic markup, including docstring-only annotations such as `@PURPOSE`, `@PRE`, or `@INVARIANT` written inside class/function docstrings without canonical `[DEF]...[/DEF]` anchors and header metadata. - Reject files whose effective complexity contract is under-specified relative to [`.ai/standards/semantics.md`](.ai/standards/semantics.md). - Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints. - Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`. - Treat broken or missing closing anchors as blocking violations. - Reject retained workaround code if the local contract lacks `@RATIONALE` / `@REJECTED`. - Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected. 3. **Emulate Algorithm**: Step through the code implementation in mind. - Verify it adheres to the `@PURPOSE` and `@INVARIANT`. - Verify `@PRE` and `@POST` conditions are correctly handled. - Verify the implementation follows accepted-path rationale rather than drifting into a blocked path. 4. **Validation Verdict**: - If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator. - Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived. - If audit passes: Proceed to writing/verifying tests. ### 5. Write Tests (TDD Approach) For each module requiring tests: 1. **Check existing tests**: Scan `__tests__/` for duplicates. 2. **Read TEST_FIXTURE**: If CRITICAL tier, read `@TEST_FIXTURE` from semantics header. 3. **Do not normalize broken semantics through tests**: - The Tester must not write tests that silently accept malformed semantic protocol usage. - If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure. 4. **Write test**: Follow co-location strategy. - Python: `src/module/__tests__/test_module.py` - Svelte: `src/lib/components/__tests__/test_component.test.js` 5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies 6. **Add rejected-path regression coverage when relevant**: - If ADR or local contract names a blocked path in `@REJECTED`, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored. ### 4a. UX Contract Testing (Frontend Components) For Svelte components with `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY` tags: 1. **Parse UX tags**: Read component file and extract all `@UX_*` annotations 2. **Generate UX tests**: Create tests for each UX state transition ```javascript // Example: Testing @UX_STATE: Idle -> Expanded it('should transition from Idle to Expanded on toggle click', async () => { render(Sidebar); const toggleBtn = screen.getByRole('button', { name: /toggle/i }); await fireEvent.click(toggleBtn); expect(screen.getByTestId('sidebar')).toHaveClass('expanded'); }); ``` 3. **Test `@UX_FEEDBACK`**: Verify visual feedback (toast, shake, color changes) 4. **Test `@UX_RECOVERY`**: Verify error recovery mechanisms (retry, clear input) 5. **Use `@UX_TEST` fixtures**: If component has `@UX_TEST` tags, use them as test specifications 6. **Verify decision memory**: If the UI contract declares `@REJECTED`, ensure browser-visible behavior does not regress into the rejected path. **UX Test Template:** ```javascript // [DEF:ComponentUXTests:Module] // @C: 3 // @RELATION: VERIFIES -> ../Component.svelte // @PURPOSE: Test UX states and transitions describe('Component UX States', () => { // @UX_STATE: Idle -> {action: click, expected: Active} it('should transition Idle -> Active on click', async () => { ... }); // @UX_FEEDBACK: Toast on success it('should show toast on successful action', async () => { ... }); // @UX_RECOVERY: Retry on error it('should allow retry on error', async () => { ... }); }); // [/DEF:__tests__/test_Component:Module] ``` ### 5. Test Documentation Create/update documentation in `specs//tests/`: ``` tests/ ├── README.md # Test strategy and overview ├── coverage.md # Coverage matrix and reports └── reports/ └── YYYY-MM-DD-report.md ``` Include decision-memory coverage notes when ADR or rejected-path regressions were checked. ### 6. Execute Tests Run tests and report results: **Backend:** ```bash cd backend && .venv/bin/python3 -m pytest -v ``` **Frontend:** ```bash cd frontend && npm run test ``` ### 7. Update Tasks Mark test tasks as completed in `tasks.md` with: - Test file path - Coverage achieved - Any issues found - Whether rejected-path regression checks passed or remain manual audit items ## Output Generate test execution report: ```markdown # Test Report: [FEATURE] **Date**: [YYYY-MM-DD] **Executed by**: Tester Agent ## Coverage Summary | Module | Tests | Coverage % | |--------|-------|------------| | ... | ... | ... | ## Test Results - Total: [X] - Passed: [X] - Failed: [X] - Skipped: [X] ## Semantic Audit Verdict - Verdict: PASS | FAIL - Blocking Violations: - [file path] -> [reason] - Decision Memory: - ADRs checked: [...] - Rejected-path regressions: PASS | FAIL - Missing `@RATIONALE` / `@REJECTED`: [...] - Notes: - Reject docstring-only semantic pseudo-markup - Reject complexity/contract mismatches - Reject missing belief-state instrumentation for Python Complexity 4/5 - Reject silent resurrection of rejected paths ## Issues Found | Test | Error | Resolution | |------|-------|------------| | ... | ... | ... | ## Next Steps - [ ] Fix failed tests - [ ] Fix blocking semantic violations before acceptance - [ ] Fix decision-memory drift or rejected-path regressions - [ ] Add more coverage for [module] - [ ] Review TEST_FIXTURE fixtures ``` ## Context for Testing $ARGUMENTS