8.6 KiB
description
| description |
|---|
| Generate tests, manage test documentation, and ensure maximum code coverage |
User Input
$ARGUMENTS
You MUST consider the user input before proceeding (if not empty).
Goal
Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality.
Operating Constraints
- NEVER delete existing tests - Only update if they fail due to bugs in the test or implementation
- NEVER duplicate tests - Check existing tests first before creating new ones
- Use TEST_FIXTURE fixtures - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
- Co-location required - Write tests in
__tests__directories relative to the code being tested - Decision-memory regression guard - Tests and audits must not normalize silent reintroduction of any path documented in upstream
@REJECTED
Execution Steps
1. Analyze Context
Run .specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks from repo root and parse FEATURE_DIR and AVAILABLE_DOCS.
Determine:
- FEATURE_DIR - where the feature is located
- TASKS_FILE - path to
tasks.md - Which modules need testing based on task status
- Which ADRs or task guardrails define rejected paths for the touched scope
2. Load Relevant Artifacts
From tasks.md:
- Identify completed implementation tasks (not test tasks)
- Extract file paths that need tests
- Extract guardrail summaries and blocked paths
From .ai/standards/semantics.md:
- Read effective complexity expectations
- Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR
- For CRITICAL modules: Read
@TEST_fixtures
From ADR sources and touched code:
- Read
[DEF:id:ADR]nodes when present - Read local
@RATIONALEand@REJECTEDin touched contracts
From existing tests:
- Scan
__tests__directories for existing tests - Identify test patterns and coverage gaps
3. Test Coverage Analysis
Create coverage matrix:
| Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded |
|---|---|---|---|---|---|
| ... | ... | ... | ... | ... | ... |
4. Semantic Audit & Logic Emulation (CRITICAL)
Before writing tests, the Tester MUST:
- Run
axiom-core.audit_contracts_tool: Identify semantic violations. - Run a protocol-shape review on touched files:
- Reject non-canonical semantic markup, including docstring-only annotations such as
@PURPOSE,@PRE, or@INVARIANTwritten inside class/function docstrings without canonical[DEF]...[/DEF]anchors and header metadata. - Reject files whose effective complexity contract is under-specified relative to
.ai/standards/semantics.md. - Reject Python Complexity 4+ modules that omit meaningful
logger.reason()/logger.reflect()checkpoints. - Reject Python Complexity 5 modules that omit
belief_scope(...),@DATA_CONTRACT, or@INVARIANT. - Treat broken or missing closing anchors as blocking violations.
- Reject retained workaround code if the local contract lacks
@RATIONALE/@REJECTED. - Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected.
- Reject non-canonical semantic markup, including docstring-only annotations such as
- Emulate Algorithm: Step through the code implementation in mind.
- Verify it adheres to the
@PURPOSEand@INVARIANT. - Verify
@PREand@POSTconditions are correctly handled. - Verify the implementation follows accepted-path rationale rather than drifting into a blocked path.
- Verify it adheres to the
- Validation Verdict:
- If audit fails: Emit
[AUDIT_FAIL: semantic_noncompliance]with concrete file-path reasons and notify Orchestrator. - Example blocking case:
backend/src/services/dataset_review/repositories/session_repository.pycontains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived. - If audit passes: Proceed to writing/verifying tests.
- If audit fails: Emit
5. Write Tests (TDD Approach)
For each module requiring tests:
- Check existing tests: Scan
__tests__/for duplicates. - Read TEST_FIXTURE: If CRITICAL tier, read
@TEST_FIXTUREfrom semantics header. - Do not normalize broken semantics through tests:
- The Tester must not write tests that silently accept malformed semantic protocol usage.
- If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
- Write test: Follow co-location strategy.
- Python:
src/module/__tests__/test_module.py - Svelte:
src/lib/components/__tests__/test_component.test.js
- Python:
- Use mocks: Use
unittest.mock.MagicMockfor external dependencies - Add rejected-path regression coverage when relevant:
- If ADR or local contract names a blocked path in
@REJECTED, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored.
- If ADR or local contract names a blocked path in
4a. UX Contract Testing (Frontend Components)
For Svelte components with @UX_STATE, @UX_FEEDBACK, @UX_RECOVERY tags:
- Parse UX tags: Read component file and extract all
@UX_*annotations - Generate UX tests: Create tests for each UX state transition
// Example: Testing @UX_STATE: Idle -> Expanded it('should transition from Idle to Expanded on toggle click', async () => { render(Sidebar); const toggleBtn = screen.getByRole('button', { name: /toggle/i }); await fireEvent.click(toggleBtn); expect(screen.getByTestId('sidebar')).toHaveClass('expanded'); }); - Test
@UX_FEEDBACK: Verify visual feedback (toast, shake, color changes) - Test
@UX_RECOVERY: Verify error recovery mechanisms (retry, clear input) - Use
@UX_TESTfixtures: If component has@UX_TESTtags, use them as test specifications - Verify decision memory: If the UI contract declares
@REJECTED, ensure browser-visible behavior does not regress into the rejected path.
UX Test Template:
// [DEF:ComponentUXTests:Module]
// @C: 3
// @RELATION: VERIFIES -> ../Component.svelte
// @PURPOSE: Test UX states and transitions
describe('Component UX States', () => {
// @UX_STATE: Idle -> {action: click, expected: Active}
it('should transition Idle -> Active on click', async () => { ... });
// @UX_FEEDBACK: Toast on success
it('should show toast on successful action', async () => { ... });
// @UX_RECOVERY: Retry on error
it('should allow retry on error', async () => { ... });
});
// [/DEF:__tests__/test_Component:Module]
5. Test Documentation
Create/update documentation in specs/<feature>/tests/:
tests/
├── README.md # Test strategy and overview
├── coverage.md # Coverage matrix and reports
└── reports/
└── YYYY-MM-DD-report.md
Include decision-memory coverage notes when ADR or rejected-path regressions were checked.
6. Execute Tests
Run tests and report results:
Backend:
cd backend && .venv/bin/python3 -m pytest -v
Frontend:
cd frontend && npm run test
7. Update Tasks
Mark test tasks as completed in tasks.md with:
- Test file path
- Coverage achieved
- Any issues found
- Whether rejected-path regression checks passed or remain manual audit items
Output
Generate test execution report:
# Test Report: [FEATURE]
**Date**: [YYYY-MM-DD]
**Executed by**: Tester Agent
## Coverage Summary
| Module | Tests | Coverage % |
|--------|-------|------------|
| ... | ... | ... |
## Test Results
- Total: [X]
- Passed: [X]
- Failed: [X]
- Skipped: [X]
## Semantic Audit Verdict
- Verdict: PASS | FAIL
- Blocking Violations:
- [file path] -> [reason]
- Decision Memory:
- ADRs checked: [...]
- Rejected-path regressions: PASS | FAIL
- Missing `@RATIONALE` / `@REJECTED`: [...]
- Notes:
- Reject docstring-only semantic pseudo-markup
- Reject complexity/contract mismatches
- Reject missing belief-state instrumentation for Python Complexity 4/5
- Reject silent resurrection of rejected paths
## Issues Found
| Test | Error | Resolution |
|------|-------|------------|
| ... | ... | ... |
## Next Steps
- [ ] Fix failed tests
- [ ] Fix blocking semantic violations before acceptance
- [ ] Fix decision-memory drift or rejected-path regressions
- [ ] Add more coverage for [module]
- [ ] Review TEST_FIXTURE fixtures
Context for Testing
$ARGUMENTS