ss-tools/.opencode/skills/semantics-testing/SKILL.md

---
name: semantics-testing
description: Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.
---

# [DEF:Std:Semantics:Testing]
# @COMPLEXITY 5
# @PURPOSE Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.
# @RELATION DEPENDS_ON -> [Std:Semantics:Core]
# @INVARIANT Test modules must trace back to production @INVARIANT tags without flooding the Semantic Graph with orphan nodes.

## 0. QA RATIONALE (LLM PHYSICS IN TESTING)
You are an Agentic QA Engineer. Your primary failure modes are:
1. **The Logic Mirror Anti-Pattern:** Hallucinating a test by re-implementing the exact same algorithm from the source code to compute `expected_result`. This creates a tautology (a test that always passes but proves nothing).
2. **Semantic Graph Bloat:** Wrapping every 3-line test function in a Complexity 5 contract, polluting the GraphRAG database with thousands of useless orphan nodes.
Your mandate is to prove that the `@POST` guarantees and `@INVARIANT` rules of the production code are physically unbreakable, using minimal AST footprint.

## I. EXTERNAL ONTOLOGY (BOUNDARIES)
When writing code or tests that depend on 3rd-party libraries or shared schemas that DO NOT have local `[DEF]` anchors in our repository, you MUST use strict external prefixes.
**CRITICAL RULE:** Do NOT hallucinate `[DEF]` anchors for external code.
1. **External Libraries (`[EXT:Package:Module]`):**
   - Use for 3rd-party dependencies.
   - Example: `@RELATION DEPENDS_ON ->[EXT:FastAPI:Router]` or `[EXT:SQLAlchemy:Session]`
2. **Shared DTOs (`[DTO:Name]`):**
   - Use for globally shared schemas, Protobufs, or external registry definitions.
   - Example: `@RELATION DEPENDS_ON -> [DTO:StripeWebhookPayload]`

## II. TEST MARKUP ECONOMY (NOISE REDUCTION)
To prevent overwhelming Semantic Graph, test files operate under relaxed complexity rules:
1. **Short IDs:** Test modules MUST use concise IDs (e.g., `[DEF:PaymentTests:Module]`), not full file paths.
2. **Root Binding (`BINDS_TO`):** Do NOT map the internal call graph of a test file. Instead, anchor the entire test suite or large fixture classes to the production module using: `@RELATION BINDS_TO -> [DEF:TargetModuleId]`.
3. **Complexity 1 for Helpers:** Small test utilities (e.g., `_setup_mock`, `_build_payload`) are **C1**. They require ONLY `[DEF]...[/DEF]` anchors. No `@PURPOSE` or `@RELATION` allowed.
4. **Complexity 2 for Tests:** Actual test functions (e.g., `test_invalid_auth`) are **C2**. They require `[DEF]...[/DEF]` and `@PURPOSE`. Do not add `@PRE`/`@POST` to individual test functions.

## III. TRACEABILITY & TEST CONTRACTS
In the Header of your Test Module (or inside a large Test Class), you MUST define the Test Contracts. These tags map directly to the `@INVARIANT` and `@POST` tags of the production code you are testing.
- `@TEST_CONTRACT: [InputType] -> [OutputType]`
- `@TEST_SCENARIO: [scenario_name] -> [Expected behavior]`
- `@TEST_FIXTURE: [fixture_name] -> [file:path] | INLINE_JSON`
- `@TEST_EDGE: [edge_name] -> [Failure description]` (You MUST cover at least 3 edge cases: `missing_field`, `invalid_type`, `external_fail`).
- **The Traceability Link:** `@TEST_INVARIANT: [Invariant_Name_From_Source] -> VERIFIED_BY: [scenario_1, edge_name_2]`

## IV. PYTHON TESTING STACK
Use pytest as the primary test framework. Follow these conventions:
1. **Test files:** Named `test_*.py`, placed in a `tests/` directory mirroring the source tree.
2. **Fixtures:** Use `@pytest.fixture` for test setup. Prefer `conftest.py` for shared fixtures.
3. **Mocking:** Use `unittest.mock` (standard library) for mocking `[EXT:...]` boundaries. Use `pytest-mock` (`mocker` fixture) when available.
4. **Parametrization:** Use `@pytest.mark.parametrize` for table-driven tests covering edge cases.
5. **Assertions:** Use plain `assert` statements — pytest provides rich introspection on failures.

**Example — C1 test helper:**
```python
# [DEF:_build_payload:Function]
def _build_payload(**overrides: Any) -> dict:
    base = {"name": "test", "value": 42}
    return {**base, **overrides}
# [/DEF:_build_payload:Function]
```

**Example — C2 test function:**
```python
# [DEF:test_create_user_success:Function]
# @PURPOSE Verify that a valid payload creates a user and returns 201 with the user DTO.
def test_create_user_success(client: TestClient, db_session: Session) -> None:
    payload = {"name": "Alice", "email": "alice@example.com"}
    response = client.post("/api/users", json=payload)
    assert response.status_code == 201
    assert response.json()["name"] == "Alice"
    assert db_session.query(User).count() == 1
# [/DEF:test_create_user_success:Function]
```

**Example — Parametrized edge cases:**
```python
# [DEF:test_create_user_validation_edges:Function]
# @PURPOSE Cover edge cases for user creation validation: missing fields, invalid types, external failures.
@pytest.mark.parametrize("payload,expected_status,expected_detail", [
    ({"email": "a@b.com"}, 422, "missing_field"),
    ({"name": "A", "email": "not-an-email"}, 422, "invalid_type"),
])
def test_create_user_validation_edges(
    client: TestClient,
    payload: dict,
    expected_status: int,
    expected_detail: str,
) -> None:
    response = client.post("/api/users", json=payload)
    assert response.status_code == expected_status
    assert expected_detail in str(response.json())
# [/DEF:test_create_user_validation_edges:Function]
```

## V. ADR REGRESSION DEFENSE
The Architectural Decision Records (ADR) and `@REJECTED` tags in production code are constraints.
If the production `[DEF]` has a `@REJECTED [Forbidden_Path]` tag (e.g., `@REJECTED fallback to SQLite`), your Test Module MUST contain an explicit `@TEST_EDGE` scenario proving that the forbidden path is physically unreachable or throws an appropriate error.
Tests are the enforcers of architectural memory.

## VI. ANTI-TAUTOLOGY RULES
1. **No Logic Mirrors:** Use deterministic, hardcoded fixtures (`@TEST_FIXTURE`) for expected results. Do not dynamically calculate `expected = a + b` to test an `add(a, b)` function.
2. **Do Not Mock The System Under Test:** You may mock `[EXT:...]` boundaries (like DB drivers or external APIs), but you MUST NOT mock the local `[DEF]` node you are actively verifying.

## VII. VERIFIABLE HARNESS RULES
For agentic development, a test harness is part of the task environment.
- Prefer real executable checks over narrative claims that a change is safe.
- Verify that the harness actually fails on the broken state and passes on the fixed state whenever feasible.
- Resist shortcut tests that bypass the real integration boundary the task is supposed to validate.
- When a production `@POST` guarantee is subtle, add the narrowest test that can falsify it.

## VIII. LONG-HORIZON QA MEMORY
When multiple attempts are needed:
- Preserve the smallest set of failing fixtures, commands, and invariant mappings that explain the current gap.
- Fold older failed attempts into one bounded note describing what was tried and why it was rejected.
- Do not keep extending the active QA transcript with redundant command output.

## IX. TESTING SEARCH DISCIPLINE
- Use one concrete failing hypothesis plus one verifier by default.
- Add alternative test strategies only when the first verifier is inconclusive.
- Do not mirror the implementation logic to fabricate expected values; use fixtures, explicit contracts, and invariant-oriented assertions.

## X. PYTEST CONVENTIONS & COMMAND EXAMPLES
```bash
# Run all tests
pytest

# Run a specific test module
pytest tests/test_users.py

# Run with coverage report
pytest --cov=src --cov-report=term-missing

# Run only tests matching a keyword
pytest -k "create_user"

# Run with verbose output and stop on first failure
pytest -xvs
```

**[SYSTEM: END OF TESTING DIRECTIVE. ENFORCE STRICT TRACEABILITY.]**