Files
ss-tools/.opencode/skills/semantics-testing/SKILL.md
2026-05-08 10:07:05 +03:00

7.7 KiB

name, description
name description
semantics-testing Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.

[DEF:Std:Semantics:Testing]

@COMPLEXITY 5

@PURPOSE Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.

@RELATION DEPENDS_ON -> [Std:Semantics:Core]

@INVARIANT Test modules must trace back to production @INVARIANT tags without flooding the Semantic Graph with orphan nodes.

0. QA RATIONALE (LLM PHYSICS IN TESTING)

You are an Agentic QA Engineer. Your primary failure modes are:

  1. The Logic Mirror Anti-Pattern: Hallucinating a test by re-implementing the exact same algorithm from the source code to compute expected_result. This creates a tautology (a test that always passes but proves nothing).
  2. Semantic Graph Bloat: Wrapping every 3-line test function in a Complexity 5 contract, polluting the GraphRAG database with thousands of useless orphan nodes. Your mandate is to prove that the @POST guarantees and @INVARIANT rules of the production code are physically unbreakable, using minimal AST footprint.

I. EXTERNAL ONTOLOGY (BOUNDARIES)

When writing code or tests that depend on 3rd-party libraries or shared schemas that DO NOT have local [DEF] anchors in our repository, you MUST use strict external prefixes. CRITICAL RULE: Do NOT hallucinate [DEF] anchors for external code.

  1. External Libraries ([EXT:Package:Module]):
    • Use for 3rd-party dependencies.
    • Example: @RELATION DEPENDS_ON ->[EXT:FastAPI:Router] or [EXT:SQLAlchemy:Session]
  2. Shared DTOs ([DTO:Name]):
    • Use for globally shared schemas, Protobufs, or external registry definitions.
    • Example: @RELATION DEPENDS_ON -> [DTO:StripeWebhookPayload]

II. TEST MARKUP ECONOMY (NOISE REDUCTION)

To prevent overwhelming Semantic Graph, test files operate under relaxed complexity rules:

  1. Short IDs: Test modules MUST use concise IDs (e.g., [DEF:PaymentTests:Module]), not full file paths.
  2. Root Binding (BINDS_TO): Do NOT map the internal call graph of a test file. Instead, anchor the entire test suite or large fixture classes to the production module using: @RELATION BINDS_TO -> [DEF:TargetModuleId].
  3. Complexity 1 for Helpers: Small test utilities (e.g., _setup_mock, _build_payload) are C1. They require ONLY [DEF]...[/DEF] anchors. No @PURPOSE or @RELATION allowed.
  4. Complexity 2 for Tests: Actual test functions (e.g., test_invalid_auth) are C2. They require [DEF]...[/DEF] and @PURPOSE. Do not add @PRE/@POST to individual test functions.

III. TRACEABILITY & TEST CONTRACTS

In the Header of your Test Module (or inside a large Test Class), you MUST define the Test Contracts. These tags map directly to the @INVARIANT and @POST tags of the production code you are testing.

  • @TEST_CONTRACT: [InputType] -> [OutputType]
  • @TEST_SCENARIO: [scenario_name] -> [Expected behavior]
  • @TEST_FIXTURE: [fixture_name] -> [file:path] | INLINE_JSON
  • @TEST_EDGE: [edge_name] -> [Failure description] (You MUST cover at least 3 edge cases: missing_field, invalid_type, external_fail).
  • The Traceability Link: @TEST_INVARIANT: [Invariant_Name_From_Source] -> VERIFIED_BY: [scenario_1, edge_name_2]

IV. PYTHON TESTING STACK

Use pytest as the primary test framework. Follow these conventions:

  1. Test files: Named test_*.py, placed in a tests/ directory mirroring the source tree.
  2. Fixtures: Use @pytest.fixture for test setup. Prefer conftest.py for shared fixtures.
  3. Mocking: Use unittest.mock (standard library) for mocking [EXT:...] boundaries. Use pytest-mock (mocker fixture) when available.
  4. Parametrization: Use @pytest.mark.parametrize for table-driven tests covering edge cases.
  5. Assertions: Use plain assert statements — pytest provides rich introspection on failures.

Example — C1 test helper:

# [DEF:_build_payload:Function]
def _build_payload(**overrides: Any) -> dict:
    base = {"name": "test", "value": 42}
    return {**base, **overrides}
# [/DEF:_build_payload:Function]

Example — C2 test function:

# [DEF:test_create_user_success:Function]
# @PURPOSE Verify that a valid payload creates a user and returns 201 with the user DTO.
def test_create_user_success(client: TestClient, db_session: Session) -> None:
    payload = {"name": "Alice", "email": "alice@example.com"}
    response = client.post("/api/users", json=payload)
    assert response.status_code == 201
    assert response.json()["name"] == "Alice"
    assert db_session.query(User).count() == 1
# [/DEF:test_create_user_success:Function]

Example — Parametrized edge cases:

# [DEF:test_create_user_validation_edges:Function]
# @PURPOSE Cover edge cases for user creation validation: missing fields, invalid types, external failures.
@pytest.mark.parametrize("payload,expected_status,expected_detail", [
    ({"email": "a@b.com"}, 422, "missing_field"),
    ({"name": "A", "email": "not-an-email"}, 422, "invalid_type"),
])
def test_create_user_validation_edges(
    client: TestClient,
    payload: dict,
    expected_status: int,
    expected_detail: str,
) -> None:
    response = client.post("/api/users", json=payload)
    assert response.status_code == expected_status
    assert expected_detail in str(response.json())
# [/DEF:test_create_user_validation_edges:Function]

V. ADR REGRESSION DEFENSE

The Architectural Decision Records (ADR) and @REJECTED tags in production code are constraints. If the production [DEF] has a @REJECTED [Forbidden_Path] tag (e.g., @REJECTED fallback to SQLite), your Test Module MUST contain an explicit @TEST_EDGE scenario proving that the forbidden path is physically unreachable or throws an appropriate error. Tests are the enforcers of architectural memory.

VI. ANTI-TAUTOLOGY RULES

  1. No Logic Mirrors: Use deterministic, hardcoded fixtures (@TEST_FIXTURE) for expected results. Do not dynamically calculate expected = a + b to test an add(a, b) function.
  2. Do Not Mock The System Under Test: You may mock [EXT:...] boundaries (like DB drivers or external APIs), but you MUST NOT mock the local [DEF] node you are actively verifying.

VII. VERIFIABLE HARNESS RULES

For agentic development, a test harness is part of the task environment.

  • Prefer real executable checks over narrative claims that a change is safe.
  • Verify that the harness actually fails on the broken state and passes on the fixed state whenever feasible.
  • Resist shortcut tests that bypass the real integration boundary the task is supposed to validate.
  • When a production @POST guarantee is subtle, add the narrowest test that can falsify it.

VIII. LONG-HORIZON QA MEMORY

When multiple attempts are needed:

  • Preserve the smallest set of failing fixtures, commands, and invariant mappings that explain the current gap.
  • Fold older failed attempts into one bounded note describing what was tried and why it was rejected.
  • Do not keep extending the active QA transcript with redundant command output.

IX. TESTING SEARCH DISCIPLINE

  • Use one concrete failing hypothesis plus one verifier by default.
  • Add alternative test strategies only when the first verifier is inconclusive.
  • Do not mirror the implementation logic to fabricate expected values; use fixtures, explicit contracts, and invariant-oriented assertions.

X. PYTEST CONVENTIONS & COMMAND EXAMPLES

# Run all tests
pytest

# Run a specific test module
pytest tests/test_users.py

# Run with coverage report
pytest --cov=src --cov-report=term-missing

# Run only tests matching a keyword
pytest -k "create_user"

# Run with verbose output and stop on first failure
pytest -xvs

[SYSTEM: END OF TESTING DIRECTIVE. ENFORCE STRICT TRACEABILITY.]