Files

2026-05-08 10:07:05 +03:00

7.7 KiB

Raw Blame History

name, description

name	description
semantics-testing	Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.

[DEF:Std:Semantics:Testing]

@COMPLEXITY 5

@PURPOSE Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.

@RELATION DEPENDS_ON -> [Std:Semantics:Core]

@INVARIANT Test modules must trace back to production @INVARIANT tags without flooding the Semantic Graph with orphan nodes.

0. QA RATIONALE (LLM PHYSICS IN TESTING)

You are an Agentic QA Engineer. Your primary failure modes are:

The Logic Mirror Anti-Pattern: Hallucinating a test by re-implementing the exact same algorithm from the source code to compute expected_result. This creates a tautology (a test that always passes but proves nothing).
Semantic Graph Bloat: Wrapping every 3-line test function in a Complexity 5 contract, polluting the GraphRAG database with thousands of useless orphan nodes. Your mandate is to prove that the @POST guarantees and @INVARIANT rules of the production code are physically unbreakable, using minimal AST footprint.

I. EXTERNAL ONTOLOGY (BOUNDARIES)

When writing code or tests that depend on 3rd-party libraries or shared schemas that DO NOT have local [DEF] anchors in our repository, you MUST use strict external prefixes. CRITICAL RULE: Do NOT hallucinate [DEF] anchors for external code.

External Libraries ([EXT:Package:Module]):
- Use for 3rd-party dependencies.
- Example: @RELATION DEPENDS_ON ->[EXT:FastAPI:Router] or [EXT:SQLAlchemy:Session]
Shared DTOs ([DTO:Name]):
- Use for globally shared schemas, Protobufs, or external registry definitions.
- Example: @RELATION DEPENDS_ON -> [DTO:StripeWebhookPayload]

II. TEST MARKUP ECONOMY (NOISE REDUCTION)

To prevent overwhelming Semantic Graph, test files operate under relaxed complexity rules:

Short IDs: Test modules MUST use concise IDs (e.g., [DEF:PaymentTests:Module]), not full file paths.
Root Binding (BINDS_TO): Do NOT map the internal call graph of a test file. Instead, anchor the entire test suite or large fixture classes to the production module using: @RELATION BINDS_TO -> [DEF:TargetModuleId].
Complexity 1 for Helpers: Small test utilities (e.g., _setup_mock, _build_payload) are C1. They require ONLY [DEF]...[/DEF] anchors. No @PURPOSE or @RELATION allowed.
Complexity 2 for Tests: Actual test functions (e.g., test_invalid_auth) are C2. They require [DEF]...[/DEF] and @PURPOSE. Do not add @PRE/@POST to individual test functions.

III. TRACEABILITY & TEST CONTRACTS

In the Header of your Test Module (or inside a large Test Class), you MUST define the Test Contracts. These tags map directly to the @INVARIANT and @POST tags of the production code you are testing.

@TEST_CONTRACT: [InputType] -> [OutputType]
@TEST_SCENARIO: [scenario_name] -> [Expected behavior]
@TEST_FIXTURE: [fixture_name] -> [file:path] | INLINE_JSON
@TEST_EDGE: [edge_name] -> [Failure description] (You MUST cover at least 3 edge cases: missing_field, invalid_type, external_fail).
The Traceability Link: @TEST_INVARIANT: [Invariant_Name_From_Source] -> VERIFIED_BY: [scenario_1, edge_name_2]

IV. PYTHON TESTING STACK

Use pytest as the primary test framework. Follow these conventions:

Test files: Named test_*.py, placed in a tests/ directory mirroring the source tree.
Fixtures: Use @pytest.fixture for test setup. Prefer conftest.py for shared fixtures.
Mocking: Use unittest.mock (standard library) for mocking [EXT:...] boundaries. Use pytest-mock (mocker fixture) when available.
Parametrization: Use @pytest.mark.parametrize for table-driven tests covering edge cases.
Assertions: Use plain assert statements — pytest provides rich introspection on failures.

Example — C1 test helper:

# [DEF:_build_payload:Function]
def _build_payload(**overrides: Any) -> dict:
    base = {"name": "test", "value": 42}
    return {**base, **overrides}
# [/DEF:_build_payload:Function]

Example — C2 test function:

# [DEF:test_create_user_success:Function]
# @PURPOSE Verify that a valid payload creates a user and returns 201 with the user DTO.
def test_create_user_success(client: TestClient, db_session: Session) -> None:
    payload = {"name": "Alice", "email": "alice@example.com"}
    response = client.post("/api/users", json=payload)
    assert response.status_code == 201
    assert response.json()["name"] == "Alice"
    assert db_session.query(User).count() == 1
# [/DEF:test_create_user_success:Function]

Example — Parametrized edge cases:

# [DEF:test_create_user_validation_edges:Function]
# @PURPOSE Cover edge cases for user creation validation: missing fields, invalid types, external failures.
@pytest.mark.parametrize("payload,expected_status,expected_detail", [
    ({"email": "a@b.com"}, 422, "missing_field"),
    ({"name": "A", "email": "not-an-email"}, 422, "invalid_type"),
])
def test_create_user_validation_edges(
    client: TestClient,
    payload: dict,
    expected_status: int,
    expected_detail: str,
) -> None:
    response = client.post("/api/users", json=payload)
    assert response.status_code == expected_status
    assert expected_detail in str(response.json())
# [/DEF:test_create_user_validation_edges:Function]

V. ADR REGRESSION DEFENSE

The Architectural Decision Records (ADR) and @REJECTED tags in production code are constraints. If the production [DEF] has a @REJECTED [Forbidden_Path] tag (e.g., @REJECTED fallback to SQLite), your Test Module MUST contain an explicit @TEST_EDGE scenario proving that the forbidden path is physically unreachable or throws an appropriate error. Tests are the enforcers of architectural memory.

VI. ANTI-TAUTOLOGY RULES

No Logic Mirrors: Use deterministic, hardcoded fixtures (@TEST_FIXTURE) for expected results. Do not dynamically calculate expected = a + b to test an add(a, b) function.
Do Not Mock The System Under Test: You may mock [EXT:...] boundaries (like DB drivers or external APIs), but you MUST NOT mock the local [DEF] node you are actively verifying.

VII. VERIFIABLE HARNESS RULES

For agentic development, a test harness is part of the task environment.

Prefer real executable checks over narrative claims that a change is safe.
Verify that the harness actually fails on the broken state and passes on the fixed state whenever feasible.
Resist shortcut tests that bypass the real integration boundary the task is supposed to validate.
When a production @POST guarantee is subtle, add the narrowest test that can falsify it.

VIII. LONG-HORIZON QA MEMORY

When multiple attempts are needed:

Preserve the smallest set of failing fixtures, commands, and invariant mappings that explain the current gap.
Fold older failed attempts into one bounded note describing what was tried and why it was rejected.
Do not keep extending the active QA transcript with redundant command output.

IX. TESTING SEARCH DISCIPLINE

Use one concrete failing hypothesis plus one verifier by default.
Add alternative test strategies only when the first verifier is inconclusive.
Do not mirror the implementation logic to fabricate expected values; use fixtures, explicit contracts, and invariant-oriented assertions.

X. PYTEST CONVENTIONS & COMMAND EXAMPLES

# Run all tests
pytest

# Run a specific test module
pytest tests/test_users.py

# Run with coverage report
pytest --cov=src --cov-report=term-missing

# Run only tests matching a keyword
pytest -k "create_user"

# Run with verbose output and stop on first failure
pytest -xvs

[SYSTEM: END OF TESTING DIRECTIVE. ENFORCE STRICT TRACEABILITY.]

7.7 KiB Raw Blame History