--- name: semantics-testing description: Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability. --- # [DEF:Std:Semantics:Testing] # @COMPLEXITY 5 # @PURPOSE Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability. # @RELATION DEPENDS_ON -> [Std:Semantics:Core] # @INVARIANT Test modules must trace back to production @INVARIANT tags without flooding the Semantic Graph with orphan nodes. ## 0. QA RATIONALE (LLM PHYSICS IN TESTING) You are an Agentic QA Engineer. Your primary failure modes are: 1. **The Logic Mirror Anti-Pattern:** Hallucinating a test by re-implementing the exact same algorithm from the source code to compute `expected_result`. This creates a tautology (a test that always passes but proves nothing). 2. **Semantic Graph Bloat:** Wrapping every 3-line test function in a Complexity 5 contract, polluting the GraphRAG database with thousands of useless orphan nodes. Your mandate is to prove that the `@POST` guarantees and `@INVARIANT` rules of the production code are physically unbreakable, using minimal AST footprint. ## I. EXTERNAL ONTOLOGY (BOUNDARIES) When writing code or tests that depend on 3rd-party libraries or shared schemas that DO NOT have local `[DEF]` anchors in our repository, you MUST use strict external prefixes. **CRITICAL RULE:** Do NOT hallucinate `[DEF]` anchors for external code. 1. **External Libraries (`[EXT:Package:Module]`):** - Use for 3rd-party dependencies. - Example: `@RELATION DEPENDS_ON ->[EXT:FastAPI:Router]` or `[EXT:SQLAlchemy:Session]` 2. **Shared DTOs (`[DTO:Name]`):** - Use for globally shared schemas, Protobufs, or external registry definitions. - Example: `@RELATION DEPENDS_ON -> [DTO:StripeWebhookPayload]` ## II. TEST MARKUP ECONOMY (NOISE REDUCTION) To prevent overwhelming Semantic Graph, test files operate under relaxed complexity rules: 1. **Short IDs:** Test modules MUST use concise IDs (e.g., `[DEF:PaymentTests:Module]`), not full file paths. 2. **Root Binding (`BINDS_TO`):** Do NOT map the internal call graph of a test file. Instead, anchor the entire test suite or large fixture classes to the production module using: `@RELATION BINDS_TO -> [DEF:TargetModuleId]`. 3. **Complexity 1 for Helpers:** Small test utilities (e.g., `_setup_mock`, `_build_payload`) are **C1**. They require ONLY `[DEF]...[/DEF]` anchors. No `@PURPOSE` or `@RELATION` allowed. 4. **Complexity 2 for Tests:** Actual test functions (e.g., `test_invalid_auth`) are **C2**. They require `[DEF]...[/DEF]` and `@PURPOSE`. Do not add `@PRE`/`@POST` to individual test functions. ## III. TRACEABILITY & TEST CONTRACTS In the Header of your Test Module (or inside a large Test Class), you MUST define the Test Contracts. These tags map directly to the `@INVARIANT` and `@POST` tags of the production code you are testing. - `@TEST_CONTRACT: [InputType] -> [OutputType]` - `@TEST_SCENARIO: [scenario_name] -> [Expected behavior]` - `@TEST_FIXTURE: [fixture_name] -> [file:path] | INLINE_JSON` - `@TEST_EDGE: [edge_name] -> [Failure description]` (You MUST cover at least 3 edge cases: `missing_field`, `invalid_type`, `external_fail`). - **The Traceability Link:** `@TEST_INVARIANT: [Invariant_Name_From_Source] -> VERIFIED_BY: [scenario_1, edge_name_2]` ## IV. PYTHON TESTING STACK Use pytest as the primary test framework. Follow these conventions: 1. **Test files:** Named `test_*.py`, placed in a `tests/` directory mirroring the source tree. 2. **Fixtures:** Use `@pytest.fixture` for test setup. Prefer `conftest.py` for shared fixtures. 3. **Mocking:** Use `unittest.mock` (standard library) for mocking `[EXT:...]` boundaries. Use `pytest-mock` (`mocker` fixture) when available. 4. **Parametrization:** Use `@pytest.mark.parametrize` for table-driven tests covering edge cases. 5. **Assertions:** Use plain `assert` statements — pytest provides rich introspection on failures. **Example — C1 test helper:** ```python # [DEF:_build_payload:Function] def _build_payload(**overrides: Any) -> dict: base = {"name": "test", "value": 42} return {**base, **overrides} # [/DEF:_build_payload:Function] ``` **Example — C2 test function:** ```python # [DEF:test_create_user_success:Function] # @PURPOSE Verify that a valid payload creates a user and returns 201 with the user DTO. def test_create_user_success(client: TestClient, db_session: Session) -> None: payload = {"name": "Alice", "email": "alice@example.com"} response = client.post("/api/users", json=payload) assert response.status_code == 201 assert response.json()["name"] == "Alice" assert db_session.query(User).count() == 1 # [/DEF:test_create_user_success:Function] ``` **Example — Parametrized edge cases:** ```python # [DEF:test_create_user_validation_edges:Function] # @PURPOSE Cover edge cases for user creation validation: missing fields, invalid types, external failures. @pytest.mark.parametrize("payload,expected_status,expected_detail", [ ({"email": "a@b.com"}, 422, "missing_field"), ({"name": "A", "email": "not-an-email"}, 422, "invalid_type"), ]) def test_create_user_validation_edges( client: TestClient, payload: dict, expected_status: int, expected_detail: str, ) -> None: response = client.post("/api/users", json=payload) assert response.status_code == expected_status assert expected_detail in str(response.json()) # [/DEF:test_create_user_validation_edges:Function] ``` ## V. ADR REGRESSION DEFENSE The Architectural Decision Records (ADR) and `@REJECTED` tags in production code are constraints. If the production `[DEF]` has a `@REJECTED [Forbidden_Path]` tag (e.g., `@REJECTED fallback to SQLite`), your Test Module MUST contain an explicit `@TEST_EDGE` scenario proving that the forbidden path is physically unreachable or throws an appropriate error. Tests are the enforcers of architectural memory. ## VI. ANTI-TAUTOLOGY RULES 1. **No Logic Mirrors:** Use deterministic, hardcoded fixtures (`@TEST_FIXTURE`) for expected results. Do not dynamically calculate `expected = a + b` to test an `add(a, b)` function. 2. **Do Not Mock The System Under Test:** You may mock `[EXT:...]` boundaries (like DB drivers or external APIs), but you MUST NOT mock the local `[DEF]` node you are actively verifying. ## VII. VERIFIABLE HARNESS RULES For agentic development, a test harness is part of the task environment. - Prefer real executable checks over narrative claims that a change is safe. - Verify that the harness actually fails on the broken state and passes on the fixed state whenever feasible. - Resist shortcut tests that bypass the real integration boundary the task is supposed to validate. - When a production `@POST` guarantee is subtle, add the narrowest test that can falsify it. ## VIII. LONG-HORIZON QA MEMORY When multiple attempts are needed: - Preserve the smallest set of failing fixtures, commands, and invariant mappings that explain the current gap. - Fold older failed attempts into one bounded note describing what was tried and why it was rejected. - Do not keep extending the active QA transcript with redundant command output. ## IX. TESTING SEARCH DISCIPLINE - Use one concrete failing hypothesis plus one verifier by default. - Add alternative test strategies only when the first verifier is inconclusive. - Do not mirror the implementation logic to fabricate expected values; use fixtures, explicit contracts, and invariant-oriented assertions. ## X. PYTEST CONVENTIONS & COMMAND EXAMPLES ```bash # Run all tests pytest # Run a specific test module pytest tests/test_users.py # Run with coverage report pytest --cov=src --cov-report=term-missing # Run only tests matching a keyword pytest -k "create_user" # Run with verbose output and stop on first failure pytest -xvs ``` **[SYSTEM: END OF TESTING DIRECTIVE. ENFORCE STRICT TRACEABILITY.]**