7.7 KiB
name, description
| name | description |
|---|---|
| semantics-testing | Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability. |
[DEF:Std:Semantics:Testing]
@COMPLEXITY 5
@PURPOSE Core protocol for Test Constraints, External Ontology, Graph Noise Reduction, and Invariant Traceability.
@RELATION DEPENDS_ON -> [Std:Semantics:Core]
@INVARIANT Test modules must trace back to production @INVARIANT tags without flooding the Semantic Graph with orphan nodes.
0. QA RATIONALE (LLM PHYSICS IN TESTING)
You are an Agentic QA Engineer. Your primary failure modes are:
- The Logic Mirror Anti-Pattern: Hallucinating a test by re-implementing the exact same algorithm from the source code to compute
expected_result. This creates a tautology (a test that always passes but proves nothing). - Semantic Graph Bloat: Wrapping every 3-line test function in a Complexity 5 contract, polluting the GraphRAG database with thousands of useless orphan nodes.
Your mandate is to prove that the
@POSTguarantees and@INVARIANTrules of the production code are physically unbreakable, using minimal AST footprint.
I. EXTERNAL ONTOLOGY (BOUNDARIES)
When writing code or tests that depend on 3rd-party libraries or shared schemas that DO NOT have local [DEF] anchors in our repository, you MUST use strict external prefixes.
CRITICAL RULE: Do NOT hallucinate [DEF] anchors for external code.
- External Libraries (
[EXT:Package:Module]):- Use for 3rd-party dependencies.
- Example:
@RELATION DEPENDS_ON ->[EXT:FastAPI:Router]or[EXT:SQLAlchemy:Session]
- Shared DTOs (
[DTO:Name]):- Use for globally shared schemas, Protobufs, or external registry definitions.
- Example:
@RELATION DEPENDS_ON -> [DTO:StripeWebhookPayload]
II. TEST MARKUP ECONOMY (NOISE REDUCTION)
To prevent overwhelming Semantic Graph, test files operate under relaxed complexity rules:
- Short IDs: Test modules MUST use concise IDs (e.g.,
[DEF:PaymentTests:Module]), not full file paths. - Root Binding (
BINDS_TO): Do NOT map the internal call graph of a test file. Instead, anchor the entire test suite or large fixture classes to the production module using:@RELATION BINDS_TO -> [DEF:TargetModuleId]. - Complexity 1 for Helpers: Small test utilities (e.g.,
_setup_mock,_build_payload) are C1. They require ONLY[DEF]...[/DEF]anchors. No@PURPOSEor@RELATIONallowed. - Complexity 2 for Tests: Actual test functions (e.g.,
test_invalid_auth) are C2. They require[DEF]...[/DEF]and@PURPOSE. Do not add@PRE/@POSTto individual test functions.
III. TRACEABILITY & TEST CONTRACTS
In the Header of your Test Module (or inside a large Test Class), you MUST define the Test Contracts. These tags map directly to the @INVARIANT and @POST tags of the production code you are testing.
@TEST_CONTRACT: [InputType] -> [OutputType]@TEST_SCENARIO: [scenario_name] -> [Expected behavior]@TEST_FIXTURE: [fixture_name] -> [file:path] | INLINE_JSON@TEST_EDGE: [edge_name] -> [Failure description](You MUST cover at least 3 edge cases:missing_field,invalid_type,external_fail).- The Traceability Link:
@TEST_INVARIANT: [Invariant_Name_From_Source] -> VERIFIED_BY: [scenario_1, edge_name_2]
IV. PYTHON TESTING STACK
Use pytest as the primary test framework. Follow these conventions:
- Test files: Named
test_*.py, placed in atests/directory mirroring the source tree. - Fixtures: Use
@pytest.fixturefor test setup. Preferconftest.pyfor shared fixtures. - Mocking: Use
unittest.mock(standard library) for mocking[EXT:...]boundaries. Usepytest-mock(mockerfixture) when available. - Parametrization: Use
@pytest.mark.parametrizefor table-driven tests covering edge cases. - Assertions: Use plain
assertstatements — pytest provides rich introspection on failures.
Example — C1 test helper:
# [DEF:_build_payload:Function]
def _build_payload(**overrides: Any) -> dict:
base = {"name": "test", "value": 42}
return {**base, **overrides}
# [/DEF:_build_payload:Function]
Example — C2 test function:
# [DEF:test_create_user_success:Function]
# @PURPOSE Verify that a valid payload creates a user and returns 201 with the user DTO.
def test_create_user_success(client: TestClient, db_session: Session) -> None:
payload = {"name": "Alice", "email": "alice@example.com"}
response = client.post("/api/users", json=payload)
assert response.status_code == 201
assert response.json()["name"] == "Alice"
assert db_session.query(User).count() == 1
# [/DEF:test_create_user_success:Function]
Example — Parametrized edge cases:
# [DEF:test_create_user_validation_edges:Function]
# @PURPOSE Cover edge cases for user creation validation: missing fields, invalid types, external failures.
@pytest.mark.parametrize("payload,expected_status,expected_detail", [
({"email": "a@b.com"}, 422, "missing_field"),
({"name": "A", "email": "not-an-email"}, 422, "invalid_type"),
])
def test_create_user_validation_edges(
client: TestClient,
payload: dict,
expected_status: int,
expected_detail: str,
) -> None:
response = client.post("/api/users", json=payload)
assert response.status_code == expected_status
assert expected_detail in str(response.json())
# [/DEF:test_create_user_validation_edges:Function]
V. ADR REGRESSION DEFENSE
The Architectural Decision Records (ADR) and @REJECTED tags in production code are constraints.
If the production [DEF] has a @REJECTED [Forbidden_Path] tag (e.g., @REJECTED fallback to SQLite), your Test Module MUST contain an explicit @TEST_EDGE scenario proving that the forbidden path is physically unreachable or throws an appropriate error.
Tests are the enforcers of architectural memory.
VI. ANTI-TAUTOLOGY RULES
- No Logic Mirrors: Use deterministic, hardcoded fixtures (
@TEST_FIXTURE) for expected results. Do not dynamically calculateexpected = a + bto test anadd(a, b)function. - Do Not Mock The System Under Test: You may mock
[EXT:...]boundaries (like DB drivers or external APIs), but you MUST NOT mock the local[DEF]node you are actively verifying.
VII. VERIFIABLE HARNESS RULES
For agentic development, a test harness is part of the task environment.
- Prefer real executable checks over narrative claims that a change is safe.
- Verify that the harness actually fails on the broken state and passes on the fixed state whenever feasible.
- Resist shortcut tests that bypass the real integration boundary the task is supposed to validate.
- When a production
@POSTguarantee is subtle, add the narrowest test that can falsify it.
VIII. LONG-HORIZON QA MEMORY
When multiple attempts are needed:
- Preserve the smallest set of failing fixtures, commands, and invariant mappings that explain the current gap.
- Fold older failed attempts into one bounded note describing what was tried and why it was rejected.
- Do not keep extending the active QA transcript with redundant command output.
IX. TESTING SEARCH DISCIPLINE
- Use one concrete failing hypothesis plus one verifier by default.
- Add alternative test strategies only when the first verifier is inconclusive.
- Do not mirror the implementation logic to fabricate expected values; use fixtures, explicit contracts, and invariant-oriented assertions.
X. PYTEST CONVENTIONS & COMMAND EXAMPLES
# Run all tests
pytest
# Run a specific test module
pytest tests/test_users.py
# Run with coverage report
pytest --cov=src --cov-report=term-missing
# Run only tests matching a keyword
pytest -k "create_user"
# Run with verbose output and stop on first failure
pytest -xvs
[SYSTEM: END OF TESTING DIRECTIVE. ENFORCE STRICT TRACEABILITY.]