semantic cleanup

2026-05-08 10:07:05 +03:00
parent 505864438e
commit d8df1fff59
90 changed files with 148541 additions and 2251 deletions
--- a/.kilo/workflows/read_semantic.md
+++ b/.kilo/workflows/read_semantic.md
@@ -0,0 +1,4 @@
+---
+description: USE SEMANTIC
+---
+Прочитай .ai/standards/semantics.md. ОБЯЗАТЕЛЬНО используй его при разработке
--- a/.kilo/workflows/speckit.analyze.md
+++ b/.kilo/workflows/speckit.analyze.md
@@ -0,0 +1,212 @@
+---
+description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Goal
+
+Identify inconsistencies, duplications, ambiguities, underspecified items, and decision-memory drift across the core artifacts (`spec.md`, `plan.md`, `tasks.md`, and ADR sources) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
+
+## Operating Constraints
+
+**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).
+
+**Constitution Authority**: The project constitution (`.ai/standards/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`.
+
+## Execution Steps
+
+### 1. Initialize Analysis Context
+
+Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
+
+- SPEC = FEATURE_DIR/spec.md
+- PLAN = FEATURE_DIR/plan.md
+- TASKS = FEATURE_DIR/tasks.md
+- ADR = `docs/architecture.md` and/or feature-local decision files when present
+
+Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
+For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+### 2. Load Artifacts (Progressive Disclosure)
+
+Load only the minimal necessary context from each artifact:
+
+**From `spec.md`:**
+
+- Overview/Context
+- Functional Requirements
+- Non-Functional Requirements
+- User Stories
+- Edge Cases (if present)
+
+**From `plan.md`:**
+
+- Architecture/stack choices
+- Data Model references
+- Phases
+- Technical constraints
+- ADR references or emitted decisions
+
+**From `tasks.md`:**
+
+- Task IDs
+- Descriptions
+- Phase grouping
+- Parallel markers [P]
+- Referenced file paths
+- Guardrail summaries derived from `@RATIONALE` / `@REJECTED`
+
+**From ADR sources:**
+
+- `[DEF:id:ADR]` nodes
+- `@RATIONALE`
+- `@REJECTED`
+- `@RELATION`
+
+**From constitution:**
+
+- Load `.ai/standards/constitution.md` for principle validation
+- Load `.ai/standards/semantics.md` for technical standard validation
+
+### 3. Build Semantic Models
+
+Create internal representations (do not include raw artifacts in output):
+
+- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" → `user-can-upload-file`)
+- **User story/action inventory**: Discrete user actions with acceptance criteria
+- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
+- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
+- **Decision-memory inventory**: ADR ids, accepted paths, rejected paths, and the tasks/contracts expected to inherit them
+
+### 4. Detection Passes (Token-Efficient Analysis)
+
+Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.
+
+#### A. Duplication Detection
+
+- Identify near-duplicate requirements
+- Mark lower-quality phrasing for consolidation
+
+#### B. Ambiguity Detection
+
+- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
+- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)
+
+#### C. Underspecification
+
+- Requirements with verbs but missing object or measurable outcome
+- User stories missing acceptance criteria alignment
+- Tasks referencing files or components not defined in spec/plan
+
+#### D. Constitution Alignment
+
+- Any requirement or plan element conflicting with a MUST principle
+- Missing mandated sections or quality gates from constitution
+
+#### E. Coverage Gaps
+
+- Requirements with zero associated tasks
+- Tasks with no mapped requirement/story
+- Non-functional requirements not reflected in tasks (e.g., performance, security)
+
+#### F. Inconsistency
+
+- Terminology drift (same concept named differently across files)
+- Data entities referenced in plan but absent in spec (or vice versa)
+- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
+- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
+
+#### G. Decision-Memory Drift
+
+- ADR exists in planning but has no downstream task guardrail
+- Task carries a guardrail with no upstream ADR or plan rationale
+- Task text accidentally schedules an ADR-rejected path
+- Missing preventive `@RATIONALE` / `@REJECTED` summaries for known traps
+- Rejected-path notes that contradict later plan or task language without explicit decision revision
+
+### 5. Severity Assignment
+
+Use this heuristic to prioritize findings:
+
+- **CRITICAL**: Violates constitution MUST, missing core spec artifact, missing blocking ADR, rejected path scheduled as work, or requirement with zero coverage that blocks baseline functionality
+- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion, ADR guardrail drift
+- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case, incomplete decision-memory propagation
+- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
+
+### 6. Produce Compact Analysis Report
+
+Output a Markdown report (no file writes) with the following structure:
+
+## Specification Analysis Report
+
+| ID | Category | Severity | Location(s) | Summary | Recommendation |
+|----|----------|----------|-------------|---------|----------------|
+| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |
+
+(Add one row per finding; generate stable IDs prefixed by category initial.)
+
+**Coverage Summary Table:**
+
+| Requirement Key | Has Task? | Task IDs | Notes |
+|-----------------|-----------|----------|-------|
+
+**Decision Memory Summary Table:**
+
+| ADR / Guardrail | Present in Plan | Propagated to Tasks | Rejected Path Protected | Notes |
+|-----------------|-----------------|---------------------|-------------------------|-------|
+
+**Constitution Alignment Issues:** (if any)
+
+**Unmapped Tasks:** (if any)
+
+**Metrics:**
+
+- Total Requirements
+- Total Tasks
+- Coverage % (requirements with >=1 task)
+- Ambiguity Count
+- Duplication Count
+- Critical Issues Count
+- ADR Count
+- Guardrail Drift Count
+
+### 7. Provide Next Actions
+
+At end of report, output a concise Next Actions block:
+
+- If CRITICAL issues exist: Recommend resolving before `/speckit.implement`
+- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
+- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"
+
+### 8. Offer Remediation
+
+Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)
+
+## Operating Principles
+
+### Context Efficiency
+
+- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
+- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
+- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
+- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts
+
+### Analysis Guidelines
+
+- **NEVER modify files** (this is read-only analysis)
+- **NEVER hallucinate missing sections** (if absent, report them accurately)
+- **Prioritize constitution violations** (these are always CRITICAL)
+- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
+- **Report zero issues gracefully** (emit success report with coverage statistics)
+- **Treat missing ADR propagation as a real defect, not a documentation nit**
+
+## Context
+
+$ARGUMENTS
--- a/.kilo/workflows/speckit.checklist.md
+++ b/.kilo/workflows/speckit.checklist.md
@@ -0,0 +1,317 @@
+---
+description: Generate a custom checklist for the current feature based on user requirements.
+---
+
+## Checklist Purpose: "Unit Tests for English"
+
+**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, completeness, and decision-memory readiness of requirements in a given domain.
+
+**NOT for verification/testing**:
+
+- ❌ NOT "Verify the button clicks correctly"
+- ❌ NOT "Test error handling works"
+- ❌ NOT "Confirm the API returns 200"
+- ❌ NOT checking if code/implementation matches the spec
+
+**FOR requirements quality validation**:
+
+- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
+- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
+- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
+- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
+- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
+- ✅ "Do repo-shaping choices have explicit rationale and rejected alternatives before task decomposition?" (decision memory)
+
+**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Execution Steps
+
+1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
+   - All file paths must be absolute.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
+   - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
+   - Only ask about information that materially changes checklist content
+   - Be skipped individually if already unambiguous in `$ARGUMENTS`
+   - Prefer precision over breadth
+
+   Generation algorithm:
+   1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
+   2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
+   3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
+   4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria, decision-memory needs.
+   5. Formulate questions chosen from these archetypes:
+      - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
+      - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
+      - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
+      - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
+      - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
+      - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
+      - Decision-memory gap (e.g., "Do we need explicit ADR and rejected-path checks for this feature?")
+
+   Question formatting rules:
+   - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
+   - Limit to A–E options maximum; omit table if a free-form answer is clearer
+   - Never ask the user to restate what they already said
+   - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
+
+   Defaults when interaction impossible:
+   - Depth: Standard
+   - Audience: Reviewer (PR) if code-related; Author otherwise
+   - Focus: Top 2 relevance clusters
+
+   Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
+
+3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
+   - Derive checklist theme (e.g., security, review, deploy, ux)
+   - Consolidate explicit must-have items mentioned by user
+   - Map focus selections to category scaffolding
+   - Infer any missing context from spec/plan/tasks (do NOT hallucinate)
+
+4. **Load feature context**: Read from FEATURE_DIR:
+   - `spec.md`: Feature requirements and scope
+   - `plan.md` (if exists): Technical details, dependencies, ADR references
+   - `tasks.md` (if exists): Implementation tasks and inherited guardrails
+   - ADR artifacts (if present): `[DEF:id:ADR]`, `@RATIONALE`, `@REJECTED`
+
+   **Context Loading Strategy**:
+   - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
+   - Prefer summarizing long sections into concise scenario/requirement bullets
+   - Use progressive disclosure: add follow-on retrieval only if gaps detected
+   - If source docs are large, generate interim summary items instead of embedding raw text
+
+5. **Generate checklist** - Create "Unit Tests for Requirements":
+   - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
+   - Generate unique checklist filename:
+     - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
+     - Format: `[domain].md`
+     - If file exists, append to existing file
+   - Number items sequentially starting from CHK001
+   - Each `/speckit.checklist` run creates a NEW file (never overwrites existing checklists)
+
+   **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
+   Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
+   - **Completeness**: Are all necessary requirements present?
+   - **Clarity**: Are requirements unambiguous and specific?
+   - **Consistency**: Do requirements align with each other?
+   - **Measurability**: Can requirements be objectively verified?
+   - **Coverage**: Are all scenarios/edge cases addressed?
+   - **Decision Memory**: Are durable choices and rejected alternatives explicit before implementation starts?
+
+   **Category Structure** - Group items by requirement quality dimensions:
+   - **Requirement Completeness** (Are all necessary requirements documented?)
+   - **Requirement Clarity** (Are requirements specific and unambiguous?)
+   - **Requirement Consistency** (Do requirements align without conflicts?)
+   - **Acceptance Criteria Quality** (Are success criteria measurable?)
+   - **Scenario Coverage** (Are all flows/cases addressed?)
+   - **Edge Case Coverage** (Are boundary conditions defined?)
+   - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
+   - **Dependencies & Assumptions** (Are they documented and validated?)
+   - **Decision Memory & ADRs** (Are architectural choices, rationale, and rejected paths explicit?)
+   - **Ambiguities & Conflicts** (What needs clarification?)
+
+   **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
+
+   ❌ **WRONG** (Testing implementation):
+   - "Verify landing page displays 3 episode cards"
+   - "Test hover states work on desktop"
+   - "Confirm logo click navigates home"
+
+   ✅ **CORRECT** (Testing requirements quality):
+   - "Are the exact number and layout of featured episodes specified?" [Completeness]
+   - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
+   - "Are hover state requirements consistent across all interactive elements?" [Consistency]
+   - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
+   - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
+   - "Are blocking architecture decisions recorded with explicit rationale and rejected alternatives before task generation?" [Decision Memory]
+   - "Does the plan make clear which implementation shortcuts are forbidden for this feature?" [Decision Memory, Gap]
+
+   **ITEM STRUCTURE**:
+   Each item should follow this pattern:
+   - Question format asking about requirement quality
+   - Focus on what's WRITTEN (or not written) in the spec/plan
+   - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
+   - Reference spec section `[Spec §X.Y]` when checking existing requirements
+   - Use `[Gap]` marker when checking for missing requirements
+
+   **EXAMPLES BY QUALITY DIMENSION**:
+
+   Completeness:
+   - "Are error handling requirements defined for all API failure modes? [Gap]"
+   - "Are accessibility requirements specified for all interactive elements? [Completeness]"
+   - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
+
+   Clarity:
+   - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
+   - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
+   - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
+
+   Consistency:
+   - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
+   - "Are card component requirements consistent between landing and detail pages? [Consistency]"
+
+   Coverage:
+   - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
+   - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
+   - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
+
+   Measurability:
+   - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
+   - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
+
+   Decision Memory:
+   - "Do all repo-shaping technical choices have explicit rationale before tasks are generated? [Decision Memory, Plan]"
+   - "Are rejected alternatives documented for architectural branches that would materially change implementation scope? [Decision Memory, Gap]"
+   - "Can a coder determine from the planning artifacts which tempting shortcut is forbidden? [Decision Memory, Clarity]"
+
+   **Scenario Classification & Coverage** (Requirements Quality Focus):
+   - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
+   - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
+   - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
+   - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
+
+   **Traceability Requirements**:
+   - MINIMUM: ≥80% of items MUST include at least one traceability reference
+   - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`, `[ADR]`
+   - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
+
+   **Surface & Resolve Issues** (Requirements Quality Problems):
+   Ask questions about the requirements themselves:
+   - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
+   - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
+   - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
+   - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
+   - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
+   - Decision-memory drift: "Do tasks inherit the same rejected-path guardrails defined in planning? [Decision Memory, Conflict]"
+
+   **Content Consolidation**:
+   - Soft cap: If raw candidate items > 40, prioritize by risk/impact
+   - Merge near-duplicates checking the same requirement aspect
+   - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
+
+   **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
+   - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
+   - ❌ References to code execution, user actions, system behavior
+   - ❌ "Displays correctly", "works properly", "functions as expected"
+   - ❌ "Click", "navigate", "render", "load", "execute"
+   - ❌ Test cases, test plans, QA procedures
+   - ❌ Implementation details (frameworks, APIs, algorithms) unless the checklist is asking whether those decisions were explicitly documented and bounded by rationale/rejected alternatives
+
+   **✅ REQUIRED PATTERNS** - These test requirements quality:
+   - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
+   - ✅ "Is [vague term] quantified/clarified with specific criteria?"
+   - ✅ "Are requirements consistent between [section A] and [section B]?"
+   - ✅ "Can [requirement] be objectively measured/verified?"
+   - ✅ "Are [edge cases/scenarios] addressed in requirements?"
+   - ✅ "Does the spec define [missing aspect]?"
+   - ✅ "Does the plan record why [accepted path] was chosen and why [rejected path] is forbidden?"
+
+6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
+
+7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
+   - Focus areas selected
+   - Depth level
+   - Actor/timing
+   - Any explicit user-specified must-have items incorporated
+   - Whether ADR / decision-memory checks were included
+
+**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
+
+- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
+- Simple, memorable filenames that indicate checklist purpose
+- Easy identification and navigation in the `checklists/` folder
+
+To avoid clutter, use descriptive types and clean up obsolete checklists when done.
+
+## Example Checklist Types & Sample Items
+
+**UX Requirements Quality:** `ux.md`
+
+Sample items (testing the requirements, NOT the implementation):
+
+- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
+- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
+- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
+- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
+- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
+- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
+
+**API Requirements Quality:** `api.md`
+
+Sample items:
+
+- "Are error response formats specified for all failure scenarios? [Completeness]"
+- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
+- "Are authentication requirements consistent across all endpoints? [Consistency]"
+- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
+- "Is versioning strategy documented in requirements? [Gap]"
+
+**Performance Requirements Quality:** `performance.md`
+
+Sample items:
+
+- "Are performance requirements quantified with specific metrics? [Clarity]"
+- "Are performance targets defined for all critical user journeys? [Coverage]"
+- "Are performance requirements under different load conditions specified? [Completeness]"
+- "Can performance requirements be objectively measured? [Measurability]"
+- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
+
+**Security Requirements Quality:** `security.md`
+
+Sample items:
+
+- "Are authentication requirements specified for all protected resources? [Coverage]"
+- "Are data protection requirements defined for sensitive information? [Completeness]"
+- "Is the threat model documented and requirements aligned to it? [Traceability]"
+- "Are security requirements consistent with compliance obligations? [Consistency]"
+- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
+
+**Architecture Decision Quality:** `architecture.md`
+
+Sample items:
+
+- "Do all repo-shaping architecture choices have explicit rationale before tasks are generated? [Decision Memory]"
+- "Are rejected alternatives documented for each blocking technology branch? [Decision Memory, Gap]"
+- "Can an implementer tell which shortcuts are forbidden without re-reading research artifacts? [Clarity, ADR]"
+- "Are ADR decisions traceable to requirements or constraints in the spec? [Traceability, ADR]"
+
+## Anti-Examples: What NOT To Do
+
+**❌ WRONG - These test implementation, not requirements:**
+
+```markdown
+- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
+- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
+- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
+- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
+```
+
+**✅ CORRECT - These test requirements quality:**
+
+```markdown
+- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
+- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
+- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
+- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
+- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
+- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
+- [ ] CHK007 - Do planning artifacts state why the accepted architecture was chosen and which alternative is rejected? [Decision Memory, ADR]
+```
+
+**Key Differences:**
+
+- Wrong: Tests if the system works correctly
+- Correct: Tests if the requirements are written correctly
+- Wrong: Verification of behavior
+- Correct: Validation of requirement quality
+- Wrong: "Does it do X?"
+- Correct: "Is X clearly specified?"
--- a/.kilo/workflows/speckit.clarify.md
+++ b/.kilo/workflows/speckit.clarify.md
@@ -0,0 +1,181 @@
+---
+description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
+handoffs: 
+  - label: Build Technical Plan
+    agent: speckit.plan
+    prompt: Create a plan for the spec. I am building with...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
+
+Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
+
+Execution steps:
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
+   - `FEATURE_DIR`
+   - `FEATURE_SPEC`
+   - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
+   - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
+
+   Functional Scope & Behavior:
+   - Core user goals & success criteria
+   - Explicit out-of-scope declarations
+   - User roles / personas differentiation
+
+   Domain & Data Model:
+   - Entities, attributes, relationships
+   - Identity & uniqueness rules
+   - Lifecycle/state transitions
+   - Data volume / scale assumptions
+
+   Interaction & UX Flow:
+   - Critical user journeys / sequences
+   - Error/empty/loading states
+   - Accessibility or localization notes
+
+   Non-Functional Quality Attributes:
+   - Performance (latency, throughput targets)
+   - Scalability (horizontal/vertical, limits)
+   - Reliability & availability (uptime, recovery expectations)
+   - Observability (logging, metrics, tracing signals)
+   - Security & privacy (authN/Z, data protection, threat assumptions)
+   - Compliance / regulatory constraints (if any)
+
+   Integration & External Dependencies:
+   - External services/APIs and failure modes
+   - Data import/export formats
+   - Protocol/versioning assumptions
+
+   Edge Cases & Failure Handling:
+   - Negative scenarios
+   - Rate limiting / throttling
+   - Conflict resolution (e.g., concurrent edits)
+
+   Constraints & Tradeoffs:
+   - Technical constraints (language, storage, hosting)
+   - Explicit tradeoffs or rejected alternatives
+
+   Terminology & Consistency:
+   - Canonical glossary terms
+   - Avoided synonyms / deprecated terms
+
+   Completion Signals:
+   - Acceptance criteria testability
+   - Measurable Definition of Done style indicators
+
+   Misc / Placeholders:
+   - TODO markers / unresolved decisions
+   - Ambiguous adjectives ("robust", "intuitive") lacking quantification
+
+   For each category with Partial or Missing status, add a candidate question opportunity unless:
+   - Clarification would not materially change implementation or validation strategy
+   - Information is better deferred to planning phase (note internally)
+
+3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
+    - Maximum of 10 total questions across the whole session.
+    - Each question must be answerable with EITHER:
+       - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR
+       - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words").
+    - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
+    - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
+    - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
+    - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
+    - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
+
+4. Sequential questioning loop (interactive):
+    - Present EXACTLY ONE question at a time.
+    - For multiple‑choice questions:
+       - **Analyze all options** and determine the **most suitable option** based on:
+          - Best practices for the project type
+          - Common patterns in similar implementations
+          - Risk reduction (security, performance, maintainability)
+          - Alignment with any explicit project goals or constraints visible in the spec
+       - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
+       - Format as: `**Recommended:** Option [X] - <reasoning>`
+       - Then render all options as a Markdown table:
+
+       | Option | Description |
+       |--------|-------------|
+       | A | <Option A description> |
+       | B | <Option B description> |
+       | C | <Option C description> (add D/E as needed up to 5) |
+       | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
+
+       - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
+    - For short‑answer style (no meaningful discrete options):
+       - Provide your **suggested answer** based on best practices and context.
+       - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
+       - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
+    - After the user answers:
+       - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
+       - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
+       - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
+       - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
+    - Stop asking further questions when:
+       - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
+       - User signals completion ("done", "good", "no more"), OR
+       - You reach 5 asked questions.
+    - Never reveal future queued questions in advance.
+    - If no valid questions exist at start, immediately report no critical ambiguities.
+
+5. Integration after EACH accepted answer (incremental update approach):
+    - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
+    - For the first integrated answer in this session:
+       - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
+       - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
+    - Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
+    - Then immediately apply the clarification to the most appropriate section(s):
+       - Functional ambiguity → Update or add a bullet in Functional Requirements.
+       - User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
+       - Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
+       - Non-functional constraint → Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
+       - Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
+       - Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
+    - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
+    - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
+    - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
+    - Keep each inserted clarification minimal and testable (avoid narrative drift).
+
+6. Validation (performed after EACH write plus final pass):
+   - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
+   - Total asked (accepted) questions ≤ 5.
+   - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
+   - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
+   - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
+   - Terminology consistency: same canonical term used across all updated sections.
+
+7. Write the updated spec back to `FEATURE_SPEC`.
+
+8. Report completion (after questioning loop ends or early termination):
+   - Number of questions asked & answered.
+   - Path to updated spec.
+   - Sections touched (list names).
+   - Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
+   - If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
+   - Suggested next command.
+
+Behavior rules:
+
+- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
+- If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
+- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
+- Avoid speculative tech stack questions unless the absence blocks functional clarity.
+- Respect user early termination signals ("stop", "done", "proceed").
+- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
+- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
+
+Context for prioritization: $ARGUMENTS
--- a/.kilo/workflows/speckit.constitution.md
+++ b/.kilo/workflows/speckit.constitution.md
@@ -0,0 +1,82 @@
+---
+description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
+handoffs: 
+  - label: Build Specification
+    agent: speckit.specify
+    prompt: Implement the feature specification based on the updated constitution. I want to build...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+You are updating the project constitution at `.ai/standards/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
+
+Follow this execution flow:
+
+1. Load the existing constitution template at `.ai/standards/constitution.md`.
+   - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
+   **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
+
+2. Collect/derive values for placeholders:
+   - If user input (conversation) supplies a value, use it.
+   - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
+   - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
+   - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
+     - MAJOR: Backward incompatible governance/principle removals or redefinitions.
+     - MINOR: New principle/section added or materially expanded guidance.
+     - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
+   - If version bump type ambiguous, propose reasoning before finalizing.
+
+3. Draft the updated constitution content:
+   - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
+   - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
+   - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non‑negotiable rules, explicit rationale if not obvious.
+   - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
+
+4. Consistency propagation checklist (convert prior checklist into active validations):
+   - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
+   - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
+   - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
+   - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
+   - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
+
+5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
+   - Version change: old → new
+   - List of modified principles (old title → new title if renamed)
+   - Added sections
+   - Removed sections
+   - Templates requiring updates (✅ updated / ⚠ pending) with file paths
+   - Follow-up TODOs if any placeholders intentionally deferred.
+
+6. Validation before final output:
+   - No remaining unexplained bracket tokens.
+   - Version line matches report.
+   - Dates ISO format YYYY-MM-DD.
+   - Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).
+
+7. Write the completed constitution back to `.ai/standards/constitution.md` (overwrite).
+
+8. Output a final summary to the user with:
+   - New version and bump rationale.
+   - Any files flagged for manual follow-up.
+   - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
+
+Formatting & Style Requirements:
+
+- Use Markdown headings exactly as in the template (do not demote/promote levels).
+- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
+- Keep a single blank line between sections.
+- Avoid trailing whitespace.
+
+If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
+
+If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
+
+Do not create a new template; always operate on the existing `.ai/standards/constitution.md` file.
--- a/.kilo/workflows/speckit.implement.md
+++ b/.kilo/workflows/speckit.implement.md
@@ -0,0 +1,203 @@
+---
+description: Execute the implementation plan by processing and executing all tasks defined in tasks.md
+handoffs:
+  - label: Audit & Verify (Tester)
+    agent: tester
+    prompt: Perform semantic audit, algorithm emulation, and unit test verification for the completed tasks.
+    send: true
+  - label: Orchestration Control
+    agent: orchestrator
+    prompt: Review Tester's feedback and coordinate next steps.
+    send: true
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
+   - Scan all checklist files in the checklists/ directory
+   - For each checklist, count:
+     - Total items: All lines matching `- [ ]` or `- [X]` or `- [x]`
+     - Completed items: Lines matching `- [X]` or `- [x]`
+     - Incomplete items: Lines matching `- [ ]`
+   - Create a status table:
+
+     ```text
+     | Checklist | Total | Completed | Incomplete | Status |
+     |-----------|-------|-----------|------------|--------|
+     | ux.md     | 12    | 12        | 0          | ✓ PASS |
+     | test.md   | 8     | 5         | 3          | ✗ FAIL |
+     | security.md | 6   | 6         | 0          | ✓ PASS |
+     ```
+
+   - Calculate overall status:
+     - **PASS**: All checklists have 0 incomplete items
+     - **FAIL**: One or more checklists have incomplete items
+
+   - **If any checklist is incomplete**:
+     - Display the table with incomplete item counts
+     - **STOP** and ask: "Some checklists are incomplete. Do you want to proceed with implementation anyway? (yes/no)"
+     - Wait for user response before continuing
+     - If user says "no" or "wait" or "stop", halt execution
+     - If user says "yes" or "proceed" or "continue", proceed to step 3
+
+   - **If all checklists are complete**:
+     - Display the table showing all checklists passed
+     - Automatically proceed to step 3
+
+3. Load and analyze the implementation context:
+   - **REQUIRED**: Read `.ai/standards/semantics.md` for strict coding standards and contract requirements
+   - **REQUIRED**: Read `tasks.md` for the complete task list and execution plan
+   - **REQUIRED**: Read `plan.md` for tech stack, architecture, and file structure
+   - **REQUIRED IF PRESENT**: Read ADR artifacts containing `[DEF:id:ADR]` nodes and build a blocked-path inventory from `@REJECTED`
+   - **IF EXISTS**: Read `data-model.md` for entities and relationships
+   - **IF EXISTS**: Read `contracts/` for API specifications and test requirements
+   - **IF EXISTS**: Read `research.md` for technical decisions and constraints
+   - **IF EXISTS**: Read `quickstart.md` for integration scenarios
+
+4. **Project Setup Verification**:
+   - **REQUIRED**: Create/verify ignore files based on actual project setup:
+
+   **Detection & Creation Logic**:
+   - Check if the following command succeeds to determine if the repository is a git repo (create/verify `.gitignore` if so):
+
+     ```sh
+     git rev-parse --git-dir 2>/dev/null
+     ```
+
+   - Check if Dockerfile* exists or Docker in `plan.md` → create/verify `.dockerignore`
+   - Check if `.eslintrc*` exists → create/verify `.eslintignore`
+   - Check if `eslint.config.*` exists → ensure the config's `ignores` entries cover required patterns
+   - Check if `.prettierrc*` exists → create/verify `.prettierignore`
+   - Check if `.npmrc` or `package.json` exists → create/verify `.npmignore` (if publishing)
+   - Check if terraform files (`*.tf`) exist → create/verify `.terraformignore`
+   - Check if `.helmignore` needed (helm charts present) → create/verify `.helmignore`
+
+   **If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
+   **If ignore file missing**: Create with full pattern set for detected technology
+
+   **Common Patterns by Technology** (from `plan.md` tech stack):
+   - **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
+   - **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
+   - **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
+   - **C#/.NET**: `bin/`, `obj/`, `*.user`, `*.suo`, `packages/`
+   - **Go**: `*.exe`, `*.test`, `vendor/`, `*.out`
+   - **Ruby**: `.bundle/`, `log/`, `tmp/`, `*.gem`, `vendor/bundle/`
+   - **PHP**: `vendor/`, `*.log`, `*.cache`, `*.env`
+   - **Rust**: `target/`, `debug/`, `release/`, `*.rs.bk`, `*.rlib`, `*.prof*`, `.idea/`, `*.log`, `.env*`
+   - **Kotlin**: `build/`, `out/`, `.gradle/`, `.idea/`, `*.class`, `*.jar`, `*.iml`, `*.log`, `.env*`
+   - **C++**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.so`, `*.a`, `*.exe`, `*.dll`, `.idea/`, `*.log`, `.env*`
+   - **C**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.a`, `*.so`, `*.exe`, `Makefile`, `config.log`, `.idea/`, `*.log`, `.env*`
+   - **Swift**: `.build/`, `DerivedData/`, `*.swiftpm/`, `Packages/`
+   - **R**: `.Rproj.user/`, `.Rhistory`, `.RData`, `.Ruserdata`, `*.Rproj`, `packrat/`, `renv/`
+   - **Universal**: `.DS_Store`, `Thumbs.db`, `*.tmp`, `*.swp`, `.vscode/`, `.idea/`
+
+   **Tool-Specific Patterns**:
+   - **Docker**: `node_modules/`, `.git/`, `Dockerfile*`, `.dockerignore`, `*.log*`, `.env*`, `coverage/`
+   - **ESLint**: `node_modules/`, `dist/`, `build/`, `coverage/`, `*.min.js`
+   - **Prettier**: `node_modules/`, `dist/`, `build/`, `coverage/`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
+   - **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
+   - **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
+
+5. Parse `tasks.md` structure and extract:
+   - **Task phases**: Setup, Tests, Core, Integration, Polish
+   - **Task dependencies**: Sequential vs parallel execution rules
+   - **Task details**: ID, description, file paths, parallel markers [P]
+   - **Execution flow**: Order and dependency requirements
+   - **Decision-memory requirements**: which tasks inherit ADR ids, `@RATIONALE`, and `@REJECTED` guardrails
+
+6. Execute implementation following the task plan:
+   - **Phase-by-phase execution**: Complete each phase before moving to the next
+   - **Respect dependencies**: Run sequential tasks in order, parallel tasks [P] can run together  
+   - **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
+   - **File-based coordination**: Tasks affecting the same files must run sequentially
+   - **Validation checkpoints**: Verify each phase completion before proceeding
+   - **ADR guardrail discipline**: if a task packet or local contract forbids a path via `@REJECTED`, do not treat it as an implementation option
+
+7. Implementation execution rules:
+   - **Strict Adherence**: Apply `.ai/standards/semantics.md` rules:
+     - Every file MUST start with a `[DEF:id:Type]` header and end with a matching closing `[/DEF:id:Type]` anchor.
+     - Use `@COMPLEXITY` / `@C:` as the primary control tag; treat `@TIER` only as legacy compatibility metadata.
+     - Contract density MUST match effective complexity from [`.ai/standards/semantics.md`](.ai/standards/semantics.md):
+       - Complexity 1: anchors only.
+       - Complexity 2: require `@PURPOSE`.
+       - Complexity 3: require `@PURPOSE` and `@RELATION`.
+       - Complexity 4: require `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`.
+       - Complexity 5: require full level-4 contract plus `@DATA_CONTRACT` and `@INVARIANT`.
+     - For Python Complexity 4+ modules, implementation MUST include a meaningful semantic logging path using `logger.reason()` and `logger.reflect()`.
+     - For Python Complexity 5 modules, `belief_scope(...)` is mandatory and the critical path must be irrigated with `logger.reason()` / `logger.reflect()` according to the contract.
+     - For Svelte components, require `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`; runes-only reactivity is allowed (`$state`, `$derived`, `$effect`, `$props`).
+     - Reject pseudo-semantic markup: docstrings containing loose `@PURPOSE` / `@PRE` text do **NOT** satisfy the protocol unless represented in canonical anchored metadata blocks.
+     - Preserve and propagate decision-memory tags. Upstream `@RATIONALE` / `@REJECTED` are mandatory when carried by the task packet or contract.
+     - If `logger.explore()` or equivalent runtime evidence leads to a retained workaround, mutate the same contract header with reactive Micro-ADR tags: `@RATIONALE` and `@REJECTED`.
+   - **Self-Audit**: The Coder MUST use `axiom-core` tools (like `audit_contracts_tool`) to verify semantic compliance before completion.
+   - **Semantic Rejection Gate**: If self-audit reveals broken anchors, missing closing tags, missing required metadata for the effective complexity, orphaned critical classes/functions, Complexity 4/5 Python code without required belief-state logging, or retained workarounds without decision-memory tags, the task is NOT complete and cannot be handed off as accepted work.
+   - **CRITICAL Contracts**: If a task description contains a contract summary (e.g., `CRITICAL: PRE: ..., POST: ...`), these constraints are **MANDATORY** and must be strictly implemented in the code using guards/assertions (if applicable per protocol).
+   - **Setup first**: Initialize project structure, dependencies, configuration
+   - **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
+   - **Core development**: Implement models, services, CLI commands, endpoints
+   - **Integration work**: Database connections, middleware, logging, external services
+   - **Polish and validation**: Unit tests, performance optimization, documentation
+
+8. Progress tracking and error handling:
+   - Report progress after each completed task.
+   - Halt execution if any non-parallel task fails.
+   - For parallel tasks [P], continue with successful tasks, report failed ones.
+   - Provide clear error messages with context for debugging.
+   - Suggest next steps if implementation cannot proceed.
+   - **IMPORTANT** For completed tasks, mark as [X] only AFTER local verification and self-audit.
+   - If blocked because the only apparent fix is listed in upstream `@REJECTED`, escalate for decision revision instead of silently overriding the guardrail.
+
+9. **Handoff to Tester (Audit Loop)**:
+   - Once a task or phase is complete, the Coder hands off to the Tester.
+   - Handoff includes: file paths, declared complexity, expected contracts (`@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, `@INVARIANT` when applicable), and a short logic overview.
+   - Handoff MUST explicitly disclose any contract exceptions or known semantic debt. Hidden semantic debt is forbidden.
+   - Handoff MUST disclose decision-memory changes: inherited ADR ids, new or updated `@RATIONALE`, new or updated `@REJECTED`, and any blocked paths that remain active.
+   - The handoff payload MUST instruct the Tester to execute the dedicated testing workflow [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md), not just perform an informal review.
+
+10. **Tester Verification & Orchestrator Gate**:
+   - Tester MUST:
+     - Explicitly run the [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md) workflow as the verification procedure for the delivered implementation batch.
+     - Perform mandatory semantic audit (using `audit_contracts_tool`).
+     - Reject code that only imitates the protocol superficially, such as free-form docstrings with `@PURPOSE` text but without canonical `[DEF]...[/DEF]` anchors and header metadata.
+     - Verify that effective complexity and required metadata match [`.ai/standards/semantics.md`](.ai/standards/semantics.md).
+     - Verify that Python Complexity 4/5 implementations include required belief-state instrumentation (`belief_scope`, `logger.reason()`, `logger.reflect()`).
+     - Verify that upstream rejected paths were not silently restored.
+     - Emulate algorithms "in mind" step-by-step to ensure logic consistency.
+     - Verify unit tests match the declared contracts.
+   - If Tester finds issues:
+     - Emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch | speckit_test_not_run | rejected_path_regression]`.
+     - Provide concrete file-path-based reasons, for example: missing anchors, module/class contract mismatch, missing `@DATA_CONTRACT`, missing `logger.reason()`, illegal docstring-only annotations, missing decision-memory tags, re-enabled upstream rejected path, or missing execution of [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md).
+     - Notify the Orchestrator.
+     - Orchestrator redirects the feedback to the Coder for remediation.
+   - Orchestrator green-status rule:
+     - The Orchestrator MUST NOT assign green/accepted status unless the Tester confirms that [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md) was executed.
+     - Missing execution evidence for [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md) is an automatic gate failure even if the Tester verbally reports that the code "looks fine".
+   - Acceptance (Final mark [X]):
+     - Only after the Tester is satisfied with semantics, emulation, and tests.
+     - Any semantic audit warning relevant to touched files blocks acceptance until remediated or explicitly waived by the user.
+     - No final green status is allowed without explicit confirmation that [`.kilocode/workflows/speckit.test.md`](.kilocode/workflows/speckit.test.md) was run.
+
+11. Completion validation:
+    - Verify all required tasks are completed and accepted by the Tester.
+    - Check that implemented features match the original specification.
+    - Confirm the implementation follows the technical plan and GRACE standards.
+    - Confirm touched files do not contain protocol-invalid patterns such as:
+      - class/function-level docstring contracts standing in for canonical anchors,
+      - missing closing anchors,
+      - missing required metadata for declared complexity,
+      - Complexity 5 repository/service code using only `belief_scope(...)` without explicit `logger.reason()` / `logger.reflect()` checkpoints,
+      - retained workarounds missing local `@RATIONALE` / `@REJECTED`,
+      - silent resurrection of paths already blocked by upstream ADR or task guardrails.
+    - Report final status with summary of completed and audited work.
+
+Note: This command assumes a complete task breakdown exists in `tasks.md`. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
--- a/.kilo/workflows/speckit.plan.md
+++ b/.kilo/workflows/speckit.plan.md
@@ -0,0 +1,139 @@
+---
+description: Execute the implementation planning workflow using the plan template to generate design artifacts.
+handoffs: 
+  - label: Create Tasks
+    agent: speckit.tasks
+    prompt: Break the plan into tasks
+    send: true
+  - label: Create Checklist
+    agent: speckit.checklist
+    prompt: Create a checklist for the following domain...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load context**: Read `.ai/ROOT.md` and `.ai/PROJECT_MAP.md` to understand the project structure and navigation. Then read required standards: `.ai/standards/constitution.md` and `.ai/standards/semantics.md`. Load IMPL_PLAN template.
+
+3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
+   - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
+   - Fill Constitution Check section from constitution
+   - Evaluate gates (ERROR if violations unjustified)
+   - Phase 0: Generate `research.md` (resolve all NEEDS CLARIFICATION)
+   - Phase 1: Generate `data-model.md`, `contracts/`, `quickstart.md`
+   - Phase 1: Generate global ADR artifacts and connect them to the plan
+   - Phase 1: Update agent context by running the agent script
+   - Re-evaluate Constitution Check post-design
+
+4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, generated artifacts, and ADR decisions created.
+
+## Phases
+
+### Phase 0: Outline & Research
+
+1. **Extract unknowns from Technical Context** above:
+   - For each NEEDS CLARIFICATION → research task
+   - For each dependency → best practices task
+   - For each integration → patterns task
+
+2. **Generate and dispatch research agents**:
+
+   ```text
+   For each unknown in Technical Context:
+     Task: "Research {unknown} for {feature context}"
+   For each technology choice:
+     Task: "Find best practices for {tech} in {domain}"
+   ```
+
+3. **Consolidate findings** in `research.md` using format:
+   - Decision: [what was chosen]
+   - Rationale: [why chosen]
+   - Alternatives considered: [what else evaluated]
+
+**Output**: `research.md` with all NEEDS CLARIFICATION resolved
+
+### Phase 1: Design, ADRs & Contracts
+
+**Prerequisites:** `research.md` complete
+
+0. **Validate Design against UX Reference**:
+   - Check if the proposed architecture supports the latency, interactivity, and flow defined in `ux_reference.md`.
+   - **Linkage**: Ensure key UI states from `ux_reference.md` map to Component Contracts (`@UX_STATE`).
+   - **CRITICAL**: If the technical plan compromises the UX (e.g. "We can't do real-time validation"), you **MUST STOP** and warn the user.
+
+1. **Extract entities from feature spec** → `data-model.md`:
+   - Entity name, fields, relationships, validation rules.
+
+2. **Generate Global ADRs (Decision Memory Root Layer)**:
+   - Read `spec.md`, `research.md`, and the technical context to identify repo-shaping decisions: storage, auth pattern, framework boundaries, integration patterns, deployment assumptions, failure strategy.
+   - For each durable architectural choice, emit a standalone semantic ADR block using `[DEF:DecisionId:ADR]`.
+   - Every ADR block MUST include:
+     - `@COMPLEXITY: 3` or `4` depending on blast radius
+     - `@PURPOSE`
+     - `@RATIONALE`
+     - `@REJECTED`
+     - `@RELATION` back to the originating spec/research/plan boundary or target module family
+   - Preferred destinations:
+     - `docs/architecture.md` for cross-cutting repository decisions
+     - feature-local design docs when the decision is feature-scoped
+     - root module headers only when the decision scope is truly local
+   - **Hard Gate**: do not continue to task decomposition until the blocking global decisions have been materialized as ADR nodes.
+   - **Anti-Regression Goal**: a later orchestrator must be able to read these ADRs and avoid creating tasks for rejected branches.
+
+3. **Design & Verify Contracts (Semantic Protocol)**:
+   - **Drafting**: Define semantic headers, metadata, and closing anchors for all new modules strictly from `.ai/standards/semantics.md`.
+   - **Complexity Classification**: Classify each contract with `@COMPLEXITY: [1|2|3|4|5]` or `@C:`. Treat `@TIER` only as a legacy compatibility hint and never as the primary rule source.
+   - **Adaptive Contract Requirements**:
+     - **Complexity 1**: anchors only; `@PURPOSE` optional.
+     - **Complexity 2**: require `@PURPOSE`.
+     - **Complexity 3**: require `@PURPOSE` and `@RELATION`; UI also requires `@UX_STATE`.
+     - **Complexity 4**: require `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; Python modules must define a meaningful `logger.reason()` / `logger.reflect()` path or equivalent belief-state mechanism.
+     - **Complexity 5**: require full level-4 contract plus `@DATA_CONTRACT` and `@INVARIANT`; Python modules must require `belief_scope`; UI modules must define UX contracts including `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY`, and `@UX_REACTIVITY`.
+   - **Decision-Memory Propagation**:
+     - If a module/function/component realizes or is constrained by an ADR, add local `@RATIONALE` and `@REJECTED` guardrails before coding begins.
+     - Use `@RELATION: IMPLEMENTS ->[AdrId]` when the contract realizes the ADR.
+     - Use `@RELATION: DEPENDS_ON ->[AdrId]` when the contract is merely constrained by the ADR.
+     - Record known LLM traps directly in the contract header so the implementer inherits the guardrail from the start.
+   - **Relation Syntax**: Write dependency edges in canonical GraphRAG form: `@RELATION: [PREDICATE] ->[TARGET_ID]`.
+   - **Context Guard**: If a target relation, DTO, required dependency, or decision rationale cannot be named confidently, stop generation and emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
+   - **Testing Contracts**: Add `@TEST_CONTRACT`, `@TEST_SCENARIO`, `@TEST_FIXTURE`, `@TEST_EDGE`, and `@TEST_INVARIANT` when the design introduces audit-critical or explicitly test-governed contracts, especially for Complexity 5 boundaries.
+   - **Self-Review**:
+     - *Complexity Fit*: Does each contract include exactly the metadata and contract density required by its complexity level?
+     - *Completeness*: Do `@PRE`/`@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, UX tags, and decision-memory tags cover the edge cases identified in Research and UX Reference?
+     - *Connectivity*: Do `@RELATION` tags form a coherent graph using canonical `@RELATION: [PREDICATE] ->[TARGET_ID]` syntax?
+     - *Compliance*: Are all anchors properly opened and closed, and does the chosen comment syntax match the target medium?
+     - *Belief-State Requirements*: Do Complexity 4/5 Python modules explicitly account for `logger.reason()`, `logger.reflect()`, and `belief_scope` requirements?
+     - *ADR Continuity*: Does every blocking architectural decision have a corresponding ADR node and at least one downstream guarded contract?
+   - **Output**: Write verified contracts to `contracts/modules.md`.
+
+4. **Simulate Contract Usage**:
+   - Trace one key user scenario through the defined contracts to ensure data flow continuity.
+   - If a contract interface mismatch is found, fix it immediately.
+   - Verify that no traced path accidentally realizes an alternative already named in any ADR `@REJECTED` tag.
+
+5. **Generate API contracts**:
+   - Output OpenAPI/GraphQL schema to `/contracts/` for backend-frontend sync.
+
+6. **Agent context update**:
+   - Run `.specify/scripts/bash/update-agent-context.sh kilocode`
+   - These scripts detect which AI agent is in use
+   - Update the appropriate agent-specific context file
+   - Add only new technology from current plan
+   - Preserve manual additions between markers
+
+**Output**: `data-model.md`, `/contracts/*`, `quickstart.md`, ADR artifact(s), agent-specific file
+
+## Key rules
+
+- Use absolute paths
+- ERROR on gate failures or unresolved clarifications
+- Do not hand off to [`speckit.tasks`](.kilocode/workflows/speckit.tasks.md) until blocking ADRs exist and rejected branches are explicit
--- a/.kilo/workflows/speckit.semantics.md
+++ b/.kilo/workflows/speckit.semantics.md
@@ -0,0 +1,92 @@
+---
+description: Maintain semantic integrity by generating maps and auditing compliance reports.
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Goal
+
+Ensure the codebase adheres to the semantic standards defined in `.ai/standards/semantics.md` by using the AXIOM MCP semantic graph as the primary execution engine. This involves reindexing the workspace, measuring semantic health, auditing contract compliance, auditing decision-memory continuity, and optionally delegating contract-safe fixes through MCP-aware agents.
+
+## Operating Constraints
+
+1. **ROLE: Orchestrator**: You are responsible for the high-level coordination of semantic maintenance.
+2. **MCP-FIRST**: Use the connected AXIOM MCP server as the default mechanism for discovery, health checks, audit, semantic context, impact analysis, and contract mutation planning.
+3. **STRICT ADHERENCE**: Follow `.ai/standards/semantics.md` for all anchor and tag syntax.
+4. **NON-DESTRUCTIVE**: Do not remove existing code logic; only add or update semantic annotations.
+5. **TIER AWARENESS**: Prioritize CRITICAL and STANDARD modules for compliance fixes.
+6. **NO PSEUDO-CONTRACTS (CRITICAL)**: You are STRICTLY FORBIDDEN from using automated scripts (e.g., Python/Bash/sed) to mechanically inject boilerplate, placeholders, or "pseudo-contracts" merely to artificially inflate the compliance score. Every semantic tag, anchor, and contract you add MUST reflect a genuine, deep understanding of the code's actual logic and business requirements.
+7. **ID NAMING (CRITICAL)**: NEVER use fully-qualified Python import paths in `[DEF:id:Type]`. Use short, domain-driven semantic IDs (e.g., `[DEF:AuthService:Class]`). Follow the exact style shown in `.ai/standards/semantics.md`.
+8. **ORPHAN PREVENTION**: To reduce the orphan count, you MUST physically wrap actual class and function definitions with `[DEF:id:Type] ... [/DEF]` blocks in the code. Modifying `@RELATION` tags does NOT fix orphans. The AST parser flags any unwrapped function as an orphan.
+   - **Exception for Tests**: In test modules, use `BINDS_TO` to link major helpers to the module root. Small helpers remain C1 and don't need relations.
+9. **DECISION-MEMORY CONTINUITY**: Audit ADR nodes, preventive task guardrails, and reactive Micro-ADR tags as one anti-regression chain. Missing or contradictory `@RATIONALE` / `@REJECTED` is a first-class semantic defect.
+
+## Execution Steps
+
+### 1. Reindex Semantic Workspace
+
+Use MCP to refresh the semantic graph for the current workspace with [`reindex_workspace_tool`](.kilo/mcp.json).
+
+### 2. Analyze Semantic Health
+
+Use [`workspace_semantic_health_tool`](.kilo/mcp.json) and capture:
+- `contracts`
+- `relations`
+- `orphans`
+- `unresolved_relations`
+- `files`
+
+Treat high orphan counts and unresolved relations as first-class health indicators, not just informational noise.
+
+### 3. Audit Critical Issues
+
+Use [`audit_contracts_tool`](.kilo/mcp.json) and classify findings into:
+- **Critical Parsing/Structure Errors**: malformed or incoherent semantic contract regions
+- **Critical Contract Gaps**: missing [`@DATA_CONTRACT`](.ai/standards/semantics.md), [`@PRE`](.ai/standards/semantics.md), [`@POST`](.ai/standards/semantics.md), [`@SIDE_EFFECT`](.ai/standards/semantics.md) on CRITICAL contracts
+- **Decision-Memory Gaps**:
+  - missing standalone `[DEF:id:ADR]` for repo-shaping decisions
+  - missing `@RATIONALE` / `@REJECTED` where task or implementation context clearly requires guardrails
+  - retained workaround code without local reactive Micro-ADR tags
+  - implementation that silently re-enables a path declared in upstream `@REJECTED`
+- **Coverage Gaps**: missing [`@TIER`](.ai/standards/semantics.md), missing [`@PURPOSE`](.ai/standards/semantics.md)
+- **Graph Breakages**: unresolved relations, broken references, isolated critical contracts, ADR nodes without downstream guarded contracts
+
+### 4. Build Remediation Context
+
+For the top failing contracts, use MCP semantic context tools such as [`get_semantic_context_tool`](.kilo/mcp.json), [`build_task_context_tool`](.kilo/mcp.json), [`impact_analysis_tool`](.kilo/mcp.json), and [`trace_tests_for_contract_tool`](.kilo/mcp.json) to understand:
+1. Local contract intent
+2. Upstream/downstream semantic impact
+3. Related tests and fixtures
+4. Whether relation recovery is needed
+5. Whether decision-memory continuity is broken between ADR, task contract, and implementation
+
+### 5. Execute Fixes (Optional/Handoff)
+
+If $ARGUMENTS contains `fix` or `apply`:
+- Handoff to the [`semantic`](.kilocodemodes) mode or a dedicated implementation agent instead of applying naive textual edits in orchestration.
+- Require the fixing agent to prefer MCP contract mutation tools such as [`simulate_patch_tool`](.kilo/mcp.json), [`guarded_patch_contract_tool`](.kilo/mcp.json), [`patch_contract_tool`](.kilo/mcp.json), and [`infer_missing_relations_tool`](.kilo/mcp.json).
+- Require the fixing agent to preserve or restore `@RATIONALE` / `@REJECTED` continuity whenever blocked-path knowledge exists.
+- After changes, re-run reindex, health, and audit MCP steps to verify the delta.
+
+### 6. Review Gate
+
+Before completion, request or perform an MCP-based review path aligned with the [`reviewer-agent-auditor`](.kilocodemodes) mode so the workflow produces a semantic PASS/FAIL gate, not just a remediation list.
+
+## Output
+
+Provide a summary of the semantic state:
+- **Health Metrics**: contracts / relations / orphans / unresolved_relations / files
+- **Status**: [PASS/FAIL] (FAIL if CRITICAL gaps, rejected-path regressions, or semantically significant unresolved relations exist)
+- **Top Issues**: List top 3-5 contracts or files needing attention.
+- **Decision Memory**: summarize missing ADRs, missing guardrails, and rejected-path regression risks.
+- **Action Taken**: Summary of MCP analysis performed, context gathered, and fixes or handoffs initiated.
+
+## Context
+
+$ARGUMENTS
--- a/.kilo/workflows/speckit.specify.md
+++ b/.kilo/workflows/speckit.specify.md
@@ -0,0 +1,279 @@
+---
+description: Create or update the feature specification from a natural language feature description.
+handoffs: 
+  - label: Build Technical Plan
+    agent: speckit.plan
+    prompt: Create a plan for the spec. I am building with...
+  - label: Clarify Spec Requirements
+    agent: speckit.clarify
+    prompt: Clarify specification requirements
+    send: true
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+The text the user typed after `/speckit.specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.
+
+Given that feature description, do this:
+
+1. **Generate a concise short name** (2-4 words) for the branch:
+   - Analyze the feature description and extract the most meaningful keywords
+   - Create a 2-4 word short name that captures the essence of the feature
+   - Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
+   - Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
+   - Keep it concise but descriptive enough to understand the feature at a glance
+   - Examples:
+     - "I want to add user authentication" → "user-auth"
+     - "Implement OAuth2 integration for the API" → "oauth2-api-integration"
+     - "Create a dashboard for analytics" → "analytics-dashboard"
+     - "Fix payment processing timeout bug" → "fix-payment-timeout"
+
+2. **Check for existing branches before creating new one**:
+
+   a. First, fetch all remote branches to ensure we have the latest information:
+
+      ```bash
+      git fetch --all --prune
+      ```
+
+   b. Find the highest feature number across all sources for the short-name:
+      - Remote branches: `git ls-remote --heads origin | grep -E 'refs/heads/[0-9]+-<short-name>$'`
+      - Local branches: `git branch | grep -E '^[* ]*[0-9]+-<short-name>$'`
+      - Specs directories: Check for directories matching `specs/[0-9]+-<short-name>`
+
+   c. Determine the next available number:
+      - Extract all numbers from all three sources
+      - Find the highest number N
+      - Use N+1 for the new branch number
+
+   d. Run the script `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS"` with the calculated number and short-name:
+      - Pass `--number N+1` and `--short-name "your-short-name"` along with the feature description
+      - Bash example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
+      - PowerShell example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
+
+   **IMPORTANT**:
+   - Check all three sources (remote branches, local branches, specs directories) to find the highest number
+   - Only match branches/directories with the exact short-name pattern
+   - If no existing branches/directories found with this short-name, start with number 1
+   - You must only ever run this script once per feature
+   - The JSON is provided in the terminal as output - always refer to it to get the actual content you're looking for
+   - The JSON output will contain BRANCH_NAME and SPEC_FILE paths
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot")
+
+3. Load `.specify/templates/spec-template.md` to understand required sections.
+
+4. **Generate UX Reference**:
+   
+   a. Load `.specify/templates/ux-reference-template.md`.
+   
+   b. **Design the User Experience**:
+      - **Imagine you are the user**: Visualize the interface and interaction.
+      - **Persona**: Define who is using this.
+      - **Happy Path**: Write the story of the perfect interaction.
+      - **Mockups**: Create concrete CLI text blocks or UI descriptions.
+      - **Errors**: Define how the system guides the user out of failure.
+   
+   c. Write the `ux_reference.md` file in the feature directory.
+   
+   d. **CRITICAL**: This UX Reference is now the source of truth for the "feel" of the feature. The technical spec MUST support this experience.
+
+5. Follow this execution flow:
+
+    1. Parse user description from Input
+       If empty: ERROR "No feature description provided"
+    2. Extract key concepts from description
+       Identify: actors, actions, data, constraints
+    3. For unclear aspects:
+       - Make informed guesses based on context and industry standards
+       - Only mark with [NEEDS CLARIFICATION: specific question] if:
+         - The choice significantly impacts feature scope or user experience
+         - Multiple reasonable interpretations exist with different implications
+         - No reasonable default exists
+       - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
+       - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
+    4. Fill User Scenarios & Testing section
+       If no clear user flow: ERROR "Cannot determine user scenarios"
+    5. Generate Functional Requirements
+       Each requirement must be testable
+       Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
+    6. Define Success Criteria
+       Create measurable, technology-agnostic outcomes
+       Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
+       Each criterion must be verifiable without implementation details
+    7. Identify Key Entities (if data involved)
+    8. Return: SUCCESS (spec ready for planning)
+
+5. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.
+
+6. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:
+
+   a. **Create Spec Quality Checklist**: Generate a checklist file at `FEATURE_DIR/checklists/requirements.md` using the checklist template structure with these validation items:
+
+      ```markdown
+      # Specification Quality Checklist: [FEATURE NAME]
+      
+      **Purpose**: Validate specification completeness and quality before proceeding to planning
+      **Created**: [DATE]
+      **Feature**: [Link to spec.md]
+      
+      ## Content Quality
+      
+      - [ ] No implementation details (languages, frameworks, APIs)
+      - [ ] Focused on user value and business needs
+      - [ ] Written for non-technical stakeholders
+      - [ ] All mandatory sections completed
+
+      ## UX Consistency
+      
+      - [ ] Functional requirements fully support the 'Happy Path' in ux_reference.md
+      - [ ] Error handling requirements match the 'Error Experience' in ux_reference.md
+      - [ ] No requirements contradict the defined User Persona or Context
+      
+      ## Requirement Completeness
+      
+      - [ ] No [NEEDS CLARIFICATION] markers remain
+      - [ ] Requirements are testable and unambiguous
+      - [ ] Success criteria are measurable
+      - [ ] Success criteria are technology-agnostic (no implementation details)
+      - [ ] All acceptance scenarios are defined
+      - [ ] Edge cases are identified
+      - [ ] Scope is clearly bounded
+      - [ ] Dependencies and assumptions identified
+      
+      ## Feature Readiness
+      
+      - [ ] All functional requirements have clear acceptance criteria
+      - [ ] User scenarios cover primary flows
+      - [ ] Feature meets measurable outcomes defined in Success Criteria
+      - [ ] No implementation details leak into specification
+      
+      ## Notes
+      
+      - Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
+      ```
+
+   b. **Run Validation Check**: Review the spec against each checklist item:
+      - For each item, determine if it passes or fails
+      - Document specific issues found (quote relevant spec sections)
+
+   c. **Handle Validation Results**:
+
+      - **If all items pass**: Mark checklist complete and proceed to step 6
+
+      - **If items fail (excluding [NEEDS CLARIFICATION])**:
+        1. List the failing items and specific issues
+        2. Update the spec to address each issue
+        3. Re-run validation until all items pass (max 3 iterations)
+        4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user
+
+      - **If [NEEDS CLARIFICATION] markers remain**:
+        1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
+        2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
+        3. For each clarification needed (max 3), present options to user in this format:
+
+           ```markdown
+           ## Question [N]: [Topic]
+           
+           **Context**: [Quote relevant spec section]
+           
+           **What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
+           
+           **Suggested Answers**:
+           
+           | Option | Answer | Implications |
+           |--------|--------|--------------|
+           | A      | [First suggested answer] | [What this means for the feature] |
+           | B      | [Second suggested answer] | [What this means for the feature] |
+           | C      | [Third suggested answer] | [What this means for the feature] |
+           | Custom | Provide your own answer | [Explain how to provide custom input] |
+           
+           **Your choice**: _[Wait for user response]_
+           ```
+
+        4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
+           - Use consistent spacing with pipes aligned
+           - Each cell should have spaces around content: `| Content |` not `|Content|`
+           - Header separator must have at least 3 dashes: `|--------|`
+           - Test that the table renders correctly in markdown preview
+        5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
+        6. Present all questions together before waiting for responses
+        7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
+        8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
+        9. Re-run validation after all clarifications are resolved
+
+   d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
+
+7. Report completion with branch name, spec file path, ux_reference file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).
+
+**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.
+
+## General Guidelines
+
+## Quick Guidelines
+
+- Focus on **WHAT** users need and **WHY**.
+- Avoid HOW to implement (no tech stack, APIs, code structure).
+- Written for business stakeholders, not developers.
+- DO NOT create any checklists that are embedded in the spec. That will be a separate command.
+
+### Section Requirements
+
+- **Mandatory sections**: Must be completed for every feature
+- **Optional sections**: Include only when relevant to the feature
+- When a section doesn't apply, remove it entirely (don't leave as "N/A")
+
+### For AI Generation
+
+When creating this spec from a user prompt:
+
+1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
+2. **Document assumptions**: Record reasonable defaults in the Assumptions section
+3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
+   - Significantly impact feature scope or user experience
+   - Have multiple reasonable interpretations with different implications
+   - Lack any reasonable default
+4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
+5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
+6. **Common areas needing clarification** (only if no reasonable default exists):
+   - Feature scope and boundaries (include/exclude specific use cases)
+   - User types and permissions (if multiple conflicting interpretations possible)
+   - Security/compliance requirements (when legally/financially significant)
+
+**Examples of reasonable defaults** (don't ask about these):
+
+- Data retention: Industry-standard practices for the domain
+- Performance targets: Standard web/mobile app expectations unless specified
+- Error handling: User-friendly messages with appropriate fallbacks
+- Authentication method: Standard session-based or OAuth2 for web apps
+- Integration patterns: RESTful APIs unless specified otherwise
+
+### Success Criteria Guidelines
+
+Success criteria must be:
+
+1. **Measurable**: Include specific metrics (time, percentage, count, rate)
+2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
+3. **User-focused**: Describe outcomes from user/business perspective, not system internals
+4. **Verifiable**: Can be tested/validated without knowing implementation details
+
+**Good examples**:
+
+- "Users can complete checkout in under 3 minutes"
+- "System supports 10,000 concurrent users"
+- "95% of searches return results in under 1 second"
+- "Task completion rate improves by 40%"
+
+**Bad examples** (implementation-focused):
+
+- "API response time is under 200ms" (too technical, use "Users see results instantly")
+- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
+- "React components render efficiently" (framework-specific)
+- "Redis cache hit rate above 80%" (technology-specific)
--- a/.kilo/workflows/speckit.tasks.md
+++ b/.kilo/workflows/speckit.tasks.md
@@ -0,0 +1,167 @@
+---
+description: Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts.
+handoffs: 
+  - label: Analyze For Consistency
+    agent: speckit.analyze
+    prompt: Run a project analysis for consistency
+    send: true
+  - label: Implement Project
+    agent: speckit.implement
+    prompt: Start the implementation in phases
+    send: true
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load design documents**: Read from FEATURE_DIR:
+   - **Required**: `plan.md` (tech stack, libraries, structure), `spec.md` (user stories with priorities), `ux_reference.md` (experience source of truth)
+   - **Optional**: `data-model.md` (entities), `contracts/` (API endpoints), `research.md` (decisions), `quickstart.md` (test scenarios)
+   - **Required when present in plan output**: ADR artifacts such as `docs/architecture.md` or feature-local architecture decision files containing `[DEF:id:ADR]` nodes
+   - Note: Not all projects have all documents. Generate tasks based on what's available.
+
+3. **Execute task generation workflow**:
+   - Load `plan.md` and extract tech stack, libraries, project structure
+   - Load `spec.md` and extract user stories with their priorities (P1, P2, P3, etc.)
+   - Load ADR nodes and build a decision-memory inventory: `DecisionId`, `@RATIONALE`, `@REJECTED`, dependent modules
+   - If `data-model.md` exists: Extract entities and map to user stories
+   - If `contracts/` exists: Map endpoints to user stories
+   - If `research.md` exists: Extract decisions for setup tasks
+   - Generate tasks organized by user story (see Task Generation Rules below)
+   - Generate dependency graph showing user story completion order
+   - Create parallel execution examples per user story
+   - Validate task completeness (each user story has all needed tasks, independently testable)
+   - Validate guardrail continuity: no task may realize an ADR path named in `@REJECTED`
+
+4. **Generate `tasks.md`**: Use `.specify/templates/tasks-template.md` as structure, fill with:
+   - Correct feature name from `plan.md`
+   - Phase 1: Setup tasks (project initialization)
+   - Phase 2: Foundational tasks (blocking prerequisites for all user stories)
+   - Phase 3+: One phase per user story (in priority order from `spec.md`)
+   - Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
+   - Final Phase: Polish & cross-cutting concerns
+   - All tasks must follow the strict checklist format (see Task Generation Rules below)
+   - Clear file paths for each task
+   - Dependencies section showing story completion order
+   - Parallel execution examples per story
+   - Implementation strategy section (MVP first, incremental delivery)
+   - Decision-memory notes for guarded tasks when ADRs or known traps apply
+
+5. **Report**: Output path to generated `tasks.md` and summary:
+   - Total task count
+   - Task count per user story
+   - Parallel opportunities identified
+   - Independent test criteria for each story
+   - Suggested MVP scope (typically just User Story 1)
+   - Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
+   - ADR propagation summary: which ADRs were inherited into task guardrails and which paths were rejected
+
+Context for task generation: $ARGUMENTS
+
+The `tasks.md` should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
+
+## Task Generation Rules
+
+**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.
+
+**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
+
+### UX & Semantic Preservation (CRITICAL)
+
+- **Source of Truth**: `ux_reference.md` for UX, `.ai/standards/semantics.md` for code, and ADR artifacts for upstream technology decisions.
+- **Violation Warning**: If any task violates UX, ADR guardrails, or GRACE standards, flag it immediately.
+- **Verification Task (UX)**: Add a task at the end of each Story phase: `- [ ] Txxx [USx] Verify implementation matches ux_reference.md (Happy Path & Errors)`
+- **Verification Task (Audit)**: Add a mandatory audit task at the end of each Story phase: `- [ ] Txxx [USx] Acceptance: Perform semantic audit & algorithm emulation by Tester`
+- **Guardrail Rule**: If an ADR or contract says `@REJECTED`, task text must not schedule that path as implementation work.
+
+### Checklist Format (REQUIRED)
+
+Every task MUST strictly follow this format:
+
+```text
+- [ ] [TaskID] [P?] [Story?] Description with file path
+```
+
+**Format Components**:
+
+1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
+2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
+3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
+4. **[Story] label**: REQUIRED for user story phase tasks only
+   - Format: [US1], [US2], [US3], etc. (maps to user stories from `spec.md`)
+   - Setup phase: NO story label
+   - Foundational phase: NO story label  
+   - User Story phases: MUST have story label
+   - Polish phase: NO story label
+5. **Description**: Clear action with exact file path
+
+**Examples**:
+
+- ✅ CORRECT: `- [ ] T001 Create project structure per implementation plan`
+- ✅ CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
+- ✅ CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
+- ✅ CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
+- ❌ WRONG: `- [ ] Create User model` (missing ID and Story label)
+- ❌ WRONG: `T001 [US1] Create model` (missing checkbox)
+- ❌ WRONG: `- [ ] [US1] Create User model` (missing Task ID)
+- ❌ WRONG: `- [ ] T001 [US1] Create model` (missing file path)
+
+### Task Organization
+
+1. **From User Stories (`spec.md`)** - PRIMARY ORGANIZATION:
+   - Each user story (P1, P2, P3...) gets its own phase
+   - Map all related components to their story:
+     - Models needed for that story
+     - Services needed for that story
+     - Endpoints/UI needed for that story
+     - If tests requested: Tests specific to that story
+   - Mark story dependencies (most stories should be independent)
+
+2. **From Contracts (CRITICAL TIER)**:
+   - Identify components marked as `@TIER: CRITICAL` in `contracts/modules.md`.
+   - For these components, **MUST** append the summary of `@PRE`, `@POST`, `@UX_STATE`, and test contracts (`@TEST_FIXTURE`, `@TEST_EDGE`) directly to the task description.
+   - Example: `- [ ] T005 [P] [US1] Implement Auth (CRITICAL: PRE: token exists, POST: returns User, TESTS: 2 edges) in src/auth.py`
+   - Map each contract/endpoint → to the user story it serves
+   - If tests requested: Each contract → contract test task [P] before implementation in that story's phase
+
+3. **From ADRs and Decision Memory**:
+   - For each implementation task constrained by an ADR, append a concise guardrail summary drawn from `@RATIONALE` and `@REJECTED`.
+   - Example: `- [ ] T021 [US1] Implement payload parsing guardrails in src/api/input.py (RATIONALE: strict validation because frontend sends numeric strings; REJECTED: json.loads() without schema validation)`
+   - If a task would naturally branch into an ADR-rejected alternative, rewrite the task around the accepted path instead of leaving the choice ambiguous.
+   - If no safe executable path remains because ADR context is incomplete, stop and emit `[NEED_CONTEXT: target]`.
+
+4. **From Data Model**:
+   - Map each entity to the user story(ies) that need it
+   - If entity serves multiple stories: Put in earliest story or Setup phase
+   - Relationships → service layer tasks in appropriate story phase
+
+5. **From Setup/Infrastructure**:
+   - Shared infrastructure → Setup phase (Phase 1)
+   - Foundational/blocking tasks → Foundational phase (Phase 2)
+   - Story-specific setup → within that story's phase
+
+### Phase Structure
+
+- **Phase 1**: Setup (project initialization)
+- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
+- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
+  - Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
+  - Each phase should be a complete, independently testable increment
+- **Final Phase**: Polish & Cross-Cutting Concerns
+
+### Decision-Memory Validation Gate
+
+Before finalizing `tasks.md`, verify all of the following:
+- Every repo-shaping ADR from planning is either represented in a setup/foundational task or inherited by a downstream story task.
+- Every guarded task that could tempt an implementer into a known wrong branch carries preventive `@RATIONALE` / `@REJECTED` guidance in its text.
+- No task instructs the implementer to realize an ADR path already named as rejected.
+- At least one explicit audit/verification task exists for checking rejected-path regressions in code review or test stages.
--- a/.kilo/workflows/speckit.taskstoissues.md
+++ b/.kilo/workflows/speckit.taskstoissues.md
@@ -0,0 +1,30 @@
+---
+description: Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts.
+tools: ['github/github-mcp-server/issue_write']
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+1. From the executed script, extract the path to **tasks**.
+1. Get the Git remote by running:
+
+```bash
+git config --get remote.origin.url
+```
+
+> [!CAUTION]
+> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL
+
+1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.
+
+> [!CAUTION]
+> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL
--- a/.kilo/workflows/speckit.test.md
+++ b/.kilo/workflows/speckit.test.md
@@ -0,0 +1,235 @@
+---
+
+description: Generate tests, manage test documentation, and ensure maximum code coverage
+
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Goal
+
+Execute semantic audit and full testing cycle: verify contract compliance, verify decision-memory continuity, emulate logic, ensure maximum coverage, and maintain test quality.
+
+## Operating Constraints
+
+1. **NEVER delete existing tests** - Only update if they fail due to bugs in the test or implementation
+2. **NEVER duplicate tests** - Check existing tests first before creating new ones
+3. **Use TEST_FIXTURE fixtures** - For CRITICAL tier modules, read @TEST_FIXTURE from .ai/standards/semantics.md
+4. **Co-location required** - Write tests in `__tests__` directories relative to the code being tested
+5. **Decision-memory regression guard** - Tests and audits must not normalize silent reintroduction of any path documented in upstream `@REJECTED`
+
+## Execution Steps
+
+### 1. Analyze Context
+
+Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS.
+
+Determine:
+- FEATURE_DIR - where the feature is located
+- TASKS_FILE - path to `tasks.md`
+- Which modules need testing based on task status
+- Which ADRs or task guardrails define rejected paths for the touched scope
+
+### 2. Load Relevant Artifacts
+
+**From `tasks.md`:**
+- Identify completed implementation tasks (not test tasks)
+- Extract file paths that need tests
+- Extract guardrail summaries and blocked paths
+
+**From `.ai/standards/semantics.md`:**
+- Read effective complexity expectations
+- Read decision-memory rules for ADR, preventive guardrails, and reactive Micro-ADR
+- For CRITICAL modules: Read `@TEST_` fixtures
+
+**From ADR sources and touched code:**
+- Read `[DEF:id:ADR]` nodes when present
+- Read local `@RATIONALE` and `@REJECTED` in touched contracts
+
+**From existing tests:**
+- Scan `__tests__` directories for existing tests
+- Identify test patterns and coverage gaps
+
+### 3. Test Coverage Analysis
+
+Create coverage matrix:
+
+| Module | File | Has Tests | Complexity / Tier | TEST_FIXTURE Available | Rejected Path Guarded |
+|--------|------|-----------|-------------------|------------------------|-----------------------|
+| ... | ... | ... | ... | ... | ... |
+
+### 4. Semantic Audit & Logic Emulation (CRITICAL)
+
+Before writing tests, the Tester MUST:
+1. **Run `axiom-core.audit_contracts_tool`**: Identify semantic violations.
+2. **Run a protocol-shape review on touched files**:
+   - Reject non-canonical semantic markup, including docstring-only annotations such as `@PURPOSE`, `@PRE`, or `@INVARIANT` written inside class/function docstrings without canonical `[DEF]...[/DEF]` anchors and header metadata.
+   - Reject files whose effective complexity contract is under-specified relative to [`.ai/standards/semantics.md`](.ai/standards/semantics.md).
+   - Reject Python Complexity 4+ modules that omit meaningful `logger.reason()` / `logger.reflect()` checkpoints.
+   - Reject Python Complexity 5 modules that omit `belief_scope(...)`, `@DATA_CONTRACT`, or `@INVARIANT`.
+   - Treat broken or missing closing anchors as blocking violations.
+   - Reject retained workaround code if the local contract lacks `@RATIONALE` / `@REJECTED`.
+   - Reject code that silently re-enables a path declared in upstream ADR or local guardrails as rejected.
+3. **Emulate Algorithm**: Step through the code implementation in mind.
+   - Verify it adheres to the `@PURPOSE` and `@INVARIANT`.
+   - Verify `@PRE` and `@POST` conditions are correctly handled.
+   - Verify the implementation follows accepted-path rationale rather than drifting into a blocked path.
+4. **Validation Verdict**:
+   - If audit fails: Emit `[AUDIT_FAIL: semantic_noncompliance]` with concrete file-path reasons and notify Orchestrator.
+   - Example blocking case: [`backend/src/services/dataset_review/repositories/session_repository.py`](backend/src/services/dataset_review/repositories/session_repository.py) contains a module anchor, but its nested repository class/method semantics are expressed as loose docstrings instead of canonical anchored contracts; this MUST be rejected until remediated or explicitly waived.
+   - If audit passes: Proceed to writing/verifying tests.
+
+### 5. Write Tests (TDD Approach)
+
+For each module requiring tests:
+
+1. **Check existing tests**: Scan `__tests__/` for duplicates.
+2. **Read TEST_FIXTURE**: If CRITICAL tier, read `@TEST_FIXTURE` from semantics header.
+3. **Do not normalize broken semantics through tests**:
+   - The Tester must not write tests that silently accept malformed semantic protocol usage.
+   - If implementation is semantically invalid, stop and reject instead of adapting tests around the invalid structure.
+4. **Write test**: Follow co-location strategy.
+   - Python: `src/module/__tests__/test_module.py`
+   - Svelte: `src/lib/components/__tests__/test_component.test.js`
+5. **Use mocks**: Use `unittest.mock.MagicMock` for external dependencies
+6. **Add rejected-path regression coverage when relevant**:
+   - If ADR or local contract names a blocked path in `@REJECTED`, add or verify at least one test or explicit audit check that would fail if that forbidden path were silently restored.
+
+### 4a. UX Contract Testing (Frontend Components)
+
+For Svelte components with `@UX_STATE`, `@UX_FEEDBACK`, `@UX_RECOVERY` tags:
+
+1. **Parse UX tags**: Read component file and extract all `@UX_*` annotations
+2. **Generate UX tests**: Create tests for each UX state transition
+   ```javascript
+   // Example: Testing @UX_STATE: Idle -> Expanded
+   it('should transition from Idle to Expanded on toggle click', async () => {
+     render(Sidebar);
+     const toggleBtn = screen.getByRole('button', { name: /toggle/i });
+     await fireEvent.click(toggleBtn);
+     expect(screen.getByTestId('sidebar')).toHaveClass('expanded');
+   });
+   ```
+3. **Test `@UX_FEEDBACK`**: Verify visual feedback (toast, shake, color changes)
+4. **Test `@UX_RECOVERY`**: Verify error recovery mechanisms (retry, clear input)
+5. **Use `@UX_TEST` fixtures**: If component has `@UX_TEST` tags, use them as test specifications
+6. **Verify decision memory**: If the UI contract declares `@REJECTED`, ensure browser-visible behavior does not regress into the rejected path.
+
+**UX Test Template:**
+```javascript
+// [DEF:ComponentUXTests:Module]
+// @C: 3
+// @RELATION: VERIFIES -> ../Component.svelte
+// @PURPOSE: Test UX states and transitions
+
+describe('Component UX States', () => {
+  // @UX_STATE: Idle -> {action: click, expected: Active}
+  it('should transition Idle -> Active on click', async () => { ... });
+  
+  // @UX_FEEDBACK: Toast on success
+  it('should show toast on successful action', async () => { ... });
+  
+  // @UX_RECOVERY: Retry on error
+  it('should allow retry on error', async () => { ... });
+});
+// [/DEF:__tests__/test_Component:Module]
+```
+
+### 5. Test Documentation
+
+Create/update documentation in `specs/<feature>/tests/`:
+
+```
+tests/
+├── README.md           # Test strategy and overview
+├── coverage.md         # Coverage matrix and reports
+└── reports/
+    └── YYYY-MM-DD-report.md
+```
+
+Include decision-memory coverage notes when ADR or rejected-path regressions were checked.
+
+### 6. Execute Tests
+
+Run tests and report results:
+
+**Backend:**
+```bash
+cd backend && .venv/bin/python3 -m pytest -v
+```
+
+**Frontend:**
+```bash
+cd frontend && npm run test
+```
+
+### 7. Update Tasks
+
+Mark test tasks as completed in `tasks.md` with:
+- Test file path
+- Coverage achieved
+- Any issues found
+- Whether rejected-path regression checks passed or remain manual audit items
+
+## Output
+
+Generate test execution report:
+
+```markdown
+# Test Report: [FEATURE]
+
+**Date**: [YYYY-MM-DD]
+**Executed by**: Tester Agent
+
+## Coverage Summary
+
+| Module | Tests | Coverage % |
+|--------|-------|------------|
+| ... | ... | ... |
+
+## Test Results
+
+- Total: [X]
+- Passed: [X]
+- Failed: [X]
+- Skipped: [X]
+
+## Semantic Audit Verdict
+
+- Verdict: PASS | FAIL
+- Blocking Violations:
+  - [file path] -> [reason]
+- Decision Memory:
+  - ADRs checked: [...]
+  - Rejected-path regressions: PASS | FAIL
+  - Missing `@RATIONALE` / `@REJECTED`: [...]
+- Notes:
+  - Reject docstring-only semantic pseudo-markup
+  - Reject complexity/contract mismatches
+  - Reject missing belief-state instrumentation for Python Complexity 4/5
+  - Reject silent resurrection of rejected paths
+
+## Issues Found
+
+| Test | Error | Resolution |
+|------|-------|------------|
+| ... | ... | ... |
+
+## Next Steps
+
+- [ ] Fix failed tests
+- [ ] Fix blocking semantic violations before acceptance
+- [ ] Fix decision-memory drift or rejected-path regressions
+- [ ] Add more coverage for [module]
+- [ ] Review TEST_FIXTURE fixtures
+```
+
+## Context for Testing
+
+$ARGUMENTS