ss-tools/specs/025-clean-release-compliance/research.md

# Phase 0 Research: Clean Release Compliance Subsystem Redesign

## Decision 1: The subsystem becomes API/CLI-first and TUI becomes a thin client

**Decision**
Primary release operations are owned by application services and exposed through CLI and HTTP API first. TUI is retained only as a thin operator interface that reads state and triggers actions through the same facade.

**Rationale**
The current implementation mixes UI state with preparation, manifest creation and compliance execution. This blocks automation and makes behavior depend on the interface used.

**Alternatives considered**
- Keep TUI as primary orchestrator and add wrappers around it: rejected because it preserves hidden business logic inside the interface.
- Remove TUI entirely: rejected because operators still need an interactive console flow in enterprise environments.

---

## Decision 2: Policy and registry are trusted snapshots, never runtime UI/env payloads

**Decision**
Policy and source registry are resolved by a dedicated read-only resolution service from trusted stores, then frozen into immutable snapshots for each compliance run.

**Rationale**
The redesign explicitly separates trusted and untrusted inputs. Candidate input, artifacts JSON and operator choices are not allowed to define policy contents or final report outcomes.

**Alternatives considered**
- Continue using `.clean-release.yaml` and env bootstrap as policy source: rejected because it violates the new trust model.
- Let TUI construct policy in demo and real mode differently: rejected because it breaks evidence integrity and reproducibility.

---

## Decision 3: Manifest, report and snapshots are immutable; run history is append-only

**Decision**
`DistributionManifest`, `CleanPolicySnapshot`, `SourceRegistrySnapshot`, `ComplianceReport`, `ApprovalDecision` and `PublicationRecord` are immutable. `ComplianceRun`, `ComplianceStageRun`, `ComplianceViolation` and audit log are append-only once created; only non-terminal run fields may progress during execution.

**Rationale**
The main value of the subsystem is evidence integrity. Mutable manifest/report records make audit and publication safety unverifiable.

**Alternatives considered**
- Update manifest/report in place: rejected because historical evidence would be lost.
- Allow deleting old runs to keep storage small: rejected because real mode must preserve evidence.

---

## Decision 4: Release lifecycle is modeled as an explicit state machine

**Decision**
The candidate lifecycle is formalized as `DRAFT -> PREPARED -> MANIFEST_BUILT -> CHECK_PENDING -> CHECK_RUNNING -> CHECK_PASSED|CHECK_BLOCKED|CHECK_ERROR -> APPROVED -> PUBLISHED -> REVOKED`, with hard guards on forbidden transitions.

**Rationale**
Current logic spreads status changes across TUI and orchestration code. A formal state machine makes approval/publication gating deterministic and testable.

**Alternatives considered**
- Keep loose status updates per module: rejected because it produces hidden invalid states.
- Collapse all states into a smaller set: rejected because manifest, check and approval stages need separate audit visibility.

---

## Decision 5: Compliance execution is a pluggable stage pipeline integrated with TaskManager

**Decision**
Each compliance run becomes a `TaskManager` task. The run stores lifecycle metadata while stage logs are emitted as task logs or structured sub-events. The pipeline remains pluggable with stage-specific decisions and violations.

**Rationale**
The repository already has a mature async task lifecycle and reporting patterns. Reusing it reduces duplicated orchestration infrastructure and aligns with repository constitution.

**Alternatives considered**
- Keep synchronous orchestrator execution: rejected due to non-blocking API requirements.
- Build a second custom task subsystem inside clean release: rejected as redundant and harder to observe.

---

## Decision 6: Interfaces are split into CLI, REST API and thin TUI over one facade

**Decision**
A single `CleanReleaseFacade` exposes use cases for candidate overview, manifest build, compliance run, approval and publication. CLI, API and TUI all call the facade. Headless mode belongs to CLI/API only.

**Rationale**
A facade keeps interface code thin and prevents re-implementing business rules per entrypoint.

**Alternatives considered**
- Let each interface call lower-level services directly: rejected because state validation and DTO assembly would drift.
- Keep a headless branch inside TUI: rejected because headless is not a UI concern.

---

## Decision 7: Repositories are decomposed by responsibility, even if exposed through one internal facade

**Decision**
Persistence is split by bounded responsibility: candidate, artifacts, manifest, policy, compliance run, report, approval, publication and audit. A convenience facade may exist, but ownership remains explicit.

**Rationale**
The current `CleanReleaseRepository` is too broad for the redesigned evidence model. Explicit repository boundaries make append-only and immutable behavior easier to enforce.

**Alternatives considered**
- Keep one universal repository class: rejected because contracts stay ambiguous.
- Persist everything only through TaskManager: rejected because domain entities need direct retrieval independently of task history.

---

## Decision 8: Demo mode is preserved but isolated by namespace

**Decision**
Demo mode is handled by a dedicated demo service and isolated storage namespace. Demo runs, policies, candidates and reports never share identifiers or history with real mode.

**Rationale**
Demo mode remains useful for operator training, but it must not contaminate real compliance evidence.

**Alternatives considered**
- Simulate demo behavior inside real storage: rejected because it risks false evidence and operator confusion.
- Drop demo mode entirely: rejected because it removes a safe training path.

---

## Decision 9: Migration proceeds incrementally, starting by extracting services out of TUI

**Decision**
Migration starts by extracting build/run logic into new services/facade, then removes env-driven policy injection, then introduces immutable snapshots, then adds CLI/API contracts, and only after that thins the TUI.

**Rationale**
The current codebase already has working routes, models and tests. A big-bang rewrite would create unnecessary integration risk.

**Alternatives considered**
- Rewrite the whole subsystem at once: rejected because it is harder to validate incrementally.
- Patch TUI only: rejected because it does not solve the architectural problem.