ss-tools/specs/027-dataset-llm-orchestration/quickstart.md

# Quickstart: LLM Dataset Orchestration

**Feature**: [LLM Dataset Orchestration](./spec.md)
**Branch**: `027-dataset-llm-orchestration`

This guide validates the end-to-end workflow for dataset review, semantic enrichment, clarification, preview generation, and controlled SQL Lab launch.

---

## 1. Prerequisites

1. Access to a configured Superset environment with:
   - at least one dataset,
   - at least one dashboard URL containing reusable analytical context,
   - permissions sufficient for dataset inspection and SQL Lab session creation.
2. An active LLM provider configured in ss-tools.
3. Optional semantic sources for enrichment testing:
   - uploaded spreadsheet dictionary,
   - connected tabular dictionary,
   - trusted reference Superset dataset.
4. A test user account with permission to create and resume sessions.
5. A second test user account for ownership/visibility guard validation.

---

## 2. Primary End-to-End Happy Path

### Step 1: Start Review Session
- Navigate to the dataset-review workflow entry from the datasets area.
- Start a session using one of:
  - a Superset dashboard link with saved filters,
  - a direct dataset selection.
- **Verify**:
  - a new session is created,
  - the session gets a visible readiness state,
  - the first recommended action is explicit.

### Step 2: Observe Progressive Recovery
- Keep the session open while recovery runs.
- **Verify** progressive updates appear for:
  - dataset recognition,
  - imported filter recovery,
  - template/Jinja variable discovery,
  - preliminary semantic-source candidates,
  - first-pass business summary.
- **Verify** partial work is shown before the whole pipeline finishes.

### Step 3: Review Automatic Analysis
- Inspect the generated business summary and validation findings.
- **Verify**:
  - the summary is readable by an operational stakeholder,
  - findings are grouped by severity,
  - provenance/confidence markers distinguish confirmed/imported/inferred/AI-draft values,
  - the next recommended action changes appropriately.

### Step 4: Apply Semantic Source
- Use **Apply semantic source** and choose:
  - spreadsheet dictionary,
  - connected dictionary,
  - trusted reference dataset.
- **Verify**:
  - exact matches are applied as stronger candidates,
  - fuzzy matches remain reviewable rather than silently applied,
  - semantic conflicts are shown side by side,
  - field-level manual overrides remain possible.

### Step 5: Confirm Field-Level Semantics
- Manually override one field’s `verbose_name` or description.
- Apply another semantic source afterward.
- **Verify**:
  - the manual field remains locked,
  - imported/generated values do not silently overwrite it,
  - provenance changes to manual override.

### Step 6: Guided Clarification
- Enter clarification mode from a session with unresolved findings.
- Answer one question using a suggested option.
- Answer another with a custom value.
- Skip one question.
- Mark one for expert review.
- **Verify**:
  - only one active question is shown at a time,
  - each question includes “why this matters” and current guess,
  - answers update readiness/findings/profile state,
  - skipped and expert-review items remain visible as unresolved.

### Step 7: Pause and Resume
- Save or pause the session mid-clarification.
- Leave the page and reopen the session.
- **Verify**:
  - the session resumes with prior answers intact,
  - the current question or next unresolved question is restored,
  - manual semantic decisions and pending mappings are preserved.

### Step 8: Review Mapping and Generate Preview
- Open the mapping review section.
- Approve one warning-level mapping transformation.
- Manually override another transformed mapping value.
- Trigger **Generate SQL Preview**.
- **Verify**:
  - all required variables are visible,
  - warning approvals are explicit,
  - the preview is read-only,
  - preview status shows it was compiled by Superset,
  - substituted values are visible in the final SQL.

### Step 9: Launch Dataset
- Move the session to `Run Ready`.
- Click **Launch Dataset**.
- **Verify**:
  - launch confirmation shows dataset identity, effective filters, parameter values, warnings, and preview status,
  - a SQL Lab session reference is returned,
  - an audited run context is stored,
  - the session moves to run-in-progress or completed state appropriately.

### Step 10: Export Outputs
- Export documentation.
- Export validation findings.
- **Verify**:
  - both artifacts are generated,
  - artifact metadata or file reference is associated with the session,
  - exported output reflects the current reviewed state.

### Step 11: Collaboration and Review
- As User A, add User B as a `reviewer`.
- Access the same session as User B.
- **Verify**:
  - User B can view the session state.
  - User B can answer clarification questions but cannot approve launch-critical mappings.
  - Audit log (if implemented) records which user performed which action.

---

## 3. Negative and Recovery Scenarios

### Scenario A: Invalid Superset Link
- Start a session with a malformed or unsupported link.
- **Verify**:
  - intake fails with actionable error messaging,
  - no fake recovered context is shown,
  - the user can correct input in place.

### Scenario B: Partial Filter Recovery
- Use a link where only some filters can be recovered.
- **Verify**:
  - recovered filters are shown,
  - unrecovered pieces are explicitly marked,
  - session enters `recovery_required` or equivalent partial state,
  - workflow remains usable.

### Scenario C: Dataset Without Clear Business Meaning
- Use a dataset with weak metadata and no strong trusted semantic matches.
- **Verify**:
  - the summary remains minimal but usable,
  - the system does not pretend certainty,
  - clarification becomes the recommended next step.

### Scenario D: Conflicting Semantic Sources
- Apply two semantic sources that disagree for the same field.
- **Verify**:
  - both candidates are shown side by side,
  - recommended source is visible if confidence differs,
  - no silent overwrite occurs,
  - conflict remains until explicitly resolved.

### Scenario E: Missing Required Runtime Value
- Leave a required template variable unmapped.
- Attempt preview or launch.
- **Verify**:
  - preview or launch is blocked according to gate rules,
  - missing values are highlighted specifically,
  - recommended next action becomes completion/remediation rather than launch.

### Scenario F: Preview Compilation Failure
- Provide a mapping value known to break Superset-side compilation.
- Trigger preview.
- **Verify**:
  - preview moves to `failed` state,
  - readable Superset error details are shown,
  - launch remains blocked,
  - the user can navigate back to the problematic mapping/value.

### Scenario G: Preview Staleness After Input Change
- Successfully generate preview.
- Change an approved mapping or required value.
- **Verify**:
  - preview state becomes `stale`,
  - launch is blocked until preview is regenerated,
  - stale state is visible and not hidden.

### Scenario H: SQL Lab Launch Failure
- Simulate or trigger SQL Lab session creation failure.
- **Verify**:
  - launch result is marked failed,
  - the audit record still preserves attempted run context,
  - the session remains recoverable,
  - no success redirect is shown.

### Scenario I: Cross-User Access Guard
- Try to open or mutate the first user’s session from a second user account (without collaborator access).
- **Verify**:
  - access is denied,
  - no session state leaks to the second user,
  - ownership/permission is enforced on view and mutation paths.

---

## 4. UX Invariants to Validate

- [ ] The primary CTA always reflects the current highest-value next step.
- [ ] The launch button stays blocked if:
  - [ ] blocking findings remain,
  - [ ] required values are missing,
  - [ ] warning-level mappings needing approval are unresolved,
  - [ ] preview is missing, failed, or stale.
- [ ] Manual semantic overrides are never silently overwritten.
- [ ] Every important semantic value exposes visible provenance.
- [ ] Clarification shows one focused question at a time.
- [ ] Partial recovery preserves usable value and explains what is missing.
- [ ] Preview explicitly indicates it was compiled by Superset.
- [ ] Session resume restores prior state without forcing re-entry.

---

## 5. Suggested Verification by Milestone

### Milestone 1: Sessioned Auto Review
Validate:
- source intake,
- progressive recovery,
- automatic documentation summary,
- typed findings display,
- semantic source application,
- export endpoints.

### Milestone 2: Guided Clarification
Validate:
- clarification question flow,
- answer persistence,
- resume behavior,
- conflict review,
- field-level manual override/lock behavior.

### Milestone 3: Controlled Execution
Validate:
- mapping review,
- explicit warning approvals,
- Superset-side preview,
- preview staleness handling,
- SQL Lab launch,
- audited run context persistence.

---

## 6. Success-Criteria Measurement Hints

These are not implementation metrics by themselves; they are validation hints for pilot runs.

### For [SC-001](./spec.md)
Track how many submitted datasets produce an initial documentation draft without manual reconstruction.

### For [SC-002](./spec.md)
Measure time from session start to first readable summary visible to the user.

### For [SC-003](./spec.md)
Measure the percentage of semantic fields populated from trusted sources before AI-draft fallback.

### For [SC-005](./spec.md)
Measure the percentage of eligible Superset links that produce a non-empty imported filter set usable for review.

### For [SC-007](./spec.md)
Check that launched sessions always persist:
- dataset identity,
- effective filters,
- template params,
- approved mappings,
- preview reference,
- SQL Lab session reference,
- outcome.

### For [SC-008](./spec.md)
Run moderated first-attempt sessions and record whether users complete import → review → clarification (if needed) → preview → launch without facilitator intervention.

---

## 7. Completion Checklist

A Phase 1 design is operationally validated when all are true:

- [ ] Happy-path session can be started and completed.
- [ ] Partial recovery behaves as explicit partial recovery, not silent failure.
- [ ] Clarification is resumable.
- [ ] Semantic conflict review is explicit.
- [ ] Field-level override lock works.
- [ ] Preview is Superset-generated and becomes stale after input mutation.
- [ ] Launch targets SQL Lab only.
- [ ] Export outputs are available.
- [ ] Ownership and guard rails are enforced.