Таски готовы

2026-03-16 23:11:19 +03:00
parent 493a73827a
commit 9cae07a3b4
24 changed files with 10614 additions and 8733 deletions
--- a/specs/027-dataset-llm-orchestration/quickstart.md
+++ b/specs/027-dataset-llm-orchestration/quickstart.md
@@ -0,0 +1,298 @@
+# Quickstart: LLM Dataset Orchestration
+
+**Feature**: [LLM Dataset Orchestration](./spec.md)  
+**Branch**: `027-dataset-llm-orchestration`
+
+This guide validates the end-to-end workflow for dataset review, semantic enrichment, clarification, preview generation, and controlled SQL Lab launch.
+
+---
+
+## 1. Prerequisites
+
+1. Access to a configured Superset environment with:
+   - at least one dataset,
+   - at least one dashboard URL containing reusable analytical context,
+   - permissions sufficient for dataset inspection and SQL Lab session creation.
+2. An active LLM provider configured in ss-tools.
+3. Optional semantic sources for enrichment testing:
+   - uploaded spreadsheet dictionary,
+   - connected tabular dictionary,
+   - trusted reference Superset dataset.
+4. A test user account with permission to create and resume sessions.
+5. A second test user account for ownership/visibility guard validation.
+
+---
+
+## 2. Primary End-to-End Happy Path
+
+### Step 1: Start Review Session
+- Navigate to the dataset-review workflow entry from the datasets area.
+- Start a session using one of:
+  - a Superset dashboard link with saved filters,
+  - a direct dataset selection.
+- **Verify**:
+  - a new session is created,
+  - the session gets a visible readiness state,
+  - the first recommended action is explicit.
+
+### Step 2: Observe Progressive Recovery
+- Keep the session open while recovery runs.
+- **Verify** progressive updates appear for:
+  - dataset recognition,
+  - imported filter recovery,
+  - template/Jinja variable discovery,
+  - preliminary semantic-source candidates,
+  - first-pass business summary.
+- **Verify** partial work is shown before the whole pipeline finishes.
+
+### Step 3: Review Automatic Analysis
+- Inspect the generated business summary and validation findings.
+- **Verify**:
+  - the summary is readable by an operational stakeholder,
+  - findings are grouped by severity,
+  - provenance/confidence markers distinguish confirmed/imported/inferred/AI-draft values,
+  - the next recommended action changes appropriately.
+
+### Step 4: Apply Semantic Source
+- Use **Apply semantic source** and choose:
+  - spreadsheet dictionary,
+  - connected dictionary,
+  - trusted reference dataset.
+- **Verify**:
+  - exact matches are applied as stronger candidates,
+  - fuzzy matches remain reviewable rather than silently applied,
+  - semantic conflicts are shown side by side,
+  - field-level manual overrides remain possible.
+
+### Step 5: Confirm Field-Level Semantics
+- Manually override one field’s `verbose_name` or description.
+- Apply another semantic source afterward.
+- **Verify**:
+  - the manual field remains locked,
+  - imported/generated values do not silently overwrite it,
+  - provenance changes to manual override.
+
+### Step 6: Guided Clarification
+- Enter clarification mode from a session with unresolved findings.
+- Answer one question using a suggested option.
+- Answer another with a custom value.
+- Skip one question.
+- Mark one for expert review.
+- **Verify**:
+  - only one active question is shown at a time,
+  - each question includes “why this matters” and current guess,
+  - answers update readiness/findings/profile state,
+  - skipped and expert-review items remain visible as unresolved.
+
+### Step 7: Pause and Resume
+- Save or pause the session mid-clarification.
+- Leave the page and reopen the session.
+- **Verify**:
+  - the session resumes with prior answers intact,
+  - the current question or next unresolved question is restored,
+  - manual semantic decisions and pending mappings are preserved.
+
+### Step 8: Review Mapping and Generate Preview
+- Open the mapping review section.
+- Approve one warning-level mapping transformation.
+- Manually override another transformed mapping value.
+- Trigger **Generate SQL Preview**.
+- **Verify**:
+  - all required variables are visible,
+  - warning approvals are explicit,
+  - the preview is read-only,
+  - preview status shows it was compiled by Superset,
+  - substituted values are visible in the final SQL.
+
+### Step 9: Launch Dataset
+- Move the session to `Run Ready`.
+- Click **Launch Dataset**.
+- **Verify**:
+  - launch confirmation shows dataset identity, effective filters, parameter values, warnings, and preview status,
+  - a SQL Lab session reference is returned,
+  - an audited run context is stored,
+  - the session moves to run-in-progress or completed state appropriately.
+
+### Step 10: Export Outputs
+- Export documentation.
+- Export validation findings.
+- **Verify**:
+  - both artifacts are generated,
+  - artifact metadata or file reference is associated with the session,
+  - exported output reflects the current reviewed state.
+
+### Step 11: Collaboration and Review
+- As User A, add User B as a `reviewer`.
+- Access the same session as User B.
+- **Verify**:
+  - User B can view the session state.
+  - User B can answer clarification questions but cannot approve launch-critical mappings.
+  - Audit log (if implemented) records which user performed which action.
+
+---
+
+## 3. Negative and Recovery Scenarios
+
+### Scenario A: Invalid Superset Link
+- Start a session with a malformed or unsupported link.
+- **Verify**:
+  - intake fails with actionable error messaging,
+  - no fake recovered context is shown,
+  - the user can correct input in place.
+
+### Scenario B: Partial Filter Recovery
+- Use a link where only some filters can be recovered.
+- **Verify**:
+  - recovered filters are shown,
+  - unrecovered pieces are explicitly marked,
+  - session enters `recovery_required` or equivalent partial state,
+  - workflow remains usable.
+
+### Scenario C: Dataset Without Clear Business Meaning
+- Use a dataset with weak metadata and no strong trusted semantic matches.
+- **Verify**:
+  - the summary remains minimal but usable,
+  - the system does not pretend certainty,
+  - clarification becomes the recommended next step.
+
+### Scenario D: Conflicting Semantic Sources
+- Apply two semantic sources that disagree for the same field.
+- **Verify**:
+  - both candidates are shown side by side,
+  - recommended source is visible if confidence differs,
+  - no silent overwrite occurs,
+  - conflict remains until explicitly resolved.
+
+### Scenario E: Missing Required Runtime Value
+- Leave a required template variable unmapped.
+- Attempt preview or launch.
+- **Verify**:
+  - preview or launch is blocked according to gate rules,
+  - missing values are highlighted specifically,
+  - recommended next action becomes completion/remediation rather than launch.
+
+### Scenario F: Preview Compilation Failure
+- Provide a mapping value known to break Superset-side compilation.
+- Trigger preview.
+- **Verify**:
+  - preview moves to `failed` state,
+  - readable Superset error details are shown,
+  - launch remains blocked,
+  - the user can navigate back to the problematic mapping/value.
+
+### Scenario G: Preview Staleness After Input Change
+- Successfully generate preview.
+- Change an approved mapping or required value.
+- **Verify**:
+  - preview state becomes `stale`,
+  - launch is blocked until preview is regenerated,
+  - stale state is visible and not hidden.
+
+### Scenario H: SQL Lab Launch Failure
+- Simulate or trigger SQL Lab session creation failure.
+- **Verify**:
+  - launch result is marked failed,
+  - the audit record still preserves attempted run context,
+  - the session remains recoverable,
+  - no success redirect is shown.
+
+### Scenario I: Cross-User Access Guard
+- Try to open or mutate the first user’s session from a second user account (without collaborator access).
+- **Verify**:
+  - access is denied,
+  - no session state leaks to the second user,
+  - ownership/permission is enforced on view and mutation paths.
+
+---
+
+## 4. UX Invariants to Validate
+
+- [ ] The primary CTA always reflects the current highest-value next step.
+- [ ] The launch button stays blocked if:
+  - [ ] blocking findings remain,
+  - [ ] required values are missing,
+  - [ ] warning-level mappings needing approval are unresolved,
+  - [ ] preview is missing, failed, or stale.
+- [ ] Manual semantic overrides are never silently overwritten.
+- [ ] Every important semantic value exposes visible provenance.
+- [ ] Clarification shows one focused question at a time.
+- [ ] Partial recovery preserves usable value and explains what is missing.
+- [ ] Preview explicitly indicates it was compiled by Superset.
+- [ ] Session resume restores prior state without forcing re-entry.
+
+---
+
+## 5. Suggested Verification by Milestone
+
+### Milestone 1: Sessioned Auto Review
+Validate:
+- source intake,
+- progressive recovery,
+- automatic documentation summary,
+- typed findings display,
+- semantic source application,
+- export endpoints.
+
+### Milestone 2: Guided Clarification
+Validate:
+- clarification question flow,
+- answer persistence,
+- resume behavior,
+- conflict review,
+- field-level manual override/lock behavior.
+
+### Milestone 3: Controlled Execution
+Validate:
+- mapping review,
+- explicit warning approvals,
+- Superset-side preview,
+- preview staleness handling,
+- SQL Lab launch,
+- audited run context persistence.
+
+---
+
+## 6. Success-Criteria Measurement Hints
+
+These are not implementation metrics by themselves; they are validation hints for pilot runs.
+
+### For [SC-001](./spec.md)
+Track how many submitted datasets produce an initial documentation draft without manual reconstruction.
+
+### For [SC-002](./spec.md)
+Measure time from session start to first readable summary visible to the user.
+
+### For [SC-003](./spec.md)
+Measure the percentage of semantic fields populated from trusted sources before AI-draft fallback.
+
+### For [SC-005](./spec.md)
+Measure the percentage of eligible Superset links that produce a non-empty imported filter set usable for review.
+
+### For [SC-007](./spec.md)
+Check that launched sessions always persist:
+- dataset identity,
+- effective filters,
+- template params,
+- approved mappings,
+- preview reference,
+- SQL Lab session reference,
+- outcome.
+
+### For [SC-008](./spec.md)
+Run moderated first-attempt sessions and record whether users complete import → review → clarification (if needed) → preview → launch without facilitator intervention.
+
+---
+
+## 7. Completion Checklist
+
+A Phase 1 design is operationally validated when all are true:
+
+- [ ] Happy-path session can be started and completed.
+- [ ] Partial recovery behaves as explicit partial recovery, not silent failure.
+- [ ] Clarification is resumable.
+- [ ] Semantic conflict review is explicit.
+- [ ] Field-level override lock works.
+- [ ] Preview is Superset-generated and becomes stale after input mutation.
+- [ ] Launch targets SQL Lab only.
+- [ ] Export outputs are available.
+- [ ] Ownership and guard rails are enforced.