Таски готовы
This commit is contained in:
@@ -0,0 +1,43 @@
|
||||
# Specification Quality Checklist: LLM Dataset Orchestration
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-03-16
|
||||
**Feature**: [spec.md](../spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## UX Consistency
|
||||
|
||||
- [x] Functional requirements fully support the 'Happy Path' in ux_reference.md
|
||||
- [x] Error handling requirements match the 'Error Experience' in ux_reference.md
|
||||
- [x] No requirements contradict the defined User Persona or Context
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Notes
|
||||
|
||||
- Validation completed against [spec.md](../spec.md) and [ux_reference.md](../ux_reference.md).
|
||||
- Automatic documentation, guided clarification, and Superset-derived dataset execution are all represented as independently testable user journeys.
|
||||
- Error recovery is aligned between the UX reference and the functional requirements, especially for partial filter import, missing run-time values, and conflicting metadata.
|
||||
- The specification is ready for the next phase.
|
||||
1479
specs/027-dataset-llm-orchestration/contracts/api.yaml
Normal file
1479
specs/027-dataset-llm-orchestration/contracts/api.yaml
Normal file
File diff suppressed because it is too large
Load Diff
459
specs/027-dataset-llm-orchestration/contracts/modules.md
Normal file
459
specs/027-dataset-llm-orchestration/contracts/modules.md
Normal file
@@ -0,0 +1,459 @@
|
||||
# Semantic Module Contracts: LLM Dataset Orchestration
|
||||
|
||||
**Feature**: [LLM Dataset Orchestration](../spec.md)
|
||||
**Branch**: `027-dataset-llm-orchestration`
|
||||
|
||||
This document defines the semantic contracts for the core components of the Dataset LLM Orchestration feature, following the [GRACE-Poly Standard](../../../.ai/standards/semantics.md).
|
||||
|
||||
---
|
||||
|
||||
## 1. Backend Modules
|
||||
|
||||
# [DEF:DatasetReviewOrchestrator:Module]
|
||||
# @COMPLEXITY: 5
|
||||
# @PURPOSE: Coordinate the full dataset review session lifecycle across intake, recovery, semantic review, clarification, mapping review, preview generation, and launch.
|
||||
# @LAYER: Domain
|
||||
# @RELATION: [DEPENDS_ON] ->[DatasetReviewSessionRepository]
|
||||
# @RELATION: [DEPENDS_ON] ->[SemanticSourceResolver]
|
||||
# @RELATION: [DEPENDS_ON] ->[ClarificationEngine]
|
||||
# @RELATION: [DEPENDS_ON] ->[SupersetContextExtractor]
|
||||
# @RELATION: [DEPENDS_ON] ->[SupersetCompilationAdapter]
|
||||
# @RELATION: [DEPENDS_ON] ->[TaskManager]
|
||||
# @PRE: session mutations must execute inside a persisted session boundary scoped to one authenticated user.
|
||||
# @POST: state transitions are persisted atomically and emit observable progress for long-running steps.
|
||||
# @SIDE_EFFECT: creates task records, updates session aggregates, triggers upstream Superset calls, persists audit artifacts.
|
||||
# @DATA_CONTRACT: Input[SessionCommand] -> Output[DatasetReviewSession | CompiledPreview | DatasetRunContext]
|
||||
# @INVARIANT: Launch is blocked unless a current session has no open blocking findings, all launch-sensitive mappings are approved, and a non-stale Superset-generated compiled preview matches the current input fingerprint.
|
||||
# @TEST_CONTRACT: start_or_resume_session -> returns persisted session shell with recommended next action
|
||||
# @TEST_SCENARIO: launch_gate_blocks_stale_preview -> launch rejected when preview fingerprint no longer matches current mapping inputs
|
||||
# @TEST_EDGE: missing_dataset_ref -> blocking failure
|
||||
# @TEST_EDGE: stale_preview -> blocking failure
|
||||
# @TEST_EDGE: sql_lab_launch_failure -> terminal failed launch state with audit record
|
||||
# @TEST_INVARIANT: launch_gate -> VERIFIED_BY: [launch_gate_blocks_stale_preview]
|
||||
|
||||
#### ƒ **start_session**
|
||||
# @PURPOSE: Initialize a new session from a Superset link or dataset selection and trigger context recovery.
|
||||
# @PRE: source input is non-empty and environment is accessible.
|
||||
# @POST: session exists in persisted storage with intake/recovery state and task linkage when async work is required.
|
||||
# @SIDE_EFFECT: persists session and may enqueue recovery task.
|
||||
|
||||
#### ƒ **apply_semantic_source**
|
||||
# @PURPOSE: Apply a selected semantic source and update field-level candidate/decision state.
|
||||
# @PRE: source exists and session is not terminal.
|
||||
# @POST: semantic field entries and findings reflect selected-source outcomes without overwriting locked manual values.
|
||||
# @SIDE_EFFECT: updates semantic decisions and conflict findings.
|
||||
|
||||
#### ƒ **record_clarification_answer**
|
||||
# @PURPOSE: Persist one clarification answer and re-evaluate profile, findings, and readiness.
|
||||
# @PRE: target question belongs to the session’s active clarification session.
|
||||
# @POST: answer is saved before current-question pointer advances.
|
||||
# @SIDE_EFFECT: updates clarification and finding state.
|
||||
|
||||
#### ƒ **prepare_launch_preview**
|
||||
# @PURPOSE: Assemble effective execution inputs and trigger Superset-side preview compilation.
|
||||
# @PRE: all required variables have candidate values or explicitly accepted defaults.
|
||||
# @POST: returns preview artifact in pending, ready, failed, or stale state.
|
||||
# @SIDE_EFFECT: persists preview attempt and upstream compilation diagnostics.
|
||||
|
||||
#### ƒ **launch_dataset**
|
||||
# @PURPOSE: Start the approved dataset execution through SQL Lab and persist run context for audit/replay.
|
||||
# @PRE: session is run-ready and compiled preview is current.
|
||||
# @POST: returns persisted run context with SQL Lab session reference and launch outcome.
|
||||
# @SIDE_EFFECT: creates SQL Lab execution session and audit snapshot.
|
||||
|
||||
# [/DEF:DatasetReviewOrchestrator:Module]
|
||||
|
||||
---
|
||||
|
||||
# [DEF:DatasetReviewSessionRepository:Module]
|
||||
# @COMPLEXITY: 5
|
||||
# @PURPOSE: Persist and retrieve dataset review session aggregates, including readiness, findings, semantic decisions, clarification state, previews, and run contexts.
|
||||
# @LAYER: Domain
|
||||
# @RELATION: [DEPENDS_ON] ->[DatasetReviewSession]
|
||||
# @RELATION: [DEPENDS_ON] ->[DatasetProfile]
|
||||
# @RELATION: [DEPENDS_ON] ->[ValidationFinding]
|
||||
# @RELATION: [DEPENDS_ON] ->[CompiledPreview]
|
||||
# @PRE: repository operations execute within authenticated request or task scope.
|
||||
# @POST: session aggregate reads are structurally consistent and writes preserve ownership and version semantics.
|
||||
# @SIDE_EFFECT: reads/writes application persistence layer.
|
||||
# @DATA_CONTRACT: Input[SessionMutation] -> Output[PersistedSessionAggregate]
|
||||
# @INVARIANT: answers, mapping approvals, preview artifacts, and launch snapshots are never attributed to the wrong user or session.
|
||||
# @TEST_CONTRACT: save_then_resume -> persisted session can be reopened without losing semantic/manual/clarification state
|
||||
# @TEST_SCENARIO: resume_session_preserves_manual_overrides -> locked semantic fields remain active after reload
|
||||
# @TEST_EDGE: foreign_user_access -> rejected
|
||||
# @TEST_EDGE: missing_session -> not found
|
||||
# @TEST_EDGE: partial_preview_snapshot -> preserved but not marked launchable
|
||||
# @TEST_INVARIANT: ownership_scope -> VERIFIED_BY: [foreign_user_access]
|
||||
|
||||
#### ƒ **create_session**
|
||||
# @PURPOSE: Persist initial session shell.
|
||||
|
||||
#### ƒ **load_session_detail**
|
||||
# @PURPOSE: Return the full session aggregate for API/frontend use.
|
||||
|
||||
#### ƒ **save_profile_and_findings**
|
||||
# @PURPOSE: Persist profile and validation state together.
|
||||
|
||||
#### ƒ **save_preview**
|
||||
# @PURPOSE: Persist compiled preview attempt and mark older fingerprints stale.
|
||||
|
||||
#### ƒ **save_run_context**
|
||||
# @PURPOSE: Persist immutable launch audit snapshot.
|
||||
|
||||
# [/DEF:DatasetReviewSessionRepository:Module]
|
||||
|
||||
---
|
||||
|
||||
# [DEF:SemanticSourceResolver:Module]
|
||||
# @COMPLEXITY: 4
|
||||
# @PURPOSE: Resolve, rank, and apply semantic metadata candidates from files, connected dictionaries, reference datasets, and AI generation fallback.
|
||||
# @LAYER: Domain
|
||||
# @RELATION: [DEPENDS_ON] ->[LLMProviderService]
|
||||
# @RELATION: [DEPENDS_ON] ->[SemanticSource]
|
||||
# @RELATION: [DEPENDS_ON] ->[SemanticFieldEntry]
|
||||
# @RELATION: [DEPENDS_ON] ->[SemanticCandidate]
|
||||
# @PRE: selected source and target field set must be known.
|
||||
# @POST: candidate ranking follows the configured confidence hierarchy and unresolved fuzzy matches remain reviewable.
|
||||
# @SIDE_EFFECT: may create conflict findings and semantic candidate records.
|
||||
# @INVARIANT: Manual overrides are never silently replaced by imported, inferred, or AI-generated values.
|
||||
# @TEST_CONTRACT: rank_candidates -> exact dictionary beats reference import beats fuzzy beats AI draft
|
||||
# @TEST_SCENARIO: manual_lock_survives_reimport -> locked field remains active after another source is applied
|
||||
# @TEST_EDGE: malformed_source_payload -> failed source application with explanatory finding
|
||||
# @TEST_EDGE: conflicting_sources -> conflict state preserved for review
|
||||
# @TEST_EDGE: no_trusted_matches -> AI draft fallback only
|
||||
# @TEST_INVARIANT: confidence_hierarchy -> VERIFIED_BY: [rank_candidates]
|
||||
|
||||
#### ƒ **resolve_from_file**
|
||||
# @PURPOSE: Normalize uploaded semantic file records into field-level candidates.
|
||||
|
||||
#### ƒ **resolve_from_dictionary**
|
||||
# @PURPOSE: Resolve candidates from connected tabular dictionary sources.
|
||||
|
||||
#### ƒ **resolve_from_reference_dataset**
|
||||
# @PURPOSE: Reuse semantic metadata from trusted Superset datasets.
|
||||
|
||||
#### ƒ **rank_candidates**
|
||||
# @PURPOSE: Apply confidence ordering and determine best candidate per field.
|
||||
|
||||
#### ƒ **detect_conflicts**
|
||||
# @PURPOSE: Mark competing candidate sets that require explicit user review.
|
||||
|
||||
#### ƒ **apply_field_decision**
|
||||
# @PURPOSE: Accept, reject, or manually override a field-level semantic value.
|
||||
|
||||
# [/DEF:SemanticSourceResolver:Module]
|
||||
|
||||
---
|
||||
|
||||
# [DEF:ClarificationEngine:Module]
|
||||
# @COMPLEXITY: 4
|
||||
# @PURPOSE: Manage one-question-at-a-time clarification sessions, including prioritization, answer persistence, and readiness impact updates.
|
||||
# @LAYER: Domain
|
||||
# @RELATION: [DEPENDS_ON] ->[ClarificationSession]
|
||||
# @RELATION: [DEPENDS_ON] ->[ClarificationQuestion]
|
||||
# @RELATION: [DEPENDS_ON] ->[ClarificationAnswer]
|
||||
# @RELATION: [DEPENDS_ON] ->[ValidationFinding]
|
||||
# @PRE: target session contains unresolved or contradictory review state.
|
||||
# @POST: every recorded answer updates the clarification session and associated session state deterministically.
|
||||
# @SIDE_EFFECT: creates clarification questions, persists answers, updates findings/profile state.
|
||||
# @INVARIANT: Clarification answers are persisted before the current question pointer or readiness state is advanced.
|
||||
# @TEST_CONTRACT: next_question_selection -> returns only one highest-priority unresolved question at a time
|
||||
# @TEST_SCENARIO: save_and_resume_clarification -> reopening session restores current question and prior answers
|
||||
# @TEST_EDGE: skipped_question -> unresolved topic remains visible
|
||||
# @TEST_EDGE: expert_review_marked -> topic deferred without false resolution
|
||||
# @TEST_EDGE: duplicate_answer_submission -> idempotent or rejected deterministically
|
||||
# @TEST_INVARIANT: single_active_question -> VERIFIED_BY: [next_question_selection]
|
||||
|
||||
#### ƒ **start_or_resume**
|
||||
# @PURPOSE: Open clarification mode on the highest-priority unresolved question.
|
||||
|
||||
#### ƒ **build_question_payload**
|
||||
# @PURPOSE: Return question, why-it-matters text, current guess, and suggested options.
|
||||
|
||||
#### ƒ **record_answer**
|
||||
# @PURPOSE: Persist one answer and compute state impact.
|
||||
|
||||
#### ƒ **summarize_progress**
|
||||
# @PURPOSE: Produce the clarification change summary shown on exit or pause.
|
||||
|
||||
# [/DEF:ClarificationEngine:Module]
|
||||
|
||||
---
|
||||
|
||||
# [DEF:SupersetContextExtractor:Module]
|
||||
# @COMPLEXITY: 4
|
||||
# @PURPOSE: Recover dataset, dashboard, filter, and runtime-template context from Superset links and related API payloads.
|
||||
# @LAYER: Infra
|
||||
# @RELATION: [CALLS] ->[SupersetClient]
|
||||
# @RELATION: [DEPENDS_ON] ->[ImportedFilter]
|
||||
# @RELATION: [DEPENDS_ON] ->[TemplateVariable]
|
||||
# @PRE: Superset link or dataset reference must be parseable enough to resolve an environment-scoped target resource.
|
||||
# @POST: returns the best available recovered context with explicit provenance and partial-recovery markers when necessary.
|
||||
# @SIDE_EFFECT: performs upstream Superset API reads.
|
||||
# @INVARIANT: Partial recovery is surfaced explicitly and never misrepresented as fully confirmed context.
|
||||
# @TEST_CONTRACT: recover_context_from_link -> output distinguishes URL-derived, native-filter-derived, and unresolved context
|
||||
# @TEST_SCENARIO: partial_filter_recovery_marks_recovery_required -> session remains usable but not falsely complete
|
||||
# @TEST_EDGE: unsupported_link_shape -> intake failure with actionable finding
|
||||
# @TEST_EDGE: dataset_without_filters -> successful dataset recovery with empty imported filter set
|
||||
# @TEST_EDGE: missing_dashboard_binding -> partial recovery only
|
||||
# @TEST_INVARIANT: provenance_visibility -> VERIFIED_BY: [recover_context_from_link]
|
||||
|
||||
#### ƒ **parse_superset_link**
|
||||
# @PURPOSE: Extract candidate identifiers and query state from supported Superset URLs.
|
||||
|
||||
#### ƒ **recover_imported_filters**
|
||||
# @PURPOSE: Build imported filter entries from URL state and Superset-side saved context.
|
||||
|
||||
#### ƒ **discover_template_variables**
|
||||
# @PURPOSE: Detect runtime variables and Jinja references from dataset query-bearing fields.
|
||||
|
||||
#### ƒ **build_recovery_summary**
|
||||
# @PURPOSE: Summarize recovered, partial, and unresolved context for session state and UX.
|
||||
|
||||
# [/DEF:SupersetContextExtractor:Module]
|
||||
|
||||
---
|
||||
|
||||
# [DEF:SupersetCompilationAdapter:Module]
|
||||
# @COMPLEXITY: 4
|
||||
# @PURPOSE: Interact with Superset preview compilation and SQL Lab execution endpoints using the current approved execution context.
|
||||
# @LAYER: Infra
|
||||
# @RELATION: [CALLS] ->[SupersetClient]
|
||||
# @RELATION: [DEPENDS_ON] ->[CompiledPreview]
|
||||
# @RELATION: [DEPENDS_ON] ->[DatasetRunContext]
|
||||
# @PRE: effective template params and dataset execution reference are available.
|
||||
# @POST: preview and launch calls return Superset-originated artifacts or explicit errors.
|
||||
# @SIDE_EFFECT: performs upstream Superset preview and SQL Lab calls.
|
||||
# @INVARIANT: The adapter never fabricates compiled SQL locally; preview truth is delegated to Superset only.
|
||||
# @TEST_CONTRACT: compile_then_launch -> launch uses the same effective input fingerprint verified in preview
|
||||
# @TEST_SCENARIO: preview_failure_blocks_launch -> no SQL Lab session is created after failed preview
|
||||
# @TEST_EDGE: compilation_endpoint_error -> failed preview artifact with readable diagnostics
|
||||
# @TEST_EDGE: sql_lab_creation_error -> failed launch audit state
|
||||
# @TEST_EDGE: fingerprint_mismatch -> launch rejected
|
||||
# @TEST_INVARIANT: superset_truth_source -> VERIFIED_BY: [preview_failure_blocks_launch]
|
||||
|
||||
#### ƒ **compile_preview**
|
||||
# @PURPOSE: Request Superset-side compiled SQL preview for the current effective inputs.
|
||||
|
||||
#### ƒ **mark_preview_stale**
|
||||
# @PURPOSE: Invalidate previous preview after mapping or value changes.
|
||||
|
||||
#### ƒ **create_sql_lab_session**
|
||||
# @PURPOSE: Create the canonical audited execution session after all launch gates pass.
|
||||
|
||||
# [/DEF:SupersetCompilationAdapter:Module]
|
||||
|
||||
---
|
||||
|
||||
## 2. Frontend Components
|
||||
|
||||
<!-- [DEF:DatasetReviewWorkspace:Component] -->
|
||||
<!-- @COMPLEXITY: 5 -->
|
||||
<!-- @PURPOSE: Main dataset review workspace coordinating session state, progressive recovery, semantic review, clarification, preview, and launch UX. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[assistantChat] -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[taskDrawer] -->
|
||||
<!-- @UX_STATE: Empty -> Show source intake with Superset link and dataset-selection entry actions. -->
|
||||
<!-- @UX_STATE: Importing -> Show progressive recovery milestones as context is assembled. -->
|
||||
<!-- @UX_STATE: Review -> Show summary, findings, semantic layer, filters, mapping, and next action. -->
|
||||
<!-- @UX_STATE: Clarification -> Focus the user on one current clarification question while preserving wider session context. -->
|
||||
<!-- @UX_STATE: Ready -> Show launch summary and unambiguous run-ready signal without hiding warnings. -->
|
||||
<!-- @UX_FEEDBACK: Main CTA changes by readiness state and reflects current highest-value next action. -->
|
||||
<!-- @UX_RECOVERY: Users can save, resume, or reopen an unfinished session without losing context. -->
|
||||
<!-- @UX_REACTIVITY: Uses Svelte runes for session, readiness, preview, and task state derivation. -->
|
||||
<!-- @INVARIANT: Navigation away from dirty session state must require explicit confirmation. -->
|
||||
<!-- @TEST_CONTRACT: workspace_state_machine -> one and only one major readiness-driven CTA is primary at a time -->
|
||||
<!-- @TEST_SCENARIO: resume_preserves_state -> reopening unfinished session restores current panel state and next action -->
|
||||
<!-- @TEST_EDGE: session_load_failure -> error/recovery UI shown -->
|
||||
<!-- @TEST_EDGE: stale_preview_after_mapping_change -> preview state flips to stale -->
|
||||
<!-- @TEST_EDGE: unsaved_navigation -> guarded exit -->
|
||||
<!-- @TEST_INVARIANT: primary_cta_alignment -> VERIFIED_BY: [workspace_state_machine] -->
|
||||
|
||||
#### ƒ **handleSourceSubmit**
|
||||
<!-- @PURPOSE: Start a session from a Superset link or dataset selection. -->
|
||||
<!-- @UX_FEEDBACK: Immediate optimistic intake acknowledgement plus recovery progress. -->
|
||||
|
||||
#### ƒ **handleResumeSession**
|
||||
<!-- @PURPOSE: Reopen an existing paused or unfinished session. -->
|
||||
<!-- @UX_FEEDBACK: Restores the session into the correct readiness-driven panel state. -->
|
||||
|
||||
#### ƒ **handleLaunch**
|
||||
<!-- @PURPOSE: Execute the final launch action once run-ready gates pass. -->
|
||||
<!-- @UX_STATE: Launching -> Disable CTA and expose progress/result handoff. -->
|
||||
<!-- @UX_FEEDBACK: Success state links to SQL Lab and audit summary; failure preserves context and recovery path. -->
|
||||
|
||||
<!-- [/DEF:DatasetReviewWorkspace:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:SourceIntakePanel:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Collect initial dataset source input through Superset link paste or dataset selection entry paths. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @UX_STATE: Idle -> Empty intake form with two clear entry paths. -->
|
||||
<!-- @UX_STATE: Validating -> Lightweight inline validation feedback. -->
|
||||
<!-- @UX_STATE: Rejected -> Input error shown with corrective hint. -->
|
||||
<!-- @UX_FEEDBACK: Recognized links are acknowledged before deeper recovery finishes. -->
|
||||
<!-- @UX_RECOVERY: Users can correct invalid input in place without resetting the page. -->
|
||||
|
||||
#### ƒ **submitSupersetLink**
|
||||
<!-- @PURPOSE: Validate and submit Superset link input. -->
|
||||
|
||||
#### ƒ **submitDatasetSelection**
|
||||
<!-- @PURPOSE: Submit selected dataset/environment context. -->
|
||||
|
||||
<!-- [/DEF:SourceIntakePanel:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:ValidationFindingsPanel:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Present validation findings grouped by severity with explicit resolution and actionability signals. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[DatasetReviewWorkspace] -->
|
||||
<!-- @UX_STATE: Blocking -> Blocking findings are visually dominant and block launch flow. -->
|
||||
<!-- @UX_STATE: Warning -> Warnings remain visible with explicit approval or defer actions. -->
|
||||
<!-- @UX_STATE: Informational -> Low-priority findings are collapsed or secondary. -->
|
||||
<!-- @UX_FEEDBACK: Resolving or approving an item updates readiness state immediately. -->
|
||||
<!-- @UX_RECOVERY: Users can jump from a finding directly to the relevant remediation area. -->
|
||||
|
||||
#### ƒ **groupFindingsBySeverity**
|
||||
<!-- @PURPOSE: Project findings into blocking, warning, and informational groups. -->
|
||||
|
||||
#### ƒ **jumpToFindingTarget**
|
||||
<!-- @PURPOSE: Focus the relevant review section for a selected finding. -->
|
||||
|
||||
<!-- [/DEF:ValidationFindingsPanel:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:SemanticLayerReview:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Review and edit semantic metadata for columns and metrics with provenance and conflict visibility. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @UX_STATE: Normal -> Show current semantic values and provenance badges. -->
|
||||
<!-- @UX_STATE: Conflicted -> Show side-by-side competing semantic candidates for the same field. -->
|
||||
<!-- @UX_STATE: Manual -> Show locked manual override and block silent overwrite. -->
|
||||
<!-- @UX_FEEDBACK: Applying a semantic source or manual override updates provenance and readiness immediately. -->
|
||||
<!-- @UX_RECOVERY: Users can keep current values, accept recommendations, or review candidates one by one. -->
|
||||
|
||||
#### ƒ **applyManualOverride**
|
||||
<!-- @PURPOSE: Lock a field to a user-provided semantic value. -->
|
||||
<!-- @UX_FEEDBACK: Field marked as manual override and source-import replacement is disabled. -->
|
||||
|
||||
#### ƒ **applyCandidateSelection**
|
||||
<!-- @PURPOSE: Accept one candidate from conflicting or fuzzy semantic options. -->
|
||||
<!-- @UX_FEEDBACK: Candidate badge state changes and conflict warning clears when appropriate. -->
|
||||
|
||||
<!-- [/DEF:SemanticLayerReview:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:ClarificationDialog:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: One-question-at-a-time clarification surface for unresolved or contradictory dataset meanings. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[assistantChat] -->
|
||||
<!-- @UX_STATE: Question -> Show current question, why-it-matters text, current guess, and selectable answers. -->
|
||||
<!-- @UX_STATE: Saving -> Disable controls while persisting answer. -->
|
||||
<!-- @UX_STATE: Completed -> Show clarification summary and impact on readiness. -->
|
||||
<!-- @UX_FEEDBACK: Each answer updates profile and findings without forcing a full page reload. -->
|
||||
<!-- @UX_RECOVERY: Users can skip, defer to expert review, pause, and resume later. -->
|
||||
|
||||
#### ƒ **submitAnswer**
|
||||
<!-- @PURPOSE: Save selected or custom clarification answer. -->
|
||||
|
||||
#### ƒ **skipQuestion**
|
||||
<!-- @PURPOSE: Defer the current question while keeping it unresolved. -->
|
||||
|
||||
#### ƒ **pauseClarification**
|
||||
<!-- @PURPOSE: Exit clarification mode without losing prior answers. -->
|
||||
|
||||
<!-- [/DEF:ClarificationDialog:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:ExecutionMappingReview:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Review mappings between imported filters and detected template variables, including transformed values and warning approvals. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @UX_STATE: Incomplete -> Required mapping values still missing. -->
|
||||
<!-- @UX_STATE: WarningApproval -> Mapping rows require explicit approval before launch. -->
|
||||
<!-- @UX_STATE: Approved -> All launch-sensitive mappings approved or overridden. -->
|
||||
<!-- @UX_FEEDBACK: Mapping approvals or manual edits immediately re-evaluate preview staleness and readiness. -->
|
||||
<!-- @UX_RECOVERY: Users can manually override transformed values instead of approving them as-is. -->
|
||||
|
||||
#### ƒ **approveMapping**
|
||||
<!-- @PURPOSE: Explicitly approve a warning-level value transformation. -->
|
||||
<!-- @UX_FEEDBACK: Warning cleared and launch checklist refreshed. -->
|
||||
|
||||
#### ƒ **overrideMappingValue**
|
||||
<!-- @PURPOSE: Replace the proposed effective mapping value manually. -->
|
||||
<!-- @UX_FEEDBACK: Mapping method switches to manual override and prior warning may be cleared or recalculated. -->
|
||||
|
||||
<!-- [/DEF:ExecutionMappingReview:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:CompiledSQLPreview:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Present the exact Superset-generated compiled SQL preview with refresh state and failure diagnostics. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[api_module] -->
|
||||
<!-- @UX_STATE: Missing -> Prompt user to generate preview. -->
|
||||
<!-- @UX_STATE: Pending -> Show generation-in-progress feedback. -->
|
||||
<!-- @UX_STATE: Ready -> Render read-only SQL preview with visible substitutions. -->
|
||||
<!-- @UX_STATE: Stale -> Mark preview invalid after input changes until regenerated. -->
|
||||
<!-- @UX_STATE: Error -> Show readable Superset compilation failure details and recovery action. -->
|
||||
<!-- @UX_FEEDBACK: Refreshing preview updates timestamp and launch readiness cues. -->
|
||||
<!-- @UX_RECOVERY: Users can navigate directly from preview error to the mapping or value row that caused failure. -->
|
||||
|
||||
#### ƒ **requestPreview**
|
||||
<!-- @PURPOSE: Trigger preview generation for the current effective session inputs. -->
|
||||
|
||||
#### ƒ **showPreviewErrorTarget**
|
||||
<!-- @PURPOSE: Focus remediation target when compilation diagnostics identify a mapping or variable issue. -->
|
||||
|
||||
<!-- [/DEF:CompiledSQLPreview:Component] -->
|
||||
|
||||
---
|
||||
|
||||
<!-- [DEF:LaunchConfirmationPanel:Component] -->
|
||||
<!-- @COMPLEXITY: 3 -->
|
||||
<!-- @PURPOSE: Summarize final run context, approvals, warnings, and compiled-preview status before dataset launch. -->
|
||||
<!-- @LAYER: UI -->
|
||||
<!-- @RELATION: [BINDS_TO] ->[DatasetReviewWorkspace] -->
|
||||
<!-- @UX_STATE: Blocked -> Explicitly list missing gates preventing launch. -->
|
||||
<!-- @UX_STATE: Ready -> Show final reviewed context and confirm action. -->
|
||||
<!-- @UX_STATE: Submitted -> Show handoff to SQL Lab and audit snapshot reference. -->
|
||||
<!-- @UX_FEEDBACK: Confirmation view is a contextual run summary, not a generic yes/no modal. -->
|
||||
<!-- @UX_RECOVERY: When blocked, users can jump back to missing values, mapping approvals, or preview generation. -->
|
||||
|
||||
#### ƒ **buildLaunchSummary**
|
||||
<!-- @PURPOSE: Project the exact run context into the final pre-launch summary. -->
|
||||
|
||||
#### ƒ **confirmLaunch**
|
||||
<!-- @PURPOSE: Submit the run-ready launch request once all gates pass. -->
|
||||
|
||||
<!-- [/DEF:LaunchConfirmationPanel:Component] -->
|
||||
|
||||
---
|
||||
|
||||
## 3. Contract Coverage Notes
|
||||
|
||||
The feature requires:
|
||||
- dedicated semantic resolution contracts instead of hiding source-ranking logic inside orchestration,
|
||||
- a first-class clarification engine because guided ambiguity resolution is a persisted workflow, not a simple endpoint,
|
||||
- a Superset extraction boundary distinct from preview/launch behavior,
|
||||
- UI contracts that cover the UX state machine rather than only the happy path.
|
||||
|
||||
These contracts are intended to align directly with:
|
||||
- [`specs/027-dataset-llm-orchestration/spec.md`](../spec.md)
|
||||
- [`specs/027-dataset-llm-orchestration/ux_reference.md`](../ux_reference.md)
|
||||
- [`specs/027-dataset-llm-orchestration/research.md`](../research.md)
|
||||
- [`specs/027-dataset-llm-orchestration/data-model.md`](../data-model.md)
|
||||
764
specs/027-dataset-llm-orchestration/data-model.md
Normal file
764
specs/027-dataset-llm-orchestration/data-model.md
Normal file
@@ -0,0 +1,764 @@
|
||||
# Data Model: LLM Dataset Orchestration
|
||||
|
||||
**Feature**: [LLM Dataset Orchestration](./spec.md)
|
||||
**Branch**: `027-dataset-llm-orchestration`
|
||||
**Date**: 2026-03-16
|
||||
|
||||
## Overview
|
||||
|
||||
This document defines the domain entities, relationships, lifecycle states, and validation rules for the dataset review, semantic enrichment, clarification, preview, and launch workflow described in [`spec.md`](./spec.md) and grounded by the decisions in [`research.md`](./research.md).
|
||||
|
||||
The model is intentionally split into:
|
||||
- **session aggregate** entities for resumable workflow state,
|
||||
- **semantic/provenance** entities for enrichment and conflict handling,
|
||||
- **execution** entities for mapping, preview, and launch audit,
|
||||
- **export** projections for sharing outputs.
|
||||
|
||||
---
|
||||
|
||||
## 1. Core Aggregate: DatasetReviewSession
|
||||
|
||||
### Entity: `SessionCollaborator`
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `user_id` | string | yes | Collaborating user ID |
|
||||
| `role` | enum | yes | `viewer`, `reviewer`, `approver` |
|
||||
| `added_at` | datetime | yes | When they were added |
|
||||
|
||||
### Entity: `DatasetReviewSession`
|
||||
|
||||
Represents the top-level resumable workflow container for one dataset review/execution effort.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `session_id` | string (UUID) | yes | Stable unique identifier for the review session |
|
||||
| `user_id` | string | yes | Authenticated User ID of the session owner |
|
||||
| `collaborators` | list[SessionCollaborator] | no | Shared access and roles |
|
||||
| `environment_id` | string | yes | Superset environment context |
|
||||
| `source_kind` | enum | yes | Origin kind: `superset_link`, `dataset_selection` |
|
||||
| `source_input` | string | yes | Original link or selected dataset reference |
|
||||
| `dataset_ref` | string | yes | Canonical dataset reference used by the feature |
|
||||
| `dataset_id` | integer \| null | no | Superset dataset id when resolved |
|
||||
| `dashboard_id` | integer \| null | no | Superset dashboard id if imported from dashboard link |
|
||||
| `readiness_state` | enum | yes | Current workflow readiness state |
|
||||
| `recommended_action` | enum | yes | Explicit next recommended action |
|
||||
| `status` | enum | yes | Session lifecycle status |
|
||||
| `current_phase` | enum | yes | Active workflow phase |
|
||||
| `active_task_id` | string \| null | no | Linked long-running task if one is active |
|
||||
| `last_preview_id` | string \| null | no | Most recent preview snapshot |
|
||||
| `last_run_context_id` | string \| null | no | Most recent launch audit record |
|
||||
| `created_at` | datetime | yes | Session creation timestamp |
|
||||
| `updated_at` | datetime | yes | Last mutation timestamp |
|
||||
| `last_activity_at` | datetime | yes | Last user/system activity timestamp |
|
||||
| `closed_at` | datetime \| null | no | Terminal close/archive timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `session_id` must be globally unique.
|
||||
- `source_input` must be non-empty.
|
||||
- `environment_id` must resolve to a configured environment.
|
||||
- `readiness_state` and `recommended_action` must always be present.
|
||||
- `user_id` ownership must be enforced for all mutations, unless collaborator roles allow otherwise.
|
||||
- `dataset_id` becomes required before preview or launch phases.
|
||||
- `last_preview_id` must refer to a preview generated from the same session.
|
||||
|
||||
### Enums
|
||||
|
||||
#### `SessionStatus`
|
||||
- `active`
|
||||
- `paused`
|
||||
- `completed`
|
||||
- `archived`
|
||||
- `cancelled`
|
||||
|
||||
#### `SessionPhase`
|
||||
- `intake`
|
||||
- `recovery`
|
||||
- `review`
|
||||
- `semantic_review`
|
||||
- `clarification`
|
||||
- `mapping_review`
|
||||
- `preview`
|
||||
- `launch`
|
||||
- `post_run`
|
||||
|
||||
#### `ReadinessState`
|
||||
- `empty`
|
||||
- `importing`
|
||||
- `review_ready`
|
||||
- `semantic_source_review_needed`
|
||||
- `clarification_needed`
|
||||
- `clarification_active`
|
||||
- `mapping_review_needed`
|
||||
- `compiled_preview_ready`
|
||||
- `partially_ready`
|
||||
- `run_ready`
|
||||
- `run_in_progress`
|
||||
- `completed`
|
||||
- `recovery_required`
|
||||
|
||||
#### `RecommendedAction`
|
||||
- `import_from_superset`
|
||||
- `review_documentation`
|
||||
- `apply_semantic_source`
|
||||
- `start_clarification`
|
||||
- `answer_next_question`
|
||||
- `approve_mapping`
|
||||
- `generate_sql_preview`
|
||||
- `complete_required_values`
|
||||
- `launch_dataset`
|
||||
- `resume_session`
|
||||
- `export_outputs`
|
||||
|
||||
---
|
||||
|
||||
## 2. Dataset Profile and Review State
|
||||
|
||||
### Entity: `DatasetProfile`
|
||||
|
||||
Consolidated interpretation of dataset meaning, semantics, filters, assumptions, and readiness.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `profile_id` | string (UUID) | yes | Unique profile id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `dataset_name` | string | yes | Display dataset name |
|
||||
| `schema_name` | string \| null | no | Schema if available |
|
||||
| `database_name` | string \| null | no | Database if available |
|
||||
| `business_summary` | text | yes | Human-readable summary |
|
||||
| `business_summary_source` | enum | yes | Provenance of summary |
|
||||
| `description` | text \| null | no | Dataset-level description |
|
||||
| `dataset_type` | enum \| null | no | `table`, `virtual`, `sqllab_view`, `unknown` |
|
||||
| `is_sqllab_view` | boolean | yes | Whether dataset is SQL Lab derived |
|
||||
| `completeness_score` | number \| null | no | Optional normalized completeness score |
|
||||
| `confidence_state` | enum | yes | Overall confidence posture |
|
||||
| `has_blocking_findings` | boolean | yes | Derived summary flag |
|
||||
| `has_warning_findings` | boolean | yes | Derived summary flag |
|
||||
| `manual_summary_locked` | boolean | yes | Protects user-entered summary |
|
||||
| `created_at` | datetime | yes | Created timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `business_summary` must always contain a usable string; if weak, it may be skeletal but not null.
|
||||
- `manual_summary_locked=true` prevents later automatic overwrite.
|
||||
- `session_id` must be unique if only one active profile snapshot is stored per session, or versioned if snapshots are retained.
|
||||
- `confidence_state` must reflect highest unresolved-risk posture, not just optimistic confidence.
|
||||
|
||||
#### `BusinessSummarySource`
|
||||
- `confirmed`
|
||||
- `imported`
|
||||
- `inferred`
|
||||
- `ai_draft`
|
||||
- `manual_override`
|
||||
|
||||
#### `ConfidenceState`
|
||||
- `confirmed`
|
||||
- `mostly_confirmed`
|
||||
- `mixed`
|
||||
- `low_confidence`
|
||||
- `unresolved`
|
||||
|
||||
---
|
||||
|
||||
## 3. Validation Findings
|
||||
|
||||
### Entity: `ValidationFinding`
|
||||
|
||||
Represents a blocking issue, warning, or informational observation.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `finding_id` | string (UUID) | yes | Unique finding id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `area` | enum | yes | Affected domain area |
|
||||
| `severity` | enum | yes | `blocking`, `warning`, `informational` |
|
||||
| `code` | string | yes | Stable machine-readable finding code |
|
||||
| `title` | string | yes | Short label |
|
||||
| `message` | text | yes | Actionable human-readable explanation |
|
||||
| `resolution_state` | enum | yes | Current resolution status |
|
||||
| `resolution_note` | text \| null | no | Optional explanation or approval note |
|
||||
| `caused_by_ref` | string \| null | no | Related field/filter/mapping/question id |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `resolved_at` | datetime \| null | no | Resolution timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `severity` must be one of the allowed values.
|
||||
- `resolution_state=resolved` or `approved` requires either a system resolution event or user action.
|
||||
- `launch` is blocked if any open `blocking` finding remains.
|
||||
- `warning` findings tied to mapping transformations require explicit approval before launch if marked launch-sensitive.
|
||||
|
||||
#### `FindingArea`
|
||||
- `source_intake`
|
||||
- `dataset_profile`
|
||||
- `semantic_enrichment`
|
||||
- `clarification`
|
||||
- `filter_recovery`
|
||||
- `template_mapping`
|
||||
- `compiled_preview`
|
||||
- `launch`
|
||||
- `audit`
|
||||
|
||||
#### `ResolutionState`
|
||||
- `open`
|
||||
- `resolved`
|
||||
- `approved`
|
||||
- `skipped`
|
||||
- `deferred`
|
||||
- `expert_review`
|
||||
|
||||
---
|
||||
|
||||
## 4. Semantic Source and Field Decisions
|
||||
|
||||
### Entity: `SemanticSource`
|
||||
|
||||
Represents a trusted or candidate source of semantic metadata.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `source_id` | string (UUID) | yes | Unique source id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `source_type` | enum | yes | Origin kind |
|
||||
| `source_ref` | string | yes | External reference, dataset ref, or uploaded artifact ref |
|
||||
| `source_version` | string | yes | Version/Snapshot for propagation tracking |
|
||||
| `display_name` | string | yes | Human-readable source name |
|
||||
| `trust_level` | enum | yes | Source trust tier |
|
||||
| `schema_overlap_score` | number \| null | no | Optional overlap signal |
|
||||
| `status` | enum | yes | Availability/applicability status |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
|
||||
#### `SemanticSourceType`
|
||||
- `uploaded_file`
|
||||
- `connected_dictionary`
|
||||
- `reference_dataset`
|
||||
- `neighbor_dataset`
|
||||
- `ai_generated`
|
||||
|
||||
#### `TrustLevel`
|
||||
- `trusted`
|
||||
- `recommended`
|
||||
- `candidate`
|
||||
- `generated`
|
||||
|
||||
#### `SemanticSourceStatus`
|
||||
- `available`
|
||||
- `selected`
|
||||
- `applied`
|
||||
- `rejected`
|
||||
- `partial`
|
||||
- `failed`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `SemanticFieldEntry`
|
||||
|
||||
Canonical semantic state for one dataset field or metric.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `field_id` | string (UUID) | yes | Unique field semantic id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `field_name` | string | yes | Physical field/metric name |
|
||||
| `field_kind` | enum | yes | `column`, `metric`, `filter_dimension`, `parameter` |
|
||||
| `verbose_name` | string \| null | no | Display label |
|
||||
| `description` | text \| null | no | Human-readable description |
|
||||
| `display_format` | string \| null | no | Formatting metadata such as d3 format |
|
||||
| `provenance` | enum | yes | Final chosen source class |
|
||||
| `source_id` | string \| null | no | Winning source |
|
||||
| `confidence_rank` | integer \| null | no | Final applied ranking |
|
||||
| `is_locked` | boolean | yes | Manual override protection |
|
||||
| `has_conflict` | boolean | yes | Whether competing candidates exist |
|
||||
| `needs_review` | boolean | yes | Whether user review is still needed |
|
||||
| `last_changed_by` | enum | yes | `system`, `user`, `agent` |
|
||||
| `user_feedback` | enum | no | User feedback: `up`, `down`, `null` |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `field_name` must be unique per `session_id + field_kind`.
|
||||
- `is_locked=true` prevents automatic overwrite.
|
||||
- `provenance=manual_override` implies `is_locked=true`.
|
||||
- `has_conflict=true` requires at least one competing candidate record.
|
||||
- Fuzzy/applied inferred values must keep `needs_review=true` until confirmed if policy requires explicit review.
|
||||
|
||||
#### `FieldKind`
|
||||
- `column`
|
||||
- `metric`
|
||||
- `filter_dimension`
|
||||
- `parameter`
|
||||
|
||||
#### `FieldProvenance`
|
||||
- `dictionary_exact`
|
||||
- `reference_imported`
|
||||
- `fuzzy_inferred`
|
||||
- `ai_generated`
|
||||
- `manual_override`
|
||||
- `unresolved`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `SemanticCandidate`
|
||||
|
||||
Stores competing candidate values before or alongside final field decision.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `candidate_id` | string (UUID) | yes | Unique candidate id |
|
||||
| `field_id` | string | yes | Parent semantic field |
|
||||
| `source_id` | string \| null | no | Candidate source |
|
||||
| `candidate_rank` | integer | yes | Lower is stronger |
|
||||
| `match_type` | enum | yes | Exact, imported, fuzzy, generated |
|
||||
| `confidence_score` | number | yes | Normalized score |
|
||||
| `proposed_verbose_name` | string \| null | no | Candidate verbose name |
|
||||
| `proposed_description` | text \| null | no | Candidate description |
|
||||
| `proposed_display_format` | string \| null | no | Candidate display format |
|
||||
| `status` | enum | yes | Candidate lifecycle |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
|
||||
#### `CandidateMatchType`
|
||||
- `exact`
|
||||
- `reference`
|
||||
- `fuzzy`
|
||||
- `generated`
|
||||
|
||||
#### `CandidateStatus`
|
||||
- `proposed`
|
||||
- `accepted`
|
||||
- `rejected`
|
||||
- `superseded`
|
||||
|
||||
---
|
||||
|
||||
## 5. Imported Filters and Runtime Variables
|
||||
|
||||
### Entity: `ImportedFilter`
|
||||
|
||||
Represents one recovered or user-supplied filter value.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `filter_id` | string (UUID) | yes | Unique filter id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `filter_name` | string | yes | Source filter name |
|
||||
| `display_name` | string \| null | no | User-facing label |
|
||||
| `raw_value` | json | yes | Original recovered value |
|
||||
| `normalized_value` | json \| null | no | Optional transformed value |
|
||||
| `source` | enum | yes | Origin of the filter |
|
||||
| `confidence_state` | enum | yes | Confidence/provenance class |
|
||||
| `requires_confirmation` | boolean | yes | Whether explicit review is needed |
|
||||
| `recovery_status` | enum | yes | Recovery completeness |
|
||||
| `notes` | text \| null | no | Recovery explanation |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
#### `FilterSource`
|
||||
- `superset_native`
|
||||
- `superset_url`
|
||||
- `manual`
|
||||
- `inferred`
|
||||
|
||||
#### `FilterConfidenceState`
|
||||
- `confirmed`
|
||||
- `imported`
|
||||
- `inferred`
|
||||
- `ai_draft`
|
||||
- `unresolved`
|
||||
|
||||
#### `FilterRecoveryStatus`
|
||||
- `recovered`
|
||||
- `partial`
|
||||
- `missing`
|
||||
- `conflicted`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `TemplateVariable`
|
||||
|
||||
Represents a runtime variable discovered from dataset execution logic.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `variable_id` | string (UUID) | yes | Unique variable id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `variable_name` | string | yes | Canonical runtime variable name |
|
||||
| `expression_source` | text | yes | Raw expression or snippet where variable was found |
|
||||
| `variable_kind` | enum | yes | Detected variable class |
|
||||
| `is_required` | boolean | yes | Whether launch requires a mapped value |
|
||||
| `default_value` | json \| null | no | Optional default |
|
||||
| `mapping_status` | enum | yes | Current mapping state |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
#### `VariableKind`
|
||||
- `native_filter`
|
||||
- `parameter`
|
||||
- `derived`
|
||||
- `unknown`
|
||||
|
||||
#### `MappingStatus`
|
||||
- `unmapped`
|
||||
- `proposed`
|
||||
- `approved`
|
||||
- `overridden`
|
||||
- `invalid`
|
||||
|
||||
---
|
||||
|
||||
## 6. Mapping Review and Warning Approvals
|
||||
|
||||
### Entity: `ExecutionMapping`
|
||||
|
||||
Represents the mapping between a recovered filter and a runtime variable.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `mapping_id` | string (UUID) | yes | Unique mapping id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `filter_id` | string | yes | Source imported filter |
|
||||
| `variable_id` | string | yes | Target template variable |
|
||||
| `mapping_method` | enum | yes | How mapping was produced |
|
||||
| `raw_input_value` | json | yes | Original input |
|
||||
| `effective_value` | json \| null | no | Value to send to preview/launch |
|
||||
| `transformation_note` | text \| null | no | Explanation of normalization |
|
||||
| `warning_level` | enum \| null | no | Warning classification if transformation is risky |
|
||||
| `requires_explicit_approval` | boolean | yes | Whether launch gate applies |
|
||||
| `approval_state` | enum | yes | Approval lifecycle |
|
||||
| `approved_by_user_id` | string \| null | no | Approver if approved |
|
||||
| `approved_at` | datetime \| null | no | Approval timestamp |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `filter_id + variable_id` must be unique per session unless versioning is used.
|
||||
- `requires_explicit_approval=true` implies launch is blocked while `approval_state != approved`.
|
||||
- `effective_value` is required before preview when variable is required.
|
||||
- user override should set `mapping_method=manual_override`.
|
||||
|
||||
#### `MappingMethod`
|
||||
- `direct_match`
|
||||
- `heuristic_match`
|
||||
- `semantic_match`
|
||||
- `manual_override`
|
||||
|
||||
#### `MappingWarningLevel`
|
||||
- `low`
|
||||
- `medium`
|
||||
- `high`
|
||||
|
||||
#### `ApprovalState`
|
||||
- `pending`
|
||||
- `approved`
|
||||
- `rejected`
|
||||
- `not_required`
|
||||
|
||||
---
|
||||
|
||||
## 7. Clarification Workflow
|
||||
|
||||
### Entity: `ClarificationSession`
|
||||
|
||||
Stores resumable clarification flow state for one review session.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `clarification_session_id` | string (UUID) | yes | Unique clarification session id |
|
||||
| `session_id` | string | yes | Parent review session |
|
||||
| `status` | enum | yes | Clarification lifecycle |
|
||||
| `current_question_id` | string \| null | no | Current active question |
|
||||
| `resolved_count` | integer | yes | Count of answered/resolved items |
|
||||
| `remaining_count` | integer | yes | Count of unresolved items |
|
||||
| `summary_delta` | text \| null | no | Human-readable change summary |
|
||||
| `started_at` | datetime | yes | Start time |
|
||||
| `updated_at` | datetime | yes | Last update |
|
||||
| `completed_at` | datetime \| null | no | End time |
|
||||
|
||||
#### `ClarificationStatus`
|
||||
- `pending`
|
||||
- `active`
|
||||
- `paused`
|
||||
- `completed`
|
||||
- `cancelled`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `ClarificationQuestion`
|
||||
|
||||
Represents one focused question in the clarification flow.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `question_id` | string (UUID) | yes | Unique question id |
|
||||
| `clarification_session_id` | string | yes | Parent clarification session |
|
||||
| `topic_ref` | string | yes | Related field/finding/mapping id |
|
||||
| `question_text` | text | yes | Focused question |
|
||||
| `why_it_matters` | text | yes | Business significance explanation |
|
||||
| `current_guess` | text \| null | no | Best guess if available |
|
||||
| `priority` | integer | yes | Order score |
|
||||
| `state` | enum | yes | Question lifecycle |
|
||||
| `created_at` | datetime | yes | Creation timestamp |
|
||||
| `updated_at` | datetime | yes | Updated timestamp |
|
||||
|
||||
#### `QuestionState`
|
||||
- `open`
|
||||
- `answered`
|
||||
- `skipped`
|
||||
- `expert_review`
|
||||
- `superseded`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `ClarificationOption`
|
||||
|
||||
Suggested selectable answer option for a question.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `option_id` | string (UUID) | yes | Unique option id |
|
||||
| `question_id` | string | yes | Parent question |
|
||||
| `label` | string | yes | UI label |
|
||||
| `value` | string | yes | Stored answer payload |
|
||||
| `is_recommended` | boolean | yes | Whether this is the recommended option |
|
||||
| `display_order` | integer | yes | UI ordering |
|
||||
|
||||
---
|
||||
|
||||
### Entity: `ClarificationAnswer`
|
||||
|
||||
Stores user response to one clarification question.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `answer_id` | string (UUID) | yes | Unique answer id |
|
||||
| `question_id` | string | yes | Parent question |
|
||||
| `answer_kind` | enum | yes | How user responded |
|
||||
| `answer_value` | text \| null | no | Selected/custom answer |
|
||||
| `answered_by_user_id` | string | yes | Responding user |
|
||||
| `impact_summary` | text \| null | no | Optional summary of resulting state changes |
|
||||
| `created_at` | datetime | yes | Answer timestamp |
|
||||
|
||||
#### `AnswerKind`
|
||||
- `selected`
|
||||
- `custom`
|
||||
- `skipped`
|
||||
- `expert_review`
|
||||
|
||||
### Validation rules
|
||||
- Each active question may have at most one current answer.
|
||||
- `custom` answers require non-empty `answer_value`.
|
||||
- `selected` answers must correspond to a valid option or normalized payload.
|
||||
- `expert_review` leaves the related topic unresolved but marked intentionally deferred.
|
||||
|
||||
---
|
||||
|
||||
## 8. Preview and Launch Audit
|
||||
|
||||
### Entity: `CompiledPreview`
|
||||
|
||||
Stores the exact Superset-returned compiled SQL preview.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `preview_id` | string (UUID) | yes | Unique preview id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `preview_status` | enum | yes | Preview lifecycle state |
|
||||
| `compiled_sql` | text \| null | no | Exact compiled SQL if successful |
|
||||
| `preview_fingerprint` | string | yes | Snapshot hash of mapping/inputs used |
|
||||
| `compiled_by` | enum | yes | Must be `superset` |
|
||||
| `error_code` | string \| null | no | Optional failure code |
|
||||
| `error_details` | text \| null | no | Readable preview error |
|
||||
| `compiled_at` | datetime \| null | no | Successful compile timestamp |
|
||||
| `created_at` | datetime | yes | Record creation timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `compiled_by` must be `superset`.
|
||||
- `compiled_sql` is required when `preview_status=ready`.
|
||||
- `compiled_sql` must be null when `preview_status=failed` unless partial diagnostics are intentionally stored elsewhere.
|
||||
- `preview_fingerprint` must be compared against current session inputs before launch.
|
||||
- Launch requires `preview_status=ready` and matching current fingerprint.
|
||||
|
||||
#### `PreviewStatus`
|
||||
- `pending`
|
||||
- `ready`
|
||||
- `failed`
|
||||
- `stale`
|
||||
|
||||
---
|
||||
|
||||
### Entity: `DatasetRunContext`
|
||||
|
||||
Audited execution snapshot created at launch.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `run_context_id` | string (UUID) | yes | Unique run context id |
|
||||
| `session_id` | string | yes | Parent review session |
|
||||
| `dataset_ref` | string | yes | Canonical dataset identity |
|
||||
| `environment_id` | string | yes | Execution environment |
|
||||
| `preview_id` | string | yes | Bound compiled preview |
|
||||
| `sql_lab_session_ref` | string | yes | Canonical SQL Lab reference |
|
||||
| `effective_filters` | json | yes | Final filter payload |
|
||||
| `template_params` | json | yes | Final template parameter object |
|
||||
| `approved_mapping_ids` | json array | yes | Explicit approvals used for launch |
|
||||
| `semantic_decision_refs` | json array | yes | Applied semantic decision references |
|
||||
| `open_warning_refs` | json array | yes | Warnings that remained visible at launch |
|
||||
| `launch_status` | enum | yes | Launch outcome |
|
||||
| `launch_error` | text \| null | no | Error if launch failed |
|
||||
| `created_at` | datetime | yes | Launch record timestamp |
|
||||
|
||||
### Validation rules
|
||||
- `preview_id` must reference a `CompiledPreview` with `ready` status.
|
||||
- `sql_lab_session_ref` is mandatory for successful launch.
|
||||
- `effective_filters` and `template_params` must match the preview fingerprint used.
|
||||
- `launch_status=started` or `success` requires a non-empty SQL Lab reference.
|
||||
|
||||
#### `LaunchStatus`
|
||||
- `started`
|
||||
- `success`
|
||||
- `failed`
|
||||
|
||||
---
|
||||
|
||||
## 9. Export Projections
|
||||
|
||||
### Entity: `ExportArtifact`
|
||||
|
||||
Tracks generated exports for sharing documentation and validation outputs.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---:|---|
|
||||
| `artifact_id` | string (UUID) | yes | Unique artifact id |
|
||||
| `session_id` | string | yes | Parent session |
|
||||
| `artifact_type` | enum | yes | Export type |
|
||||
| `format` | enum | yes | File/output format |
|
||||
| `storage_ref` | string | yes | Storage/file reference |
|
||||
| `created_by_user_id` | string | yes | Requesting user |
|
||||
| `created_at` | datetime | yes | Artifact creation time |
|
||||
|
||||
#### `ArtifactType`
|
||||
- `documentation`
|
||||
- `validation_report`
|
||||
- `run_summary`
|
||||
|
||||
#### `ArtifactFormat`
|
||||
- `json`
|
||||
- `markdown`
|
||||
- `csv`
|
||||
- `pdf`
|
||||
|
||||
---
|
||||
|
||||
## 10. Relationships
|
||||
|
||||
## One-to-one / aggregate-root relationships
|
||||
- `DatasetReviewSession` → `DatasetProfile` (current active profile view)
|
||||
- `DatasetReviewSession` → `ClarificationSession` (current or latest)
|
||||
- `DatasetReviewSession` → `CompiledPreview` (latest/current preview)
|
||||
- `DatasetReviewSession` → `DatasetRunContext` (latest/current launch audit)
|
||||
|
||||
## One-to-many relationships
|
||||
- `DatasetReviewSession` → many `ValidationFinding`
|
||||
- `DatasetReviewSession` → many `SemanticSource`
|
||||
- `DatasetReviewSession` → many `SemanticFieldEntry`
|
||||
- `SemanticFieldEntry` → many `SemanticCandidate`
|
||||
- `DatasetReviewSession` → many `ImportedFilter`
|
||||
- `DatasetReviewSession` → many `TemplateVariable`
|
||||
- `DatasetReviewSession` → many `ExecutionMapping`
|
||||
- `ClarificationSession` → many `ClarificationQuestion`
|
||||
- `ClarificationQuestion` → many `ClarificationOption`
|
||||
- `ClarificationQuestion` → zero/one current `ClarificationAnswer`
|
||||
- `DatasetReviewSession` → many `ExportArtifact`
|
||||
- `DatasetReviewSession` → many `SessionEvent`
|
||||
- `DatasetReviewSession` → many `SessionEvent`
|
||||
|
||||
---
|
||||
|
||||
## 11. Derived Rules and Invariants
|
||||
|
||||
### Run readiness invariant
|
||||
A session is `run_ready` only if:
|
||||
- no open blocking findings remain,
|
||||
- all required template variables have approved/effective mappings,
|
||||
- all launch-sensitive mapping warnings have been explicitly approved,
|
||||
- a non-stale `CompiledPreview` exists for the current fingerprint.
|
||||
|
||||
### Manual intent invariant
|
||||
If a field is manually overridden:
|
||||
- `SemanticFieldEntry.is_locked = true`
|
||||
- `SemanticFieldEntry.provenance = manual_override`
|
||||
- later imports or inferred candidates may be recorded, but cannot replace the active value automatically.
|
||||
|
||||
### Progressive recovery invariant
|
||||
Partial Superset recovery must preserve usable state:
|
||||
- imported filters may be `partial`,
|
||||
- unresolved variables may remain `unmapped`,
|
||||
- findings must explain what is still missing,
|
||||
- session remains resumable.
|
||||
|
||||
### Clarification persistence invariant
|
||||
Clarification answers must be persisted before:
|
||||
- finding severity is downgraded,
|
||||
- profile state is updated,
|
||||
- current question pointer advances.
|
||||
|
||||
### Preview truth invariant
|
||||
Compiled preview must be:
|
||||
- generated by Superset,
|
||||
- tied to the exact current effective inputs,
|
||||
- treated as invalid if mappings/values change afterward.
|
||||
|
||||
---
|
||||
|
||||
## 12. Migration & Evolution Strategy
|
||||
- **Baseline**: The initial implementation (Milestone 1) will include the core session and profile entities.
|
||||
- **Incremental Growth**: Subsequent milestones will add clarification, mapping, and launch audit entities via standard SQLAlchemy migrations.
|
||||
- **Compatibility**: The `DatasetReviewSession` aggregate root will remain the stable entry point for all sub-entities to ensure forward compatibility with saved user state.
|
||||
|
||||
## 13. Suggested Backend DTO Grouping
|
||||
|
||||
The future API and persistence layers should group models roughly as follows:
|
||||
|
||||
### Session DTOs
|
||||
- `SessionSummary`
|
||||
- `SessionDetail`
|
||||
- `SessionListItem`
|
||||
|
||||
### Review DTOs
|
||||
- `DatasetProfileDto`
|
||||
- `ValidationFindingDto`
|
||||
- `ReadinessChecklistDto`
|
||||
|
||||
### Semantic DTOs
|
||||
- `SemanticSourceDto`
|
||||
- `SemanticFieldEntryDto`
|
||||
- `SemanticCandidateDto`
|
||||
|
||||
### Clarification DTOs
|
||||
- `ClarificationSessionDto`
|
||||
- `ClarificationQuestionDto`
|
||||
- `ClarificationAnswerRequest`
|
||||
|
||||
### Execution DTOs
|
||||
- `ImportedFilterDto`
|
||||
- `TemplateVariableDto`
|
||||
- `ExecutionMappingDto`
|
||||
- `CompiledPreviewDto`
|
||||
- `LaunchSummaryDto`
|
||||
|
||||
### Export DTOs
|
||||
- `ExportArtifactDto`
|
||||
|
||||
---
|
||||
|
||||
## 13. Open Modeling Notes Resolved
|
||||
|
||||
The Phase 0 research questions are considered resolved for design purposes:
|
||||
- SQL preview is modeled as a first-class persisted artifact.
|
||||
- SQL Lab is modeled as the only canonical launch target.
|
||||
- semantic resolution and clarification are modeled as separate domain boundaries.
|
||||
- field-level overrides and mapping approvals are first-class entities.
|
||||
- session persistence is separate from task execution state.
|
||||
|
||||
This model is ready to drive:
|
||||
- [`contracts/modules.md`](./contracts/modules.md)
|
||||
- [`contracts/api.yaml`](./contracts/api.yaml)
|
||||
- [`quickstart.md`](./quickstart.md)
|
||||
320
specs/027-dataset-llm-orchestration/plan.md
Normal file
320
specs/027-dataset-llm-orchestration/plan.md
Normal file
@@ -0,0 +1,320 @@
|
||||
# Implementation Plan: LLM Dataset Orchestration
|
||||
|
||||
**Branch**: `027-dataset-llm-orchestration` | **Date**: 2026-03-16 | **Spec**: `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/spec.md`
|
||||
**Input**: Feature specification from `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/spec.md`
|
||||
|
||||
**Note**: This template is filled in by the `/speckit.plan` command. See `/home/busya/dev/ss-tools/.specify/templates/plan-template.md` for the execution workflow.
|
||||
|
||||
## Summary
|
||||
|
||||
Deliver a dataset-centered orchestration flow that lets users start from a Superset link or dataset selection, recover analytical context, enrich semantics from trusted sources before AI generation, resolve ambiguity through guided clarification, generate a Superset-side compiled SQL preview, and launch an audited SQL Lab execution only when readiness gates pass.
|
||||
|
||||
The implementation will extend the existing FastAPI + SvelteKit architecture rather than creating a parallel subsystem. Backend work will add a persisted review-session domain, orchestration services for semantic recovery and clarification, Superset adapters for context extraction and SQL Lab execution, and explicit APIs for mapping approvals and field-level semantic overrides. Frontend work will add a dedicated dataset review workspace with progressive recovery, semantic-source review, one-question-at-a-time clarification, mapping approval controls, compiled SQL preview, and resumable session state.
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Python 3.9+ backend, Node.js 18+ frontend, Svelte 5 / SvelteKit frontend runtime
|
||||
**Primary Dependencies**: FastAPI, SQLAlchemy, Pydantic, existing `TaskManager`, existing `SupersetClient`, existing LLM provider stack, SvelteKit, Tailwind CSS, frontend `requestApi`/`fetchApi` wrappers
|
||||
**Storage**: Existing application databases for persistent session/domain entities; existing tasks database for async execution metadata; filesystem for optional uploaded semantic sources/artifacts
|
||||
**Testing**: pytest for backend unit/integration/API tests; Vitest for frontend component/store/API-wrapper tests
|
||||
**Target Platform**: Linux-hosted FastAPI + Svelte web application integrated with Superset
|
||||
**Project Type**: Web application with backend API and frontend SPA
|
||||
**Performance Goals**:
|
||||
- Initial summary generation: < 30s (Progressive recovery visible within < 5s)
|
||||
- Preview compilation: < 10s
|
||||
- Session load / Resume: < 2s
|
||||
- SC-002 target: first readable summary under 5 minutes for complex datasets.
|
||||
**Constraints**: Launch must remain blocked without successful Superset-side compiled preview; long-running recovery/enrichment/preview work must be asynchronous and observable; frontend must use existing API wrappers instead of native fetch; manual semantic overrides must never be silently overwritten; auditability and provenance are prioritized over raw throughput
|
||||
**Scale/Scope**: One end-to-end feature spanning dataset intake, session persistence, semantic enrichment, clarification, mapping approval, preview, and launch; multiple new backend services/APIs plus a new multi-state frontend workspace
|
||||
|
||||
## Constitution Check
|
||||
|
||||
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
||||
|
||||
### Pre-Research Gate Assessment
|
||||
|
||||
1. **Semantic protocol compliance — PASS WITH REQUIRED PHASE 1 EXPANSION**
|
||||
- New backend orchestration and persistence modules must follow `/home/busya/dev/ss-tools/.ai/standards/semantics.md`.
|
||||
- Existing draft contracts are incomplete for the feature scope; Phase 1 must add explicit contracts for semantic-source resolution, clarification lifecycle, Superset context extraction, session persistence, and missing UI states.
|
||||
- Complexity 4/5 Python modules must explicitly define `logger.reason()` / `logger.reflect()` paths; Complexity 5 boundaries must use `belief_scope`.
|
||||
|
||||
2. **Complexity-driven contract coverage — PASS WITH GAPS TO CLOSE**
|
||||
- The core orchestration boundary is Complexity 5 because it gates launch, audit, state transitions, and cross-service consistency.
|
||||
- Semantic source resolution, clarification workflow, mapping approval state, and session persistence each require explicit contracts instead of being hidden inside one orchestrator.
|
||||
- UI contracts must map to the UX state machine, especially `Empty`, `Importing`, `Review Ready`, `Semantic Source Review Needed`, `Clarification Active`, `Mapping Review Needed`, `Compiled Preview Ready`, `Run Ready`, `Run In Progress`, `Completed`, and `Recovery Required`.
|
||||
|
||||
3. **UX-state compatibility — PASS**
|
||||
- The architecture can support the UX reference because:
|
||||
- recovery can be progressive and asynchronous,
|
||||
- clarification can be session-backed and resumed,
|
||||
- preview generation can be represented as a stateful asynchronous action,
|
||||
- launch remains a gated terminal action.
|
||||
- If Phase 0 research later shows Superset cannot provide reliable compilation preview or SQL Lab execution hooks compatible with the required interaction model, planning must stop and the UX contract must be renegotiated.
|
||||
|
||||
4. **Async boundaries — PASS**
|
||||
- Long-running work already fits the repository constitution through `TaskManager`.
|
||||
- Session start, deep context recovery, semantic enrichment from external sources, preview generation, and launch-hand-off side effects should be dispatched as tasks or internally asynchronous service steps with observable state changes.
|
||||
|
||||
5. **Frontend API-wrapper rules — PASS**
|
||||
- Existing frontend uses `/home/busya/dev/ss-tools/frontend/src/lib/api.js` wrappers.
|
||||
- New frontend work must use `requestApi`, `fetchApi`, `postApi`, or wrapper modules only; native `fetch` remains forbidden.
|
||||
|
||||
6. **RBAC/security constraints — PASS WITH DESIGN REQUIREMENT**
|
||||
- New endpoints must use existing auth and permission dependencies.
|
||||
- New orchestration actions need explicit permission modeling for reading sessions, editing semantic mappings, answering clarification prompts, generating previews, and launching runs.
|
||||
- Session data must remain self-scoped/auditable and must not permit cross-user mutation without explicit policy.
|
||||
- **Action**: Add `DATASET_REVIEW_*` permissions to `backend/src/scripts/seed_permissions.py`.
|
||||
|
||||
7. **Security & Threat Model — PASS**
|
||||
- Session isolation: Every session record is strictly bound to `user_id`. Query filters must include owner check.
|
||||
- Audit trail: `DatasetRunContext` is immutable after launch.
|
||||
- Credential handling: Reuse existing `SupersetClient` encrypted configuration.
|
||||
- **Action**: API endpoints must use `Depends(get_current_user)` and explicit permission checks.
|
||||
|
||||
7. **Belief-state/logging constraints — PASS WITH REQUIRED APPLICATION**
|
||||
- Complexity 4/5 Python orchestration modules will require `belief_scope` plus meaningful `logger.reason()` and `logger.reflect()` traces around state transitions, preview validation, warning approvals, and launch gating.
|
||||
|
||||
### Post-Design Gate Assessment
|
||||
|
||||
1. **Semantic protocol compliance — PASS**
|
||||
- All modules in `contracts/modules.md` follow the complexity-driven metadata requirements.
|
||||
- Relation syntax matches the canonical `@RELATION: [PREDICATE] ->[TARGET_ID]` format.
|
||||
- Python modules (Complexity 4/5) explicitly specify `logger.reason()` and `belief_scope` requirements in their contracts.
|
||||
|
||||
2. **API Schema Completeness — PASS**
|
||||
- `contracts/api.yaml` provides a fully typed OpenAPI 3.0.3 specification.
|
||||
- Every session lifecycle, semantic review, and execution gate is covered by a typed endpoint.
|
||||
|
||||
3. **UX-Technical Alignment — PASS**
|
||||
- Design supports the WYSIWWR principle via `SupersetCompilationAdapter`.
|
||||
- Fallback strategies for missing preview or SQL Lab hooks are defined in `research.md`.
|
||||
|
||||
### Final Gate Result
|
||||
|
||||
**PASS** - The implementation plan and design artifacts are constitution-compliant and ready for task breakdown.
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/
|
||||
├── plan.md
|
||||
├── research.md
|
||||
├── data-model.md
|
||||
├── quickstart.md
|
||||
├── contracts/
|
||||
│ ├── api.yaml
|
||||
│ └── modules.md
|
||||
└── tasks.md
|
||||
```
|
||||
|
||||
### Source Code (repository root)
|
||||
|
||||
```text
|
||||
/home/busya/dev/ss-tools/backend/
|
||||
├── src/
|
||||
│ ├── api/
|
||||
│ │ └── routes/
|
||||
│ ├── core/
|
||||
│ ├── models/
|
||||
│ ├── schemas/
|
||||
│ └── services/
|
||||
|
||||
/home/busya/dev/ss-tools/frontend/
|
||||
├── src/
|
||||
│ ├── lib/
|
||||
│ │ ├── api/
|
||||
│ │ ├── components/
|
||||
│ │ ├── i18n/
|
||||
│ │ └── stores/
|
||||
│ └── routes/
|
||||
|
||||
/home/busya/dev/ss-tools/backend/src/api/routes/__tests__/
|
||||
/home/busya/dev/ss-tools/backend/src/services/__tests__/
|
||||
/home/busya/dev/ss-tools/frontend/src/lib/**/__tests__/
|
||||
/home/busya/dev/ss-tools/frontend/src/routes/**/__tests__/
|
||||
```
|
||||
|
||||
**Structure Decision**: Use the repository’s existing web-application split. Backend implementation belongs under `/home/busya/dev/ss-tools/backend/src/{models,schemas,services,api/routes}`. Frontend implementation belongs under `/home/busya/dev/ss-tools/frontend/src/{routes,lib/components,lib/api,lib/stores}`. Tests will stay adjacent to their current backend/frontend conventions.
|
||||
|
||||
## Semantic Contract Guidance
|
||||
|
||||
> Use this section to drive Phase 1 artifacts, especially `contracts/modules.md`.
|
||||
|
||||
### Planned Critical/High-Value Modules
|
||||
|
||||
- `DatasetReviewOrchestrator` — `@COMPLEXITY: 5`
|
||||
- `SemanticSourceResolver` — `@COMPLEXITY: 4`
|
||||
- `ClarificationEngine` — `@COMPLEXITY: 4`
|
||||
- `SupersetContextExtractor` — `@COMPLEXITY: 4`
|
||||
- `SupersetCompilationAdapter` — `@COMPLEXITY: 4`
|
||||
- `DatasetReviewSessionRepository` or equivalent persistence boundary — `@COMPLEXITY: 5`
|
||||
- `DatasetReviewWorkspace` — `@COMPLEXITY: 5`
|
||||
- `SourceIntakePanel` — `@COMPLEXITY: 3`
|
||||
- `ValidationFindingsPanel` — `@COMPLEXITY: 3`
|
||||
- `SemanticLayerReview` — `@COMPLEXITY: 3`
|
||||
- `ClarificationDialog` — `@COMPLEXITY: 3`
|
||||
- `ExecutionMappingReview` — `@COMPLEXITY: 3`
|
||||
- `CompiledSQLPreview` — `@COMPLEXITY: 3`
|
||||
- `LaunchConfirmationPanel` — `@COMPLEXITY: 3`
|
||||
|
||||
### Required Semantic Rules
|
||||
|
||||
- Use `@COMPLEXITY` or `@C:` as the primary rule source.
|
||||
- Match contract density to complexity:
|
||||
- Complexity 1: anchors only, `@PURPOSE` optional
|
||||
- Complexity 2: `@PURPOSE`
|
||||
- Complexity 3: `@PURPOSE`, `@RELATION`; UI also `@UX_STATE`
|
||||
- Complexity 4: `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; Python also meaningful `logger.reason()` / `logger.reflect()` path
|
||||
- Complexity 5: level 4 + `@DATA_CONTRACT`, `@INVARIANT`; Python also `belief_scope`; UI also `@UX_FEEDBACK`, `@UX_RECOVERY`, `@UX_REACTIVITY`
|
||||
- Write relations only in canonical form: `@RELATION: [PREDICATE] ->[TARGET_ID]`
|
||||
- If any relation target, DTO, or contract dependency is unknown, emit `[NEED_CONTEXT: target]` instead of inventing placeholders.
|
||||
- Preserve medium-appropriate anchor/comment syntax for Python, Svelte markup, and Svelte script contexts.
|
||||
|
||||
## Phase 0: Research Agenda
|
||||
|
||||
### Open Questions Requiring Resolution
|
||||
|
||||
1. How to reliably extract saved native filters from supported Superset links and versions.
|
||||
2. How to discover dataset runtime template variables and Jinja placeholders using available Superset APIs and dataset payloads.
|
||||
3. How to perform a safe Superset-side compiled SQL preview compatible with the current deployment/version.
|
||||
4. How to create or bind a SQL Lab execution session as the canonical audited launch target.
|
||||
5. How to model semantic source ranking, fuzzy match review, conflict detection, and provenance without collapsing into an orchestration god-object.
|
||||
6. How to persist resumable clarification and review sessions using the current database stack.
|
||||
7. How to design typed API contracts that support field-level semantic operations, mapping approval flow, and session lifecycle operations.
|
||||
8. How to degrade gracefully when Superset import/preview or LLM enrichment only partially succeeds.
|
||||
|
||||
### Required Research Outputs
|
||||
|
||||
Research must produce explicit decisions for:
|
||||
- Superset link parsing and recovery strategy
|
||||
- Superset compilation/SQL Lab integration approach
|
||||
- Semantic source resolution architecture
|
||||
- Clarification session persistence model
|
||||
- Session persistence/audit model
|
||||
- API schema granularity and endpoint set
|
||||
- Test strategy for Superset-dependent and LLM-dependent flows
|
||||
- Delivery milestones for incremental rollout
|
||||
|
||||
## Phase 1: Design Focus
|
||||
|
||||
Phase 1 must generate:
|
||||
- typed domain entities and DTOs in `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/data-model.md`
|
||||
- expanded semantic contracts in `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/contracts/modules.md`
|
||||
- typed OpenAPI schemas and missing endpoints in `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/contracts/api.yaml`
|
||||
- execution and validation guide in `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/quickstart.md`
|
||||
|
||||
Phase 1 must specifically close the current gaps around:
|
||||
- field-level semantic operations,
|
||||
- clarification engine responsibilities,
|
||||
- mapping approval endpoints,
|
||||
- session lifecycle APIs,
|
||||
- exportable outputs,
|
||||
- error-path validation scenarios,
|
||||
- alignment between UX states and UI contracts.
|
||||
|
||||
## Delivery Milestones
|
||||
|
||||
| Milestone | FR Coverage | Scope | User Value |
|
||||
|-----------|-------------|-------|------------|
|
||||
| M1: Sessioned Auto Review | FR-001 to FR-011, FR-035, FR-037 | Source intake, dataset review session, initial profile, findings, provenance, semantic-source application, export of review outputs | Users get immediate documentation, validation, and trusted-source enrichment without manual reconstruction |
|
||||
| M2: Guided Clarification | FR-012 to FR-020, FR-036, FR-038, FR-039, FR-040 | Clarification engine, resumable questions, question templates/eval, field-level semantic overrides, conflict review, progress persistence | Users can resolve ambiguity safely and preserve manual intent |
|
||||
| M3: Controlled Execution | FR-021 to FR-034 | Filter extraction, template-variable mapping, warning approvals, compiled preview, SQL Lab launch, manual export path, audited run context | Users can move from recovered context to reproducible execution with clear readiness gates |
|
||||
|
||||
## RBAC Model
|
||||
|
||||
| Permission | Description | Target Role(s) |
|
||||
|------------|-------------|----------------|
|
||||
| `dataset:session:read` | View own review sessions | Analytics Engineer, BI Engineer, Data Steward |
|
||||
| `dataset:session:manage` | Edit mappings, answer questions, override semantics | Analytics Engineer, BI Engineer |
|
||||
| `dataset:session:approve` | Approve warning-level mappings | Senior Analytics Engineer, Data Steward |
|
||||
| `dataset:execution:preview` | Trigger Superset SQL compilation preview | Analytics Engineer, BI Engineer |
|
||||
| `dataset:execution:launch` | Create SQL Lab session in target environment | Analytics Engineer, BI Engineer |
|
||||
| `dataset:execution:launch_prod` | Launch in Production-staged environment | Senior Analytics Engineer |
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Service Reuse (Critical)
|
||||
- **Superset Interaction**: Use existing `backend/src/core/superset_client.py` (do not duplicate HTTP clients).
|
||||
- **LLM Interaction**: Use existing `backend/src/services/llm_provider.py` via `LLMProviderService`.
|
||||
- **Notifications**: Integrate with `NotificationService` for launch outcomes and preview readiness.
|
||||
- **i18n**: Use existing `frontend/src/lib/i18n/` for all user-facing strings in the review workspace.
|
||||
|
||||
## Rollout & Monitoring
|
||||
|
||||
### Feature Flags
|
||||
- `ff_dataset_auto_review`: Enables basic documentation and intake.
|
||||
- `ff_dataset_clarification`: Enables guided dialogue mode.
|
||||
- `ff_dataset_execution`: Enables preview and launch capabilities.
|
||||
|
||||
### Metrics & Alerting
|
||||
- **Metrics**: Session completion rate, time-to-first-summary, preview failure rate (Superset compilation errors vs connection errors), clarification engagement.
|
||||
- **Alerting**: High rate of `503` Superset API failures; persistent LLM provider timeouts (> 30s); unauthorized cross-session access attempts.
|
||||
|
||||
## Implementation Sequencing
|
||||
|
||||
### Backend First
|
||||
1. Add persistent review-session domain model and schemas.
|
||||
2. Add orchestration services and Superset adapters.
|
||||
3. Add typed API endpoints and explicit RBAC.
|
||||
4. Add task/event integration and audit persistence.
|
||||
5. Add backend tests for session lifecycle, preview gating, launch gating, and degradation paths.
|
||||
|
||||
### Frontend Next
|
||||
1. Add dataset review route/workspace shell and session loading.
|
||||
2. Add source-intake, summary, findings, and semantic review panels.
|
||||
3. Add clarification dialog and mapping approval UI.
|
||||
4. Add compiled preview and launch confirmation UI.
|
||||
5. Add frontend tests for state transitions, wrappers, and critical UX invariants.
|
||||
|
||||
### Integration/Hardening
|
||||
1. Validate Superset version compatibility against real/staged environment.
|
||||
2. Verify progressive session recovery and resume flows.
|
||||
3. Verify audit replay/run-context capture.
|
||||
4. Measure success-criteria instrumentation feasibility.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Backend
|
||||
- **Unit tests** for semantic ranking, provenance/conflict rules, clarification prioritization, preview gating, and launch guards.
|
||||
- **Integration tests** for session persistence, Superset adapter behavior, SQL preview orchestration, and SQL Lab launch orchestration with mocked upstream responses.
|
||||
- **API contract tests** for typed response schemas, RBAC enforcement, mapping approval operations, field-level semantic edits, export operations, and session lifecycle.
|
||||
|
||||
### Frontend
|
||||
- **Unit/component tests** for state-driven UI contracts, provenance rendering, one-question clarification, mapping approval flow, stale preview handling, and launch gating visuals.
|
||||
- **Integration-style route tests** for resume flows, progressive loading, and error recovery states.
|
||||
|
||||
### External Dependency Strategy
|
||||
- Mock Superset APIs for CI determinism.
|
||||
- Use stable fixtures/snapshots for LLM-produced structured outputs.
|
||||
- Treat provider/transport failure as explicit degraded states rather than semantic failure.
|
||||
- Include replayable fixtures for imported filters, template variables, conflict cases, and compilation errors.
|
||||
|
||||
## Risks & Mitigations
|
||||
|
||||
| Risk | Why It Matters | Mitigation |
|
||||
|------|----------------|------------|
|
||||
| Superset version lacks a stable compiled-preview endpoint | FR-029 and WYSIWWR depend on native Superset-side compilation | Resolve in Phase 0; if unsupported, stop and renegotiate UX/feature scope before implementation |
|
||||
| Superset link/native filter formats differ across installations | Could make import brittle or partial | Design recovery as best-effort with explicit provenance and recovery-required state |
|
||||
| SQL Lab launch handoff is inconsistent across environments | FR-032 requires canonical audited launch target | Research version-compatible creation strategy and define fallback as blocked, not silent substitution |
|
||||
| Semantic resolution logic becomes an orchestration god-object | Hurts maintainability and contract traceability | Separate `SemanticSourceResolver`, `ClarificationEngine`, and Superset extraction responsibilities |
|
||||
| Fuzzy matching creates too many false positives | Undermines trust and increases approval burden | Keep explicit confidence hierarchy, review-required fuzzy matches, and field-level selective application |
|
||||
| LLM/provider outages interrupt review quality | Could block non-critical enrichment | Degrade to partial review state with preserved trusted-source results and explicit next action |
|
||||
| Session lifecycle becomes hard to resume safely | FR-019 and FR-036 require resumability | Persist answers, approvals, and current recommended action as first-class session state |
|
||||
|
||||
## Post-Design Re-Check Criteria
|
||||
|
||||
After Phase 1 artifacts are produced, re-check:
|
||||
- semantic protocol coverage against all planned modules/components,
|
||||
- UX-state coverage against `/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/ux_reference.md`,
|
||||
- explicit API support for field-level semantic actions, mapping approval, exports, and session lifecycle,
|
||||
- belief-state/logging expectations for Complexity 4/5 Python modules,
|
||||
- typed schemas sufficient for backend/frontend parallel implementation,
|
||||
- quickstart coverage of happy path plus critical negative/recovery paths.
|
||||
|
||||
## Complexity Tracking
|
||||
|
||||
> **Fill ONLY if Constitution Check has violations that must be justified**
|
||||
|
||||
No justified constitution violations at planning time.
|
||||
298
specs/027-dataset-llm-orchestration/quickstart.md
Normal file
298
specs/027-dataset-llm-orchestration/quickstart.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# Quickstart: LLM Dataset Orchestration
|
||||
|
||||
**Feature**: [LLM Dataset Orchestration](./spec.md)
|
||||
**Branch**: `027-dataset-llm-orchestration`
|
||||
|
||||
This guide validates the end-to-end workflow for dataset review, semantic enrichment, clarification, preview generation, and controlled SQL Lab launch.
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
1. Access to a configured Superset environment with:
|
||||
- at least one dataset,
|
||||
- at least one dashboard URL containing reusable analytical context,
|
||||
- permissions sufficient for dataset inspection and SQL Lab session creation.
|
||||
2. An active LLM provider configured in ss-tools.
|
||||
3. Optional semantic sources for enrichment testing:
|
||||
- uploaded spreadsheet dictionary,
|
||||
- connected tabular dictionary,
|
||||
- trusted reference Superset dataset.
|
||||
4. A test user account with permission to create and resume sessions.
|
||||
5. A second test user account for ownership/visibility guard validation.
|
||||
|
||||
---
|
||||
|
||||
## 2. Primary End-to-End Happy Path
|
||||
|
||||
### Step 1: Start Review Session
|
||||
- Navigate to the dataset-review workflow entry from the datasets area.
|
||||
- Start a session using one of:
|
||||
- a Superset dashboard link with saved filters,
|
||||
- a direct dataset selection.
|
||||
- **Verify**:
|
||||
- a new session is created,
|
||||
- the session gets a visible readiness state,
|
||||
- the first recommended action is explicit.
|
||||
|
||||
### Step 2: Observe Progressive Recovery
|
||||
- Keep the session open while recovery runs.
|
||||
- **Verify** progressive updates appear for:
|
||||
- dataset recognition,
|
||||
- imported filter recovery,
|
||||
- template/Jinja variable discovery,
|
||||
- preliminary semantic-source candidates,
|
||||
- first-pass business summary.
|
||||
- **Verify** partial work is shown before the whole pipeline finishes.
|
||||
|
||||
### Step 3: Review Automatic Analysis
|
||||
- Inspect the generated business summary and validation findings.
|
||||
- **Verify**:
|
||||
- the summary is readable by an operational stakeholder,
|
||||
- findings are grouped by severity,
|
||||
- provenance/confidence markers distinguish confirmed/imported/inferred/AI-draft values,
|
||||
- the next recommended action changes appropriately.
|
||||
|
||||
### Step 4: Apply Semantic Source
|
||||
- Use **Apply semantic source** and choose:
|
||||
- spreadsheet dictionary,
|
||||
- connected dictionary,
|
||||
- trusted reference dataset.
|
||||
- **Verify**:
|
||||
- exact matches are applied as stronger candidates,
|
||||
- fuzzy matches remain reviewable rather than silently applied,
|
||||
- semantic conflicts are shown side by side,
|
||||
- field-level manual overrides remain possible.
|
||||
|
||||
### Step 5: Confirm Field-Level Semantics
|
||||
- Manually override one field’s `verbose_name` or description.
|
||||
- Apply another semantic source afterward.
|
||||
- **Verify**:
|
||||
- the manual field remains locked,
|
||||
- imported/generated values do not silently overwrite it,
|
||||
- provenance changes to manual override.
|
||||
|
||||
### Step 6: Guided Clarification
|
||||
- Enter clarification mode from a session with unresolved findings.
|
||||
- Answer one question using a suggested option.
|
||||
- Answer another with a custom value.
|
||||
- Skip one question.
|
||||
- Mark one for expert review.
|
||||
- **Verify**:
|
||||
- only one active question is shown at a time,
|
||||
- each question includes “why this matters” and current guess,
|
||||
- answers update readiness/findings/profile state,
|
||||
- skipped and expert-review items remain visible as unresolved.
|
||||
|
||||
### Step 7: Pause and Resume
|
||||
- Save or pause the session mid-clarification.
|
||||
- Leave the page and reopen the session.
|
||||
- **Verify**:
|
||||
- the session resumes with prior answers intact,
|
||||
- the current question or next unresolved question is restored,
|
||||
- manual semantic decisions and pending mappings are preserved.
|
||||
|
||||
### Step 8: Review Mapping and Generate Preview
|
||||
- Open the mapping review section.
|
||||
- Approve one warning-level mapping transformation.
|
||||
- Manually override another transformed mapping value.
|
||||
- Trigger **Generate SQL Preview**.
|
||||
- **Verify**:
|
||||
- all required variables are visible,
|
||||
- warning approvals are explicit,
|
||||
- the preview is read-only,
|
||||
- preview status shows it was compiled by Superset,
|
||||
- substituted values are visible in the final SQL.
|
||||
|
||||
### Step 9: Launch Dataset
|
||||
- Move the session to `Run Ready`.
|
||||
- Click **Launch Dataset**.
|
||||
- **Verify**:
|
||||
- launch confirmation shows dataset identity, effective filters, parameter values, warnings, and preview status,
|
||||
- a SQL Lab session reference is returned,
|
||||
- an audited run context is stored,
|
||||
- the session moves to run-in-progress or completed state appropriately.
|
||||
|
||||
### Step 10: Export Outputs
|
||||
- Export documentation.
|
||||
- Export validation findings.
|
||||
- **Verify**:
|
||||
- both artifacts are generated,
|
||||
- artifact metadata or file reference is associated with the session,
|
||||
- exported output reflects the current reviewed state.
|
||||
|
||||
### Step 11: Collaboration and Review
|
||||
- As User A, add User B as a `reviewer`.
|
||||
- Access the same session as User B.
|
||||
- **Verify**:
|
||||
- User B can view the session state.
|
||||
- User B can answer clarification questions but cannot approve launch-critical mappings.
|
||||
- Audit log (if implemented) records which user performed which action.
|
||||
|
||||
---
|
||||
|
||||
## 3. Negative and Recovery Scenarios
|
||||
|
||||
### Scenario A: Invalid Superset Link
|
||||
- Start a session with a malformed or unsupported link.
|
||||
- **Verify**:
|
||||
- intake fails with actionable error messaging,
|
||||
- no fake recovered context is shown,
|
||||
- the user can correct input in place.
|
||||
|
||||
### Scenario B: Partial Filter Recovery
|
||||
- Use a link where only some filters can be recovered.
|
||||
- **Verify**:
|
||||
- recovered filters are shown,
|
||||
- unrecovered pieces are explicitly marked,
|
||||
- session enters `recovery_required` or equivalent partial state,
|
||||
- workflow remains usable.
|
||||
|
||||
### Scenario C: Dataset Without Clear Business Meaning
|
||||
- Use a dataset with weak metadata and no strong trusted semantic matches.
|
||||
- **Verify**:
|
||||
- the summary remains minimal but usable,
|
||||
- the system does not pretend certainty,
|
||||
- clarification becomes the recommended next step.
|
||||
|
||||
### Scenario D: Conflicting Semantic Sources
|
||||
- Apply two semantic sources that disagree for the same field.
|
||||
- **Verify**:
|
||||
- both candidates are shown side by side,
|
||||
- recommended source is visible if confidence differs,
|
||||
- no silent overwrite occurs,
|
||||
- conflict remains until explicitly resolved.
|
||||
|
||||
### Scenario E: Missing Required Runtime Value
|
||||
- Leave a required template variable unmapped.
|
||||
- Attempt preview or launch.
|
||||
- **Verify**:
|
||||
- preview or launch is blocked according to gate rules,
|
||||
- missing values are highlighted specifically,
|
||||
- recommended next action becomes completion/remediation rather than launch.
|
||||
|
||||
### Scenario F: Preview Compilation Failure
|
||||
- Provide a mapping value known to break Superset-side compilation.
|
||||
- Trigger preview.
|
||||
- **Verify**:
|
||||
- preview moves to `failed` state,
|
||||
- readable Superset error details are shown,
|
||||
- launch remains blocked,
|
||||
- the user can navigate back to the problematic mapping/value.
|
||||
|
||||
### Scenario G: Preview Staleness After Input Change
|
||||
- Successfully generate preview.
|
||||
- Change an approved mapping or required value.
|
||||
- **Verify**:
|
||||
- preview state becomes `stale`,
|
||||
- launch is blocked until preview is regenerated,
|
||||
- stale state is visible and not hidden.
|
||||
|
||||
### Scenario H: SQL Lab Launch Failure
|
||||
- Simulate or trigger SQL Lab session creation failure.
|
||||
- **Verify**:
|
||||
- launch result is marked failed,
|
||||
- the audit record still preserves attempted run context,
|
||||
- the session remains recoverable,
|
||||
- no success redirect is shown.
|
||||
|
||||
### Scenario I: Cross-User Access Guard
|
||||
- Try to open or mutate the first user’s session from a second user account (without collaborator access).
|
||||
- **Verify**:
|
||||
- access is denied,
|
||||
- no session state leaks to the second user,
|
||||
- ownership/permission is enforced on view and mutation paths.
|
||||
|
||||
---
|
||||
|
||||
## 4. UX Invariants to Validate
|
||||
|
||||
- [ ] The primary CTA always reflects the current highest-value next step.
|
||||
- [ ] The launch button stays blocked if:
|
||||
- [ ] blocking findings remain,
|
||||
- [ ] required values are missing,
|
||||
- [ ] warning-level mappings needing approval are unresolved,
|
||||
- [ ] preview is missing, failed, or stale.
|
||||
- [ ] Manual semantic overrides are never silently overwritten.
|
||||
- [ ] Every important semantic value exposes visible provenance.
|
||||
- [ ] Clarification shows one focused question at a time.
|
||||
- [ ] Partial recovery preserves usable value and explains what is missing.
|
||||
- [ ] Preview explicitly indicates it was compiled by Superset.
|
||||
- [ ] Session resume restores prior state without forcing re-entry.
|
||||
|
||||
---
|
||||
|
||||
## 5. Suggested Verification by Milestone
|
||||
|
||||
### Milestone 1: Sessioned Auto Review
|
||||
Validate:
|
||||
- source intake,
|
||||
- progressive recovery,
|
||||
- automatic documentation summary,
|
||||
- typed findings display,
|
||||
- semantic source application,
|
||||
- export endpoints.
|
||||
|
||||
### Milestone 2: Guided Clarification
|
||||
Validate:
|
||||
- clarification question flow,
|
||||
- answer persistence,
|
||||
- resume behavior,
|
||||
- conflict review,
|
||||
- field-level manual override/lock behavior.
|
||||
|
||||
### Milestone 3: Controlled Execution
|
||||
Validate:
|
||||
- mapping review,
|
||||
- explicit warning approvals,
|
||||
- Superset-side preview,
|
||||
- preview staleness handling,
|
||||
- SQL Lab launch,
|
||||
- audited run context persistence.
|
||||
|
||||
---
|
||||
|
||||
## 6. Success-Criteria Measurement Hints
|
||||
|
||||
These are not implementation metrics by themselves; they are validation hints for pilot runs.
|
||||
|
||||
### For [SC-001](./spec.md)
|
||||
Track how many submitted datasets produce an initial documentation draft without manual reconstruction.
|
||||
|
||||
### For [SC-002](./spec.md)
|
||||
Measure time from session start to first readable summary visible to the user.
|
||||
|
||||
### For [SC-003](./spec.md)
|
||||
Measure the percentage of semantic fields populated from trusted sources before AI-draft fallback.
|
||||
|
||||
### For [SC-005](./spec.md)
|
||||
Measure the percentage of eligible Superset links that produce a non-empty imported filter set usable for review.
|
||||
|
||||
### For [SC-007](./spec.md)
|
||||
Check that launched sessions always persist:
|
||||
- dataset identity,
|
||||
- effective filters,
|
||||
- template params,
|
||||
- approved mappings,
|
||||
- preview reference,
|
||||
- SQL Lab session reference,
|
||||
- outcome.
|
||||
|
||||
### For [SC-008](./spec.md)
|
||||
Run moderated first-attempt sessions and record whether users complete import → review → clarification (if needed) → preview → launch without facilitator intervention.
|
||||
|
||||
---
|
||||
|
||||
## 7. Completion Checklist
|
||||
|
||||
A Phase 1 design is operationally validated when all are true:
|
||||
|
||||
- [ ] Happy-path session can be started and completed.
|
||||
- [ ] Partial recovery behaves as explicit partial recovery, not silent failure.
|
||||
- [ ] Clarification is resumable.
|
||||
- [ ] Semantic conflict review is explicit.
|
||||
- [ ] Field-level override lock works.
|
||||
- [ ] Preview is Superset-generated and becomes stale after input mutation.
|
||||
- [ ] Launch targets SQL Lab only.
|
||||
- [ ] Export outputs are available.
|
||||
- [ ] Ownership and guard rails are enforced.
|
||||
506
specs/027-dataset-llm-orchestration/research.md
Normal file
506
specs/027-dataset-llm-orchestration/research.md
Normal file
@@ -0,0 +1,506 @@
|
||||
# Research: LLM Dataset Orchestration
|
||||
|
||||
**Feature**: [LLM Dataset Orchestration](./spec.md)
|
||||
**Branch**: `027-dataset-llm-orchestration`
|
||||
**Date**: 2026-03-16
|
||||
|
||||
## Scope
|
||||
|
||||
This document resolves the Phase 0 technical unknowns identified in [`specs/027-dataset-llm-orchestration/plan.md`](./plan.md) and establishes implementation decisions for:
|
||||
- Superset link parsing and context recovery
|
||||
- runtime variable and Jinja discovery
|
||||
- compiled SQL preview strategy
|
||||
- SQL Lab launch handoff
|
||||
- semantic source resolution architecture
|
||||
- clarification session architecture
|
||||
- typed API scope
|
||||
- persistence, testing, and rollout strategy
|
||||
|
||||
---
|
||||
|
||||
## 1. Superset Link Intake and Native Filter Recovery
|
||||
|
||||
### Decision
|
||||
Use a dedicated backend adapter, `SupersetContextExtractor`, to parse supported Superset URLs and recover context in two layers:
|
||||
1. **URL-level recovery** from dashboard, dataset, and query parameters in the provided link
|
||||
2. **Superset API enrichment** using the resolved dashboard/dataset identifiers to recover saved filter context and related dataset metadata
|
||||
|
||||
The recovered filter set will be stored as an `ImportedFilterSet` with provenance per value:
|
||||
- `superset_native`
|
||||
- `superset_url`
|
||||
- `manual`
|
||||
- `inferred`
|
||||
|
||||
If only partial filter recovery is possible, the session enters `recovery_required` rather than failing closed.
|
||||
|
||||
### Rationale
|
||||
The UX requires recovery to feel progressive rather than brittle. A pure URL parser would be too fragile and version-specific. A pure API lookup would miss context encoded in the link itself. A two-layer strategy preserves value even when one source is incomplete.
|
||||
|
||||
This also matches the spec’s requirement that partial recovery remain usable and explicitly explained.
|
||||
|
||||
### Alternatives considered
|
||||
- **Parse only URL query parameters**
|
||||
Rejected because it is too dependent on frontend link shape and misses server-side saved context.
|
||||
- **Resolve everything only through dashboard API after identifying the dashboard**
|
||||
Rejected because some context lives directly in the URL and should be captured before API reconciliation.
|
||||
- **Block session start if any filter cannot be recovered**
|
||||
Rejected because it violates the UX principle of progressive value and FR-025.
|
||||
|
||||
---
|
||||
|
||||
## 2. Runtime Template Variable and Jinja Discovery
|
||||
|
||||
### Decision
|
||||
Detect runtime execution variables using a hybrid discovery pipeline:
|
||||
1. Load dataset detail from Superset via existing [`backend/src/core/superset_client.py`](backend/src/core/superset_client.py)
|
||||
2. Extract `sql`, `is_sqllab_view`, metric expressions, and other query-bearing fields where available
|
||||
3. Apply backend parsing rules for:
|
||||
- `{{ ... }}`
|
||||
- `filter_values('...')`
|
||||
- common template parameter references
|
||||
- Jinja variables in SQL text and metric expressions
|
||||
4. Normalize discovered variables into typed `TemplateVariable` entries with:
|
||||
- variable name
|
||||
- source expression
|
||||
- inferred kind (`native_filter`, `parameter`, `derived`, `unknown`)
|
||||
- required/optional status
|
||||
- mapping status
|
||||
|
||||
The parser will not execute Jinja locally. It only discovers variables and structure.
|
||||
|
||||
### Rationale
|
||||
The UX and specification both require execution transparency, but also require Superset to remain the source of truth for final SQL compilation. Discovery is safe locally; execution is not. Separating variable discovery from compilation preserves WYSIWWR and avoids local behavior drift.
|
||||
|
||||
### Alternatives considered
|
||||
- **Use only string regex over SQL text**
|
||||
Rejected because it is simple but insufficient for richer query-bearing payloads.
|
||||
- **Execute local Jinja rendering to discover required params**
|
||||
Rejected because it conflicts with the UX principle that only Superset defines final execution.
|
||||
- **Rely on LLM to infer variables from SQL text**
|
||||
Rejected because variable discovery must be deterministic and auditable.
|
||||
|
||||
---
|
||||
|
||||
## 3. Compiled SQL Preview Strategy
|
||||
|
||||
### Decision
|
||||
Compiled preview will be generated only through a dedicated Superset-side compilation call path managed by `SupersetCompilationAdapter`. The adapter will:
|
||||
1. receive the candidate run context,
|
||||
2. submit it to a Superset-compatible preview/compilation flow,
|
||||
3. store the exact returned compiled SQL plus metadata,
|
||||
4. mark preview state as `ready`, `pending`, `failed`, or `stale`.
|
||||
|
||||
Launch remains blocked unless preview state is `ready` and bound to the current mapping/version snapshot.
|
||||
|
||||
#### Superset Version Compatibility Matrix
|
||||
The system will explicitly check `SupersetClient.get_version()` during intake to determine available features:
|
||||
- **v3.x+**: Supports all features (context recovery, native filter import, SQL compilation preview, SQL Lab launch).
|
||||
- **v2.x**: Supports best-effort context recovery; SQL compilation preview may require fallback to Manual Launch.
|
||||
- **Legacy/Unknown**: Automatic recovery disabled; session enters `recovery_required` mode immediately.
|
||||
|
||||
#### Fallback Strategy (Preview Availability < 100%)
|
||||
If Superset-side compilation is unavailable (e.g., version mismatch or unsupported dataset type):
|
||||
- The system will allow **Manual Launch Approval** by a user with `dataset:session:approve` permission.
|
||||
- The UI will explicitly display: "Native Preview Unavailable - Manual Validation Required."
|
||||
- The audit record will mark the run as "Launched without Compiled Preview."
|
||||
|
||||
### Rationale
|
||||
This is a direct requirement from FR-029 and the UX reference’s WYSIWWR principle. The compiled preview must be the same source of truth as the real execution path. Even a “close enough” local approximation would undermine trust and violate the spec.
|
||||
|
||||
### Alternatives considered
|
||||
- **Local best-effort Jinja compilation with warning banner**
|
||||
Rejected because the spec explicitly forbids fallback to unverified local approximation.
|
||||
- **No preview, only launch-time validation**
|
||||
Rejected because it breaks both UX and explicit launch gate requirements.
|
||||
- **Preview generated by the LLM**
|
||||
Rejected because the LLM must not write or edit SQL directly.
|
||||
|
||||
---
|
||||
|
||||
## 4. SQL Lab Launch as Canonical Audited Target
|
||||
|
||||
### Decision
|
||||
Treat Superset SQL Lab as the primary canonical launch target for this feature. `DatasetReviewOrchestrator.launch_dataset` will:
|
||||
1. verify preview readiness,
|
||||
2. verify required-value completeness,
|
||||
3. verify warning approvals,
|
||||
4. create or bind a SQL Lab execution session in Superset,
|
||||
5. persist a `DatasetRunContext` snapshot with session reference, approved mappings, semantic decisions, and preview fingerprint.
|
||||
|
||||
#### Fallback path: Export Prepared Context
|
||||
If SQL Lab session creation is unavailable:
|
||||
- Provide an **Export Prepared Run Context (JSON/SQL)** action.
|
||||
- This allows users to manually take the compiled SQL (if available) or the effective parameter payload to Superset.
|
||||
- The system still records the `DatasetRunContext` as "Exported for Manual Launch" to preserve audit integrity.
|
||||
|
||||
### Rationale
|
||||
The specification explicitly names SQL Lab as canonical. The system must preserve auditability and replay, not merely trigger execution. Persisting the SQL Lab reference and run context as a single audited boundary supports both operational trust and later reopening.
|
||||
|
||||
### Alternatives considered
|
||||
- **Run query outside SQL Lab through another Superset endpoint**
|
||||
Rejected because it violates the canonical target clarified in the spec.
|
||||
- **Redirect user to a generic analytical view with params**
|
||||
Rejected because it is weaker for audit and replay.
|
||||
- **Allow fallback execution when SQL Lab creation fails**
|
||||
Rejected because the feature’s safety model depends on the audited target.
|
||||
|
||||
---
|
||||
|
||||
## 5. Semantic Enrichment Architecture
|
||||
|
||||
### Decision
|
||||
Split semantic enrichment into a dedicated `SemanticSourceResolver` domain module instead of embedding all logic inside `DatasetReviewOrchestrator`.
|
||||
|
||||
Planned responsibilities:
|
||||
- `resolve_from_file`
|
||||
- `resolve_from_dictionary`
|
||||
- `resolve_from_reference_dataset`
|
||||
- `rank_candidates`
|
||||
- `detect_conflicts`
|
||||
- `apply_field_decision`
|
||||
|
||||
Candidate ranking will follow the required confidence hierarchy:
|
||||
1. exact dictionary/file match
|
||||
2. trusted reference dataset match
|
||||
3. fuzzy semantic match
|
||||
4. AI-generated draft
|
||||
|
||||
Manual overrides are represented as a lockable terminal provenance state and cannot be overwritten implicitly.
|
||||
|
||||
### Rationale
|
||||
The feature’s semantic logic is too substantial to remain an untyped orchestrator subroutine. The reviewer feedback correctly identified this as a likely god-object risk. A dedicated resolver improves testability, contract clarity, and future extensibility.
|
||||
|
||||
### Alternatives considered
|
||||
- **Keep enrichment inside orchestrator only**
|
||||
Rejected because it centralizes too many responsibilities in a single Complexity 5 module.
|
||||
- **Push matching into frontend only**
|
||||
Rejected because provenance, confidence, and audit decisions must be persisted server-side.
|
||||
- **Use only AI generation when dictionaries are absent**
|
||||
Rejected because “reuse before invention” is a core UX principle.
|
||||
|
||||
---
|
||||
|
||||
## 6. Clarification Session Architecture
|
||||
|
||||
### Decision
|
||||
Create a dedicated `ClarificationEngine` service with persistent `ClarificationSession` state. The engine will manage:
|
||||
- unresolved-item prioritization,
|
||||
- one-question-at-a-time question generation,
|
||||
- best-guess suggestion packaging,
|
||||
- answer recording,
|
||||
- progress summarization,
|
||||
- resume-from-last-open-item behavior.
|
||||
|
||||
Answers are persisted before session state advancement.
|
||||
|
||||
#### LLM Quality & Feedback Mechanism
|
||||
To maintain high trust in AI-generated summaries, match recommendations, and questions:
|
||||
- **User Feedback**: UI will include inline feedback (👍/👎) for all AI-drafted content.
|
||||
- **Audit Logging**: Feedback is persisted in `SemanticFieldDecision` and `ClarificationAnswer` records for prompt engineering improvement.
|
||||
- **Confidence Triage**: Low-confidence LLM outputs (score < 0.7) will be automatically marked for "expert review" rather than presented as confirmed suggestions.
|
||||
|
||||
#### Question Generation Strategy
|
||||
Question templates and logic will be developed in Milestone 2:
|
||||
- **Evaluation Criteria**: Questions must be judged on: relevancy, business impact explanation, and actionability of suggestions.
|
||||
- **Templates**: Standardized prompts for field ambiguity, filter conflict, and missing run-time values.
|
||||
- **Human-in-the-loop**: High-risk semantic conflicts will prioritize "expert review" options.
|
||||
|
||||
### Rationale
|
||||
Guided clarification is not a simple endpoint action; it is a stateful interaction model with resumability and audit expectations. Treating it as a first-class module aligns with FR-012 to FR-020 and supports deterministic behavior rather than ad hoc agent chat state.
|
||||
|
||||
### Alternatives considered
|
||||
- **Use assistant conversation history as the only clarification state**
|
||||
Rejected because feature state and conversational transcript are not the same thing.
|
||||
- **Generate all questions at once**
|
||||
Rejected because the UX explicitly requires one question at a time.
|
||||
- **Do not persist skipped questions separately**
|
||||
Rejected because resume and expert-review flows need explicit unresolved-state semantics.
|
||||
|
||||
---
|
||||
|
||||
## 7. API Contract Granularity
|
||||
|
||||
### Decision
|
||||
Expand the API beyond the current session/preview/launch skeleton into a typed session contract set with explicit lifecycle and field-level operations.
|
||||
|
||||
Required endpoint groups:
|
||||
1. **Session lifecycle**
|
||||
- `POST /sessions`
|
||||
- `GET /sessions`
|
||||
- `GET /sessions/{id}`
|
||||
- `PATCH /sessions/{id}`
|
||||
- `DELETE /sessions/{id}`
|
||||
|
||||
2. **Semantic source and field-level semantic actions**
|
||||
- `POST /sessions/{id}/semantic-source`
|
||||
- `PATCH /sessions/{id}/fields/{field_id}/semantic`
|
||||
- `POST /sessions/{id}/fields/{field_id}/lock`
|
||||
- `POST /sessions/{id}/fields/{field_id}/unlock`
|
||||
|
||||
3. **Clarification**
|
||||
- `GET /sessions/{id}/clarification`
|
||||
- `POST /sessions/{id}/clarification/answers`
|
||||
- `POST /sessions/{id}/clarification/resume`
|
||||
|
||||
4. **Mapping review**
|
||||
- `GET /sessions/{id}/mappings`
|
||||
- `PATCH /sessions/{id}/mappings/{mapping_id}`
|
||||
- `POST /sessions/{id}/mappings/{mapping_id}/approve`
|
||||
|
||||
5. **Preview and launch**
|
||||
- `POST /sessions/{id}/preview`
|
||||
- `POST /sessions/{id}/launch`
|
||||
|
||||
6. **Exports**
|
||||
- `GET /sessions/{id}/exports/documentation`
|
||||
- `GET /sessions/{id}/exports/validation`
|
||||
|
||||
Core schemas must be fully typed, especially:
|
||||
- `ValidationFinding`
|
||||
- `ImportedFilter`
|
||||
- `TemplateVariable`
|
||||
- `MappingDecision`
|
||||
- `SemanticFieldEntry`
|
||||
- `ClarificationQuestion`
|
||||
- `DatasetRunContextSummary`
|
||||
- `CompiledPreview`
|
||||
|
||||
### Rationale
|
||||
The existing API draft is structurally correct but too loose for backend/frontend parallel work. Typed schemas are a blocker for safe implementation, especially given mapping approvals, field-level overrides, and resumable state.
|
||||
|
||||
### Alternatives considered
|
||||
- **Minimal object blobs in `SessionDetail`**
|
||||
Rejected because they prevent stable frontend typing and contract validation.
|
||||
- **Expose only session-level mutations**
|
||||
Rejected because FR-040 and mapping approval UX require field-level actions.
|
||||
- **Defer exports until later**
|
||||
Rejected because FR-035 requires exportable outputs as part of the feature.
|
||||
|
||||
---
|
||||
|
||||
## 8. Persistence Model
|
||||
|
||||
### Decision
|
||||
Persist dataset review workflow as first-class application entities rather than storing everything in ad hoc task payloads.
|
||||
|
||||
Persistence boundaries will include:
|
||||
- `DatasetReviewSession`
|
||||
- `DatasetProfileSnapshot`
|
||||
- `ValidationFindingRecord`
|
||||
- `SemanticFieldDecision`
|
||||
- `ImportedFilterRecord`
|
||||
- `TemplateVariableRecord`
|
||||
- `MappingApprovalRecord`
|
||||
- `ClarificationSessionRecord`
|
||||
- `ClarificationAnswerRecord`
|
||||
- `CompiledPreviewRecord`
|
||||
- `DatasetRunContextRecord`
|
||||
|
||||
Long-running task execution remains in the existing task system, but tasks reference session identifiers rather than replacing session state.
|
||||
|
||||
### Rationale
|
||||
The current task system is good for observability and async work, but not sufficient as the canonical store for resumable feature state. The feature requires reopening, editing, exporting, and replaying a session independently of task lifecycle.
|
||||
|
||||
### Alternatives considered
|
||||
- **Store all workflow state inside task params/results**
|
||||
Rejected because tasks are execution records, not editable session aggregates.
|
||||
- **Use filesystem-only session snapshots**
|
||||
Rejected because this weakens structured querying and RBAC alignment.
|
||||
- **Persist only final run context**
|
||||
Rejected because clarification and review resume need mid-flow state.
|
||||
|
||||
---
|
||||
|
||||
## 9. Async and Observability Strategy
|
||||
|
||||
### Decision
|
||||
Use `TaskManager` for expensive or multi-step operations:
|
||||
- initial deep context recovery,
|
||||
- semantic enrichment imports over large sources,
|
||||
- preview generation when upstream latency is non-trivial,
|
||||
- launch handoff if it includes multi-step Superset orchestration.
|
||||
|
||||
Short synchronous operations may still exist for simple field edits and approvals, but session state transitions must remain observable and reflected in the session model.
|
||||
|
||||
### Rationale
|
||||
This aligns with the repository constitution and existing operational model. It also supports progressive UX milestones and live readiness changes.
|
||||
|
||||
### Alternatives considered
|
||||
- **Make all session operations synchronous**
|
||||
Rejected because upstream Superset and enrichment work may exceed acceptable UI latency.
|
||||
- **Use frontend polling only with no task integration**
|
||||
Rejected because the repository already has stronger task observability patterns.
|
||||
|
||||
---
|
||||
|
||||
## 10. Frontend Integration Pattern
|
||||
|
||||
### Decision
|
||||
Implement the feature on top of the existing [`frontend/src/lib/api.js`](frontend/src/lib/api.js) wrapper conventions and Svelte 5 rune-based patterns. Add dedicated wrapper functions under the frontend API layer for dataset orchestration rather than embedding endpoint strings inside components.
|
||||
|
||||
The main route will be a dedicated review workspace separate from the current [`frontend/src/routes/datasets/+page.svelte`](frontend/src/routes/datasets/+page.svelte), which remains the dataset hub/list page.
|
||||
|
||||
### Rationale
|
||||
The current dataset page already serves as a listing and bulk-operation view. The new orchestration flow is a multi-state workspace and should not overload the existing hub component. Using wrapper functions preserves constitution compliance and testability.
|
||||
|
||||
### Alternatives considered
|
||||
- **Extend the existing dataset hub page for the whole flow**
|
||||
Rejected because it would overload a currently unrelated list page with a large state machine.
|
||||
- **Call endpoints directly with native `fetch` in new components**
|
||||
Rejected by constitution and existing frontend architecture.
|
||||
- **Build the whole flow inside assistant chat only**
|
||||
Rejected because the UX reference defines a structured workspace, not chat-only interaction.
|
||||
|
||||
---
|
||||
|
||||
## 11. Testing Strategy
|
||||
|
||||
### Decision
|
||||
Use a three-layer testing strategy:
|
||||
|
||||
#### Backend
|
||||
- **Unit tests**
|
||||
- semantic ranking/conflict rules,
|
||||
- clarification prioritization,
|
||||
- mapping approval guards,
|
||||
- preview stale/ready rules,
|
||||
- launch blocking conditions.
|
||||
|
||||
- **Integration tests**
|
||||
- session persistence,
|
||||
- Superset context extraction with fixtures,
|
||||
- compiled preview orchestration,
|
||||
- SQL Lab launch handoff,
|
||||
- export generation.
|
||||
|
||||
- **API contract tests**
|
||||
- typed schema responses,
|
||||
- RBAC guards,
|
||||
- session lifecycle endpoints,
|
||||
- field-level semantic operations,
|
||||
- mapping approval endpoints.
|
||||
|
||||
#### Frontend
|
||||
- **Component tests**
|
||||
- source intake states,
|
||||
- validation triage,
|
||||
- semantic conflict view,
|
||||
- clarification dialog,
|
||||
- mapping review,
|
||||
- compiled preview stale/error behavior,
|
||||
- launch summary gating.
|
||||
|
||||
- **Route integration tests**
|
||||
- progressive loading,
|
||||
- save/resume flows,
|
||||
- recovery-required state,
|
||||
- warning approval requirements.
|
||||
|
||||
#### External dependency handling
|
||||
- mock Superset endpoints with version-shaped fixtures,
|
||||
- mock LLM structured outputs deterministically,
|
||||
- verify degraded paths explicitly instead of pretending success.
|
||||
|
||||
### Rationale
|
||||
The feature blends deterministic parsing, stateful orchestration, and external-service dependencies. It needs more than happy-path smoke tests.
|
||||
|
||||
### Alternatives considered
|
||||
- **Rely on manual quickstart only**
|
||||
Rejected because the feature has too many state transitions and edge cases.
|
||||
- **Mock everything at unit level only**
|
||||
Rejected because contract drift with Superset and frontend schemas would go undetected.
|
||||
- **Run live LLM/Superset tests in CI**
|
||||
Rejected because determinism and cost would be poor.
|
||||
|
||||
---
|
||||
|
||||
## 12. Incremental Delivery Strategy
|
||||
|
||||
### Decision
|
||||
Implement in three milestones:
|
||||
|
||||
### Milestone 1 — Sessioned Auto Review
|
||||
Includes:
|
||||
- source intake,
|
||||
- session persistence,
|
||||
- profile/finding generation,
|
||||
- semantic-source application,
|
||||
- exportable documentation/validation outputs.
|
||||
|
||||
### Milestone 2 — Guided Clarification
|
||||
Includes:
|
||||
- clarification engine,
|
||||
- resumable dialogue state,
|
||||
- conflict review,
|
||||
- field-level semantic overrides/locks.
|
||||
|
||||
### Milestone 3 — Controlled Execution
|
||||
Includes:
|
||||
- imported filter recovery,
|
||||
- template-variable mapping,
|
||||
- mapping approval workflow,
|
||||
- compiled SQL preview,
|
||||
- SQL Lab launch,
|
||||
- audited run context.
|
||||
|
||||
### Rationale
|
||||
This delivers user value earlier and reduces risk by separating understanding from execution. It also directly addresses the reviewer feedback that the feature should not be treated as a monolithic drop.
|
||||
|
||||
### Alternatives considered
|
||||
- **Single all-at-once delivery**
|
||||
Rejected due to risk concentration and hard-to-demonstrate progress.
|
||||
- **Execution-first milestone**
|
||||
Rejected because trust, semantics, and readiness are prerequisites for safe launch.
|
||||
|
||||
---
|
||||
|
||||
## 13. Risk Resolution Outcomes
|
||||
|
||||
### Decision
|
||||
Treat the following as non-negotiable gates:
|
||||
- no Superset-side preview support → stop execution-scope planning
|
||||
- no reliable SQL Lab session handoff → block launch design
|
||||
- no typed API schemas → block parallel frontend/backend implementation
|
||||
- no explicit semantic/clarification modules → block Phase 1 contract completion
|
||||
|
||||
### Rationale
|
||||
These are architecture-shaping constraints, not minor implementation details.
|
||||
|
||||
### Alternatives considered
|
||||
- **Continue with placeholders and fill gaps during coding**
|
||||
Rejected because the feature is too stateful and contract-heavy for speculative implementation.
|
||||
|
||||
---
|
||||
|
||||
## 14. Review Feedback Incorporated
|
||||
|
||||
### Decision
|
||||
The next design phase must explicitly address the previously identified review gaps by:
|
||||
- adding `SemanticSourceResolver`,
|
||||
- adding `ClarificationEngine`,
|
||||
- expanding `SupersetCompilationAdapter` into a broader extraction + compilation boundary or adding `SupersetContextExtractor`,
|
||||
- fully typing `api.yaml`,
|
||||
- adding mapping and field-level endpoints,
|
||||
- adding session lifecycle endpoints,
|
||||
- adding error-path coverage to `quickstart.md`,
|
||||
- preserving delivery milestones in the plan.
|
||||
|
||||
### Rationale
|
||||
These gaps are valid and materially affect implementation readiness. Folding them into Phase 1 keeps the planning workflow honest and actionable.
|
||||
|
||||
### Alternatives considered
|
||||
- **Ignore review findings until code review**
|
||||
Rejected because several issues are architectural, not cosmetic.
|
||||
|
||||
---
|
||||
|
||||
## Final Phase 0 Result
|
||||
|
||||
All identified `NEEDS CLARIFICATION` items from the plan are resolved at the planning level.
|
||||
|
||||
The implementation should proceed into Phase 1 with:
|
||||
- expanded typed data model,
|
||||
- expanded contracts,
|
||||
- expanded API surface,
|
||||
- negative-path quickstart scenarios,
|
||||
- post-design constitution re-check.
|
||||
170
specs/027-dataset-llm-orchestration/spec.md
Normal file
170
specs/027-dataset-llm-orchestration/spec.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Feature Specification: LLM Dataset Orchestration
|
||||
|
||||
**Feature Branch**: `027-dataset-llm-orchestration`
|
||||
**Created**: 2026-03-16
|
||||
**Status**: Draft
|
||||
**Input**: User description: "Я хочу проработать механизм llm документирования и проверки датасетов. И в автоматическом режиме, и в режиме диалога с агентом с уточнением атрибутов и прочего неявного. Так же, нам нужен механизм запуска датасетов на стороне superset, с поддержкой jinja шаблонов. В идеале, пользователь должен скормить ссылку из суперсета с сохраненными native filters, ss-tools должен вытащить все фильтры и собрать их для датасета"
|
||||
|
||||
## Clarifications
|
||||
|
||||
### Session 2026-03-16
|
||||
|
||||
- Q: Which execution target should be canonical for approved dataset launch? → A: Superset SQL Lab session is the canonical audited launch target.
|
||||
- Q: What user action should be required to clear mapping warnings before launch? → A: Any mapping warning requires explicit user approval, but manual edit is optional.
|
||||
- Q: What should happen if Superset-side SQL compilation is unavailable before launch? → A: Launch stays blocked until Superset-side compiled preview succeeds.
|
||||
|
||||
## User Scenarios & Testing *(mandatory)*
|
||||
|
||||
### User Story 1 - Recover, enrich, and explain dataset context automatically (Priority: P1)
|
||||
|
||||
A data engineer or analytics engineer submits a dataset or a Superset link and immediately receives a readable explanation of what the dataset is, which filters were recovered, which semantic labels were reused from trusted sources, and what still needs review.
|
||||
|
||||
**Why this priority**: The first user need is fast understanding with minimal reinvention. Without an immediate and trustworthy first-pass interpretation, neither clarification nor execution provides value.
|
||||
|
||||
**Independent Test**: Can be fully tested by submitting a dataset with partial metadata or a Superset link with saved filters and verifying that the system produces a business-readable summary, distinguishes source confidence, searches trusted semantic sources before generating new labels, and shows the next recommended action without requiring manual dialogue.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a dataset with partial technical metadata, **When** the user starts automatic review, **Then** the system generates a business-readable documentation draft, groups known and unresolved attributes, and presents a current readiness state.
|
||||
2. **Given** a valid Superset link with reusable saved native filters, **When** the user imports it, **Then** the system recovers the available filter context and presents imported values separately from inferred or user-provided values.
|
||||
3. **Given** connected dictionaries, spreadsheet sources, or trusted reference datasets are available, **When** automatic review runs, **Then** the system attempts semantic enrichment from those sources before creating AI-generated labels from scratch.
|
||||
4. **Given** multiple semantic candidates exist for a field, **When** the first summary is shown, **Then** the system clearly indicates the provenance and confidence level of the chosen or suggested semantic value.
|
||||
|
||||
---
|
||||
|
||||
### User Story 2 - Resolve ambiguities through guided clarification (Priority: P2)
|
||||
|
||||
A data steward, analytics engineer, or domain expert works with an agent to resolve ambiguous business meanings, conflicting metadata, conflicting semantic sources, and missing run-time values one issue at a time.
|
||||
|
||||
**Why this priority**: Real datasets often contain implicit semantics that cannot be derived safely from source metadata alone. Guided clarification converts uncertainty into auditable decisions.
|
||||
|
||||
**Independent Test**: Can be fully tested by opening clarification mode for a dataset with ambiguous attributes or conflicting semantic sources and verifying that the system asks focused questions, explains why each question matters, stores answers, and updates readiness and validation outcomes in real time.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a dataset has blocking ambiguities, **When** the user starts guided clarification, **Then** the system asks one focused question at a time and explains the significance of the question in business terms.
|
||||
2. **Given** the system already has a current guess for an unresolved attribute, **When** the question is shown, **Then** the system presents that guess along with selectable answers, a custom-answer option, and a skip option.
|
||||
3. **Given** semantic source reuse is likely, **When** the system detects a strong match with a trusted dictionary or reference dataset, **Then** the agent can proactively suggest that source as the preferred basis for semantic enrichment.
|
||||
4. **Given** fuzzy semantic matches were found from a selected dictionary or dataset, **When** the system presents them, **Then** the user can approve them in bulk, review them individually, or keep only exact matches.
|
||||
5. **Given** the user confirms or edits an answer, **When** the response is saved, **Then** the system updates the dataset profile, validation findings, and readiness state without losing prior context.
|
||||
6. **Given** the user exits clarification before all issues are resolved, **When** the session is saved, **Then** the system preserves answered questions, unresolved questions, and the current recommended next action.
|
||||
|
||||
---
|
||||
|
||||
### User Story 3 - Prepare and launch a controlled dataset run (Priority: P3)
|
||||
|
||||
A BI engineer reviews the assembled run context, verifies filters and placeholders, understands any remaining warnings, reviews the compiled SQL preview, and launches the dataset with confidence that the execution can be reproduced later.
|
||||
|
||||
**Why this priority**: Execution is the final high-value outcome, but it must feel controlled and auditable rather than opaque.
|
||||
|
||||
**Independent Test**: Can be fully tested by preparing a dataset run from imported or manually confirmed filter context and verifying that the system blocks missing required values, blocks missing preview approval conditions, allows review and editing, and records the exact run context used.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** an assembled dataset context contains required filters and placeholders, **When** the user opens run preparation, **Then** the system shows the effective filters, unresolved assumptions, semantic provenance signals, and current run readiness in one place.
|
||||
2. **Given** required values are still missing, **When** the user attempts to launch, **Then** the system blocks launch and highlights the specific values that must be completed.
|
||||
3. **Given** warning-level mapping transformations are present, **When** the user reviews run preparation, **Then** the system requires explicit approval for each warning before launch while still allowing optional manual edits.
|
||||
4. **Given** Superset-side SQL compilation preview is unavailable or fails, **When** the user attempts to launch, **Then** the system blocks launch until a successful compiled preview is available.
|
||||
5. **Given** the dataset is run-ready, **When** the user confirms launch, **Then** the system creates or starts a Superset SQL Lab session as the canonical execution target and records the dataset identity, effective filters, parameter values, outstanding warnings, and execution outcome for later audit or replay.
|
||||
|
||||
---
|
||||
|
||||
### Edge Cases
|
||||
|
||||
- What happens when a dataset has enough structural metadata to document technically but not enough business context to explain its meaning?
|
||||
- How does the system handle a Superset link that identifies the dataset but contains no reusable native filters?
|
||||
- What happens when imported filters conflict with previously saved defaults or with the dataset’s documented business meaning?
|
||||
- How does the system handle parameterized placeholders that exist in the run context but do not yet have values?
|
||||
- What happens when a user skips clarification questions and proceeds with warnings?
|
||||
- How does the system present cases where one attribute is confirmed by a user, inferred from metadata, and contradicted by imported filter context?
|
||||
- What happens when a user leaves during clarification or run preparation and returns later?
|
||||
- What happens when a semantic label exists in a spreadsheet dictionary, a reference dataset, and an AI proposal with different values?
|
||||
- How does the system handle fuzzy semantic matches where source and target names are similar in meaning but not identical in form?
|
||||
- What happens when a user manually edits a semantic value and a higher-confidence imported source becomes available later?
|
||||
|
||||
## Requirements *(mandatory)*
|
||||
|
||||
### Functional Requirements
|
||||
|
||||
- **FR-001**: The system MUST allow users to start dataset review and execution preparation from the frontend workspace by selecting a dataset source or providing a Superset link.
|
||||
- **FR-002**: The system MUST generate an initial dataset profile that distinguishes confirmed metadata, inferred metadata, imported metadata, unresolved metadata, and AI-draft metadata where applicable.
|
||||
- **FR-003**: The system MUST produce human-readable dataset documentation that explains the dataset purpose, business meaning, major attributes, filters, and known limitations in language suitable for operational stakeholders.
|
||||
- **FR-004**: The system MUST assign and display a current readiness state for the dataset review so users can immediately understand whether the dataset is review-ready, semantic-source-review-needed, clarification-needed, partially ready, or run-ready.
|
||||
- **FR-005**: The system MUST validate dataset completeness and consistency across attributes, business semantics, semantic enrichment sources, filters, assumptions, and execution readiness.
|
||||
- **FR-006**: The system MUST classify validation findings into blocking issues, warnings, and informational findings.
|
||||
- **FR-007**: The system MUST allow users to inspect the provenance of important dataset values, including whether each value was confirmed from a connected dictionary, imported from a trusted dataset, inferred from fuzzy matching, generated as an AI draft, manually edited by a user, or still unresolved.
|
||||
- **FR-008**: The system MUST search connected semantic sources during automatic review, including supported external dictionaries and trusted reference datasets, before creating AI-generated semantic values from scratch.
|
||||
- **FR-009**: The system MUST support semantic enrichment for at least `verbose_name`, `description`, and display formatting metadata for dataset fields and metrics when such metadata is available from a trusted source.
|
||||
- **FR-010**: The system MUST apply a visible confidence hierarchy to semantic enrichment candidates in this order: exact dictionary/file match, trusted reference dataset match, fuzzy semantic match, AI-generated draft.
|
||||
- **FR-011**: The system MUST allow users to choose and apply a semantic source from the frontend workspace using supported source types, including uploaded files, connected tabular dictionaries, and existing trusted Superset datasets.
|
||||
- **FR-012**: The system MUST allow users to start a guided clarification flow for unresolved or contradictory dataset details.
|
||||
- **FR-013**: The guided clarification flow MUST present one focused question at a time rather than an unstructured list of unresolved items.
|
||||
- **FR-014**: Each clarification question MUST explain why the answer matters and, when available, show the system’s current best guess.
|
||||
- **FR-015**: The system MUST allow the user to answer with a suggested option, provide a custom answer, skip the question, or mark the item for later expert review.
|
||||
- **FR-016**: The system MUST allow the agent to proactively recommend a semantic source when schema overlap or semantic similarity with a trusted source is strong enough to justify reuse.
|
||||
- **FR-017**: The system MUST distinguish exact semantic matches from fuzzy semantic matches and MUST require user review before fuzzy matches are applied.
|
||||
- **FR-018**: The system MUST preserve answers provided during clarification and immediately update the dataset profile, validation findings, and readiness state when those answers affect review outcomes.
|
||||
- **FR-019**: The system MUST allow users to pause and resume a clarification session without losing prior answers, unresolved items, or progress state.
|
||||
- **FR-020**: The system MUST summarize what changed when a clarification session ends, including resolved ambiguities, remaining ambiguities, and impact on run readiness.
|
||||
- **FR-021**: (Consolidated with FR-001)
|
||||
- **FR-022**: The system MUST extract reusable saved native filters from a provided Superset link whenever such filters are present and accessible.
|
||||
- **FR-023**: The system MUST detect and expose runtime template variables referenced by the dataset execution logic so they can be mapped from imported or user-provided filter values.
|
||||
- **FR-024**: The system MUST present extracted filters with their current value, source, confidence state, and whether user confirmation is required.
|
||||
- **FR-025**: The system MUST preserve partially recovered value when a Superset import is incomplete and MUST explain which parts were recovered successfully and which still require manual or guided completion.
|
||||
- **FR-026**: The system MUST support dataset execution contexts that include parameterized placeholders so users can complete required run-time values before launch.
|
||||
- **FR-027**: The system MUST provide a dedicated pre-run review that presents the effective dataset identity, selected filters, required placeholders, unresolved assumptions, and current warnings in one place before launch.
|
||||
- **FR-028**: The system MUST require explicit user approval for each warning-level mapping transformation before launch, while allowing the user to manually edit the mapped value instead of approving it.
|
||||
- **FR-029**: The system MUST require a successful Superset-side compiled SQL preview before launch and MUST keep launch blocked if the preview is unavailable or compilation fails.
|
||||
- **FR-030**: The system MUST prevent dataset launch when required values, required execution attributes, required warning approvals, or a required compiled preview are missing and MUST explain what must be completed.
|
||||
- **FR-031**: The system MUST allow users to review and adjust the assembled filter set before starting a dataset run.
|
||||
- **FR-032**: The system MUST use a Superset SQL Lab session as the canonical audited execution target for approved dataset launch.
|
||||
- **FR-033**: The system MUST record the dataset run context, including dataset identity, selected filters, parameter values, unresolved assumptions, the associated SQL Lab session reference, mapping approvals, semantic-source decisions, and execution outcome, so that users can audit or repeat the run later.
|
||||
- **FR-034**: The system MUST support a workflow where automatic review, semantic enrichment, guided clarification, and dataset execution can be used independently or in sequence on the same dataset.
|
||||
- **FR-035**: The system MUST provide exportable outputs for dataset documentation and validation results so users can share them outside the immediate workflow.
|
||||
- **FR-036**: The system MUST preserve a usable frontend session state when a user stops mid-flow so they can resume review, clarification, semantic enrichment review, or run preparation without reconstructing prior work.
|
||||
- **FR-037**: The system MUST make the recommended next action explicit at each major state of the workflow.
|
||||
- **FR-038**: The system MUST provide side-by-side comparison when multiple semantic sources disagree for the same field and MUST not silently overwrite a user-entered value with imported or AI-generated metadata.
|
||||
- **FR-039**: The system MUST preserve manual semantic overrides unless the user explicitly replaces them.
|
||||
- **FR-040**: The system MUST allow users to apply semantic enrichment selectively at field level rather than only as an all-or-nothing operation.
|
||||
- **FR-041**: The system MUST provide an inline feedback mechanism (thumbs up/down) for AI-generated content to support continuous improvement of semantic matching and summarization.
|
||||
- **FR-042**: The system MUST support multi-user collaboration on review sessions, allowing owners to invite collaborators with specific roles (viewer, reviewer, approver).
|
||||
- **FR-043**: The system MUST provide batch approval actions for mapping warnings and fuzzy semantic matches to reduce manual effort for experienced users.
|
||||
- **FR-044**: The system MUST capture and persist a structured event log of all session-related actions (e.g., source intake, answer submission, approval, launch) to support audit, replay, and collaboration visibility.
|
||||
|
||||
### Key Entities *(include if feature involves data)*
|
||||
|
||||
- **Dataset Profile**: The consolidated representation of a dataset, including business purpose, attributes, filters, assumptions, readiness state, validation state, provenance of each important fact, and semantic enrichment status.
|
||||
- **Validation Finding**: A blocking issue, warning, or informational observation raised during dataset review, including severity, explanation, affected area, and resolution state.
|
||||
- **Clarification Session**: A resumable interaction record that stores unresolved questions, user answers, system guesses, expert-review flags, and remaining ambiguities for a dataset.
|
||||
- **Semantic Source**: A reusable origin of semantic metadata, such as an uploaded file, connected tabular dictionary, or trusted reference dataset, used to enrich field- and metric-level business meaning.
|
||||
- **Semantic Mapping Decision**: A recorded choice about which semantic source or proposed value was accepted, rejected, edited, or left unresolved for a field or metric.
|
||||
- **Imported Filter Set**: The collection of reusable filters extracted from a Superset link, including source context, mapped dataset fields, current values, confidence state, and confirmation status.
|
||||
- **Dataset Run Context**: The execution-ready snapshot of dataset inputs, selected filters, parameterized placeholders, unresolved assumptions, warnings, mapping approvals, semantic-source decisions, the associated SQL Lab session reference, and launch outcome used for auditing or replay.
|
||||
- **Readiness State**: The current workflow status that tells the user whether the dataset is still being recovered, ready for review, needs semantic-source review, needs clarification, is partially ready, or is ready to run.
|
||||
|
||||
## Success Criteria *(mandatory)*
|
||||
|
||||
### Measurable Outcomes
|
||||
|
||||
- **SC-001**: At least 90% of datasets submitted with standard source metadata produce an initial documentation draft without requiring manual reconstruction from scratch.
|
||||
- **SC-002**: Users can reach a first readable validation and documentation summary for a newly submitted dataset in under 5 minutes for the primary workflow.
|
||||
- **SC-003**: At least 70% of eligible semantic fields are populated from trusted external dictionaries or trusted reference datasets before AI-generated drafting is needed.
|
||||
- **SC-004**: At least 85% of clarification questions shown in guided mode are judged by pilot users as relevant and helpful to resolving ambiguity (measured via the built-in feedback mechanism).
|
||||
- **SC-005**: At least 80% of Superset links containing reusable saved native filters result in an imported filter set that users can review without rebuilding the context manually.
|
||||
- **SC-006**: At least 85% of pilot users correctly identify which values are confirmed versus imported versus inferred versus AI-generated during moderated usability review.
|
||||
- **SC-007**: At least 90% of dataset runs started from an imported or clarified context include a complete recorded run context that can be reopened later.
|
||||
- **SC-008**: Pilot users successfully complete the end-to-end flow of import, review, semantic enrichment, clarification, and launch on their first attempt in at least 75% of observed sessions.
|
||||
- **SC-009**: Support requests caused by missing or unclear dataset attributes decrease by at least 40% within the target pilot group after adoption.
|
||||
|
||||
## Assumptions
|
||||
|
||||
- Users already have permission to access the datasets and Superset artifacts they submit to ss-tools.
|
||||
- Saved native filters embedded in a Superset link are considered the preferred reusable source of analytical context when available.
|
||||
- Users need both self-service automation and a guided conversational path because dataset semantics are often incomplete, implicit, conflicting, or distributed across multiple semantic sources.
|
||||
- The feature is intended for internal operational use where clarity, traceability, semantic consistency, and repeatable execution are more important than raw execution speed.
|
||||
- Exportable documentation and validation outputs are required for collaboration, review, and audit use cases.
|
||||
- Users may choose to proceed with warnings, but not with missing required execution inputs, missing required mapping approvals, or missing required compiled preview.
|
||||
- Superset SQL Lab session creation is the canonical audited launch path for approved execution.
|
||||
- Warning-level mapping transformations require explicit user approval before launch, while manual correction remains optional.
|
||||
- Launch requires a successful Superset-side compiled preview and cannot fall back to an unverified local approximation.
|
||||
- Trusted semantic sources already exist or can be introduced incrementally through frontend-managed files, connected dictionaries, or reference datasets without requiring organizations to discard existing semantic workflows.
|
||||
109
specs/027-dataset-llm-orchestration/tasks.md
Normal file
109
specs/027-dataset-llm-orchestration/tasks.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Tasks: LLM Dataset Orchestration
|
||||
|
||||
**Feature Branch**: `027-dataset-llm-orchestration`
|
||||
**Implementation Plan**: [`specs/027-dataset-llm-orchestration/plan.md`](/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/plan.md)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Setup
|
||||
|
||||
- [ ] T001 Initialize backend service directory structure for `dataset_review` in `backend/src/services/dataset_review/`
|
||||
- [ ] T002 Initialize frontend component directory for `dataset-review` in `frontend/src/lib/components/dataset-review/`
|
||||
- [ ] T003 Register `ff_dataset_auto_review`, `ff_dataset_clarification`, and `ff_dataset_execution` feature flags in configuration
|
||||
- [ ] T004 [P] Seed new `DATASET_REVIEW_*` permissions in `backend/src/scripts/seed_permissions.py`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Foundational Layer
|
||||
|
||||
- [ ] T005 [P] Implement Core SQLAlchemy models for session, profile, and findings in `backend/src/models/dataset_review.py`
|
||||
- [ ] T006 [P] Implement Semantic, Mapping, and Clarification models in `backend/src/models/dataset_review.py`
|
||||
- [ ] T007 [P] Implement Preview and Launch Audit models in `backend/src/models/dataset_review.py`
|
||||
- [ ] T008 [P] Implement `DatasetReviewSessionRepository` (CRITICAL: C5, PRE: auth scope, POST: consistent aggregates, INVARIANTS: ownership scope) in `backend/src/services/dataset_review/repositories/session_repository.py`
|
||||
- [ ] T009 [P] Create Pydantic schemas for Session Summary and Detail in `backend/src/schemas/dataset_review.py`
|
||||
- [ ] T010 [P] Create Svelte store for session management in `frontend/src/lib/stores/datasetReviewSession.js`
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: User Story 1 — Automatic Review (P1)
|
||||
|
||||
**Goal**: Submission of link/dataset produces immediate readable summary and semantic enrichment from trusted sources.
|
||||
|
||||
**Independent Test**: Submit a Superset link; verify session created, summary generated, and findings populated without manual intervention.
|
||||
|
||||
- [ ] T011 [P] [US1] Implement `StartSessionRequest` and lifecycle endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T012 [US1] Implement `DatasetReviewOrchestrator.start_session` (CRITICAL: C5, PRE: non-empty input, POST: enqueued recovery, BELIEF: uses `belief_scope`) in `backend/src/services/dataset_review/orchestrator.py`
|
||||
- [ ] T013 [P] [US1] Implement `SupersetContextExtractor.parse_superset_link` (CRITICAL: C4, PRE: parseable link, POST: resolved target, REL: uses `SupersetClient`) in `backend/src/core/utils/superset_context_extractor.py`
|
||||
- [ ] T014 [US1] Implement `SemanticSourceResolver.resolve_from_dictionary` (CRITICAL: C4, PRE: source exists, POST: confidence-ranked candidates) in `backend/src/services/dataset_review/semantic_resolver.py`
|
||||
- [ ] T015 [US1] Implement Documentation and Validation export endpoints (JSON/Markdown) in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T016 [P] [US1] Implement `SourceIntakePanel` (C3, UX_STATE: Idle/Validating/Rejected) in `frontend/src/lib/components/dataset-review/SourceIntakePanel.svelte`
|
||||
- [ ] T017 [P] [US1] Implement `ValidationFindingsPanel` (C3, UX_STATE: Blocking/Warning/Info) in `frontend/src/lib/components/dataset-review/ValidationFindingsPanel.svelte`
|
||||
- [ ] T018 [US1] Create main `DatasetReviewWorkspace` (CRITICAL: C5, UX_STATE: Empty/Importing/Review) in `frontend/src/routes/datasets/review/[id]/+page.svelte`
|
||||
- [ ] T019 [US1] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T020 [US1] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: User Story 2 — Guided Clarification (P2)
|
||||
|
||||
**Goal**: Resolve ambiguities and conflicting metadata through one-question-at-a-time dialogue.
|
||||
|
||||
**Independent Test**: Open a session with unresolved findings; answer questions one by one and verify readiness state updates in real-time.
|
||||
|
||||
- [ ] T021 [P] [US2] Implement `ClarificationEngine.build_question_payload` (CRITICAL: C4, PRE: unresolved state, POST: prioritized question) in `backend/src/services/dataset_review/clarification_engine.py`
|
||||
- [ ] T022 [US2] Implement `ClarificationEngine.record_answer` (CRITICAL: C4, PRE: question active, POST: answer persisted before state advance) in `backend/src/services/dataset_review/clarification_engine.py`
|
||||
- [ ] T023 [P] [US2] Implement field-level semantic override and lock endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T024 [US2] Implement `SemanticLayerReview` component (C3, UX_STATE: Conflicted/Manual) in `frontend/src/lib/components/dataset-review/SemanticLayerReview.svelte`
|
||||
- [ ] T025 [P] [US2] Implement `ClarificationDialog` (C3, UX_STATE: Question/Saving/Completed, REL: binds to `assistantChat`) in `frontend/src/lib/components/dataset-review/ClarificationDialog.svelte`
|
||||
- [ ] T026 [US2] Implement LLM feedback (👍/👎) storage and UI handlers in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T027 [US2] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T028 [US2] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: User Story 3 — Controlled Execution (P3)
|
||||
|
||||
**Goal**: Review mappings, generate Superset-side preview, and launch audited SQL Lab execution.
|
||||
|
||||
**Independent Test**: Map filters to variables; trigger preview; verify launch blocked until preview succeeds; verify SQL Lab session creation.
|
||||
|
||||
- [ ] T029 [P] [US3] Implement `SupersetContextExtractor.recover_imported_filters` and variable discovery in `backend/src/core/utils/superset_context_extractor.py`
|
||||
- [ ] T030 [US3] Implement `SupersetCompilationAdapter.compile_preview` (CRITICAL: C4, PRE: effective inputs available, POST: Superset-compiled SQL only) in `backend/src/core/utils/superset_compilation_adapter.py`
|
||||
- [ ] T031 [US3] Implement `DatasetReviewOrchestrator.launch_dataset` (CRITICAL: C5, PRE: run-ready + preview match, POST: audited run context) in `backend/src/services/dataset_review/orchestrator.py`
|
||||
- [ ] T032 [P] [US3] Implement mapping approval and preview trigger endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T033 [P] [US3] Implement `ExecutionMappingReview` component (C3, UX_STATE: WarningApproval/Approved) in `frontend/src/lib/components/dataset-review/ExecutionMappingReview.svelte`
|
||||
- [ ] T034 [P] [US3] Implement `CompiledSQLPreview` component (C3, UX_STATE: Ready/Stale/Error) in `frontend/src/lib/components/dataset-review/CompiledSQLPreview.svelte`
|
||||
- [ ] T035 [US3] Implement `LaunchConfirmationPanel` (C3, UX_STATE: Blocked/Ready/Submitted) in `frontend/src/lib/components/dataset-review/LaunchConfirmationPanel.svelte`
|
||||
- [ ] T036 [US3] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T037 [US3] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Final Phase: Polish & Security
|
||||
|
||||
- [ ] T038 Implement `SessionEvent` logger and persistence logic in `backend/src/services/dataset_review/event_logger.py`
|
||||
- [ ] T039 Implement automatic version propagation logic for updated `SemanticSource` entities
|
||||
- [ ] T040 Add batch approval API and UI actions for mapping/semantics
|
||||
- [ ] T041 Add integration tests for Superset version compatibility matrix in `backend/tests/services/dataset_review/test_superset_matrix.py`
|
||||
- [ ] T042 Final audit of RBAC enforcement across all session-mutation endpoints
|
||||
- [ ] T043 Verify i18n coverage for all user-facing strings in `frontend/src/lib/i18n/`
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Strategy
|
||||
|
||||
### Story Completion Order
|
||||
1. **Foundation** (Blocking: T005-T010)
|
||||
2. **User Story 1** (Blocking for US2 and US3)
|
||||
3. **User Story 2** (Can be implemented in parallel with US3 parts, but requires US1 findings)
|
||||
4. **User Story 3** (Final terminal action)
|
||||
|
||||
### Parallel Execution Opportunities
|
||||
- T011, T013, T016 (API, Parser, UI Setup) can run simultaneously once T001-T010 are done.
|
||||
- T021 and T025 (Clarification Backend/Frontend) can run in parallel.
|
||||
- T030 and T034 (Preview Backend/Frontend) can run in parallel.
|
||||
|
||||
### Implementation Strategy
|
||||
- **MVP First**: Implement US1 with hardcoded trusted sources to prove the session/summary lifecycle.
|
||||
- **Incremental Delivery**: Release US1 for documentation value, then US2 for metadata cleanup, finally US3 for execution.
|
||||
- **WYSIWWR Guard**: T030 must never be compromised; if Superset API fails, implementation must prioritize the "Manual Launch" fallback defined in research.
|
||||
720
specs/027-dataset-llm-orchestration/ux_reference.md
Normal file
720
specs/027-dataset-llm-orchestration/ux_reference.md
Normal file
@@ -0,0 +1,720 @@
|
||||
# UX Reference: LLM Dataset Orchestration
|
||||
|
||||
**Feature Branch**: `027-dataset-llm-orchestration`
|
||||
**Created**: 2026-03-16
|
||||
**Status**: Draft
|
||||
|
||||
## 1. User Persona & Context
|
||||
|
||||
* **Primary user**: Analytics engineer or BI engineer who needs to quickly understand, validate, parameterize, enrich, and run a dataset that may have incomplete business context.
|
||||
* **Secondary user**: Data steward or domain expert who helps confirm meanings, resolve ambiguities, and approve the documented interpretation of a dataset.
|
||||
* **What is the user trying to achieve?**: Convert a raw dataset or a Superset-derived analytical context into something understandable, trustworthy, semantically enriched, and runnable without manually reverse-engineering filters, semantics, and hidden assumptions.
|
||||
* **Mindset**: The user is usually unsure about part of the dataset. They want speed, but they do not want “magic” that hides uncertainty. They need confidence, traceability, the ability to intervene, and reuse of existing semantic assets rather than endless redefinition.
|
||||
* **Context of use**:
|
||||
* Reviewing a dataset before reuse in analysis.
|
||||
* Preparing a dataset for migration or operational execution.
|
||||
* Importing an existing analytical context from Superset instead of rebuilding it manually.
|
||||
* Reusing semantic metadata from Excel or database dictionaries.
|
||||
* Inheriting semantic layer settings from neighboring or master datasets in Superset.
|
||||
* Collaborating with another person who knows the business meaning better than the technical owner.
|
||||
|
||||
## 2. UX Principles
|
||||
|
||||
* **Expose certainty, do not fake certainty**: The system must always distinguish confirmed facts, inferred facts, imported facts, unresolved facts, and AI drafts.
|
||||
* **Guide, then get out of the way**: The product should proactively suggest next actions but should not force the user into a rigid wizard if they already know what they want to do.
|
||||
* **Progress over perfection**: A user should be able to get partial value immediately, save progress, and return later.
|
||||
* **One ambiguity at a time**: In dialogue mode, the user should never feel interrogated by a wall of questions.
|
||||
* **Execution must feel safe**: Before launch, the user should clearly understand what will run, with which filters, with which unresolved assumptions.
|
||||
* **Superset import should feel like recovery, not parsing**: The user expectation is not “we decoded a link”, but “we recovered the analysis context I had in Superset.”
|
||||
* **What You See Is What Will Run (WYSIWWR)**: Before any launch, the system must show the final compiled SQL query exactly as it will be sent for execution, with all template substitutions already resolved.
|
||||
* **Single Source of Truth for Execution**: The LLM never writes or edits SQL directly. The LLM only helps interpret business meaning and map available filter values into execution parameters. Jinja compilation and final SQL generation are always delegated to the native Superset execution API so the preview and the real run stay aligned.
|
||||
* **Reuse before invention**: The system should prefer trusted semantic sources before generating new names or descriptions from scratch.
|
||||
* **Confidence hierarchy must stay visible**: Semantic enrichment should follow a clear source priority: exact match from dictionary, inherited match from reference dataset, fuzzy semantic match, and only then AI-generated draft.
|
||||
* **Manual intent wins**: The system must never silently overwrite a user’s manual semantic edits with imported or generated metadata.
|
||||
|
||||
## 3. Core Product Modes
|
||||
|
||||
### Mode A: Automatic Review
|
||||
|
||||
User submits a dataset or imported analytical context and immediately receives:
|
||||
* documentation draft,
|
||||
* validation findings,
|
||||
* filter/context extraction result,
|
||||
* semantic enrichment candidates for columns and metrics,
|
||||
* recommended next action.
|
||||
|
||||
This mode is for speed and low-friction first-pass understanding.
|
||||
|
||||
Automatic review is not limited to generating names and descriptions from scratch. During first-pass analysis, the system actively searches connected semantic sources:
|
||||
* external dictionaries from database tables or uploaded spreadsheet files,
|
||||
* other reference datasets in Superset,
|
||||
* neighboring datasets that reuse the same physical tables or overlapping schema,
|
||||
* LLM-driven fuzzy semantic matching when exact reuse is not possible.
|
||||
|
||||
The semantic confidence hierarchy is explicit:
|
||||
|
||||
1. **Confirmed** — exact match from connected dictionary or file.
|
||||
2. **Imported** — reused match from a trusted reference dataset.
|
||||
3. **Inferred** — fuzzy or semantic match proposed through LLM-assisted comparison.
|
||||
4. **AI Draft** — generated by the LLM from scratch when no stronger source exists.
|
||||
|
||||
This mode should feel like the system is recovering and inheriting existing semantic knowledge before inventing anything new.
|
||||
|
||||
### Mode B: Guided Clarification
|
||||
|
||||
User enters a focused interaction with the agent to resolve unresolved attributes, missing filter meanings, inconsistent business semantics, conflicting semantic sources, or run-time gaps.
|
||||
|
||||
This mode is for confidence-building and resolving uncertainty.
|
||||
|
||||
### Mode C: Run Preparation
|
||||
|
||||
User reviews the assembled run context, edits values where needed, confirms assumptions, inspects the compiled SQL preview, and launches the dataset only when the context is good enough.
|
||||
|
||||
This mode is for controlled execution.
|
||||
|
||||
## 4. Primary Happy Path
|
||||
|
||||
### High-Level Story
|
||||
|
||||
The user opens ss-tools because they have a dataset they need to understand and run, but they do not fully trust the metadata. They paste a Superset link or select a dataset source in the web interface. In seconds, the workspace fills with a structured interpretation: what the dataset appears to be, which filters were recovered, which Jinja-driven variables exist in the dataset, which semantic labels were inherited from trusted sources, what is already known, and what is still uncertain. The user scans a short human-readable summary, adjusts the business meaning manually if needed, approves a few semantic and filter mappings, resolves only the remaining ambiguities through a short guided dialogue, and reaches a “Run Ready” state after reviewing the final SQL compiled by Superset itself. Launch feels deliberate and safe because the interface shows exactly what will be used, how imported filters map to runtime variables, and where each semantic label came from.
|
||||
|
||||
### Detailed Step-by-Step Journey
|
||||
|
||||
#### Step 1: Entry
|
||||
|
||||
The user lands on an empty “Dataset Review Workspace”.
|
||||
|
||||
The screen offers two clear entry paths:
|
||||
* **Paste Superset Link**
|
||||
* **Select Dataset Source**
|
||||
|
||||
The user should instantly understand that both paths lead to the same outcome: a documented, semantically enriched, and runnable dataset context.
|
||||
|
||||
**Desired feeling**: “I know where to start.”
|
||||
|
||||
#### Step 2: Source Intake
|
||||
|
||||
The user pastes a Superset link.
|
||||
|
||||
The system immediately validates the input shape and responds optimistically:
|
||||
* link recognized,
|
||||
* source identified,
|
||||
* import started.
|
||||
|
||||
The system should avoid blocking the user with technical checks unless the import is impossible.
|
||||
|
||||
**Desired feeling**: “The system understood what I gave it.”
|
||||
|
||||
#### Step 3: Context Recovery
|
||||
|
||||
The system assembles the first-pass interpretation:
|
||||
* dataset identity,
|
||||
* imported native filters,
|
||||
* obvious dimensions/measures,
|
||||
* initial business summary,
|
||||
* unresolved items,
|
||||
* discovered Jinja variables used by the dataset,
|
||||
* candidate semantic sources for columns and metrics.
|
||||
|
||||
Context recovery is not limited to decoding the Superset link. The system also inspects the dataset through the Superset-side API to detect all available runtime template variables referenced inside the dataset query logic, for example variables used in expressions like `{{ filter_values('region') }}`.
|
||||
|
||||
In parallel, ss-tools gathers semantic metadata in the background from neighboring or reference datasets, especially those using the same physical tables, overlapping schema, or known business lineage. This gives the system an immediate base for suggesting `verbose_name`, `description`, and `d3format` values before asking the user to define them manually.
|
||||
|
||||
Instead of showing a spinner for too long, the interface should reveal results progressively as they become available:
|
||||
* dataset recognized,
|
||||
* saved native filters recovered from the link,
|
||||
* dataset template variables detected from the dataset body,
|
||||
* nearby or master datasets identified as semantic candidates,
|
||||
* dictionary or spreadsheet matches found,
|
||||
* preliminary mapping candidates suggested between filter inputs and template variables,
|
||||
* preliminary semantic matches suggested for columns and metrics.
|
||||
|
||||
**Desired feeling**: “I’m already getting value before everything is finished.”
|
||||
|
||||
#### Step 4: First Readable Summary
|
||||
|
||||
The user sees a compact summary card:
|
||||
* what this dataset appears to represent,
|
||||
* what period/scope/segments are implied,
|
||||
* what filters were recovered,
|
||||
* whether execution is currently possible.
|
||||
|
||||
This summary is the anchor of trust. It must be short, business-readable, and immediately useful.
|
||||
|
||||
The summary is editable. If the user sees that the generated business meaning is incorrect or incomplete, they can use **[Edit]** to manually correct the summary without starting a long clarification dialogue.
|
||||
|
||||
**Desired feeling**: “I can explain this dataset to someone else already, and I can quickly fix the explanation if it is wrong.”
|
||||
|
||||
#### Step 5: Validation Triage
|
||||
|
||||
The system groups findings into:
|
||||
* **Blocking**
|
||||
* **Needs Attention**
|
||||
* **Informational**
|
||||
|
||||
The user does not need to read everything. They need to know what is stopping them from running, what is risky, and what can be reviewed later.
|
||||
|
||||
**Desired feeling**: “I know what matters right now.”
|
||||
|
||||
#### Step 6: Clarification Decision
|
||||
|
||||
If ambiguities remain, the product presents an explicit choice:
|
||||
* **Fix now with agent**
|
||||
* **Continue with current assumptions**
|
||||
* **Save and return later**
|
||||
|
||||
This is a critical UX moment. The user must feel in control rather than forced into a mandatory workflow.
|
||||
|
||||
**Desired feeling**: “I decide how much rigor I need right now.”
|
||||
|
||||
#### Step 7: Guided Clarification
|
||||
|
||||
If the user chooses clarification, the workspace switches into a focused dialogue mode.
|
||||
|
||||
The agent asks one question at a time, each with:
|
||||
* why this matters,
|
||||
* what the current guess is,
|
||||
* quick-select answers when possible,
|
||||
* an option to skip,
|
||||
* an option to say “I don’t know”.
|
||||
|
||||
Each answer updates the dataset profile in real time.
|
||||
|
||||
**Desired feeling**: “This is helping me resolve uncertainty, not making me fill a form.”
|
||||
|
||||
#### Step 8: Run Readiness Review
|
||||
|
||||
When blocking issues are resolved, the system returns to a run-preparation state with:
|
||||
* selected filters,
|
||||
* placeholder values,
|
||||
* unresolved warnings,
|
||||
* final business summary,
|
||||
* provenance labels for each key value,
|
||||
* visible mapping between imported filters and detected Jinja template variables,
|
||||
* semantic provenance for important columns and metrics,
|
||||
* a preview of the final compiled SQL returned by Superset.
|
||||
|
||||
This step contains the critical **Smart Mapping** stage. The system uses the LLM to propose a mapping between the filter values recovered from the Superset link and the Jinja variables discovered in the dataset. The LLM does not generate SQL. It only assembles or suggests the parameter payload used for execution, such as the effective template parameter object.
|
||||
|
||||
The user can review each mapping explicitly:
|
||||
* source filter,
|
||||
* target Jinja variable,
|
||||
* transformed value if normalization was required,
|
||||
* confidence state,
|
||||
* warning state,
|
||||
* manual override.
|
||||
|
||||
Semantic review also remains visible here. Users can inspect where key `verbose_name`, `description`, and `d3format` values came from and whether they were confirmed from a dictionary, imported from a reference dataset, inferred from fuzzy matching, or generated as AI drafts.
|
||||
|
||||
Before launch, ss-tools performs a **Dry Run via Superset API**. The backend sends the assembled execution parameters to Superset for safe server-side compilation of the query without triggering the real dataset run. The result is shown as the **Compiled Query Preview**.
|
||||
|
||||
The **Compiled Query Preview** is a read-only SQL block that shows the final SQL with Jinja substitutions already resolved by Superset. Substituted values should be visibly highlighted so users can quickly inspect what changed.
|
||||
|
||||
If smart mapping introduced warnings, for example a value normalization such as `Europe → EU`, the launch button stays blocked until the user explicitly approves the mapping or edits it manually. The user must never run a query whose effective substitutions are still ambiguous.
|
||||
|
||||
Before launch, the user should be able to inspect the full context in one place.
|
||||
|
||||
**Desired feeling**: “I know exactly what will run, and I trust that this preview matches the real execution.”
|
||||
|
||||
#### Step 9: Launch
|
||||
|
||||
The user presses **Launch Dataset**.
|
||||
|
||||
The final confirmation is not a generic “Are you sure?” modal. It is a run summary:
|
||||
* dataset,
|
||||
* effective filters,
|
||||
* variable inputs,
|
||||
* warnings still open,
|
||||
* compiled SQL preview status,
|
||||
* semantic source summary for important fields,
|
||||
* what will be recorded for audit.
|
||||
|
||||
“Launch” has a concrete execution meaning. Depending on the selected path, ss-tools either:
|
||||
* sends the prepared execution payload for execution in Superset SQL Lab, or
|
||||
* redirects the user into a ready-to-run Superset analytical view with the assembled execution context already applied.
|
||||
|
||||
In both cases, the user expectation is the same: the execution uses the exact compiled query and runtime parameters they already reviewed.
|
||||
|
||||
**Desired feeling**: “This run is controlled, reproducible, and uses the exact query I approved.”
|
||||
|
||||
#### Step 10: Post-Run Feedback
|
||||
|
||||
After launch, the system confirms:
|
||||
* run started or completed,
|
||||
* context saved,
|
||||
* documentation linked,
|
||||
* validation snapshot preserved,
|
||||
* compiled query version associated with the run,
|
||||
* execution handoff target available,
|
||||
* semantic mapping decisions preserved for reuse.
|
||||
|
||||
The post-run state should provide useful artifacts, such as:
|
||||
* a link to the created Superset execution session,
|
||||
* a preview of the first rows of returned data directly in ss-tools when available,
|
||||
* or an updated saved dataset context that can be reopened and reused later.
|
||||
|
||||
The user can reopen the run later and understand the exact state used.
|
||||
|
||||
**Desired feeling**: “I can trust this later, not just right now.”
|
||||
|
||||
## 5. End-to-End Interaction Model
|
||||
|
||||
## 5.1 Main Workspace Structure
|
||||
|
||||
**Screen/Component**: Dataset Review Workspace
|
||||
|
||||
**Layout**: Adaptive three-column workspace.
|
||||
|
||||
### Left Column: Source & Session
|
||||
* dataset source card,
|
||||
* Superset import status,
|
||||
* session state,
|
||||
* save/resume controls,
|
||||
* recent actions timeline.
|
||||
|
||||
### Center Column: Meaning & Validation
|
||||
* generated business summary,
|
||||
* manual override with **[Edit]** for the generated summary and business interpretation,
|
||||
* documentation draft preview,
|
||||
* validation findings grouped by severity,
|
||||
* confidence markers,
|
||||
* unresolved assumptions.
|
||||
|
||||
### Center Column: Columns & Metrics
|
||||
* semantic layer table for columns and metrics,
|
||||
* visible values for `verbose_name`, `description`, and formatting metadata where available,
|
||||
* provenance badges for every semantically enriched field, such as `[ 📄 dict.xlsx ]`, `[ 📊 Dataset: Master Sales ]`, or `[ ✨ AI Guessed ]`,
|
||||
* side-by-side conflict view when multiple semantic sources disagree,
|
||||
* **Apply semantic source...** action that opens source selection for file, database dictionary, or existing Superset datasets,
|
||||
* manual per-field override so the user can keep, replace, or rewrite semantic metadata.
|
||||
|
||||
### Right Column: Filters & Execution
|
||||
* imported filters,
|
||||
* parameter placeholders,
|
||||
* **Jinja Template Mapping** block with visible mapping between source filters and detected dataset variables,
|
||||
* run-time values,
|
||||
* **Compiled SQL Preview** block or action to open the compiled query returned by Superset API,
|
||||
* readiness checklist,
|
||||
* primary CTA.
|
||||
|
||||
This structure matters because the user mentally works across four questions:
|
||||
1. What is this?
|
||||
2. Can I trust its meaning?
|
||||
3. Can I trust what will run?
|
||||
4. Can I run it?
|
||||
|
||||
## 5.2 Primary CTAs by State
|
||||
|
||||
The main CTA should change based on readiness:
|
||||
|
||||
* **Empty** → `Import from Superset`
|
||||
* **Intake complete** → `Review Documentation`
|
||||
* **Semantic source available** → `Apply Semantic Source`
|
||||
* **Ambiguities present** → `Start Clarification`
|
||||
* **Mapping warnings present** → `Approve Mapping`
|
||||
* **Compilation preview missing** → `Generate SQL Preview`
|
||||
* **Blocking values missing** → `Complete Required Values`
|
||||
* **Run-ready** → `Launch Dataset`
|
||||
|
||||
The product should never make the user guess what the next best action is.
|
||||
|
||||
## 5.3 Information Hierarchy
|
||||
|
||||
At any moment, the most visible information should be:
|
||||
|
||||
1. current readiness state,
|
||||
2. blocking problems,
|
||||
3. imported/recovered context,
|
||||
4. mapping status between recovered filters and runtime variables,
|
||||
5. semantic source confidence for key fields,
|
||||
6. business explanation,
|
||||
7. compiled SQL preview status,
|
||||
8. detailed metadata.
|
||||
|
||||
Raw detail is valuable, but it should never compete visually with the answer to “Can I proceed?”
|
||||
|
||||
## 6. Dialogue UX: Agent Interaction Design
|
||||
|
||||
## 6.1 Conversation Pattern
|
||||
|
||||
The agent interaction is not a chat for general brainstorming. It is a structured clarification assistant.
|
||||
|
||||
Each prompt should contain:
|
||||
* **Question**
|
||||
* **Why this matters**
|
||||
* **Current system guess**
|
||||
* **Suggested answers**
|
||||
* **Optional free-form input**
|
||||
* **Skip for now**
|
||||
|
||||
Example interaction:
|
||||
|
||||
```text
|
||||
Question 2 of 5
|
||||
What does the "region_scope" filter represent in this dataset?
|
||||
|
||||
Why this matters:
|
||||
This value changes how the final aggregation is interpreted.
|
||||
|
||||
Current guess:
|
||||
It appears to mean the reporting region, not the customer region.
|
||||
|
||||
Choose one:
|
||||
[1] Reporting region
|
||||
[2] Customer region
|
||||
[3] Both depending on use case
|
||||
[4] I’m not sure
|
||||
[5] Enter custom answer
|
||||
```
|
||||
|
||||
This keeps the agent focused, useful, and fast.
|
||||
|
||||
## 6.2 Agent-Led Semantic Source Suggestion
|
||||
|
||||
The agent may proactively suggest a semantic source when the schema strongly resembles an existing reference.
|
||||
|
||||
Example interaction:
|
||||
|
||||
```text
|
||||
Question: Semantic Layer Source
|
||||
|
||||
I noticed that 80% of the columns in this dataset (user_id, region, revenue) match the existing "Core_Users_Master" dataset.
|
||||
|
||||
Why this matters:
|
||||
Reusing existing metadata keeps verbose names, descriptions, and d3formats consistent across dashboards.
|
||||
|
||||
How would you like to populate the semantic layer?
|
||||
[1] Copy from "Core_Users_Master" dataset (Recommended)
|
||||
[2] Upload an Excel (.xlsx) or DB dictionary
|
||||
[3] Let AI generate them from scratch
|
||||
[4] Skip and leave as database names
|
||||
```
|
||||
|
||||
This should feel like a smart reuse recommendation, not a forced detour.
|
||||
|
||||
## 6.3 Fuzzy Matching Confirmation Pattern
|
||||
|
||||
When the user chooses an external dictionary and exact matches are incomplete, the agent should summarize the result clearly before applying it.
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
I matched 15 columns exactly from the selected dictionary.
|
||||
I also found 3 likely semantic matches that need confirmation.
|
||||
|
||||
Please review:
|
||||
- reg_code → region
|
||||
- rev_total → revenue
|
||||
- usr_identifier → user_id
|
||||
|
||||
How would you like to proceed?
|
||||
[1] Accept all suggested semantic matches
|
||||
[2] Review one by one
|
||||
[3] Ignore fuzzy matches and keep exact ones only
|
||||
```
|
||||
|
||||
The user must understand which matches are exact, which are semantic guesses, and which remain unresolved.
|
||||
|
||||
## 6.4 Agent Tone
|
||||
|
||||
The agent should sound:
|
||||
* precise,
|
||||
* calm,
|
||||
* operational,
|
||||
* non-judgmental.
|
||||
|
||||
It should never imply the user made a mistake when data is ambiguous. Ambiguity is treated as a normal property of datasets.
|
||||
|
||||
## 6.5 Dialogue Controls
|
||||
|
||||
The user must be able to:
|
||||
* skip a question,
|
||||
* save and exit,
|
||||
* review previous answers,
|
||||
* revise a prior answer,
|
||||
* mark an item as “needs expert review”.
|
||||
|
||||
These controls are critical for real-world data workflows.
|
||||
|
||||
## 6.6 Dialogue Exit Conditions
|
||||
|
||||
The user can leave dialogue mode when:
|
||||
* all blocking ambiguities are resolved,
|
||||
* user chooses to continue with warnings,
|
||||
* session is saved for later,
|
||||
* no further useful clarification can be generated.
|
||||
|
||||
The agent must explicitly summarize what changed before exit:
|
||||
* resolved items,
|
||||
* still unresolved items,
|
||||
* effect on run readiness.
|
||||
|
||||
## 7. State Model
|
||||
|
||||
### State 1: Empty
|
||||
No dataset loaded. Clear entry choices.
|
||||
|
||||
### State 2: Importing
|
||||
Progressive loading with visible milestones.
|
||||
|
||||
### State 3: Review Ready
|
||||
Documentation and validation visible. User can understand the dataset immediately.
|
||||
|
||||
### State 4: Semantic Source Review Needed
|
||||
The system found reusable semantic sources, but the user still needs to choose, approve, or reject them.
|
||||
|
||||
### State 5: Clarification Needed
|
||||
There are meaningful unresolved items. Product suggests dialogue mode.
|
||||
|
||||
### State 6: Clarification Active
|
||||
One-question-at-a-time guided flow.
|
||||
|
||||
### State 7: Mapping Review Needed
|
||||
Recovered filters and detected Jinja variables exist, but the mapping still requires approval, correction, or completion.
|
||||
|
||||
### State 8: Compiled Preview Ready
|
||||
Superset has compiled the current parameter set, and the user can inspect the exact SQL that would run.
|
||||
|
||||
### State 9: Partially Ready
|
||||
No blockers, but warnings remain.
|
||||
|
||||
### State 10: Run Ready
|
||||
Everything required for launch is complete.
|
||||
|
||||
### State 11: Run In Progress
|
||||
Execution feedback and status tracking.
|
||||
|
||||
### State 12: Completed
|
||||
Run outcome and saved context available.
|
||||
|
||||
### State 13: Recovery Required
|
||||
Import, mapping, semantic enrichment, or compilation was partial; manual or guided recovery needed.
|
||||
|
||||
## 8. Key User Decisions
|
||||
|
||||
The UX must support these decisions explicitly:
|
||||
|
||||
* Is this imported context trustworthy enough?
|
||||
* Which semantic source should define `verbose_name`, `description`, and `d3format`?
|
||||
* Do I want to reuse a master dataset or apply a spreadsheet/database dictionary?
|
||||
* Should I accept fuzzy semantic matches or only exact ones?
|
||||
* Do I need clarification now or can I continue?
|
||||
* Are the filters correct as imported?
|
||||
* Which source filter should map to which Jinja variable?
|
||||
* Is the transformed value acceptable if normalization was applied?
|
||||
* Which values are confirmed versus guessed?
|
||||
* Does the compiled SQL match my intent?
|
||||
* Is the dataset safe enough to run?
|
||||
* Do I want to save current progress and come back later?
|
||||
|
||||
If the interface does not make these decisions visible, the user will feel lost even if the feature is technically correct.
|
||||
|
||||
## 9. UI Layout & Flow
|
||||
|
||||
**Screen**: Dataset Review Workspace
|
||||
|
||||
* **Top Bar**:
|
||||
* Source badge
|
||||
* Dataset name
|
||||
* Readiness status pill
|
||||
* Save session
|
||||
* Export summary
|
||||
|
||||
* **Hero Summary Block**:
|
||||
* “What this dataset is”
|
||||
* “What is ready”
|
||||
* “What still needs attention”
|
||||
* Primary CTA
|
||||
* **[Edit]** action for manual correction
|
||||
|
||||
* **Tabs or Sections**:
|
||||
* Overview
|
||||
* Documentation
|
||||
* Semantic Layer
|
||||
* Validation
|
||||
* Filters
|
||||
* Mapping
|
||||
* SQL Preview
|
||||
* Clarification History
|
||||
* Run History
|
||||
|
||||
* **Right Rail**:
|
||||
* readiness checklist,
|
||||
* semantic source status,
|
||||
* missing required values,
|
||||
* mapping warnings,
|
||||
* SQL preview status,
|
||||
* launch button.
|
||||
|
||||
## 10. Micro-Interactions
|
||||
|
||||
* Imported filters should animate into the panel one by one as they are recovered.
|
||||
* Detected Jinja variables should appear as a second wave of recovered context so the user understands execution awareness is expanding.
|
||||
* Detected semantic source candidates should appear as a third wave, with confidence labels and provenance badges.
|
||||
* Every clarified answer should immediately remove or downgrade a validation finding where relevant.
|
||||
* Provenance badges should update live:
|
||||
* Confirmed
|
||||
* Imported
|
||||
* Inferred
|
||||
* AI Draft
|
||||
* Mapped
|
||||
* Needs Review
|
||||
* The primary CTA should change smoothly, not abruptly, as the state progresses.
|
||||
* When launch becomes available, the interface should celebrate readiness subtly but should not hide remaining warnings.
|
||||
* Value transformations proposed by mapping should be visually diffed so the user can spot changes like `Europe → EU` instantly.
|
||||
* The compiled SQL preview should visibly refresh when mapping or parameter values change.
|
||||
* Manual semantic overrides should visually lock the affected field so later imports do not silently replace it.
|
||||
|
||||
## 11. Error Experience
|
||||
|
||||
**Philosophy**: Never show a dead end. Every error state must preserve recovered value, explain what failed, and show the nearest path forward.
|
||||
|
||||
### Scenario A: Superset Link Recognized, Filter Extraction Partial
|
||||
|
||||
* **User Action**: Pastes a valid Superset link with partially recoverable filter state.
|
||||
* **System Response**:
|
||||
* “We recovered the dataset and 3 filters, but 2 saved filters need manual review.”
|
||||
* Missing or low-confidence filters are listed explicitly.
|
||||
* The system still opens the workspace with partial value.
|
||||
* **Recovery**:
|
||||
* review recovered filters,
|
||||
* add missing ones manually,
|
||||
* ask the agent to help reconstruct intent.
|
||||
|
||||
### Scenario B: No Clear Business Meaning Can Be Inferred
|
||||
|
||||
* **User Action**: Submits a technically valid dataset with poor metadata.
|
||||
* **System Response**:
|
||||
* “We could identify the structure of this dataset, but not its business meaning.”
|
||||
* Documentation remains skeletal but usable.
|
||||
* Clarification becomes the obvious next step.
|
||||
* **Recovery**:
|
||||
* launch dialogue mode,
|
||||
* invite domain expert input,
|
||||
* save draft and resume later.
|
||||
|
||||
### Scenario C: Required Run-Time Values Missing
|
||||
|
||||
* **User Action**: Tries to launch with incomplete placeholders.
|
||||
* **System Response**:
|
||||
* launch blocked,
|
||||
* missing values highlighted in-place,
|
||||
* concise summary of what is required.
|
||||
* **Recovery**:
|
||||
* fill values inline,
|
||||
* return to review,
|
||||
* or save incomplete context.
|
||||
|
||||
### Scenario D: Conflicting Meanings Across Sources
|
||||
|
||||
* **User Action**: Reviews a dataset where imported filter context and documented semantics conflict.
|
||||
* **System Response**:
|
||||
* both candidate meanings are shown side-by-side,
|
||||
* neither is silently chosen if confidence is low,
|
||||
* the conflict is framed as a decision, not a failure.
|
||||
* **Recovery**:
|
||||
* user confirms one meaning,
|
||||
* leaves item unresolved,
|
||||
* or marks for expert review.
|
||||
|
||||
### Scenario E: User Leaves Mid-Flow
|
||||
|
||||
* **User Action**: Closes the session before clarification or run prep is complete.
|
||||
* **System Response**:
|
||||
* autosave or explicit save confirmation,
|
||||
* summary of current progress,
|
||||
* preserved unresolved items.
|
||||
* **Recovery**:
|
||||
* resume from last state without repeating prior answers.
|
||||
|
||||
### Scenario F: Superset API Compilation Failed
|
||||
|
||||
* **User Action**: The mapped runtime values are sent for Jinja compilation, but Superset returns a compilation error.
|
||||
* **System Response**:
|
||||
* the **Compiled SQL Preview** switches into an error state instead of pretending preview is available,
|
||||
* the problematic variable or mapping row is highlighted,
|
||||
* the original compilation error returned by Superset is shown in readable form,
|
||||
* launch remains blocked until the issue is resolved.
|
||||
* **Recovery**:
|
||||
* user manually edits the mapped value,
|
||||
* user changes the filter-to-template mapping,
|
||||
* or user asks the agent to help normalize the value format and then regenerates the preview.
|
||||
|
||||
### Scenario G: Semantic Sources Conflict
|
||||
|
||||
* **User Action**: A column has one value from a spreadsheet dictionary, a different value from a reference dataset, and a third AI-generated proposal.
|
||||
* **System Response**:
|
||||
* the interface shows a side-by-side comparison instead of silently choosing one,
|
||||
* the higher-priority source is highlighted as recommended,
|
||||
* the conflict is marked as a warning if user input would be changed.
|
||||
* **Recovery**:
|
||||
* user selects one source,
|
||||
* user keeps the current manual value,
|
||||
* or user applies the recommended higher-confidence source field by field.
|
||||
|
||||
## 12. UX for Trust & Transparency
|
||||
|
||||
Trust is central to this feature.
|
||||
|
||||
The interface must visibly answer:
|
||||
* Where did this value come from?
|
||||
* Did the system infer this or did the user confirm it?
|
||||
* Which runtime variable will receive this value?
|
||||
* Was the final SQL preview compiled by Superset or just estimated locally?
|
||||
* Did this semantic label come from a dictionary, another dataset, a fuzzy match, or AI generation?
|
||||
* What is still unknown?
|
||||
* What will happen if I proceed anyway?
|
||||
|
||||
Recommended trust markers:
|
||||
* provenance badge on important fields,
|
||||
* confidence labels for imported or inferred data,
|
||||
* mapping approval status,
|
||||
* “compiled by Superset” status on the SQL preview,
|
||||
* “last changed by” and “changed in clarification” notes,
|
||||
* “used in run” markers for final execution inputs.
|
||||
|
||||
Conflict rule:
|
||||
* The system must never silently overwrite user-entered semantic values with data from a dictionary, another dataset, or AI generation.
|
||||
* If multiple sources disagree, the interface shows them side by side and either:
|
||||
* asks the user to choose, or
|
||||
* recommends the highest-priority source while clearly marking the recommendation as a warning until approved.
|
||||
* Manual user input remains the most sensitive value and must be preserved unless the user explicitly replaces it.
|
||||
|
||||
## 13. UX for Collaboration
|
||||
|
||||
This workflow often spans more than one person.
|
||||
|
||||
The UX should support:
|
||||
* sharing documentation draft,
|
||||
* handing a clarification session to a domain expert,
|
||||
* preserving unresolved questions explicitly,
|
||||
* recording who confirmed which meaning,
|
||||
* sharing the reviewed compiled SQL preview as part of execution approval,
|
||||
* sharing which semantic sources were applied and why.
|
||||
|
||||
The user should be able to leave behind a state that another person can understand in under a minute.
|
||||
|
||||
## 14. Tone & Voice
|
||||
|
||||
* **Style**: Concise, trustworthy, operational, and transparent.
|
||||
* **System behavior language**:
|
||||
* Prefer: “Recovered”, “Confirmed”, “Imported”, “Inferred”, “AI Draft”, “Needs review”, “Ready to run”, “Compiled by Superset”
|
||||
* Avoid: “Magic”, “Solved”, “Guaranteed”, “Auto-fixed”
|
||||
* **Terminology**:
|
||||
* Use “dataset”, “clarification”, “validation finding”, “run context”, “imported filter”, “Jinja variable”, “template mapping”, “compiled SQL preview”, “semantic source”, “provenance”, “assumption”, “confidence”.
|
||||
* Avoid overly technical wording in primary UX surfaces when a business-readable phrase exists.
|
||||
|
||||
## 15. UX Success Signals
|
||||
|
||||
The UX is working if users can, with minimal hesitation:
|
||||
* understand what dataset they are dealing with,
|
||||
* see what was recovered from Superset,
|
||||
* see which Jinja variables were discovered for runtime execution,
|
||||
* understand which semantic source supplied each important field,
|
||||
* reuse existing semantic assets before accepting AI guesses,
|
||||
* tell which values are trustworthy,
|
||||
* review and approve filter-to-template mapping without confusion,
|
||||
* inspect the final compiled SQL before launch,
|
||||
* resolve only the ambiguities that matter,
|
||||
* reach a clear run/no-run decision,
|
||||
* reopen the same context later without confusion.
|
||||
Reference in New Issue
Block a user