Таски готовы
This commit is contained in:
109
specs/027-dataset-llm-orchestration/tasks.md
Normal file
109
specs/027-dataset-llm-orchestration/tasks.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Tasks: LLM Dataset Orchestration
|
||||
|
||||
**Feature Branch**: `027-dataset-llm-orchestration`
|
||||
**Implementation Plan**: [`specs/027-dataset-llm-orchestration/plan.md`](/home/busya/dev/ss-tools/specs/027-dataset-llm-orchestration/plan.md)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Setup
|
||||
|
||||
- [ ] T001 Initialize backend service directory structure for `dataset_review` in `backend/src/services/dataset_review/`
|
||||
- [ ] T002 Initialize frontend component directory for `dataset-review` in `frontend/src/lib/components/dataset-review/`
|
||||
- [ ] T003 Register `ff_dataset_auto_review`, `ff_dataset_clarification`, and `ff_dataset_execution` feature flags in configuration
|
||||
- [ ] T004 [P] Seed new `DATASET_REVIEW_*` permissions in `backend/src/scripts/seed_permissions.py`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Foundational Layer
|
||||
|
||||
- [ ] T005 [P] Implement Core SQLAlchemy models for session, profile, and findings in `backend/src/models/dataset_review.py`
|
||||
- [ ] T006 [P] Implement Semantic, Mapping, and Clarification models in `backend/src/models/dataset_review.py`
|
||||
- [ ] T007 [P] Implement Preview and Launch Audit models in `backend/src/models/dataset_review.py`
|
||||
- [ ] T008 [P] Implement `DatasetReviewSessionRepository` (CRITICAL: C5, PRE: auth scope, POST: consistent aggregates, INVARIANTS: ownership scope) in `backend/src/services/dataset_review/repositories/session_repository.py`
|
||||
- [ ] T009 [P] Create Pydantic schemas for Session Summary and Detail in `backend/src/schemas/dataset_review.py`
|
||||
- [ ] T010 [P] Create Svelte store for session management in `frontend/src/lib/stores/datasetReviewSession.js`
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: User Story 1 — Automatic Review (P1)
|
||||
|
||||
**Goal**: Submission of link/dataset produces immediate readable summary and semantic enrichment from trusted sources.
|
||||
|
||||
**Independent Test**: Submit a Superset link; verify session created, summary generated, and findings populated without manual intervention.
|
||||
|
||||
- [ ] T011 [P] [US1] Implement `StartSessionRequest` and lifecycle endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T012 [US1] Implement `DatasetReviewOrchestrator.start_session` (CRITICAL: C5, PRE: non-empty input, POST: enqueued recovery, BELIEF: uses `belief_scope`) in `backend/src/services/dataset_review/orchestrator.py`
|
||||
- [ ] T013 [P] [US1] Implement `SupersetContextExtractor.parse_superset_link` (CRITICAL: C4, PRE: parseable link, POST: resolved target, REL: uses `SupersetClient`) in `backend/src/core/utils/superset_context_extractor.py`
|
||||
- [ ] T014 [US1] Implement `SemanticSourceResolver.resolve_from_dictionary` (CRITICAL: C4, PRE: source exists, POST: confidence-ranked candidates) in `backend/src/services/dataset_review/semantic_resolver.py`
|
||||
- [ ] T015 [US1] Implement Documentation and Validation export endpoints (JSON/Markdown) in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T016 [P] [US1] Implement `SourceIntakePanel` (C3, UX_STATE: Idle/Validating/Rejected) in `frontend/src/lib/components/dataset-review/SourceIntakePanel.svelte`
|
||||
- [ ] T017 [P] [US1] Implement `ValidationFindingsPanel` (C3, UX_STATE: Blocking/Warning/Info) in `frontend/src/lib/components/dataset-review/ValidationFindingsPanel.svelte`
|
||||
- [ ] T018 [US1] Create main `DatasetReviewWorkspace` (CRITICAL: C5, UX_STATE: Empty/Importing/Review) in `frontend/src/routes/datasets/review/[id]/+page.svelte`
|
||||
- [ ] T019 [US1] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T020 [US1] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: User Story 2 — Guided Clarification (P2)
|
||||
|
||||
**Goal**: Resolve ambiguities and conflicting metadata through one-question-at-a-time dialogue.
|
||||
|
||||
**Independent Test**: Open a session with unresolved findings; answer questions one by one and verify readiness state updates in real-time.
|
||||
|
||||
- [ ] T021 [P] [US2] Implement `ClarificationEngine.build_question_payload` (CRITICAL: C4, PRE: unresolved state, POST: prioritized question) in `backend/src/services/dataset_review/clarification_engine.py`
|
||||
- [ ] T022 [US2] Implement `ClarificationEngine.record_answer` (CRITICAL: C4, PRE: question active, POST: answer persisted before state advance) in `backend/src/services/dataset_review/clarification_engine.py`
|
||||
- [ ] T023 [P] [US2] Implement field-level semantic override and lock endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T024 [US2] Implement `SemanticLayerReview` component (C3, UX_STATE: Conflicted/Manual) in `frontend/src/lib/components/dataset-review/SemanticLayerReview.svelte`
|
||||
- [ ] T025 [P] [US2] Implement `ClarificationDialog` (C3, UX_STATE: Question/Saving/Completed, REL: binds to `assistantChat`) in `frontend/src/lib/components/dataset-review/ClarificationDialog.svelte`
|
||||
- [ ] T026 [US2] Implement LLM feedback (👍/👎) storage and UI handlers in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T027 [US2] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T028 [US2] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: User Story 3 — Controlled Execution (P3)
|
||||
|
||||
**Goal**: Review mappings, generate Superset-side preview, and launch audited SQL Lab execution.
|
||||
|
||||
**Independent Test**: Map filters to variables; trigger preview; verify launch blocked until preview succeeds; verify SQL Lab session creation.
|
||||
|
||||
- [ ] T029 [P] [US3] Implement `SupersetContextExtractor.recover_imported_filters` and variable discovery in `backend/src/core/utils/superset_context_extractor.py`
|
||||
- [ ] T030 [US3] Implement `SupersetCompilationAdapter.compile_preview` (CRITICAL: C4, PRE: effective inputs available, POST: Superset-compiled SQL only) in `backend/src/core/utils/superset_compilation_adapter.py`
|
||||
- [ ] T031 [US3] Implement `DatasetReviewOrchestrator.launch_dataset` (CRITICAL: C5, PRE: run-ready + preview match, POST: audited run context) in `backend/src/services/dataset_review/orchestrator.py`
|
||||
- [ ] T032 [P] [US3] Implement mapping approval and preview trigger endpoints in `backend/src/api/routes/dataset_review.py`
|
||||
- [ ] T033 [P] [US3] Implement `ExecutionMappingReview` component (C3, UX_STATE: WarningApproval/Approved) in `frontend/src/lib/components/dataset-review/ExecutionMappingReview.svelte`
|
||||
- [ ] T034 [P] [US3] Implement `CompiledSQLPreview` component (C3, UX_STATE: Ready/Stale/Error) in `frontend/src/lib/components/dataset-review/CompiledSQLPreview.svelte`
|
||||
- [ ] T035 [US3] Implement `LaunchConfirmationPanel` (C3, UX_STATE: Blocked/Ready/Submitted) in `frontend/src/lib/components/dataset-review/LaunchConfirmationPanel.svelte`
|
||||
- [ ] T036 [US3] Verify implementation matches ux_reference.md (Happy Path & Errors)
|
||||
- [ ] T037 [US3] Acceptance: Perform semantic audit & algorithm emulation by Tester
|
||||
|
||||
---
|
||||
|
||||
## Final Phase: Polish & Security
|
||||
|
||||
- [ ] T038 Implement `SessionEvent` logger and persistence logic in `backend/src/services/dataset_review/event_logger.py`
|
||||
- [ ] T039 Implement automatic version propagation logic for updated `SemanticSource` entities
|
||||
- [ ] T040 Add batch approval API and UI actions for mapping/semantics
|
||||
- [ ] T041 Add integration tests for Superset version compatibility matrix in `backend/tests/services/dataset_review/test_superset_matrix.py`
|
||||
- [ ] T042 Final audit of RBAC enforcement across all session-mutation endpoints
|
||||
- [ ] T043 Verify i18n coverage for all user-facing strings in `frontend/src/lib/i18n/`
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Strategy
|
||||
|
||||
### Story Completion Order
|
||||
1. **Foundation** (Blocking: T005-T010)
|
||||
2. **User Story 1** (Blocking for US2 and US3)
|
||||
3. **User Story 2** (Can be implemented in parallel with US3 parts, but requires US1 findings)
|
||||
4. **User Story 3** (Final terminal action)
|
||||
|
||||
### Parallel Execution Opportunities
|
||||
- T011, T013, T016 (API, Parser, UI Setup) can run simultaneously once T001-T010 are done.
|
||||
- T021 and T025 (Clarification Backend/Frontend) can run in parallel.
|
||||
- T030 and T034 (Preview Backend/Frontend) can run in parallel.
|
||||
|
||||
### Implementation Strategy
|
||||
- **MVP First**: Implement US1 with hardcoded trusted sources to prove the session/summary lifecycle.
|
||||
- **Incremental Delivery**: Release US1 for documentation value, then US2 for metadata cleanup, finally US3 for execution.
|
||||
- **WYSIWWR Guard**: T030 must never be compromised; if Superset API fails, implementation must prioritize the "Manual Launch" fallback defined in research.
|
||||
Reference in New Issue
Block a user