Files

2026-05-08 18:01:49 +03:00

41 KiB

Raw Blame History

Tasks: LLM Table Translation Service

Feature Branch: 028-llm-datasource-supeset Input: Design documents from /specs/028-llm-datasource-supeset/ Prerequisites: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/modules.md

Tests: Test tasks are included for all C4/C5 backend contracts, new API endpoints, and Svelte components with @UX_STATE contracts. Test work traces to contract @PRE/@POST guarantees and spec acceptance scenarios.

Organization: Tasks are grouped by user story to enable independent implementation and testing of each story.

Format: `[ID] [P?] [Story] Description`

[P]: Can run in parallel (different files, no dependencies)
[Story]: Which user story this task belongs to (e.g., US1, US5)
Include exact file paths in descriptions

Phase 1: Setup (Shared Infrastructure)

Purpose: Create plugin directory structure and register the new route module in the lazy-import registry.

T001 Create translation plugin directory structure: backend/src/plugins/translate/__init__.py, backend/src/plugins/translate/plugin.py (empty skeleton), plus backend/src/plugins/translate/__tests__/__init__.py
T002 Register translate route module in backend/src/api/routes/__init__.py — add "translate" to __all__ list inside [DEF:Route_Group_Contracts:Block]

Phase 2: Foundational (Blocking Prerequisites)

Purpose: ORM models, Pydantic schemas, plugin boilerplate, route skeleton, and database migration. ALL user stories depend on these artifacts.

⚠️ CRITICAL: No user story work can begin until this phase is complete.

ORM Models

T003 [P] Create all SQLAlchemy ORM models in backend/src/models/translate.py: TranslationJob, TranslationRun, TranslationBatch, TranslationRecord, TranslationEvent, TranslationPreviewSession, TranslationPreviewRecord, TerminologyDictionary, DictionaryEntry, TranslationSchedule, TranslationJobDictionary, MetricSnapshot. Follow patterns from backend/src/models/llm.py (UUID PKs, generate_uuid, Base inheritance, JSON columns, UniqueConstraint, indexes, timezone-aware DateTime with callable defaults). Include source_term_normalized column on DictionaryEntry with unique constraint for case-insensitive matching.
T004 [P] Create Pydantic v2 request/response schemas in backend/src/schemas/translate.py: TranslateJobCreate, TranslateJobUpdate, TranslateJobResponse, DictionaryCreate, DictionaryImport, DictionaryResponse, TermCorrectionSubmit, ScheduleConfig, TranslationRunResponse, TranslationPreviewResponse (with PreviewRow), MetricsResponse. Follow existing backend/src/schemas/ patterns (use BaseModel, Field with defaults/validation)

Plugin Skeleton

T005 Create TranslatePlugin class in backend/src/plugins/translate/plugin.py inheriting from PluginBase. Implement id, name, description properties. Wire @RELATION INHERITS -> [PluginBase:Class] in contract header. (RATIONALE: separate plugin avoids bloating LLMAnalysisPlugin beyond fractal limit; REJECTED: extending LLMAnalysisPlugin would conflate domains)

Route Skeleton

T006 Create backend/src/api/routes/translate.py with FastAPI APIRouter (prefix=/api/translate, tags=["translate"]). Define all endpoint stubs with pass bodies for now: CRUD jobs, CRUD dictionaries, preview trigger, run trigger, retry, schedule CRUD, run history, metrics, correction submission, dictionary import. Attach Depends(require_permission(...)) annotations. Register router in backend/src/app.py alongside existing routers.

Database Migration

T007 Generate Alembic migration for all translate_* tables: translation_jobs, translation_runs, translation_records, translation_events, terminology_dictionaries, dictionary_entries, translation_schedules, translation_job_dictionaries. Run cd backend && alembic revision --autogenerate -m "add translation tables" and alembic upgrade head.

RBAC Registration

T008 Register 13 permission strings in the RBAC seed/permission store: translate.job.view, translate.job.create, translate.job.edit, translate.job.delete, translate.job.execute, translate.dictionary.view, translate.dictionary.create, translate.dictionary.edit, translate.dictionary.delete, translate.schedule.view, translate.schedule.manage, translate.history.view, translate.metrics.view. Ensure admin role gets all; analyst role gets translate.job.view, translate.job.execute, translate.dictionary.view, translate.history.view. Update role seeding script if needed.

Checkpoint: Foundation ready — models, schemas, plugin, routes, migration, and RBAC all in place. User story implementation can now begin.

Phase 3: User Story 1 — Configure Translation Job (Priority: P1) 🎯 MVP

Goal: User can create, edit, delete, and list translation jobs with datasource selection, column mapping, key columns, target table configuration, LLM settings, and dictionary attachment.

Independent Test: Open Configuration form → select Superset datasource → pick translation/context/key columns → specify target table → save → verify job appears in list with correct settings.

Backend — Job CRUD

T009 [P] [US1] Implement job CRUD service in backend/src/plugins/translate/plugin.py as methods on the TranslatePlugin class: create_job(), update_job(), delete_job(), get_job(), list_jobs(), duplicate_job(). Validate column existence via SupersetClient on create/update (FR-001, FR-002, FR-006). Enforce composite key support (FR-004). Detect virtual columns and warn (US1 acceptance scenario 5).
T010 [US1] Implement /api/translate/jobs endpoints in backend/src/api/routes/translate.py: POST / (create), GET / (list), GET /{job_id} (get), PUT /{job_id} (update), DELETE /{job_id} (delete), POST /{job_id}/duplicate (duplicate — FR-021). Inject Depends(require_permission("translate.job.*")) per operation.
T011 [US1] Implement /api/translate/datasources/{datasource_id}/columns endpoint that queries Superset for column metadata (name, type, is_physical flag) and the database dialect (backend/engine) from the connection configuration. Returns column list AND database_dialect field for the frontend. Cache dialect on TranslationJob.database_dialect at save time. Reject unsupported dialects at configuration time (FR-002, dialect detection).

Frontend — Job Config UI

T012 [P] [US1] Create TranslateApiClient module in frontend/src/lib/api/translate.js: fetchJobs(), createJob(), updateJob(), deleteJob(), duplicateJob(), fetchDatasourceColumns(). Use existing requestApi/fetchApi wrapper pattern.
T013 [US1] Create TranslationJobList SvelteKit page in frontend/src/routes/translate/+page.svelte: list all jobs with name, datasource, status/schedule indicators, create button, duplicate action. @UX_STATE: idle, loading, empty, populated, error.
T014 [US1] Create TranslationJobConfig SvelteKit page in frontend/src/routes/translate/[id]/+page.svelte: datasource dropdown → column selectors (translation column, context columns, key columns with [+ Add key] for composite), target table/column inputs, LLM provider selector, target language, batch size, prompt template editor, dictionary attachment (multi-select with priority ordering). @UX_STATE: idle, loading, configured, saving, validation_error, datasource_unavailable. @UX_REACTIVITY: column list $derived from datasource selection.

Verification — US1

T015 [US1] Write pytest integration tests for job CRUD API in backend/tests/test_translate_jobs.py: test create with valid config, create with missing translation column (expect 422), create with virtual key column (expect warning), update job, delete job, duplicate job. Mock SupersetClient for column metadata.
T016 [US1] Verify US1 acceptance scenarios against specs/028-llm-datasource-supeset/spec.md User Story 1 (5 scenarios). Run cd backend && pytest backend/tests/test_translate_jobs.py -v.

Checkpoint: Job CRUD fully functional — user can create, edit, list, and duplicate translation jobs with validated column mappings.

Phase 4: User Story 5 — Terminology Dictionary Management (Priority: P2)

Goal: User can create, edit, delete dictionaries; add terms inline; import CSV/TSV with duplicate detection; attach dictionaries to jobs with priority ordering.

Independent Test: Create dictionary with 5 terms → import CSV with 50 terms → verify duplicates flagged → attach dictionary to job → verify dictionary appears in job config.

Backend — Dictionary CRUD + Import

T017 [P] [US5] Implement DictionaryManager class in backend/src/plugins/translate/dictionary.py: create_dictionary(), update_dictionary(), delete_dictionary(), get_dictionary(), list_dictionaries(), add_entry(), edit_entry(), delete_entry(), clear_entries(). Enforce unique source_term per dictionary with conflict resolution (FR-026). Prevent deletion if attached to active/scheduled jobs (FR-030). @COMPLEXITY 4 — instrument with belief_scope/reason/reflect markers at mutation boundaries. (RATIONALE: C4 warranted because dictionary CRUD is stateful and must enforce referential integrity on deletion; REJECTED: pure C3 CRUD without state guards would allow orphaned job-dictionary links)
T018 [US5] Implement CSV/TSV import in DictionaryManager: parse uploaded content, detect delimiter, create DictionaryEntry rows, preview with duplicate detection, return parse errors with line numbers for malformed rows (FR-025). Add DictionaryImport schema validation.
T019 [US5] Implement /api/translate/dictionaries endpoints in backend/src/api/routes/translate.py: POST / (create), GET / (list), GET /{dict_id} (get with entries), PUT /{dict_id} (update), DELETE /{dict_id} (delete — blocked if attached), POST /{dict_id}/entries (add entry), PUT /{dict_id}/entries/{entry_id} (edit), DELETE /{dict_id}/entries/{entry_id} (delete), POST /{dict_id}/import (CSV/TSV import with preview).
T020 [US5] Implement per-batch dictionary filtering logic in DictionaryManager.filter_for_batch(source_texts: list[str]) -> list[dict]: scan batch texts for substrings matching dictionary source_term values; return matched entries in priority order across all attached dictionaries (FR-044). This is consumed by US2 (preview) and US3 (executor).

Frontend — Dictionary UI

T021 [P] [US5] Add dictionary API methods to frontend/src/lib/api/translate.js: fetchDictionaries(), createDictionary(), updateDictionary(), deleteDictionary(), fetchDictionaryEntries(), addEntry(), editEntry(), deleteEntry(), importDictionary().
T022 [US5] Create DictionaryList SvelteKit page in frontend/src/routes/translate/dictionaries/+page.svelte: list dictionaries with name, language, term count, attached job count, create/delete actions. @UX_STATE: idle, loading, empty, populated, delete_blocked.
T023 [US5] Create DictionaryEditor SvelteKit page in frontend/src/routes/translate/dictionaries/[id]/+page.svelte: inline term editor (source_term → target_translation), add/delete rows, CSV/TSV import with conflict preview, export. @UX_STATE: idle, loading, editing, importing, import_preview, import_conflict, saving. @UX_FEEDBACK: import preview with duplicate flags; toast on save.

Verification — US5

T024 [US5] Write pytest tests for DictionaryManager in backend/src/plugins/translate/__tests__/test_dictionary.py: test create/update/delete, add entry with duplicate detection (expect conflict), import CSV with valid/invalid rows, delete dictionary blocked by active job, per-batch filtering returns matched terms.
T025 [US5] Verify US5 acceptance scenarios against spec User Story 5 (6 scenarios). Run cd backend && pytest backend/src/plugins/translate/__tests__/test_dictionary.py -v.

Checkpoint: Dictionary management fully functional — CRUD, import, filtering, and job attachment all work.

Phase 5: User Story 2 — Preview Translated Output (Priority: P2)

Goal: User triggers preview on a saved job → system fetches sample rows → sends to LLM with context + dictionary → displays source/context/translation side-by-side → user approves/edits/rejects → preview state saved for execution gate.

Independent Test: Create job + dictionary → click Preview → verify 10 rows shown with LLM translations → approve 8, edit 1, reject 1 → verify state preserved.

Backend — Preview Engine

T026 [US2] Implement TranslationPreview class in backend/src/plugins/translate/preview.py: preview_rows(job_id, sample_size). Fetch source rows from Superset via SupersetClient; construct LLM prompt using LLMProviderService + llm_prompt_templates.render_prompt() + DictionaryManager.filter_for_batch(); call LLM; return PreviewRow list. @COMPLEXITY 4 — instrument with belief_scope/reason/reflect at LLM call boundaries. (RATIONALE: C4 because preview is stateful (approve/edit/reject lifecycle) and calls external LLM API with side effects; REJECTED: making preview purely read-only without approval state would degrade UX by losing user decisions between preview and execution)
T027 [US2] Implement token count and cost estimation in preview response: compute estimated tokens from sample → extrapolate to full dataset row count → apply provider pricing → return estimated_total_rows, estimated_tokens, estimated_cost in TranslationPreviewResponse (FR-014).
T028 [US2] Implement preview quality gate: create persistent TranslationPreviewSession and TranslationPreviewRecord rows with config_hash and dict_snapshot_hash. Preview acceptance gates full execution; rejected preview sample rows are excluded from full run. Preview is a quality gate — unseen rows are processed normally in full run.
T029 [US2] Implement /api/translate/jobs/{job_id}/preview endpoint: POST triggers preview, returns preview rows with status=pending. Add PUT /api/translate/jobs/{job_id}/preview/rows/{row_key} for approve/edit/reject actions. Add POST /api/translate/jobs/{job_id}/preview/approve-all for bulk approve.

Frontend — Preview UI

T030 [P] [US2] Add preview API methods to frontend/src/lib/api/translate.js: fetchPreview(), approveRow(), editRow(), rejectRow(), approveAll().
T031 [US2] Create TranslationPreview component in frontend/src/lib/components/translate/TranslationPreview.svelte: side-by-side table (source, context, LLM translation), approve/edit/reject buttons per row, bulk approve, cost estimate card before full run, row limit input. @UX_STATE: idle, loading, preview_loaded, preview_error, retrying. @UX_FEEDBACK: spinner during LLM call; visual distinction for LLM-generated vs user-edited values; cost estimate reactivity. @UX_RECOVERY: retry preview button; individual row re-translate.
T032 [US2] Integrate TranslationPreview into TranslationJobConfig page (frontend/src/routes/translate/[id]/+page.svelte) as a tab or collapsible section that appears after job is saved.

Verification — US2

T033 [US2] Write pytest tests for preview in backend/src/plugins/translate/__tests__/test_preview.py: test preview with valid job, preview with dictionary (verify glossary terms in prompt), preview row approve/edit/reject state transitions, cost estimation accuracy. Mock LLM provider responses.
T034 [US2] Write vitest component test for TranslationPreview in frontend/src/lib/components/translate/__tests__/TranslationPreview.test.js: test rendering of preview rows, approve/reject/edit interactions, bulk approve behavior. Mock API client.
T035 [US2] Verify US2 acceptance scenarios against spec User Story 2 (5 scenarios). Run cd backend && pytest backend/src/plugins/translate/__tests__/test_preview.py -v && cd frontend && npm run test -- --run.

Checkpoint: Preview flows complete — LLM translation with context + dictionary, approve/edit/reject lifecycle, cost estimation.

Phase 6: User Story 3 — Execute Translation & Insert Results (Priority: P3)

Goal: User triggers full batch execution → system processes rows in batches → generates INSERT SQL → user copies to SQL Lab or auto-executes → failed batches retryable.

Independent Test: Create job → preview + approve → execute → verify INSERT SQL generated with correct key columns → execute in SQL Lab → verify rows in target table.

Backend — Executor + SQL Generator + Orchestrator

T036 [US3] Implement SQLGenerator class in backend/src/plugins/translate/sql_generator.py: generate_insert(records: list[TranslationRecord], job: TranslationJob) -> str. Detect dialect from job.database_dialect (cached from Superset connection at save time). Produce safe dialect-appropriate SQL: for PostgreSQL/Greenplum — INSERT INTO "target_table" ("key_cols"..., "target_col") VALUES (...) with quoted identifiers; support upsert_strategy: insert (plain INSERT), skip_existing (ON CONFLICT DO NOTHING), overwrite (ON CONFLICT DO UPDATE). For ClickHouse — INSERT INTO target_table (key_cols..., target_col) VALUES (...); skip_existing warns user (not natively supported); overwrite documented limitation. @COMPLEXITY 3. (RATIONALE: dialect-aware because Superset connections may use ClickHouse or PostgreSQL; REJECTED: PostgreSQL-only would break ClickHouse users; raw identifier interpolation rejected)
T037 [US3] Implement TranslationExecutor class in backend/src/plugins/translate/executor.py: execute_run(run: TranslationRun, job: TranslationJob). Fetch all source rows from Superset; split into batches; for each batch: call DictionaryManager.filter_for_batch(), construct prompt via LLMProviderService, call LLM, create TranslationRecord rows with status translated/failed/skipped; handle batch-level retry on LLM failure (FR-015); skip NULL translation values (FR-016); reject NULL key values (FR-017); update run statistics. @COMPLEXITY 4 — instrument with belief_scope/reason/reflect at batch boundaries and error paths.
T038 [US3] Implement TranslationOrchestrator class in backend/src/plugins/translate/orchestrator.py: start_run(job_id, trigger_type). Validate preconditions (job config valid, datasource accessible, LLM provider reachable); create TranslationRun with status running and config/dict snapshots (FR-019, FR-029); dispatch to TranslationExecutor; on completion call SQLGenerator; record TranslationEvent rows via TranslationEventLog (FR-046); enforce state transitions: pending → running → (completed | partial | failed) — no skipping. @COMPLEXITY 5 — full @PRE/@POST/@DATA_CONTRACT/@INVARIANT enforcement with @RATIONALE/@REJECTED. (RATIONALE: central coordinator is C5 because preview, execution, event logging, and retry share run state and must coordinate within a single transaction boundary; REJECTED: distributed actor model would introduce eventual-consistency challenges for status tracking at current scale)
T039 [US3] Implement TranslationEventLog class in backend/src/plugins/translate/events.py: log_event(run_id, job_id, event_type, payload). Create immutable TranslationEvent row. query_events(job_id, filters) for audit/dashboard. prune_expired() for 90-day retention enforcement (FR-049) — scheduled via APScheduler cleanup job. @COMPLEXITY 5 — @INVARIANT: every run must have exactly one run_started and one terminal event. (RATIONALE: C5 warranted because event log is single source of truth for observability, metrics, and audit; REJECTED: stdout-only logging lacks structured payload integrity and cannot enforce terminal-event invariant)
T040 [US3] Implement execution endpoints in backend/src/api/routes/translate.py: POST /api/translate/jobs/{job_id}/runs (trigger manual run — creates run, dispatches orchestrator which translates AND submits to Superset API), GET /api/translate/runs/{run_id} (status + statistics + insert_status + superset_query_id), GET /api/translate/runs/{run_id}/records (paginated TranslationRecord list), POST /api/translate/runs/{run_id}/retry (retry failed batches only — FR-015), POST /api/translate/runs/{run_id}/retry-insert (retry Superset insert only without re-translating). Inject Depends(require_permission("translate.job.execute")).

Frontend — Execution UI

T041 [P] [US3] Add execution API methods to frontend/src/lib/api/translate.js: triggerRun(), fetchRunStatus(), fetchRunRecords(), retryFailedBatches().
T042 [US3] Create TranslationRunProgress component in frontend/src/lib/components/translate/TranslationRunProgress.svelte: live progress bar (WebSocket-driven from TaskWebSocket), batch counter (N/M), success/failure/skip counts, cancel button. @UX_STATE: idle, running, pausing, cancelled, completed, partial, failed. @UX_FEEDBACK: progress percentage $derived from translated/total; real-time counts. @UX_RECOVERY: retry failed batches button; cancel run; download skipped rows.
T043 [US3] Create TranslationRunResult component in frontend/src/lib/components/translate/TranslationRunResult.svelte: completion summary (rows translated/failed/skipped, token count, cost, insert_status), Superset execution reference with status badge, generated SQL block for audit/debugging (collapsed by default), retry-insert button. @UX_STATE: completed, partial, failed, insert_failed. @UX_FEEDBACK: Superset execution status badge; SQL block for audit.
T044 [US3] Integrate TranslationRunProgress and TranslationRunResult into TranslationJobConfig page as the "Run" tab/section.

Verification — US3

T045 [US3] Write pytest tests for SQLGenerator in backend/src/plugins/translate/__tests__/test_sql_generator.py: test INSERT with single key, composite key — for PostgreSQL dialect AND ClickHouse dialect. Test PostgreSQL UPSERT (ON CONFLICT DO NOTHING, ON CONFLICT DO UPDATE). Test ClickHouse plain INSERT and skip_existing warning. Test NULL key rejection, NULL translation value skipping, identifier quoting per dialect, injection safety. Validate SQL syntax correctness against each dialect.
T046 [US3] Write pytest tests for executor + orchestrator in backend/src/plugins/translate/__tests__/test_orchestrator.py: test full run lifecycle (pending→running→completed), partial failure (one batch fails, rest succeed), batch retry, event log invariants, NULL handling. Mock LLM provider and SupersetClient.
T047 [US3] Verify US3 acceptance scenarios against spec User Story 3 (5 scenarios). Run cd backend && pytest backend/src/plugins/translate/__tests__/test_orchestrator.py backend/src/plugins/translate/__tests__/test_sql_generator.py -v.

Checkpoint: Execution pipeline complete — batch processing, INSERT generation, retry, event logging. User can translate data and insert into target table.

Phase 7: User Story 6 — Feedback Loop (Correct → Dictionary) (Priority: P3)

Goal: In run results, user selects incorrect translation → submits correction to dictionary → dictionary updated with origin tracking → next run uses corrected term.

Independent Test: Complete a run → find incorrect translation → open correction popup → submit to dictionary → re-run preview → verify corrected term used.

Backend — Correction Submission

T048 [US6] Implement correction submission endpoint in backend/src/api/routes/translate.py: POST /api/translate/corrections accepting TermCorrectionSubmit body. Validate target language match between dictionary and job (FR language validation edge case); detect existing entry conflict → return conflict response (FR-032); create DictionaryEntry with origin tracking (origin_run_id, origin_row_key, origin_user_id) per FR-033. Inject Depends(require_permission("translate.dictionary.edit")).
T049 [US6] Implement bulk correction endpoint: POST /api/translate/corrections/bulk accepting array of TermCorrectionSubmit objects (FR-034). Process atomically — if any conflict is detected, return all conflicts for user resolution before partial apply.

Frontend — Correction UI

T050 [P] [US6] Add correction API methods to frontend/src/lib/api/translate.js: submitCorrection(), submitBulkCorrections().
T051 [US6] Create TermCorrectionPopup component in frontend/src/lib/components/translate/TermCorrectionPopup.svelte: text selection on source term and incorrect target translation → popup with source term (pre-filled from source column), incorrect target translation (pre-filled from selection), corrected target translation input, dictionary selector dropdown (filtered by target language), submit button, conflict dialog (overwrite/keep existing/cancel). @UX_STATE: closed, selecting, editing, submitting, conflict_detected, submitted. @UX_FEEDBACK: "Added to Dictionary" badge on corrected row.
T052 [US6] Create BulkCorrectionSidebar component in frontend/src/lib/components/translate/BulkCorrectionSidebar.svelte: sidebar collecting selected terms across rows, per-term correction inputs, submit all to dictionary. @UX_STATE: closed, collecting, reviewing, submitting, submitted. @UX_REACTIVITY: selected terms list $state.
T053 [US6] Integrate feedback-loop components into TranslationRunResult (T043) — add selection highlight behavior and correction triggers.

Verification — US6

T054 [US6] Write pytest tests for correction endpoints in backend/tests/test_translate_corrections.py: test single correction, bulk correction, conflict detection (existing term), cross-language rejection, origin tracking fields populated. Verify corrected term appears in next preview's dictionary filter.
T055 [US6] Verify US6 acceptance scenarios against spec User Story 6 (5 scenarios). Run cd backend && pytest backend/tests/test_translate_corrections.py -v.

Checkpoint: Feedback loop complete — corrections flow from results → dictionary → next run.

Phase 8: User Story 7 — Schedule Translation Jobs (Priority: P3)

Goal: User configures schedule → system triggers runs → new-key-only translation → optional auto-INSERT → failure notification → pause/resume.

Independent Test: Configure schedule (every 5 min for test) → wait for trigger → verify new TranslationRun created → verify only new keys translated → disable schedule → verify no more triggers.

Backend — Schedule Management + Trigger Dispatch

T056 [US7] Implement TranslationScheduler class in backend/src/plugins/translate/scheduler.py: create_schedule(), update_schedule(), delete_schedule(), enable_schedule(), disable_schedule(), get_next_executions(schedule, n=3) (FR-036). Register schedule with existing SchedulerService via add_job() with cron/interval/date trigger. @COMPLEXITY 4 — instrument with belief_scope/reason/reflect. (RATIONALE: C4 because schedule management is stateful with APScheduler integration, concurrency policy enforcement, and trigger dispatch side effects)
T057 [US7] Implement schedule trigger handler: _execute_scheduled_translation(job_id). Enforce concurrency policy: check if previous run for same job is still running → skip (log + event) or queue (start after previous completes) per FR-039. If proceeding: create new TranslationRun with trigger_type=scheduled; fetch source rows; apply new-key-only filter (FR-045) — compare current key values against previous successful run's key values; dispatch to TranslationOrchestrator. On failure, send notification via NotificationService (FR-041, FR-048). Schedule remains enabled for next trigger (US7 acceptance scenario 6).
T058 [US7] Implement Superset SQL Lab API submission for all runs: create SupersetSqlLabExecutor class in backend/src/plugins/translate/superset_executor.py. Submit generated SQL to /api/v1/sqllab/execute/, poll execution status, update TranslationRun.insert_status, superset_query_id, rows_affected, error fields. For scheduled runs, this happens automatically; for manual runs, this happens on user trigger. Record insert_submitted/insert_succeeded/insert_failed events.
T059 [US7] Implement schedule endpoints in backend/src/api/routes/translate.py: PUT /api/translate/jobs/{job_id}/schedule (create/update), DELETE /api/translate/jobs/{job_id}/schedule (remove), POST /api/translate/jobs/{job_id}/schedule/enable (FR-040), POST /api/translate/jobs/{job_id}/schedule/disable (FR-040). Inject Depends(require_permission("translate.schedule.manage")). Add schedule warning when editing job with active schedule (FR-042).
T060 [US7] Extend SchedulerService.load_schedules() in backend/src/core/scheduler.py to discover and register active TranslationSchedule rows alongside existing backup schedules (R4).

Frontend — Schedule UI

T061 [P] [US7] Add schedule API methods to frontend/src/lib/api/translate.js: updateSchedule(), deleteSchedule(), enableSchedule(), disableSchedule().
T062 [US7] Create ScheduleConfig component in frontend/src/lib/components/translate/ScheduleConfig.svelte: type selector (cron/interval/once), cron expression input with validation, interval input, timezone selector, run-at datetime picker, next-3-executions preview (with timezone), concurrency policy selector (skip/queue), enable/disable toggle with status indicator. Warns if no prior successful manual run exists. @UX_STATE: idle, editing, validating, enabled, disabled, no_prior_run_warning. @UX_REACTIVITY: next execution times $derived from schedule config with timezone display.
T063 [US7] Integrate ScheduleConfig into TranslationJobConfig page as the "Schedule" tab.

Verification — US7

T064 [US7] Write pytest tests for scheduler in backend/src/plugins/translate/__tests__/test_scheduler.py: test schedule CRUD, cron expression validation, next-N-executions calculation, trigger dispatch with skip/queue concurrency, new-key-only filter (verify only unseen keys processed), auto-INSERT execution, failure notification, pause/resume, load on SchedulerService start.
T065 [US7] Verify US7 acceptance scenarios against spec User Story 7 (8 scenarios). Run cd backend && pytest backend/src/plugins/translate/__tests__/test_scheduler.py -v.

Checkpoint: Scheduling complete — jobs can run automatically on schedule with new-key-only incremental translation and failure recovery.

Phase 9: User Story 4 — Translation History & Audit Trail (Priority: P4)

Goal: User views past runs with filterable list; inspects run details (config snapshot, prompt, translations, INSERT SQL); sees edit marks; duplicates job. Admin views metrics dashboard.

Independent Test: Run several translations → open history → filter by datasource → click run → verify config snapshot, prompt, translations with edit marks, INSERT SQL all shown.

Backend — History + Metrics Endpoints

T066 [US4] Implement history endpoints in backend/src/api/routes/translate.py: GET /api/translate/runs (list with filters: job_id, datasource_id, target_table, status, date_from, date_to, pagination per FR-020), GET /api/translate/runs/{run_id} (detail with config_snapshot, prompt_used, records with llm_translation and user_edit fields visible — FR showing original vs user-edited).
T067 [US4] Implement TranslationMetrics class in backend/src/plugins/translate/metrics.py: get_job_metrics(job_id) -> MetricsResponse. Aggregate from TranslationEvent table: total runs, success/failure counts, cumulative tokens, cumulative cost, average batch latency (FR-047). @COMPLEXITY 3.
T068 [US4] Implement metrics endpoint: GET /api/translate/jobs/{job_id}/metrics. Inject Depends(require_permission("translate.history.view")).

Frontend — History + Metrics UI

T069 [P] [US4] Add history API methods to frontend/src/lib/api/translate.js: fetchRunHistory(), fetchRunDetail(), fetchJobMetrics().
T070 [US4] Create TranslationHistory SvelteKit page in frontend/src/routes/translate/history/+page.svelte: filterable table (datasource, target table, row count, status, date, user), click-to-expand detail with config snapshot, prompt, translation rows with edit marks, INSERT SQL. @UX_STATE: idle, loading, empty, populated, detail_open. @UX_REACTIVITY: filtered list $derived from filters.
T071 [US4] Create admin metrics dashboard section (integrated into existing admin pages or standalone) displaying per-job metrics: run counts, success/failure ratio, cumulative tokens, cumulative cost, average latency. Use MetricsResponse schema.

Verification — US4

T072 [US4] Write pytest tests for history + metrics in backend/tests/test_translate_history.py: test run list with filters, run detail with snapshots, metrics aggregation accuracy, TranslationEvent queryability.
T073 [US4] Verify US4 acceptance scenarios against spec User Story 4 (4 scenarios). Run cd backend && pytest backend/tests/test_translate_history.py -v.

Checkpoint: History and audit complete — all runs traceable, metrics dashboard populated.

Phase 10: Polish & Cross-Cutting Concerns

Purpose: Retention enforcement, notification wiring, semantic audit, quickstart validation, and rejected-path regression protection.

T074 [P] Implement 90-day retention pruning in TranslationEventLog.prune_expired(): run as APScheduler daily cleanup job. BEFORE pruning events/records: persist cumulative metrics as MetricSnapshot row (tokens, cost, run counts). Then prune TranslationRecord, TranslationPreviewRecord, TranslationEvent, and insert_sql/config_snapshot fields older than 90 days. Preserve TranslationRun metadata, MetricSnapshot rows, and superset_query_id. Verify metrics remain accurate post-prune (SC-014). (RATIONALE: metric snapshots prevent cumulative data loss from event pruning; REJECTED: indefinite retention would violate storage constraints)
T075 [P] Wire scheduled-run failure notification: ensure TranslationScheduler trigger handler calls NotificationService.send() when a scheduled run fails (FR-041, FR-048). Test with mock notification provider.
T076 [P] Instrument remaining C4/C5 Python flows with belief_scope/reason/reflect/explore markers where missing: TranslationOrchestrator.start_run() (entry/exit), TranslationExecutor.execute_run() (batch boundaries + error paths), DictionaryManager mutation boundaries, TranslationScheduler trigger dispatch. Verify via axiom_semantic_validation belief-runtime audit.
T077 Run full semantic audit via axiom MCP tools:
- axiom_semantic_validation audit_contracts --file_path backend/src/plugins/translate/ — verify all [DEF] anchors are closed, @RELATION targets resolve, no orphan contracts, C4+ contracts have required tag density
- axiom_semantic_validation audit_belief_protocol --file_path backend/src/plugins/translate/ — verify @RATIONALE/@REJECTED present on all C5 contracts
- axiom_semantic_validation audit_belief_runtime --file_path backend/src/plugins/translate/ — verify belief_scope/reason/reflect/explore markers exist in all C4+ module bodies
- axiom_semantic_validation impact_analysis --contract_id TranslationOrchestrator:Class — verify no rejected path is accidentally re-enabled
T078 Run quickstart validation: follow specs/028-llm-datasource-supeset/quickstart.md end-to-end — create dictionary → create job → preview → execute → verify INSERT SQL → submit correction → schedule → view history → verify metrics. Run cd backend && pytest -v, cd frontend && npm run test -- --run, cd backend && ruff check src/plugins/translate/ src/api/routes/translate.py src/models/translate.py src/schemas/translate.py.
T079 Rejected-path regression guard: add a test case in backend/src/plugins/translate/__tests__/test_orchestrator.py verifying snapshot isolation — changing job config mid-run does NOT invalidate the running TranslationRun. Add a test case in backend/src/plugins/translate/__tests__/test_sql_generator.py verifying that UPDATE statements are never generated (only INSERT/UPSERT per PostgreSQL dialect). Add a test case in backend/src/plugins/translate/__tests__/test_dictionary.py verifying that duplicate source_term entries cannot coexist (UniqueConstraint enforced) and conflict resolution only offers overwrite/keep-existing. Add a test case in backend/src/plugins/translate/__tests__/test_retention.py verifying metric snapshots are persisted before event pruning and cumulative metrics remain accurate post-prune.
T080 [P] Implement cancel run endpoint: POST /api/translate/runs/{run_id}/cancel in backend/src/api/routes/translate.py. Set translation_status=cancelled, mark in-progress batches as failed, do NOT submit INSERT SQL. Emit run_cancelled event. Inject Depends(require_permission("translate.job.execute")).
T081 [P] Implement download skipped rows endpoint: GET /api/translate/runs/{run_id}/skipped.csv returning CSV of rows skipped due to NULL keys or translation failures. Use key_hash for efficient lookup.
T082 [P] Compute key_hash for TranslationRecord and TranslationPreviewRecord: hash(canonical_json(key_values)) at creation time. Add config_hash for TranslationRun and TranslationPreviewSession: hash of effective config (columns, keys, target, prompt, dictionaries). Use for idempotency checks, new-key-only filtering, and stale preview detection.

Dependencies & Execution Order

Phase Dependencies

Setup (Phase 1): No dependencies — can start immediately
Foundational (Phase 2): Depends on Setup — BLOCKS all user stories
US1 (Phase 3): Depends on Foundational — no dependencies on other stories. Recommended start after Foundational.
US5 (Phase 4): Depends on Foundational — can run in parallel with US1. Dictionary filtering (T020) will be consumed later by US2/US3 but is self-contained.
US2 (Phase 5): Depends on US1 (needs saved job) + US5 (needs dictionary filtering). Can start integration once US1 backend is stable.
US3 (Phase 6): Depends on US1 (needs job config) + US2 (preview decisions feed executor). Sequential after US2.
US6 (Phase 7): Depends on US3 (needs run results) + US5 (needs dictionary). Can run in parallel with US7 after US3.
US7 (Phase 8): Depends on US1 (needs job) + US3 (needs execution pipeline). Can run in parallel with US6 after US3.
US4 (Phase 9): Depends on US3 (needs run records). Can run in parallel with US6/US7 after US3.
Polish (Phase 10): Depends on all desired user stories being complete.

Parallel Opportunities

Phase	Parallel Tasks	Notes
1	—	Sequential (only 2 tasks)
2	T003 ∥ T004	Models + Schemas in parallel
3 (US1)	T009 ∥ T012	Backend CRUD ∥ API client
4 (US5)	T017 ∥ T021	DictionaryManager ∥ API client
5 (US2)	T030 ∥ T031	API client ∥ Preview component
6 (US3)	T036 ∥ T041	SQLGenerator ∥ API client
7 (US6)	T050 ∥ T051 ∥ T052	API client ∥ Popup ∥ Sidebar
8 (US7)	T061 ∥ T062	API client ∥ ScheduleConfig
9 (US4)	T069 ∥ T070	API client ∥ History page
10	T074 ∥ T075 ∥ T076	Retention, notifications, belief instrumentation

Cross-Story Parallelism

After Foundational (Phase 2):

US1 and US5 can proceed in parallel by different developers
After US3 completes: US6, US7, and US4 can proceed in parallel

Implementation Strategy

MVP First (US1 Only)

Phase 1 + Phase 2 → Foundation
Phase 3 (US1) → Job configuration CRUD
STOP and VALIDATE: User can create, list, edit, delete translation jobs
Deploy/demo — partial value (configuration ready, no translation yet)

Minimum Viable Feature (US1 + US5 + US2 + US3)

Foundation → US1 + US5 (parallel) → US2 → US3
STOP and VALIDATE: End-to-end translation flow works: configure → preview → execute → INSERT
This is the core feature — all remaining stories add automation (US7), quality improvement (US6), and visibility (US4)

Full Feature (All Stories)

MVP → US6 + US7 + US4 (parallel after US3) → Polish
Scheduled automation, feedback loop, and audit trail all functional

Notes

All file paths reference the actual repository structure (backend/src/, frontend/src/).
@COMPLEXITY 4/5 backend contracts require belief_scope/reason/reflect markers — verified in T076.
@RATIONALE/@REJECTED tags appear only in C5 contracts (TranslationOrchestrator, TranslationEventLog) per INV_7.
Rejected paths are explicitly protected by regression tests in T079.
[NEED_CONTEXT] markers: none — all contract targets resolve to existing or planned modules within this feature.
The existing LLMProviderService, SupersetClient, SchedulerService, NotificationService, and TaskWebSocket contracts are reused without modification.
Quickstart.md (T078) serves as the human-verifiable acceptance test for the full feature.

41 KiB Raw Blame History