41 KiB
Tasks: LLM Table Translation Service
Feature Branch: 028-llm-datasource-supeset
Input: Design documents from /specs/028-llm-datasource-supeset/
Prerequisites: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/modules.md
Tests: Test tasks are included for all C4/C5 backend contracts, new API endpoints, and Svelte components with @UX_STATE contracts. Test work traces to contract @PRE/@POST guarantees and spec acceptance scenarios.
Organization: Tasks are grouped by user story to enable independent implementation and testing of each story.
Format: [ID] [P?] [Story] Description
- [P]: Can run in parallel (different files, no dependencies)
- [Story]: Which user story this task belongs to (e.g., US1, US5)
- Include exact file paths in descriptions
Phase 1: Setup (Shared Infrastructure)
Purpose: Create plugin directory structure and register the new route module in the lazy-import registry.
- T001 Create translation plugin directory structure:
backend/src/plugins/translate/__init__.py,backend/src/plugins/translate/plugin.py(empty skeleton), plusbackend/src/plugins/translate/__tests__/__init__.py - T002 Register
translateroute module inbackend/src/api/routes/__init__.py— add"translate"to__all__list inside[DEF:Route_Group_Contracts:Block]
Phase 2: Foundational (Blocking Prerequisites)
Purpose: ORM models, Pydantic schemas, plugin boilerplate, route skeleton, and database migration. ALL user stories depend on these artifacts.
⚠️ CRITICAL: No user story work can begin until this phase is complete.
ORM Models
- T003 [P] Create all SQLAlchemy ORM models in
backend/src/models/translate.py:TranslationJob,TranslationRun,TranslationBatch,TranslationRecord,TranslationEvent,TranslationPreviewSession,TranslationPreviewRecord,TerminologyDictionary,DictionaryEntry,TranslationSchedule,TranslationJobDictionary,MetricSnapshot. Follow patterns frombackend/src/models/llm.py(UUID PKs,generate_uuid,Baseinheritance, JSON columns,UniqueConstraint, indexes, timezone-aware DateTime with callable defaults). Includesource_term_normalizedcolumn onDictionaryEntrywith unique constraint for case-insensitive matching. - T004 [P] Create Pydantic v2 request/response schemas in
backend/src/schemas/translate.py:TranslateJobCreate,TranslateJobUpdate,TranslateJobResponse,DictionaryCreate,DictionaryImport,DictionaryResponse,TermCorrectionSubmit,ScheduleConfig,TranslationRunResponse,TranslationPreviewResponse(withPreviewRow),MetricsResponse. Follow existingbackend/src/schemas/patterns (useBaseModel,Fieldwith defaults/validation)
Plugin Skeleton
- T005 Create
TranslatePluginclass inbackend/src/plugins/translate/plugin.pyinheriting fromPluginBase. Implementid,name,descriptionproperties. Wire@RELATION INHERITS -> [PluginBase:Class]in contract header. (RATIONALE: separate plugin avoids bloatingLLMAnalysisPluginbeyond fractal limit; REJECTED: extending LLMAnalysisPlugin would conflate domains)
Route Skeleton
- T006 Create
backend/src/api/routes/translate.pywith FastAPIAPIRouter(prefix=/api/translate, tags=["translate"]). Define all endpoint stubs withpassbodies for now: CRUD jobs, CRUD dictionaries, preview trigger, run trigger, retry, schedule CRUD, run history, metrics, correction submission, dictionary import. AttachDepends(require_permission(...))annotations. Register router inbackend/src/app.pyalongside existing routers.
Database Migration
- T007 Generate Alembic migration for all
translate_*tables:translation_jobs,translation_runs,translation_records,translation_events,terminology_dictionaries,dictionary_entries,translation_schedules,translation_job_dictionaries. Runcd backend && alembic revision --autogenerate -m "add translation tables"andalembic upgrade head.
RBAC Registration
- T008 Register 13 permission strings in the RBAC seed/permission store:
translate.job.view,translate.job.create,translate.job.edit,translate.job.delete,translate.job.execute,translate.dictionary.view,translate.dictionary.create,translate.dictionary.edit,translate.dictionary.delete,translate.schedule.view,translate.schedule.manage,translate.history.view,translate.metrics.view. Ensure admin role gets all; analyst role getstranslate.job.view,translate.job.execute,translate.dictionary.view,translate.history.view. Update role seeding script if needed.
Checkpoint: Foundation ready — models, schemas, plugin, routes, migration, and RBAC all in place. User story implementation can now begin.
Phase 3: User Story 1 — Configure Translation Job (Priority: P1) 🎯 MVP
Goal: User can create, edit, delete, and list translation jobs with datasource selection, column mapping, key columns, target table configuration, LLM settings, and dictionary attachment.
Independent Test: Open Configuration form → select Superset datasource → pick translation/context/key columns → specify target table → save → verify job appears in list with correct settings.
Backend — Job CRUD
- T009 [P] [US1] Implement job CRUD service in
backend/src/plugins/translate/plugin.pyas methods on theTranslatePluginclass:create_job(),update_job(),delete_job(),get_job(),list_jobs(),duplicate_job(). Validate column existence viaSupersetClienton create/update (FR-001, FR-002, FR-006). Enforce composite key support (FR-004). Detect virtual columns and warn (US1 acceptance scenario 5). - T010 [US1] Implement
/api/translate/jobsendpoints inbackend/src/api/routes/translate.py:POST /(create),GET /(list),GET /{job_id}(get),PUT /{job_id}(update),DELETE /{job_id}(delete),POST /{job_id}/duplicate(duplicate — FR-021). InjectDepends(require_permission("translate.job.*"))per operation. - T011 [US1] Implement
/api/translate/datasources/{datasource_id}/columnsendpoint that queries Superset for column metadata (name, type, is_physical flag) and the database dialect (backend/engine) from the connection configuration. Returns column list ANDdatabase_dialectfield for the frontend. Cache dialect onTranslationJob.database_dialectat save time. Reject unsupported dialects at configuration time (FR-002, dialect detection).
Frontend — Job Config UI
- T012 [P] [US1] Create
TranslateApiClientmodule infrontend/src/lib/api/translate.js:fetchJobs(),createJob(),updateJob(),deleteJob(),duplicateJob(),fetchDatasourceColumns(). Use existingrequestApi/fetchApiwrapper pattern. - T013 [US1] Create
TranslationJobListSvelteKit page infrontend/src/routes/translate/+page.svelte: list all jobs with name, datasource, status/schedule indicators, create button, duplicate action.@UX_STATE: idle, loading, empty, populated, error. - T014 [US1] Create
TranslationJobConfigSvelteKit page infrontend/src/routes/translate/[id]/+page.svelte: datasource dropdown → column selectors (translation column, context columns, key columns with [+ Add key] for composite), target table/column inputs, LLM provider selector, target language, batch size, prompt template editor, dictionary attachment (multi-select with priority ordering).@UX_STATE: idle, loading, configured, saving, validation_error, datasource_unavailable.@UX_REACTIVITY: column list$derivedfrom datasource selection.
Verification — US1
- T015 [US1] Write pytest integration tests for job CRUD API in
backend/tests/test_translate_jobs.py: test create with valid config, create with missing translation column (expect 422), create with virtual key column (expect warning), update job, delete job, duplicate job. MockSupersetClientfor column metadata. - T016 [US1] Verify US1 acceptance scenarios against
specs/028-llm-datasource-supeset/spec.mdUser Story 1 (5 scenarios). Runcd backend && pytest backend/tests/test_translate_jobs.py -v.
Checkpoint: Job CRUD fully functional — user can create, edit, list, and duplicate translation jobs with validated column mappings.
Phase 4: User Story 5 — Terminology Dictionary Management (Priority: P2)
Goal: User can create, edit, delete dictionaries; add terms inline; import CSV/TSV with duplicate detection; attach dictionaries to jobs with priority ordering.
Independent Test: Create dictionary with 5 terms → import CSV with 50 terms → verify duplicates flagged → attach dictionary to job → verify dictionary appears in job config.
Backend — Dictionary CRUD + Import
- T017 [P] [US5] Implement
DictionaryManagerclass inbackend/src/plugins/translate/dictionary.py:create_dictionary(),update_dictionary(),delete_dictionary(),get_dictionary(),list_dictionaries(),add_entry(),edit_entry(),delete_entry(),clear_entries(). Enforce uniquesource_termper dictionary with conflict resolution (FR-026). Prevent deletion if attached to active/scheduled jobs (FR-030).@COMPLEXITY 4— instrument withbelief_scope/reason/reflectmarkers at mutation boundaries. (RATIONALE: C4 warranted because dictionary CRUD is stateful and must enforce referential integrity on deletion; REJECTED: pure C3 CRUD without state guards would allow orphaned job-dictionary links) - T018 [US5] Implement CSV/TSV import in
DictionaryManager: parse uploaded content, detect delimiter, createDictionaryEntryrows, preview with duplicate detection, return parse errors with line numbers for malformed rows (FR-025). AddDictionaryImportschema validation. - T019 [US5] Implement
/api/translate/dictionariesendpoints inbackend/src/api/routes/translate.py:POST /(create),GET /(list),GET /{dict_id}(get with entries),PUT /{dict_id}(update),DELETE /{dict_id}(delete — blocked if attached),POST /{dict_id}/entries(add entry),PUT /{dict_id}/entries/{entry_id}(edit),DELETE /{dict_id}/entries/{entry_id}(delete),POST /{dict_id}/import(CSV/TSV import with preview). - T020 [US5] Implement per-batch dictionary filtering logic in
DictionaryManager.filter_for_batch(source_texts: list[str]) -> list[dict]: scan batch texts for substrings matching dictionarysource_termvalues; return matched entries in priority order across all attached dictionaries (FR-044). This is consumed by US2 (preview) and US3 (executor).
Frontend — Dictionary UI
- T021 [P] [US5] Add dictionary API methods to
frontend/src/lib/api/translate.js:fetchDictionaries(),createDictionary(),updateDictionary(),deleteDictionary(),fetchDictionaryEntries(),addEntry(),editEntry(),deleteEntry(),importDictionary(). - T022 [US5] Create
DictionaryListSvelteKit page infrontend/src/routes/translate/dictionaries/+page.svelte: list dictionaries with name, language, term count, attached job count, create/delete actions.@UX_STATE: idle, loading, empty, populated, delete_blocked. - T023 [US5] Create
DictionaryEditorSvelteKit page infrontend/src/routes/translate/dictionaries/[id]/+page.svelte: inline term editor (source_term → target_translation), add/delete rows, CSV/TSV import with conflict preview, export.@UX_STATE: idle, loading, editing, importing, import_preview, import_conflict, saving.@UX_FEEDBACK: import preview with duplicate flags; toast on save.
Verification — US5
- T024 [US5] Write pytest tests for DictionaryManager in
backend/src/plugins/translate/__tests__/test_dictionary.py: test create/update/delete, add entry with duplicate detection (expect conflict), import CSV with valid/invalid rows, delete dictionary blocked by active job, per-batch filtering returns matched terms. - T025 [US5] Verify US5 acceptance scenarios against spec User Story 5 (6 scenarios). Run
cd backend && pytest backend/src/plugins/translate/__tests__/test_dictionary.py -v.
Checkpoint: Dictionary management fully functional — CRUD, import, filtering, and job attachment all work.
Phase 5: User Story 2 — Preview Translated Output (Priority: P2)
Goal: User triggers preview on a saved job → system fetches sample rows → sends to LLM with context + dictionary → displays source/context/translation side-by-side → user approves/edits/rejects → preview state saved for execution gate.
Independent Test: Create job + dictionary → click Preview → verify 10 rows shown with LLM translations → approve 8, edit 1, reject 1 → verify state preserved.
Backend — Preview Engine
- T026 [US2] Implement
TranslationPreviewclass inbackend/src/plugins/translate/preview.py:preview_rows(job_id, sample_size). Fetch source rows from Superset viaSupersetClient; construct LLM prompt usingLLMProviderService+llm_prompt_templates.render_prompt()+DictionaryManager.filter_for_batch(); call LLM; returnPreviewRowlist.@COMPLEXITY 4— instrument withbelief_scope/reason/reflectat LLM call boundaries. (RATIONALE: C4 because preview is stateful (approve/edit/reject lifecycle) and calls external LLM API with side effects; REJECTED: making preview purely read-only without approval state would degrade UX by losing user decisions between preview and execution) - T027 [US2] Implement token count and cost estimation in preview response: compute estimated tokens from sample → extrapolate to full dataset row count → apply provider pricing → return
estimated_total_rows,estimated_tokens,estimated_costinTranslationPreviewResponse(FR-014). - T028 [US2] Implement preview quality gate: create persistent
TranslationPreviewSessionandTranslationPreviewRecordrows withconfig_hashanddict_snapshot_hash. Preview acceptance gates full execution; rejected preview sample rows are excluded from full run. Preview is a quality gate — unseen rows are processed normally in full run. - T029 [US2] Implement
/api/translate/jobs/{job_id}/previewendpoint:POSTtriggers preview, returns preview rows withstatus=pending. AddPUT /api/translate/jobs/{job_id}/preview/rows/{row_key}for approve/edit/reject actions. AddPOST /api/translate/jobs/{job_id}/preview/approve-allfor bulk approve.
Frontend — Preview UI
- T030 [P] [US2] Add preview API methods to
frontend/src/lib/api/translate.js:fetchPreview(),approveRow(),editRow(),rejectRow(),approveAll(). - T031 [US2] Create
TranslationPreviewcomponent infrontend/src/lib/components/translate/TranslationPreview.svelte: side-by-side table (source, context, LLM translation), approve/edit/reject buttons per row, bulk approve, cost estimate card before full run, row limit input.@UX_STATE: idle, loading, preview_loaded, preview_error, retrying.@UX_FEEDBACK: spinner during LLM call; visual distinction for LLM-generated vs user-edited values; cost estimate reactivity.@UX_RECOVERY: retry preview button; individual row re-translate. - T032 [US2] Integrate
TranslationPreviewintoTranslationJobConfigpage (frontend/src/routes/translate/[id]/+page.svelte) as a tab or collapsible section that appears after job is saved.
Verification — US2
- T033 [US2] Write pytest tests for preview in
backend/src/plugins/translate/__tests__/test_preview.py: test preview with valid job, preview with dictionary (verify glossary terms in prompt), preview row approve/edit/reject state transitions, cost estimation accuracy. Mock LLM provider responses. - T034 [US2] Write vitest component test for
TranslationPreviewinfrontend/src/lib/components/translate/__tests__/TranslationPreview.test.js: test rendering of preview rows, approve/reject/edit interactions, bulk approve behavior. Mock API client. - T035 [US2] Verify US2 acceptance scenarios against spec User Story 2 (5 scenarios). Run
cd backend && pytest backend/src/plugins/translate/__tests__/test_preview.py -v && cd frontend && npm run test -- --run.
Checkpoint: Preview flows complete — LLM translation with context + dictionary, approve/edit/reject lifecycle, cost estimation.
Phase 6: User Story 3 — Execute Translation & Insert Results (Priority: P3)
Goal: User triggers full batch execution → system processes rows in batches → generates INSERT SQL → user copies to SQL Lab or auto-executes → failed batches retryable.
Independent Test: Create job → preview + approve → execute → verify INSERT SQL generated with correct key columns → execute in SQL Lab → verify rows in target table.
Backend — Executor + SQL Generator + Orchestrator
- T036 [US3] Implement
SQLGeneratorclass inbackend/src/plugins/translate/sql_generator.py:generate_insert(records: list[TranslationRecord], job: TranslationJob) -> str. Detect dialect fromjob.database_dialect(cached from Superset connection at save time). Produce safe dialect-appropriate SQL: for PostgreSQL/Greenplum —INSERT INTO "target_table" ("key_cols"..., "target_col") VALUES (...)with quoted identifiers; supportupsert_strategy:insert(plain INSERT),skip_existing(ON CONFLICT DO NOTHING),overwrite(ON CONFLICT DO UPDATE). For ClickHouse —INSERT INTO target_table (key_cols..., target_col) VALUES (...);skip_existingwarns user (not natively supported);overwritedocumented limitation.@COMPLEXITY 3. (RATIONALE: dialect-aware because Superset connections may use ClickHouse or PostgreSQL; REJECTED: PostgreSQL-only would break ClickHouse users; raw identifier interpolation rejected) - T037 [US3] Implement
TranslationExecutorclass inbackend/src/plugins/translate/executor.py:execute_run(run: TranslationRun, job: TranslationJob). Fetch all source rows from Superset; split into batches; for each batch: callDictionaryManager.filter_for_batch(), construct prompt viaLLMProviderService, call LLM, createTranslationRecordrows with statustranslated/failed/skipped; handle batch-level retry on LLM failure (FR-015); skip NULL translation values (FR-016); reject NULL key values (FR-017); update run statistics.@COMPLEXITY 4— instrument withbelief_scope/reason/reflectat batch boundaries and error paths. - T038 [US3] Implement
TranslationOrchestratorclass inbackend/src/plugins/translate/orchestrator.py:start_run(job_id, trigger_type). Validate preconditions (job config valid, datasource accessible, LLM provider reachable); createTranslationRunwith statusrunningand config/dict snapshots (FR-019, FR-029); dispatch toTranslationExecutor; on completion callSQLGenerator; recordTranslationEventrows viaTranslationEventLog(FR-046); enforce state transitions: pending → running → (completed | partial | failed) — no skipping.@COMPLEXITY 5— full@PRE/@POST/@DATA_CONTRACT/@INVARIANTenforcement with@RATIONALE/@REJECTED. (RATIONALE: central coordinator is C5 because preview, execution, event logging, and retry share run state and must coordinate within a single transaction boundary; REJECTED: distributed actor model would introduce eventual-consistency challenges for status tracking at current scale) - T039 [US3] Implement
TranslationEventLogclass inbackend/src/plugins/translate/events.py:log_event(run_id, job_id, event_type, payload). Create immutableTranslationEventrow.query_events(job_id, filters)for audit/dashboard.prune_expired()for 90-day retention enforcement (FR-049) — scheduled via APScheduler cleanup job.@COMPLEXITY 5—@INVARIANT: every run must have exactly onerun_startedand one terminal event. (RATIONALE: C5 warranted because event log is single source of truth for observability, metrics, and audit; REJECTED: stdout-only logging lacks structured payload integrity and cannot enforce terminal-event invariant) - T040 [US3] Implement execution endpoints in
backend/src/api/routes/translate.py:POST /api/translate/jobs/{job_id}/runs(trigger manual run — creates run, dispatches orchestrator which translates AND submits to Superset API),GET /api/translate/runs/{run_id}(status + statistics + insert_status + superset_query_id),GET /api/translate/runs/{run_id}/records(paginated TranslationRecord list),POST /api/translate/runs/{run_id}/retry(retry failed batches only — FR-015),POST /api/translate/runs/{run_id}/retry-insert(retry Superset insert only without re-translating). InjectDepends(require_permission("translate.job.execute")).
Frontend — Execution UI
- T041 [P] [US3] Add execution API methods to
frontend/src/lib/api/translate.js:triggerRun(),fetchRunStatus(),fetchRunRecords(),retryFailedBatches(). - T042 [US3] Create
TranslationRunProgresscomponent infrontend/src/lib/components/translate/TranslationRunProgress.svelte: live progress bar (WebSocket-driven fromTaskWebSocket), batch counter (N/M), success/failure/skip counts, cancel button.@UX_STATE: idle, running, pausing, cancelled, completed, partial, failed.@UX_FEEDBACK: progress percentage$derivedfrom translated/total; real-time counts.@UX_RECOVERY: retry failed batches button; cancel run; download skipped rows. - T043 [US3] Create
TranslationRunResultcomponent infrontend/src/lib/components/translate/TranslationRunResult.svelte: completion summary (rows translated/failed/skipped, token count, cost, insert_status), Superset execution reference with status badge, generated SQL block for audit/debugging (collapsed by default), retry-insert button.@UX_STATE: completed, partial, failed, insert_failed.@UX_FEEDBACK: Superset execution status badge; SQL block for audit. - T044 [US3] Integrate
TranslationRunProgressandTranslationRunResultintoTranslationJobConfigpage as the "Run" tab/section.
Verification — US3
- T045 [US3] Write pytest tests for
SQLGeneratorinbackend/src/plugins/translate/__tests__/test_sql_generator.py: test INSERT with single key, composite key — for PostgreSQL dialect AND ClickHouse dialect. Test PostgreSQL UPSERT (ON CONFLICT DO NOTHING, ON CONFLICT DO UPDATE). Test ClickHouse plain INSERT and skip_existing warning. Test NULL key rejection, NULL translation value skipping, identifier quoting per dialect, injection safety. Validate SQL syntax correctness against each dialect. - T046 [US3] Write pytest tests for executor + orchestrator in
backend/src/plugins/translate/__tests__/test_orchestrator.py: test full run lifecycle (pending→running→completed), partial failure (one batch fails, rest succeed), batch retry, event log invariants, NULL handling. Mock LLM provider and SupersetClient. - T047 [US3] Verify US3 acceptance scenarios against spec User Story 3 (5 scenarios). Run
cd backend && pytest backend/src/plugins/translate/__tests__/test_orchestrator.py backend/src/plugins/translate/__tests__/test_sql_generator.py -v.
Checkpoint: Execution pipeline complete — batch processing, INSERT generation, retry, event logging. User can translate data and insert into target table.
Phase 7: User Story 6 — Feedback Loop (Correct → Dictionary) (Priority: P3)
Goal: In run results, user selects incorrect translation → submits correction to dictionary → dictionary updated with origin tracking → next run uses corrected term.
Independent Test: Complete a run → find incorrect translation → open correction popup → submit to dictionary → re-run preview → verify corrected term used.
Backend — Correction Submission
- T048 [US6] Implement correction submission endpoint in
backend/src/api/routes/translate.py:POST /api/translate/correctionsacceptingTermCorrectionSubmitbody. Validate target language match between dictionary and job (FR language validation edge case); detect existing entry conflict → return conflict response (FR-032); createDictionaryEntrywith origin tracking (origin_run_id,origin_row_key,origin_user_id) per FR-033. InjectDepends(require_permission("translate.dictionary.edit")). - T049 [US6] Implement bulk correction endpoint:
POST /api/translate/corrections/bulkaccepting array ofTermCorrectionSubmitobjects (FR-034). Process atomically — if any conflict is detected, return all conflicts for user resolution before partial apply.
Frontend — Correction UI
- T050 [P] [US6] Add correction API methods to
frontend/src/lib/api/translate.js:submitCorrection(),submitBulkCorrections(). - T051 [US6] Create
TermCorrectionPopupcomponent infrontend/src/lib/components/translate/TermCorrectionPopup.svelte: text selection on source term and incorrect target translation → popup with source term (pre-filled from source column), incorrect target translation (pre-filled from selection), corrected target translation input, dictionary selector dropdown (filtered by target language), submit button, conflict dialog (overwrite/keep existing/cancel).@UX_STATE: closed, selecting, editing, submitting, conflict_detected, submitted.@UX_FEEDBACK: "Added to Dictionary" badge on corrected row. - T052 [US6] Create
BulkCorrectionSidebarcomponent infrontend/src/lib/components/translate/BulkCorrectionSidebar.svelte: sidebar collecting selected terms across rows, per-term correction inputs, submit all to dictionary.@UX_STATE: closed, collecting, reviewing, submitting, submitted.@UX_REACTIVITY: selected terms list$state. - T053 [US6] Integrate feedback-loop components into
TranslationRunResult(T043) — add selection highlight behavior and correction triggers.
Verification — US6
- T054 [US6] Write pytest tests for correction endpoints in
backend/tests/test_translate_corrections.py: test single correction, bulk correction, conflict detection (existing term), cross-language rejection, origin tracking fields populated. Verify corrected term appears in next preview's dictionary filter. - T055 [US6] Verify US6 acceptance scenarios against spec User Story 6 (5 scenarios). Run
cd backend && pytest backend/tests/test_translate_corrections.py -v.
Checkpoint: Feedback loop complete — corrections flow from results → dictionary → next run.
Phase 8: User Story 7 — Schedule Translation Jobs (Priority: P3)
Goal: User configures schedule → system triggers runs → new-key-only translation → optional auto-INSERT → failure notification → pause/resume.
Independent Test: Configure schedule (every 5 min for test) → wait for trigger → verify new TranslationRun created → verify only new keys translated → disable schedule → verify no more triggers.
Backend — Schedule Management + Trigger Dispatch
- T056 [US7] Implement
TranslationSchedulerclass inbackend/src/plugins/translate/scheduler.py:create_schedule(),update_schedule(),delete_schedule(),enable_schedule(),disable_schedule(),get_next_executions(schedule, n=3)(FR-036). Register schedule with existingSchedulerServiceviaadd_job()with cron/interval/date trigger.@COMPLEXITY 4— instrument withbelief_scope/reason/reflect. (RATIONALE: C4 because schedule management is stateful with APScheduler integration, concurrency policy enforcement, and trigger dispatch side effects) - T057 [US7] Implement schedule trigger handler:
_execute_scheduled_translation(job_id). Enforce concurrency policy: check if previous run for same job is stillrunning→skip(log + event) orqueue(start after previous completes) per FR-039. If proceeding: create newTranslationRunwithtrigger_type=scheduled; fetch source rows; apply new-key-only filter (FR-045) — compare current key values against previous successful run's key values; dispatch toTranslationOrchestrator. On failure, send notification viaNotificationService(FR-041, FR-048). Schedule remains enabled for next trigger (US7 acceptance scenario 6). - T058 [US7] Implement Superset SQL Lab API submission for all runs: create
SupersetSqlLabExecutorclass inbackend/src/plugins/translate/superset_executor.py. Submit generated SQL to/api/v1/sqllab/execute/, poll execution status, updateTranslationRun.insert_status,superset_query_id,rows_affected, error fields. For scheduled runs, this happens automatically; for manual runs, this happens on user trigger. Recordinsert_submitted/insert_succeeded/insert_failedevents. - T059 [US7] Implement schedule endpoints in
backend/src/api/routes/translate.py:PUT /api/translate/jobs/{job_id}/schedule(create/update),DELETE /api/translate/jobs/{job_id}/schedule(remove),POST /api/translate/jobs/{job_id}/schedule/enable(FR-040),POST /api/translate/jobs/{job_id}/schedule/disable(FR-040). InjectDepends(require_permission("translate.schedule.manage")). Add schedule warning when editing job with active schedule (FR-042). - T060 [US7] Extend
SchedulerService.load_schedules()inbackend/src/core/scheduler.pyto discover and register activeTranslationSchedulerows alongside existing backup schedules (R4).
Frontend — Schedule UI
- T061 [P] [US7] Add schedule API methods to
frontend/src/lib/api/translate.js:updateSchedule(),deleteSchedule(),enableSchedule(),disableSchedule(). - T062 [US7] Create
ScheduleConfigcomponent infrontend/src/lib/components/translate/ScheduleConfig.svelte: type selector (cron/interval/once), cron expression input with validation, interval input, timezone selector, run-at datetime picker, next-3-executions preview (with timezone), concurrency policy selector (skip/queue), enable/disable toggle with status indicator. Warns if no prior successful manual run exists.@UX_STATE: idle, editing, validating, enabled, disabled, no_prior_run_warning.@UX_REACTIVITY: next execution times$derivedfrom schedule config with timezone display. - T063 [US7] Integrate
ScheduleConfigintoTranslationJobConfigpage as the "Schedule" tab.
Verification — US7
- T064 [US7] Write pytest tests for scheduler in
backend/src/plugins/translate/__tests__/test_scheduler.py: test schedule CRUD, cron expression validation, next-N-executions calculation, trigger dispatch with skip/queue concurrency, new-key-only filter (verify only unseen keys processed), auto-INSERT execution, failure notification, pause/resume, load on SchedulerService start. - T065 [US7] Verify US7 acceptance scenarios against spec User Story 7 (8 scenarios). Run
cd backend && pytest backend/src/plugins/translate/__tests__/test_scheduler.py -v.
Checkpoint: Scheduling complete — jobs can run automatically on schedule with new-key-only incremental translation and failure recovery.
Phase 9: User Story 4 — Translation History & Audit Trail (Priority: P4)
Goal: User views past runs with filterable list; inspects run details (config snapshot, prompt, translations, INSERT SQL); sees edit marks; duplicates job. Admin views metrics dashboard.
Independent Test: Run several translations → open history → filter by datasource → click run → verify config snapshot, prompt, translations with edit marks, INSERT SQL all shown.
Backend — History + Metrics Endpoints
- T066 [US4] Implement history endpoints in
backend/src/api/routes/translate.py:GET /api/translate/runs(list with filters:job_id,datasource_id,target_table,status,date_from,date_to, pagination per FR-020),GET /api/translate/runs/{run_id}(detail withconfig_snapshot,prompt_used,recordswithllm_translationanduser_editfields visible — FR showing original vs user-edited). - T067 [US4] Implement
TranslationMetricsclass inbackend/src/plugins/translate/metrics.py:get_job_metrics(job_id) -> MetricsResponse. Aggregate fromTranslationEventtable: total runs, success/failure counts, cumulative tokens, cumulative cost, average batch latency (FR-047).@COMPLEXITY 3. - T068 [US4] Implement metrics endpoint:
GET /api/translate/jobs/{job_id}/metrics. InjectDepends(require_permission("translate.history.view")).
Frontend — History + Metrics UI
- T069 [P] [US4] Add history API methods to
frontend/src/lib/api/translate.js:fetchRunHistory(),fetchRunDetail(),fetchJobMetrics(). - T070 [US4] Create
TranslationHistorySvelteKit page infrontend/src/routes/translate/history/+page.svelte: filterable table (datasource, target table, row count, status, date, user), click-to-expand detail with config snapshot, prompt, translation rows with edit marks, INSERT SQL.@UX_STATE: idle, loading, empty, populated, detail_open.@UX_REACTIVITY: filtered list$derivedfrom filters. - T071 [US4] Create admin metrics dashboard section (integrated into existing admin pages or standalone) displaying per-job metrics: run counts, success/failure ratio, cumulative tokens, cumulative cost, average latency. Use
MetricsResponseschema.
Verification — US4
- T072 [US4] Write pytest tests for history + metrics in
backend/tests/test_translate_history.py: test run list with filters, run detail with snapshots, metrics aggregation accuracy,TranslationEventqueryability. - T073 [US4] Verify US4 acceptance scenarios against spec User Story 4 (4 scenarios). Run
cd backend && pytest backend/tests/test_translate_history.py -v.
Checkpoint: History and audit complete — all runs traceable, metrics dashboard populated.
Phase 10: Polish & Cross-Cutting Concerns
Purpose: Retention enforcement, notification wiring, semantic audit, quickstart validation, and rejected-path regression protection.
- T074 [P] Implement 90-day retention pruning in
TranslationEventLog.prune_expired(): run as APScheduler daily cleanup job. BEFORE pruning events/records: persist cumulative metrics asMetricSnapshotrow (tokens, cost, run counts). Then pruneTranslationRecord,TranslationPreviewRecord,TranslationEvent, andinsert_sql/config_snapshotfields older than 90 days. PreserveTranslationRunmetadata,MetricSnapshotrows, andsuperset_query_id. Verify metrics remain accurate post-prune (SC-014). (RATIONALE: metric snapshots prevent cumulative data loss from event pruning; REJECTED: indefinite retention would violate storage constraints) - T075 [P] Wire scheduled-run failure notification: ensure
TranslationSchedulertrigger handler callsNotificationService.send()when a scheduled run fails (FR-041, FR-048). Test with mock notification provider. - T076 [P] Instrument remaining C4/C5 Python flows with
belief_scope/reason/reflect/exploremarkers where missing:TranslationOrchestrator.start_run()(entry/exit),TranslationExecutor.execute_run()(batch boundaries + error paths),DictionaryManagermutation boundaries,TranslationSchedulertrigger dispatch. Verify viaaxiom_semantic_validationbelief-runtime audit. - T077 Run full semantic audit via axiom MCP tools:
axiom_semantic_validation audit_contracts --file_path backend/src/plugins/translate/— verify all[DEF]anchors are closed,@RELATIONtargets resolve, no orphan contracts, C4+ contracts have required tag densityaxiom_semantic_validation audit_belief_protocol --file_path backend/src/plugins/translate/— verify@RATIONALE/@REJECTEDpresent on all C5 contractsaxiom_semantic_validation audit_belief_runtime --file_path backend/src/plugins/translate/— verifybelief_scope/reason/reflect/exploremarkers exist in all C4+ module bodiesaxiom_semantic_validation impact_analysis --contract_id TranslationOrchestrator:Class— verify no rejected path is accidentally re-enabled
- T078 Run quickstart validation: follow
specs/028-llm-datasource-supeset/quickstart.mdend-to-end — create dictionary → create job → preview → execute → verify INSERT SQL → submit correction → schedule → view history → verify metrics. Runcd backend && pytest -v,cd frontend && npm run test -- --run,cd backend && ruff check src/plugins/translate/ src/api/routes/translate.py src/models/translate.py src/schemas/translate.py. - T079 Rejected-path regression guard: add a test case in
backend/src/plugins/translate/__tests__/test_orchestrator.pyverifying snapshot isolation — changing job config mid-run does NOT invalidate the running TranslationRun. Add a test case inbackend/src/plugins/translate/__tests__/test_sql_generator.pyverifying that UPDATE statements are never generated (only INSERT/UPSERT per PostgreSQL dialect). Add a test case inbackend/src/plugins/translate/__tests__/test_dictionary.pyverifying that duplicate source_term entries cannot coexist (UniqueConstraint enforced) and conflict resolution only offers overwrite/keep-existing. Add a test case inbackend/src/plugins/translate/__tests__/test_retention.pyverifying metric snapshots are persisted before event pruning and cumulative metrics remain accurate post-prune. - T080 [P] Implement cancel run endpoint:
POST /api/translate/runs/{run_id}/cancelinbackend/src/api/routes/translate.py. Settranslation_status=cancelled, mark in-progress batches as failed, do NOT submit INSERT SQL. Emitrun_cancelledevent. InjectDepends(require_permission("translate.job.execute")). - T081 [P] Implement download skipped rows endpoint:
GET /api/translate/runs/{run_id}/skipped.csvreturning CSV of rows skipped due to NULL keys or translation failures. Usekey_hashfor efficient lookup. - T082 [P] Compute
key_hashfor TranslationRecord and TranslationPreviewRecord:hash(canonical_json(key_values))at creation time. Addconfig_hashfor TranslationRun and TranslationPreviewSession: hash of effective config (columns, keys, target, prompt, dictionaries). Use for idempotency checks, new-key-only filtering, and stale preview detection.
Dependencies & Execution Order
Phase Dependencies
- Setup (Phase 1): No dependencies — can start immediately
- Foundational (Phase 2): Depends on Setup — BLOCKS all user stories
- US1 (Phase 3): Depends on Foundational — no dependencies on other stories. Recommended start after Foundational.
- US5 (Phase 4): Depends on Foundational — can run in parallel with US1. Dictionary filtering (T020) will be consumed later by US2/US3 but is self-contained.
- US2 (Phase 5): Depends on US1 (needs saved job) + US5 (needs dictionary filtering). Can start integration once US1 backend is stable.
- US3 (Phase 6): Depends on US1 (needs job config) + US2 (preview decisions feed executor). Sequential after US2.
- US6 (Phase 7): Depends on US3 (needs run results) + US5 (needs dictionary). Can run in parallel with US7 after US3.
- US7 (Phase 8): Depends on US1 (needs job) + US3 (needs execution pipeline). Can run in parallel with US6 after US3.
- US4 (Phase 9): Depends on US3 (needs run records). Can run in parallel with US6/US7 after US3.
- Polish (Phase 10): Depends on all desired user stories being complete.
Parallel Opportunities
| Phase | Parallel Tasks | Notes |
|---|---|---|
| 1 | — | Sequential (only 2 tasks) |
| 2 | T003 ∥ T004 | Models + Schemas in parallel |
| 3 (US1) | T009 ∥ T012 | Backend CRUD ∥ API client |
| 4 (US5) | T017 ∥ T021 | DictionaryManager ∥ API client |
| 5 (US2) | T030 ∥ T031 | API client ∥ Preview component |
| 6 (US3) | T036 ∥ T041 | SQLGenerator ∥ API client |
| 7 (US6) | T050 ∥ T051 ∥ T052 | API client ∥ Popup ∥ Sidebar |
| 8 (US7) | T061 ∥ T062 | API client ∥ ScheduleConfig |
| 9 (US4) | T069 ∥ T070 | API client ∥ History page |
| 10 | T074 ∥ T075 ∥ T076 | Retention, notifications, belief instrumentation |
Cross-Story Parallelism
After Foundational (Phase 2):
- US1 and US5 can proceed in parallel by different developers
- After US3 completes: US6, US7, and US4 can proceed in parallel
Implementation Strategy
MVP First (US1 Only)
- Phase 1 + Phase 2 → Foundation
- Phase 3 (US1) → Job configuration CRUD
- STOP and VALIDATE: User can create, list, edit, delete translation jobs
- Deploy/demo — partial value (configuration ready, no translation yet)
Minimum Viable Feature (US1 + US5 + US2 + US3)
- Foundation → US1 + US5 (parallel) → US2 → US3
- STOP and VALIDATE: End-to-end translation flow works: configure → preview → execute → INSERT
- This is the core feature — all remaining stories add automation (US7), quality improvement (US6), and visibility (US4)
Full Feature (All Stories)
- MVP → US6 + US7 + US4 (parallel after US3) → Polish
- Scheduled automation, feedback loop, and audit trail all functional
Notes
- All file paths reference the actual repository structure (
backend/src/,frontend/src/). @COMPLEXITY 4/5backend contracts requirebelief_scope/reason/reflectmarkers — verified in T076.@RATIONALE/@REJECTEDtags appear only in C5 contracts (TranslationOrchestrator,TranslationEventLog) per INV_7.- Rejected paths are explicitly protected by regression tests in T079.
[NEED_CONTEXT]markers: none — all contract targets resolve to existing or planned modules within this feature.- The existing
LLMProviderService,SupersetClient,SchedulerService,NotificationService, andTaskWebSocketcontracts are reused without modification. - Quickstart.md (T078) serves as the human-verifiable acceptance test for the full feature.