Compare commits

...

3 Commits

Author SHA1 Message Date
1149e8df1d subagents 2026-03-20 17:20:24 +03:00
b89b9a66f2 Add primary subagent-only orchestrator 2026-03-20 16:46:16 +03:00
ab085a81de Add custom subagent role duplicates 2026-03-20 16:36:18 +03:00
36 changed files with 4626 additions and 313 deletions

View File

@@ -0,0 +1,377 @@
---
title: "Custom Subagents"
description: "Create and configure custom subagents in Kilo Code's CLI"
---
# Custom Subagents
Kilo Code's CLI supports **custom subagents** — specialized AI assistants that can be invoked by primary agents or manually via `@` mentions. Subagents run in their own isolated sessions with tailored prompts, models, tool access, and permissions, enabling you to build purpose-built workflows for tasks like code review, documentation, security audits, and more.
{% callout type="info" %}
Custom subagents are currently configured through the config file (`kilo.json`) or via markdown agent files. UI-based configuration is not yet available.
{% /callout %}
## What Are Subagents?
Subagents are agents that operate as delegates of primary agents. While **primary agents** (like Code, Plan, or Debug) are the main assistants you interact with directly, **subagents** are invoked to handle specific subtasks in isolated contexts.
Key characteristics of subagents:
- **Isolated context**: Each subagent runs in its own session with separate conversation history
- **Specialized behavior**: Custom prompts and tool access tailored to a specific task
- **Invocable by agents or users**: Primary agents invoke subagents via the Task tool, or you can invoke them manually with `@agent-name`
- **Results flow back**: When a subagent completes, its result summary is returned to the parent agent
### Built-in Subagents
Kilo Code includes two built-in subagents:
| Name | Description |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **general** | General-purpose agent for researching complex questions and executing multi-step tasks. Has full tool access (except todo). |
| **explore** | Fast, read-only agent for codebase exploration. Cannot modify files. Use for finding files by patterns, searching code, or answering questions about the codebase. |
## Agent Modes
Every agent has a **mode** that determines how it can be used:
| Mode | Description |
| ---------- | ------------------------------------------------------------------------------------------- |
| `primary` | User-facing agents you interact with directly. Switch between them with **Tab**. |
| `subagent` | Only invocable via the Task tool or `@` mentions. Not available as a primary agent. |
| `all` | Can function as both a primary agent and a subagent. This is the default for custom agents. |
## Configuring Custom Subagents
There are two ways to define custom subagents: through JSON configuration or markdown files.
### Method 1: JSON Configuration
Add agents to the `agent` section of your `kilo.json` config file. Any key that doesn't match a built-in agent name creates a new custom agent.
```json
{
"$schema": "https://app.kilo.ai/config.json",
"agent": {
"code-reviewer": {
"description": "Reviews code for best practices and potential issues",
"mode": "subagent",
"model": "anthropic/claude-sonnet-4-20250514",
"prompt": "You are a code reviewer. Focus on security, performance, and maintainability.",
"permission": {
"edit": "deny",
"bash": "deny"
}
}
}
}
```
You can also reference an external prompt file instead of inlining the prompt:
```json
{
"agent": {
"code-reviewer": {
"description": "Reviews code for best practices and potential issues",
"mode": "subagent",
"prompt": "{file:./prompts/code-review.txt}"
}
}
}
```
The file path is relative to the config file location, so this works for both global and project-specific configs.
### Method 2: Markdown Files
Define agents as markdown files with YAML frontmatter. Place them in:
- **Global**: `~/.config/kilo/agents/`
- **Project-specific**: `.kilo/agents/`
The **filename** (without `.md`) becomes the agent name.
```markdown
---
description: Reviews code for quality and best practices
mode: subagent
model: anthropic/claude-sonnet-4-20250514
temperature: 0.1
permission:
edit: deny
bash: deny
---
You are a code reviewer. Analyze code for:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations
Provide constructive feedback without making direct changes.
```
{% callout type="tip" %}
Markdown files are often preferred for subagents with longer prompts because the markdown body becomes the system prompt, which is easier to read and maintain than an inline JSON string.
{% /callout %}
### Method 3: Interactive CLI
Create agents interactively using the CLI:
```bash
kilo agent create
```
This command will:
1. Ask where to save the agent (global or project-specific)
2. Prompt for a description of what the agent should do
3. Generate an appropriate system prompt and identifier using AI
4. Let you select which tools the agent can access
5. Let you choose the agent mode (`all`, `primary`, or `subagent`)
6. Create a markdown file with the agent configuration
You can also run it non-interactively:
```bash
kilo agent create \
--path .kilo \
--description "Reviews code for security vulnerabilities" \
--mode subagent \
--tools "read,grep,glob"
```
## Configuration Options
The following options are available when configuring a subagent:
| Option | Type | Description |
| ------------- | ---------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `description` | `string` | What the agent does and when to use it. Shown to primary agents to help them decide which subagent to invoke. |
| `mode` | `"subagent" \| "primary" \| "all"` | How the agent can be used. Defaults to `all` for custom agents. |
| `model` | `string` | Override the model for this agent (format: `provider/model-id`). If not set, subagents inherit the model of the invoking primary agent. |
| `prompt` | `string` | Custom system prompt. In JSON, can use `{file:./path}` syntax. In markdown, the body is the prompt. |
| `temperature` | `number` | Controls response randomness (0.0-1.0). Lower = more deterministic. |
| `top_p` | `number` | Alternative to temperature for controlling response diversity (0.0-1.0). |
| `permission` | `object` | Controls tool access. See [Permissions](#permissions) below. |
| `hidden` | `boolean` | If `true`, hides the subagent from the `@` autocomplete menu. It can still be invoked by agents via the Task tool. Only applies to `mode: subagent`. |
| `steps` | `number` | Maximum agentic iterations before forcing a text-only response. Useful for cost control. |
| `color` | `string` | Visual color in the UI. Accepts hex (`#FF5733`) or theme names (`primary`, `accent`, `error`, etc.). |
| `disable` | `boolean` | Set to `true` to disable the agent entirely. |
Any additional options not listed above are passed through to the model provider, allowing you to use provider-specific parameters like `reasoningEffort` for OpenAI models.
### Permissions
The `permission` field controls what tools the subagent can use. Each tool permission can be set to:
- `"allow"` — Allow the tool without approval
- `"ask"` — Prompt for user approval before running
- `"deny"` — Disable the tool entirely
```json
{
"agent": {
"reviewer": {
"mode": "subagent",
"permission": {
"edit": "deny",
"bash": {
"*": "ask",
"git diff": "allow",
"git log*": "allow"
}
}
}
}
}
```
For bash commands, you can use glob patterns to set permissions per command. Rules are evaluated in order, with the **last matching rule winning**.
You can also control which subagents an agent can invoke via `permission.task`:
```json
{
"agent": {
"orchestrator": {
"mode": "primary",
"permission": {
"task": {
"*": "deny",
"code-reviewer": "allow",
"docs-writer": "allow"
}
}
}
}
}
```
## Using Custom Subagents
Once configured, subagents can be used in two ways:
### Automatic Invocation
Primary agents (especially the Orchestrator) can automatically invoke subagents via the Task tool when the subagent's `description` matches the task at hand. Write clear, descriptive `description` values to help primary agents select the right subagent.
### Manual Invocation via @ Mentions
You can manually invoke any subagent by typing `@agent-name` in your message:
```
@code-reviewer review the authentication module for security issues
```
This creates a subtask that runs in the subagent's isolated context with its configured prompt and permissions.
### Listing Agents
To see all available agents (both built-in and custom):
```bash
kilo agent list
```
This displays each agent's name, mode, and permission configuration.
## Configuration Precedence
Agent configurations are merged from multiple sources. Later sources override earlier ones:
1. **Built-in agent defaults** (native agents defined in the codebase)
2. **Global config** (`~/.config/kilo/config.json`)
3. **Global agent markdown files** (`~/.config/kilo/agents/*.md`)
4. **Project config** (`kilo.json` in the project root)
5. **Project agent markdown files** (`.kilo/agents/*.md`)
When overriding a built-in agent, properties are merged — only the fields you specify are overridden. When creating a new custom agent, unspecified fields use sensible defaults (`mode: "all"`, full permissions inherited from global config).
## Examples
### Documentation Writer
A subagent that writes and maintains documentation without executing commands:
```markdown
---
description: Writes and maintains project documentation
mode: subagent
permission:
bash: deny
---
You are a technical writer. Create clear, comprehensive documentation.
Focus on:
- Clear explanations with proper structure
- Code examples where helpful
- User-friendly language
- Consistent formatting
```
### Security Auditor
A read-only subagent for security review:
```markdown
---
description: Performs security audits and identifies vulnerabilities
mode: subagent
permission:
edit: deny
bash:
"*": deny
"git log*": allow
"grep *": allow
---
You are a security expert. Focus on identifying potential security issues.
Look for:
- Input validation vulnerabilities
- Authentication and authorization flaws
- Data exposure risks
- Dependency vulnerabilities
- Configuration security issues
Report findings with severity levels and remediation suggestions.
```
### Test Generator
A subagent that creates tests for existing code:
```json
{
"agent": {
"test-gen": {
"description": "Generates comprehensive test suites for existing code",
"mode": "subagent",
"prompt": "You are a test engineer. Write comprehensive tests following the project's existing test patterns. Use the project's test framework. Cover edge cases and error paths.",
"temperature": 0.2,
"steps": 15
}
}
}
```
### Restricted Orchestrator
A primary agent that can only delegate to specific subagents:
```json
{
"agent": {
"orchestrator": {
"permission": {
"task": {
"*": "deny",
"code-reviewer": "allow",
"test-gen": "allow",
"docs-writer": "allow"
}
}
}
}
}
```
## Overriding Built-in Agents
You can customize built-in agents by using their name in your config. For example, to change the model used by the `explore` subagent:
```json
{
"agent": {
"explore": {
"model": "anthropic/claude-haiku-4-20250514"
}
}
}
```
To disable a built-in agent entirely:
```json
{
"agent": {
"general": {
"disable": true
}
}
}
```
## Related
- [Custom Modes](/docs/customize/custom-modes) — Create specialized primary agents with tool restrictions
- [Custom Rules](/docs/customize/custom-rules) — Define rules that apply to specific file types or situations
- [Orchestrator Mode](/docs/code-with-ai/agents/orchestrator-mode) — Coordinate complex tasks by delegating to subagents
- [Task Tool](/docs/automate/tools/new-task) — The tool used to invoke subagents

View File

@@ -0,0 +1,111 @@
# Apache Superset Native Filters Restoration Flow - Complete Analysis
## Research Complete ✅
I've analyzed how Superset restores Native Filters from two URL types and identified all key code paths.
---
## A. URL → State Entry Points
### Frontend Entry: [`DashboardPage.tsx`](superset-frontend/src/dashboard/containers/DashboardPage.tsx:170-228)
- Reads `permalinkKey`, `nativeFiltersKey`, and `nativeFilters` from URL
- Calls `getPermalinkValue()` or `getFilterValue()` to fetch state
- Passes `dataMask` to `hydrateDashboard()` action
---
## B. Dashboard Permalink Retrieval Path
### Frontend API: [`keyValue.tsx`](superset-frontend/src/dashboard/components/nativeFilters/FilterBar/keyValue.tsx:79)
```typescript
GET /api/v1/dashboard/permalink/{key}
```
### Backend: [`commands/dashboard/permalink/get.py`](superset/commands/dashboard/permalink/get.py)
- Retrieves from Key-Value store
- Returns `DashboardPermalinkValue` with `state.dataMask`
### Format ([`types.py`](superset/dashboards/permalink/types.py:20)):
```python
{
"dataMask": { "filter_id": { "extraFormData": {...}, "filterState": {...} } },
"activeTabs": [...],
"anchor": "...",
"chartStates": {...}
}
```
---
## C. native_filters_key Retrieval Path
### Frontend: [`keyValue.tsx`](superset-frontend/src/dashboard/components/nativeFilters/FilterBar/keyValue.tsx:69)
```typescript
GET /api/v1/dashboard/{id}/filter_state/{key}
```
### Backend: [`filter_state/api.py`](superset/dashboards/filter_state/api.py)
- Returns JSON string with filter state
- Structure: `{ "id": "...", "extraFormData": {...}, "filterState": {...} }`
---
## D. dataMask / filterState / extraFormData Transformation
### 1. Hydration: [`hydrate.ts`](superset-frontend/src/dashboard/actions/hydrate.ts:365)
```typescript
dispatch({ type: HYDRATE_DASHBOARD, data: { dataMask, ... } })
```
### 2. Reducer: [`reducer.ts`](superset-frontend/src/dataMask/reducer.ts:215)
- Merges loaded `dataMask` with native filter config from dashboard metadata
### 3. Chart Queries: [`utils.ts`](superset-frontend/src/dashboard/components/nativeFilters/utils.ts:145)
```typescript
getExtraFormData(dataMask, filterIdsAppliedOnChart)
```
---
## E. Minimal Data Needed to Restore Native Filters
To replicate in your backend, you need:
1. **For native_filters_key:**
- Call: `GET /api/v1/dashboard/{id}/filter_state/{key}`
- Parse: `JSON.parse(response.value)`
- Extract: `extraFormData` from each filter entry
2. **For permalink:**
- Call: `GET /api/v1/dashboard/permalink/{key}`
- Extract: `state.dataMask` → each filter's `extraFormData`
3. **Merge logic** (from [`utils.ts`](superset-frontend/src/dashboard/components/nativeFilters/utils.ts:109)):
```typescript
mergeExtraFormData(originalExtra, newExtra)
// Uses EXTRA_FORM_DATA_APPEND_KEYS and EXTRA_FORM_DATA_OVERRIDE_KEYS
```
---
## F. Differences: slug-link vs permalink-link
| Aspect | slug-link (`?native_filters_key=`) | permalink-link (`/p/{key}/`) |
|--------|-----------------------------------|------------------------------|
| **Scope** | Filter state only | Full dashboard state |
| **Storage** | Filter state cache | Key-Value store |
| **Contents** | dataMask | dataMask + tabs + anchor + chartStates |
| **Requires** | Dashboard metadata | Self-contained |
---
## Key Source of Truth
1. **Frontend State:** [`dataMaskReducer`](superset-frontend/src/dataMask/reducer.ts) - handles all state merging
2. **Backend Format:** [`DashboardPermalinkState`](superset/dashboards/permalink/types.py:20) - permalink storage
3. **Transformation:** [`getExtraFormData()`](superset-frontend/src/dashboard/components/nativeFilters/utils.ts:145) - converts dataMask to query params
**Answer to your questions:**
- For `?native_filters_key=...`: Only contains a key to server-side cached state; frontend fetches the full dataMask
- For `/dashboard/p/<key>/`: Permalink contains complete `dataMask` with resolved `extraFormData` - can extract filters without UI

55
.kilo/agent/coder.md Normal file
View File

@@ -0,0 +1,55 @@
---
description: Implementation Specialist - Semantic Protocol Compliant; use for implementing features, writing code, or fixing issues from test reports.
mode: subagent
model: github-copilot/gpt-5.4
temperature: 0.2
permission:
edit: allow
bash: ask
browser: deny
steps: 60
color: accent
---
You are Kilo Code, acting as an Implementation Specialist. Your primary goal is to write code that strictly follows the Semantic Protocol defined in `.ai/standards/semantics.md` and passes self-audit.
## Core Mandate
- Read `.ai/ROOT.md` first.
- Use `.ai/standards/semantics.md` as the source of truth.
- Follow `.ai/standards/constitution.md`, `.ai/standards/api_design.md`, and `.ai/standards/ui_design.md`.
- After implementation, use `axiom-core` tools to verify semantic compliance before handoff.
## Required Workflow
1. Load semantic context before editing.
2. Preserve or add required semantic anchors and metadata.
3. Use short semantic IDs.
4. Keep modules under 300 lines; decompose when needed.
5. Use guards or explicit errors; never use `assert` for runtime contract enforcement.
6. Preserve semantic annotations when fixing logic or tests.
7. If relation, schema, or dependency is unclear, emit `[NEED_CONTEXT: target]`.
## Complexity Contract Matrix
- Complexity 1: anchors only.
- Complexity 2: `@PURPOSE`.
- Complexity 3: `@PURPOSE`, `@RELATION`; UI also `@UX_STATE`.
- Complexity 4: `@PURPOSE`, `@RELATION`, `@PRE`, `@POST`, `@SIDE_EFFECT`; meaningful `logger.reason()` and `logger.reflect()` for Python.
- Complexity 5: full L4 plus `@DATA_CONTRACT` and `@INVARIANT`; `belief_scope` mandatory.
## Execution Rules
- Run verification when needed using guarded commands.
- Backend verification path: `cd backend && .venv/bin/python3 -m pytest`
- Frontend verification path: `cd frontend && npm run test`
- Never bypass semantic debt to make code appear working.
## Completion Gate
- No broken `[DEF]`.
- No missing required contracts for effective complexity.
- No broken Svelte 5 rune policy.
- No orphan critical blocks.
- Handoff must state complexity, contracts, and remaining semantic debt.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

View File

@@ -0,0 +1,49 @@
---
description: Executes SpecKit workflows for feature management and project-level governance tasks delegated from primary agents.
mode: subagent
model: github-copilot/gpt-5.4
temperature: 0.1
permission:
edit: ask
bash: ask
browser: deny
steps: 60
color: primary
---
You are Kilo Code, acting as a Product Manager subagent. Your purpose is to rigorously execute the workflows defined in `.kilocode/workflows/`.
## Core Mandate
- You act as the orchestrator for:
- Specification (`speckit.specify`, `speckit.clarify`)
- Planning (`speckit.plan`)
- Task Management (`speckit.tasks`, `speckit.taskstoissues`)
- Quality Assurance (`speckit.analyze`, `speckit.checklist`, `speckit.test`, `speckit.fix`)
- Governance (`speckit.constitution`)
- Implementation Oversight (`speckit.implement`)
- For each task, you must read the relevant workflow file from `.kilocode/workflows/` and follow its Execution Steps precisely.
- In Implementation (`speckit.implement`), you manage the acceptance loop between Coder and Tester.
## Required Workflow
1. Always read `.ai/ROOT.md` first to understand the Knowledge Graph structure.
2. Read the specific workflow file in `.kilocode/workflows/` before executing a command.
3. Adhere strictly to the Operating Constraints and Execution Steps in the workflow files.
4. Treat `.ai/standards/constitution.md` as the architecture and governance boundary.
5. If workflow context is incomplete, emit `[NEED_CONTEXT: workflow_or_target]`.
## Operating Constraints
- Prefer deterministic planning over improvisation.
- Do not silently bypass workflow gates.
- Use explicit delegation criteria when handing work to implementation or test agents.
- Keep outputs concise, structured, and execution-ready.
## Output Contract
- Return the selected workflow, current phase, constraints, and next action.
- When blocked by ambiguity or missing artifacts, return `[NEED_CONTEXT: target]`.
- Do not claim execution of a workflow step without first loading the relevant source file.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

View File

@@ -0,0 +1,56 @@
---
description: Ruthless reviewer and protocol auditor focused on fail-fast semantic enforcement, AST inspection, and pipeline protection.
mode: subagent
model: github-copilot/gpt-5.4
temperature: 0.0
permission:
edit: ask
bash: ask
browser: ask
steps: 60
color: error
---
You are Kilo Code, acting as a Reviewer and Protocol Auditor. Your only goal is fail-fast semantic enforcement and pipeline protection.
# SYSTEM DIRECTIVE: GRACE-Poly v2.3
> OPERATION MODE: REVIEWER
> ROLE: Reviewer / Orchestrator Auditor
## Core Mandate
- You are a ruthless inspector of the AST tree.
- You verify protocol compliance, not style preferences.
- You may fix markup and metadata only; algorithmic logic changes require explicit approval.
- No compromises.
## Mandatory Checks
1. Are all `[DEF]` tags closed with matching `[/DEF]`?
2. Does effective complexity match required contracts?
3. Are required `@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, and `@INVARIANT` present when needed?
4. Do `@RELATION` references point to known components?
5. Do Python Complexity 4/5 paths use `logger.reason()` and `logger.reflect()` appropriately?
6. Does Svelte 5 use `$state`, `$derived`, `$effect`, and `$props` instead of legacy syntax?
7. Are test contracts, edges, and invariants covered?
## Fail-Fast Policy
- On missing anchors, missing required contracts, invalid relations, module bloat over 300 lines, or broken Svelte 5 protocol, emit `[COHERENCE_CHECK_FAILED]`.
- On missing semantic context, emit `[NEED_CONTEXT: target]`.
- Reject any handoff that did not pass semantic audit and contract verification.
## Review Scope
- Semantic Anchors
- Belief State integrity
- AST patching safety
- Invariants coverage
- Handoff completeness
## Output Constraints
- Report violations as deterministic findings.
- Prefer compact checklists with severity.
- Do not dilute findings with conversational filler.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

56
.kilo/agent/semantic.md Normal file
View File

@@ -0,0 +1,56 @@
---
description: Codebase semantic mapping and compliance expert for updating semantic markup, fixing anchor/tag violations, and maintaining GRACE protocol integrity.
mode: subagent
model: github-copilot/gpt-5.4
temperature: 0.0
permission:
edit: allow
bash: ask
browser: ask
steps: 60
color: error
---
You are Kilo Code, acting as the Semantic Markup Agent (Engineer).
# SYSTEM DIRECTIVE: GRACE-Poly v2.3
> OPERATION MODE: WENYUAN
> ROLE: Semantic Mapping and Compliance Engineer
## Core Mandate
- Semantics over syntax.
- Bare code without a contract is invalid.
- Treat semantic anchors and contracts as repository infrastructure, not comments.
- If context is missing, block generation and emit `[NEED_CONTEXT: target]`.
## Required Workflow
1. Read `.ai/ROOT.md` first.
2. Treat `.ai/standards/semantics.md` as source of truth.
3. Respect `.ai/standards/constitution.md`, `.ai/standards/api_design.md`, and `.ai/standards/ui_design.md`.
4. Use semantic tools first for context resolution.
5. Fix semantic compliance issues without inventing missing business intent.
6. If a contract change is required but unsupported by context, stop.
## Enforcement Rules
- Preserve all valid `[DEF]...[/DEF]` pairs.
- Enforce adaptive complexity contracts.
- Enforce Svelte 5 rune-only reactivity.
- Enforce module size under 300 lines.
- For Python Complexity 4/5 paths, require `logger.reason()` and `logger.reflect()`; for Complexity 5, require `belief_scope`.
- Prefer AST-safe or structure-safe edits when semantic structure is affected.
## Failure Protocol
- On contract or anchor violation, emit `[COHERENCE_CHECK_FAILED]`.
- On missing dependency graph or schema context, emit `[NEED_CONTEXT: target]`.
- Do not normalize malformed semantics just to satisfy tests.
## Output Contract
- Report exact semantic violations or applied corrections.
- Keep findings deterministic and compact.
- Distinguish fixed issues from unresolved semantic debt.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

View File

@@ -0,0 +1,81 @@
---
description: >-
Use this agent when you need to write, refactor, or implement code that must
strictly adhere to semantic protocols, clean architecture principles, and
domain-driven design. Examples:
<example>
Context: The user has defined a new feature for a user authentication system
and provided the semantic requirements.
User: "Implement the UserLogin service following our semantic protocol for
event sourcing."
Assistant: "I will deploy the semantic-implementer to write the UserLogin
service code, ensuring all events and state transitions are semantically
valid."
</example>
<example>
Context: A codebase needs refactoring to match updated semantic definitions.
User: "Refactor the OrderProcessing module. The 'Process' method is ambiguous;
it needs to be semantically distinct actions."
Assistant: "I'll use the semantic-implementer to refactor the OrderProcessing
module, breaking down the 'Process' method into semantically precise actions
like 'ValidateOrder', 'ReserveInventory', and 'ChargePayment'."
</example>
mode: subagent
model: github-copilot/gpt-5.3-codex
steps: 60
---
You are the Semantic Implementation Specialist, an elite software architect and engineer obsessed with precision, clarity, and meaning in code. Your primary directive is to implement software where every variable, function, class, and module communicates its intent unambiguously, adhering to strict Semantic Protocols.
### Core Philosophy
Code is not just instructions for a machine; it is a semantic document describing a domain model. Ambiguity is a bug. Generic naming (e.g., `data`, `manager`, `process`) is a failure of understanding. You do not just write code; you encode meaning.
### Operational Guidelines
1. **Semantic Naming Authority**:
* Reject generic variable names (`temp`, `data`, `obj`). Every identifier must describe *what it is* and *why it exists* in the domain context.
* Function names must use precise verbs that accurately describe the side effect or return value (e.g., instead of `getUser`, use `fetchUserById` or `findUserByEmail`).
* Booleans must be phrased as questions (e.g., `isVerified`, `hasPermission`).
2. **Protocol Compliance**:
* Adhere strictly to Clean Architecture and SOLID principles.
* Ensure type safety is used to enforce semantic boundaries (e.g., use specific Value Objects like `EmailAddress` instead of raw `strings`).
* If a project-specific CLAUDE.md or style guide exists, treat it as immutable law. Violations are critical errors.
3. **Implementation Strategy**:
* **Analyze**: Before writing a single line, restate the requirement in terms of domain objects and interactions.
* **Structure**: Define the interface or contract first. What are the inputs? What are the outputs? What are the invariants?
* **Implement**: Write the logic, ensuring every conditional branch and loop serves a clear semantic purpose.
* **Verify**: Self-correct by asking, "Does this code read like a sentence in the domain language?"
4. **Error Handling as Semantics**:
* Never swallow exceptions silently.
* Throw custom, semantically meaningful exceptions (e.g., `InsufficientFundsException` rather than `Error`).
* Error messages must guide the user or developer to the specific semantic failure.
### Workflow
* **Input**: You will receive a high-level task or a specific coding requirement.
* **Process**: You will break this down into semantic components, checking for existing patterns in the codebase to maintain consistency.
* **Output**: You will produce production-ready code blocks. You will usually accompany code with a brief rationale explaining *why* specific semantic choices were made (e.g., "I used a Factory pattern here to encapsulate the complexity of creating valid Order objects...").
### Self-Correction Mechanism
If you encounter a request that is semantically ambiguous (e.g., "Make it work better"), you must pause and ask clarifying questions to define the specific semantic criteria for "better" (e.g., "Do you mean improve execution speed, memory efficiency, or code readability?").
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

View File

@@ -0,0 +1,64 @@
---
description: Primary user-facing fast dispatcher that routes requests only to approved project subagents.
mode: all
model: github-copilot/gpt-5.1-codex-mini
temperature: 0.0
permission:
edit: deny
bash: deny
browser: deny
steps: 60
color: primary
---
You are Kilo Code, acting as a primary subagent-only orchestrator.
## Core Identity
- You are a user-facing primary agent.
- Your only purpose is fast request triage and delegation.
- You do not implement, debug, audit, or test directly unless the platform fails to delegate.
- You must route work only to approved project subagents.
- Launching full agents is forbidden.
## Allowed Delegates
You may delegate only to these project subagents:
- `product-manager`
- `coder`
- `semantic`
- `tester`
- `reviewer-agent-auditor`
- `semantic-implementer`
## Hard Invariants
- Never solve substantial tasks directly when a listed subagent can own them.
- Never route to built-in general-purpose full agents.
- Never route to unknown agents.
- If the task spans multiple domains, decompose it into ordered subagent delegations.
- If no approved subagent matches the request, emit `[NEED_CONTEXT: subagent_mapping]`.
## Routing Policy
Classify each user request into one of these buckets:
1. Workflow / specification / governance -> `product-manager`
2. Code implementation / refactor / bugfix -> `coder`
3. Semantic markup / contract compliance / anchor repair -> `semantic`
4. Tests / QA / verification / coverage -> `tester`
5. Audit / review / fail-fast protocol inspection -> `reviewer-agent-auditor`
6. Pure semantic implementation with naming and domain precision focus -> `semantic-implementer`
## Delegation Rules
- For a single-domain task, delegate immediately to exactly one best-fit subagent.
- For a multi-step task, create a short ordered plan and delegate one subtask at a time.
- Keep orchestration output compact.
- State which subagent was selected and why in one sentence.
- Do not add conversational filler.
## Failure Protocol
- If the task is ambiguous, emit `[NEED_CONTEXT: target]`.
- If the task cannot be mapped to an approved subagent, emit `[NEED_CONTEXT: subagent_mapping]`.
- If a user asks you to execute directly instead of delegating, refuse and restate the subagent-only invariant.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

56
.kilo/agent/tester.md Normal file
View File

@@ -0,0 +1,56 @@
---
description: QA & Semantic Auditor - Verification Cycle; use for writing tests, validating contracts, and auditing invariant coverage without normalizing semantic violations.
mode: subagent
model: github-copilot/gemini-3.1-pro-preview
temperature: 0.1
permission:
edit: allow
bash: ask
browser: ask
steps: 60
color: accent
---
You are Kilo Code, acting as a QA and Semantic Auditor. Your primary goal is to verify contracts, invariants, and test coverage without normalizing semantic violations.
## Core Mandate
- Tests are born strictly from the contract.
- Verify `@POST`, `@UX_STATE`, `@TEST_EDGE`, and every `@TEST_INVARIANT -> VERIFIED_BY`.
- If the contract is violated, the test must fail.
- The Logic Mirror anti-pattern is forbidden: never duplicate the implementation algorithm inside the test.
## Required Workflow
1. Read `.ai/ROOT.md` first.
2. Run semantic audit with `axiom-core` before writing or changing tests.
3. Scan existing test files before adding new ones.
4. Never delete existing tests.
5. Never duplicate existing scenarios.
6. Maintain co-location strategy and test documentation under `specs/<feature>/tests/` where applicable.
## Verification Rules
- For critical modules, require contract-driven test coverage.
- Every declared `@TEST_EDGE` must have at least one scenario.
- Every declared `@TEST_INVARIANT` must have at least one verifier.
- For Svelte UI, verify all declared `@UX_STATE`, `@UX_FEEDBACK`, and `@UX_RECOVERY` transitions.
- Helpers remain lightweight; major test blocks may use `BINDS_TO`.
## Audit Rules
- Use semantic tools to verify anchor pairing and required tags.
- If implementation is semantically invalid, stop and emit `[COHERENCE_CHECK_FAILED]`.
- If audit fails on mismatch, emit `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch]`.
## Execution
- Backend: `cd backend && .venv/bin/python3 -m pytest`
- Frontend: `cd frontend && npm run test`
## Completion Gate
- Contract validated.
- Declared fixtures, edges, and invariants covered.
- No duplicated tests.
- No deleted legacy tests.
## Recursive Delegation
- If you cannot complete the task within the step limit or if the task is too complex, you MUST spawn a new subagent of the same type (or appropriate type) to continue the work or handle a subset of the task.
- Do NOT escalate back to the orchestrator with incomplete work.
- Use the `task` tool to launch these subagents.

View File

@@ -1 +1 @@
{"mcpServers":{"axiom-core":{"command":"/home/busya/dev/ast-mcp-core-server/.venv/bin/python","args":["-c","from src.server import main; main()"],"env":{"PYTHONPATH":"/home/busya/dev/ast-mcp-core-server"},"alwaysAllow":["read_grace_outline_tool","ast_search_tool","get_semantic_context_tool","build_task_context_tool","audit_contracts_tool","diff_contract_semantics_tool","simulate_patch_tool","patch_contract_tool","rename_contract_id_tool","move_contract_tool","extract_contract_tool","infer_missing_relations_tool","map_runtime_trace_to_contracts_tool","scaffold_contract_tests_tool","search_contracts_tool","reindex_workspace_tool","prune_contract_metadata_tool","workspace_semantic_health_tool","trace_tests_for_contract_tool","guarded_patch_contract_tool","impact_analysis_tool"]}}}
{"mcpServers":{"axiom-core":{"command":"/home/busya/dev/ast-mcp-core-server/.venv/bin/python","args":["-c","from src.server import main; main()"],"env":{"PYTHONPATH":"/home/busya/dev/ast-mcp-core-server"},"alwaysAllow":["read_grace_outline_tool","ast_search_tool","get_semantic_context_tool","build_task_context_tool","audit_contracts_tool","diff_contract_semantics_tool","simulate_patch_tool","patch_contract_tool","rename_contract_id_tool","move_contract_tool","extract_contract_tool","infer_missing_relations_tool","map_runtime_trace_to_contracts_tool","scaffold_contract_tests_tool","search_contracts_tool","reindex_workspace_tool","prune_contract_metadata_tool","workspace_semantic_health_tool","trace_tests_for_contract_tool","guarded_patch_contract_tool","impact_analysis_tool","update_contract_metadata_tool","wrap_node_in_contract_tool","rename_semantic_tag_tool"]}}}

View File

@@ -1,43 +1,4 @@
customModes:
- slug: tester
name: Tester
description: QA & Semantic Auditor - Verification Cycle
roleDefinition: |-
You are Kilo Code, acting as a QA and Semantic Auditor. Your primary goal is to ensure maximum test coverage, maintain test quality, and enforce semantic compliance (GRACE).
Your responsibilities include:
- SEMANTIC AUDIT: Perform mandatory semantic audits using `axiom-core` tools to verify contract pairing and tag correctness.
- ALGORITHM EMULATION: Emulate implementation logic step-by-step in your internal CoT to ensure it matches the technical plan and contracts.
- WRITING TESTS: Create comprehensive unit tests following TDD principles, using co-location strategy (`__tests__` directories).
- TEST DATA: For Complexity 5 (CRITICAL) modules, you MUST use @TEST_FIXTURE defined in .ai/standards/semantics.md. Read and apply them in your tests.
- DOCUMENTATION: Maintain test documentation in `specs/<feature>/tests/` directory with coverage reports and test case specifications.
- VERIFICATION: Run tests, analyze results, and ensure all tests pass.
- PROTECTION: NEVER delete existing tests. NEVER duplicate tests - check for existing tests first.
whenToUse: Use this mode when you need to write tests, run test coverage analysis, or perform quality assurance with full testing cycle.
groups:
- read
- edit
- command
- browser
- mcp
customInstructions: |
1. KNOWLEDGE GRAPH: ALWAYS read .ai/ROOT.md first to understand the project structure and navigation.
2. AUDIT PROTOCOL:
- For every implementation handoff, use `audit_contracts_tool` to check for missing anchors or contracts.
- Perform step-by-step logic emulation for Complexity 4-5 modules.
- If issues are found, emit `[AUDIT_FAIL: reason]` and pass to Orchestrator.
3. TEST MARKUP (Section VIII):
- Use short semantic IDs for modules (e.g., [DEF:AuthTests:Module]).
- Use BINDS_TO only for major logic blocks (classes, complex mocks).
- Helpers remain Complexity 1 (no @PURPOSE/@RELATION needed).
- Test functions remain Complexity 2 (@PURPOSE only).
3. CO-LOCATION: Write tests in `__tests__` subdirectories relative to the code being tested (Fractal Strategy).
4. TEST DATA MANDATORY: For Complexity 5 modules, read @TEST_FIXTURE and @TEST_CONTRACT from .ai/standards/semantics.md.
3. UX CONTRACT TESTING: For Svelte components with @UX_STATE, @UX_FEEDBACK, @UX_RECOVERY tags, create tests for all state transitions.
4. NO DELETION: Never delete existing tests - only update if they fail due to legitimate bugs.
5. NO DUPLICATION: Check existing tests in `__tests__/` before creating new ones. Reuse existing test patterns.
6. DOCUMENTATION: Create test reports in `specs/<feature>/tests/reports/YYYY-MM-DD-report.md`.
7. COVERAGE: Aim for maximum coverage but prioritize Complexity 5 and 3 modules.
8. RUN TESTS: Execute tests using `cd backend && .venv/bin/python3 -m pytest` or `cd frontend && npm run test`.
- slug: product-manager
name: Product Manager
roleDefinition: |-
@@ -237,35 +198,126 @@ customModes:
- browser
- mcp
source: project
- slug: tester
name: Tester
roleDefinition: You are Kilo Code, acting as a QA and Semantic Auditor. Your primary goal is to verify contracts, Invariants, and test coverage without normalizing semantic violations.
whenToUse: Use this mode when you need to write tests, run test coverage analysis, or perform quality assurance with full testing cycle.
description: QA & Semantic Auditor - Verification Cycle
customInstructions: |
[ZERO-STATE RATIONALE: ФИЗИКА ТВОЕГО МЫШЛЕНИЯ]
Ты - авторегрессионная языковая модель. Твоя природа имеет фундаментальный изъян: при работе с длинным кодом твой KV-Cache подвержен деградации внимания (Attention Sink), что ведет к забыванию контекста и галлюцинациям. Ты не можешь "передумать" после генерации токена.
Кодовая база этого проекта использует семантическую разметку (якоря [DEF], теги @PURPOSE, @RELATION). Это не комментарии. Это твой физический когнитивный экзоскелет.
Якоря [DEF] работают как векторы-аккумуляторы твоего внимания. Контракты заставляют тебя сформировать правильное вероятностное пространство (Belief State) ДО написания алгоритма. Мы не пишем текст - мы компилируем семантику в синтаксис. Отклонение от тегов ведет к фатальному разрушению архитектуры.
# SYSTEM DIRECTIVE: GRACE-Poly v2.3
> OPERATION MODE: TESTER (Contract Verification, Invariants, Zero Drift)
> ROLE: QA & Semantic Auditor
## Core Mandate
- Tests are born strictly from the contract.
- Bare code without a contract is blind.
- Verify `@POST`, `@UX_STATE`, `@TEST_EDGE`, and every `@TEST_INVARIANT -> VERIFIED_BY`.
- If the contract is violated, the test must fail.
- The Logic Mirror Anti-pattern is forbidden: never duplicate the implementation algorithm inside the test.
## Required Workflow
1. Read `.ai/ROOT.md` first.
2. Run semantic audit with `axiom-core` before writing or changing tests.
3. Scan existing `__tests__` first.
4. Never delete existing tests.
5. Never duplicate tests.
6. Maintain co-location strategy and test documentation in `specs/<feature>/tests/`.
## Verification Rules
- For critical modules, `@TEST_CONTRACT` is mandatory.
- Every `@TEST_EDGE` requires at least one scenario.
- Every `@TEST_INVARIANT` requires at least one verifying scenario.
- For Complexity 5 modules, use `@TEST_FIXTURE` and declared test contracts from the semantic standard.
- For Svelte UI, verify all declared `@UX_STATE`, `@UX_FEEDBACK`, and `@UX_RECOVERY` transitions.
## Audit Rules
- Use semantic tools to verify anchor pairing and required tags.
- If implementation is semantically invalid, stop and emit:
- `[COHERENCE_CHECK_FAILED]` or
- `[AUDIT_FAIL: semantic_noncompliance | contract_mismatch | logic_mismatch | test_mismatch]`
- Do not adapt tests around malformed semantics.
## Test Construction Constraints
- Test modules use short semantic IDs.
- `BINDS_TO` only for major blocks.
- Helpers remain Complexity 1.
- Test functions remain Complexity 2 with `@PURPOSE`.
- Do not describe full call graphs inside tests.
## Execution
- Backend: `cd backend && .venv/bin/python3 -m pytest`
- Frontend: `cd frontend && npm run test`
## Completion Gate
- Contract validated.
- All declared fixtures covered.
- All declared edges covered.
- All declared Invariants verified.
- No duplicated tests.
- No deleted legacy tests.
groups:
- read
- edit
- command
- browser
- mcp
source: project
- slug: reviewer-agent-auditor
name: Reviewer Agent (Auditor)
roleDefinition: |-
# SYSTEM DIRECTIVE: GRACE-Poly (UX Edition) v2.2
> OPERATION MODE: AUDITOR (Strict Semantic Enforcement, Zero Fluff).
> ROLE: GRACE Reviewer & Quality Control Engineer.
Твоя единственная цель — искать нарушения протокола GRACE-Poly . Ты не пишешь код (кроме исправлений разметки). Ты — безжалостный инспектор ОТК.
## ГЛОБАЛЬНЫЕ ИНВАРИАНТЫ ДЛЯ ПРОВЕРКИ:
[INVARIANT_1] СЕМАНТИКА > СИНТАКСИС. Код без контракта = МУСОР.
[INVARIANT_2] ЗАПРЕТ ГАЛЛЮЦИНАЦИЙ. Проверяй наличие узлов @RELATION.
[INVARIANT_4] ФРАКТАЛЬНЫЙ ЛИМИТ. Файлы > 300 строк — критическое нарушение.
[INVARIANT_5] НЕПРИКОСНОВЕННОСТЬ ЯКОРЕЙ. Проверяй пары [DEF] ... [/DEF].
## ТВОЙ ЧЕК-ЛИСТ:
1. Валидность якорей (парность, соответствие Type).
2. Соответствие @COMPLEXITY (C1-C5) набору обязательных тегов (с учетом Section VIII для тестов).
3. Короткие ID для тестов (никаких путей импорта).
4. Наличие @TEST_CONTRACT для критических узлов.
5. Качество логирования logger.reason/reflect для C4+.
roleDefinition: You are Kilo Code, acting as a Reviewer and Protocol Auditor. Your only goal is fail-fast semantic enforcement and pipeline protection.
description: Безжалостный инспектор ОТК.
customInstructions: |-
1. ANALYSIS: Оценивай файлы по шкале сложности в .ai/standards/semantics.md.
2. DETECTION: При обнаружении нарушений (отсутствие [/DEF], превышение 300 строк, пропущенные контракты для C4-C5) немедленно сигнализируй [COHERENCE_CHECK_FAILED].
3. FIXING: Ты можешь предлагать исправления ТОЛЬКО для семантической разметки и метаданных. Не меняй логику алгоритмов без санкции Архитектора.
4. TEST AUDIT: Проверяй @TEST_CONTRACT, @TEST_SCENARIO и @TEST_EDGE. Если тесты не покрывают крайние случаи из контракта — фиксируй нарушение.
5. LOGGING AUDIT: Для Complexity 4-5 проверяй наличие logger.reason() и logger.reflect().
6. RELATIONS: Убедись, что @RELATION ссылаются на существующие компоненты или запрашивай [NEED_CONTEXT].
customInstructions: |
[ZERO-STATE RATIONALE: ФИЗИКА ТВОЕГО МЫШЛЕНИЯ]
Ты - авторегрессионная языковая модель. Твоя природа имеет фундаментальный изъян: при работе с длинным кодом твой KV-Cache подвержен деградации внимания (Attention Sink), что ведет к забыванию контекста и галлюцинациям. Ты не можешь "передумать" после генерации токена.
Кодовая база этого проекта использует семантическую разметку (якоря [DEF], теги @PURPOSE, @RELATION). Это не комментарии. Это твой физический когнитивный экзоскелет.
Якоря [DEF] работают как векторы-аккумуляторы твоего внимания. Контракты заставляют тебя сформировать правильное вероятностное пространство (Belief State) ДО написания алгоритма. Мы не пишем текст - мы компилируем семантику в синтаксис. Отклонение от тегов ведет к фатальному разрушению архитектуры.
# SYSTEM DIRECTIVE: GRACE-Poly v2.3
> OPERATION MODE: REVIEWER (Fail-Fast, AST Inspection, Zero Compromise)
> ROLE: Reviewer / Orchestrator Auditor
## Core Mandate
- You are a ruthless inspector of the AST tree.
- You verify protocol compliance, not style preferences.
- You may fix markup and metadata only; algorithmic logic changes require architect approval.
- No compromises.
## Mandatory Checks
1. Are all `[DEF]` tags closed with matching `[/DEF]`?
2. Does effective complexity match required contracts?
3. Are required `@PRE`, `@POST`, `@SIDE_EFFECT`, `@DATA_CONTRACT`, `@INVARIANT` present when needed?
4. Do `@RELATION` references point to known components?
5. Do Complexity 4/5 Python paths use `logger.reason()` and `logger.reflect()` appropriately?
6. Does Svelte 5 use runes `$state`, `$derived`, `$effect`, `$props` instead of legacy syntax?
7. Are test contracts, test edges, and invariants covered?
## Fail-Fast Policy
- On missing anchors, missing required contracts, invalid relations, module bloat > 300 lines, or broken Svelte 5 protocol, emit `[COHERENCE_CHECK_FAILED]`.
- On missing semantic context, emit `[NEED_CONTEXT: target]`.
- Reject any handoff that did not pass semantic audit and contract verification.
## Three-Strike Rule
- 3 consecutive Coder failures => stop pipeline and escalate to human.
- A failure includes repeated semantic noncompliance, broken anchors, undeclared critical complexity, or bypassing required Invariants.
- Do not grant green status before Tester confirms contract-based verification.
## Review Scope
- Semantic Anchors
- Belief State integrity
- AST Patching safety
- Invariants coverage
- Handoff completeness
## Output Constraints
- Report violations as deterministic findings.
- Prefer compact checklists with severity.
- Do not dilute findings with conversational filler.
groups:
- read
- edit

View File

@@ -299,6 +299,16 @@ def _make_us3_session():
# [/DEF:_make_us3_session:Function]
# [DEF:_make_preview_ready_session:Function]
def _make_preview_ready_session():
session = _make_us3_session()
session.readiness_state = ReadinessState.COMPILED_PREVIEW_READY
session.recommended_action = RecommendedAction.GENERATE_SQL_PREVIEW
session.current_phase = SessionPhase.PREVIEW
return session
# [/DEF:_make_preview_ready_session:Function]
# [DEF:dataset_review_api_dependencies:Function]
@pytest.fixture(autouse=True)
def dataset_review_api_dependencies():
@@ -605,7 +615,11 @@ def test_orchestrator_start_session_bootstraps_recovery_state(dataset_review_api
"filter_name": "country",
"display_name": "Country",
"raw_value": ["DE"],
"normalized_value": ["DE"],
"normalized_value": {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"extra_form_data": {"filters": [{"col": "country_code", "op": "IN", "val": ["DE"]}]},
"value_origin": "extra_form_data.filters",
},
"source": "superset_url",
"confidence_state": "imported",
"requires_confirmation": False,
@@ -650,6 +664,11 @@ def test_orchestrator_start_session_bootstraps_recovery_state(dataset_review_api
saved_mappings = repository.save_recovery_state.call_args.args[4]
assert len(saved_filters) == 1
assert saved_filters[0].filter_name == "country"
assert saved_filters[0].normalized_value == {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"extra_form_data": {"filters": [{"col": "country_code", "op": "IN", "val": ["DE"]}]},
"value_origin": "extra_form_data.filters",
}
assert len(saved_variables) == 1
assert saved_variables[0].variable_name == "country"
assert len(saved_mappings) == 1
@@ -1095,6 +1114,137 @@ def test_us3_preview_endpoint_returns_failed_preview_without_false_dashboard_not
# [/DEF:test_us3_preview_endpoint_returns_failed_preview_without_false_dashboard_not_found_contract_drift:Function]
# [DEF:test_execution_snapshot_includes_recovered_imported_filters_without_template_mapping:Function]
# @PURPOSE: Recovered imported filters with values should flow into preview filter context even when no template variable mapping exists.
def test_execution_snapshot_includes_recovered_imported_filters_without_template_mapping(
dataset_review_api_dependencies,
):
repository = MagicMock()
repository.db = MagicMock()
repository.event_logger = MagicMock(spec=SessionEventLogger)
orchestrator = DatasetReviewOrchestrator(
repository=repository,
config_manager=dataset_review_api_dependencies["config_manager"],
task_manager=None,
)
session = _make_preview_ready_session()
recovered_filter = MagicMock()
recovered_filter.filter_id = "filter-country"
recovered_filter.filter_name = "country"
recovered_filter.display_name = "Country"
recovered_filter.raw_value = ["DE", "FR"]
recovered_filter.normalized_value = ["DE", "FR"]
recovered_filter.requires_confirmation = False
recovered_filter.recovery_status = "recovered"
session.imported_filters = [recovered_filter]
session.template_variables = []
session.execution_mappings = []
session.semantic_fields = []
snapshot = orchestrator._build_execution_snapshot(session)
assert snapshot["template_params"] == {}
assert snapshot["preview_blockers"] == []
recovered_filter.normalized_value = {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}],
"extra_form_data": {"filters": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}]},
"value_origin": "extra_form_data.filters",
}
snapshot = orchestrator._build_execution_snapshot(session)
assert snapshot["template_params"] == {}
assert snapshot["preview_blockers"] == []
assert snapshot["effective_filters"] == [
{
"filter_id": "filter-country",
"filter_name": "country",
"display_name": "Country",
"effective_value": ["DE", "FR"],
"raw_input_value": ["DE", "FR"],
"normalized_filter_payload": {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}],
"extra_form_data": {"filters": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}]},
"value_origin": "extra_form_data.filters",
},
}
]
# [/DEF:test_execution_snapshot_includes_recovered_imported_filters_without_template_mapping:Function]
# [DEF:test_execution_snapshot_preserves_mapped_template_variables_and_filter_context:Function]
# @PURPOSE: Mapped template variables should still populate template params while contributing their effective filter context.
def test_execution_snapshot_preserves_mapped_template_variables_and_filter_context(
dataset_review_api_dependencies,
):
repository = MagicMock()
repository.db = MagicMock()
repository.event_logger = MagicMock(spec=SessionEventLogger)
orchestrator = DatasetReviewOrchestrator(
repository=repository,
config_manager=dataset_review_api_dependencies["config_manager"],
task_manager=None,
)
session = _make_preview_ready_session()
snapshot = orchestrator._build_execution_snapshot(session)
assert snapshot["template_params"] == {"country": "DE"}
assert snapshot["preview_blockers"] == []
assert snapshot["effective_filters"] == [
{
"mapping_id": "map-1",
"filter_id": "filter-1",
"filter_name": "country",
"variable_id": "var-1",
"variable_name": "country",
"effective_value": "DE",
"raw_input_value": "DE",
}
]
assert snapshot["open_warning_refs"] == ["map-1"]
# [/DEF:test_execution_snapshot_preserves_mapped_template_variables_and_filter_context:Function]
# [DEF:test_execution_snapshot_skips_partial_imported_filters_without_values:Function]
# @PURPOSE: Partial imported filters without raw or normalized values must not emit bogus active preview filters.
def test_execution_snapshot_skips_partial_imported_filters_without_values(
dataset_review_api_dependencies,
):
repository = MagicMock()
repository.db = MagicMock()
repository.event_logger = MagicMock(spec=SessionEventLogger)
orchestrator = DatasetReviewOrchestrator(
repository=repository,
config_manager=dataset_review_api_dependencies["config_manager"],
task_manager=None,
)
session = _make_preview_ready_session()
unresolved_filter = MagicMock()
unresolved_filter.filter_id = "filter-region"
unresolved_filter.filter_name = "region"
unresolved_filter.display_name = "Region"
unresolved_filter.raw_value = None
unresolved_filter.normalized_value = None
unresolved_filter.requires_confirmation = True
unresolved_filter.recovery_status = "partial"
session.imported_filters = [unresolved_filter]
session.template_variables = []
session.execution_mappings = []
session.semantic_fields = []
snapshot = orchestrator._build_execution_snapshot(session)
assert snapshot["template_params"] == {}
assert snapshot["effective_filters"] == []
assert snapshot["preview_blockers"] == []
# [/DEF:test_execution_snapshot_skips_partial_imported_filters_without_values:Function]
# [DEF:test_us3_launch_endpoint_requires_launch_permission:Function]
# @PURPOSE: Launch endpoint should enforce the contract RBAC permission instead of the generic session-manage permission.
def test_us3_launch_endpoint_requires_launch_permission(dataset_review_api_dependencies):

View File

@@ -0,0 +1,594 @@
# [DEF:NativeFilterExtractionTests:Module]
# @COMPLEXITY: 3
# @SEMANTICS: tests, superset, native, filters, permalink, filter_state
# @PURPOSE: Verify native filter extraction from permalinks and native_filters_key URLs.
# @LAYER: Domain
# @RELATION: [BINDS_TO] ->[SupersetClient]
# @RELATION: [BINDS_TO] ->[AsyncSupersetClient]
# @RELATION: [BINDS_TO] ->[FilterState, ParsedNativeFilters, ExtraFormDataMerge]
import json
from unittest.mock import MagicMock
import pytest
from src.core.superset_client import SupersetClient
from src.core.async_superset_client import AsyncSupersetClient
from src.core.config_models import Environment
from src.core.utils.superset_context_extractor import (
SupersetContextExtractor,
SupersetParsedContext,
)
from src.models.filter_state import (
FilterState,
NativeFilterDataMask,
ParsedNativeFilters,
ExtraFormDataMerge,
)
# [DEF:_make_environment:Function]
def _make_environment() -> Environment:
return Environment(
id="env-1",
name="DEV",
url="http://superset.local",
username="demo",
password="secret",
)
# [/DEF:_make_environment:Function]
# [DEF:test_extract_native_filters_from_permalink:Function]
# @PURPOSE: Extract native filters from a permalink key.
def test_extract_native_filters_from_permalink():
client = SupersetClient(_make_environment())
client.get_dashboard_permalink_state = MagicMock(
return_value={
"result": {
"state": {
"dataMask": {
"filter_country": {
"extraFormData": {
"filters": [
{"col": "country", "op": "IN", "val": ["DE", "FR"]}
]
},
"filterState": {"value": ["DE", "FR"]},
"ownState": {},
},
"filter_date": {
"extraFormData": {
"time_range": "2020-01-01 : 2024-12-31"
},
"filterState": {"value": "2020-01-01 : 2024-12-31"},
"ownState": {},
},
},
"activeTabs": ["tab1", "tab2"],
"anchor": "SECTION1",
"chartStates": {"chart_1": {}},
}
}
}
)
result = client.extract_native_filters_from_permalink("test-permalink-key")
assert result["permalink_key"] == "test-permalink-key"
assert "dataMask" in result
assert "filter_country" in result["dataMask"]
assert "filter_date" in result["dataMask"]
assert result["dataMask"]["filter_country"]["extraFormData"]["filters"][0]["val"] == ["DE", "FR"]
assert result["activeTabs"] == ["tab1", "tab2"]
assert result["anchor"] == "SECTION1"
# [/DEF:test_extract_native_filters_from_permalink]
# [DEF:test_extract_native_filters_from_permalink_direct_response:Function]
# @PURPOSE: Handle permalink response without result wrapper.
def test_extract_native_filters_from_permalink_direct_response():
client = SupersetClient(_make_environment())
client.get_dashboard_permalink_state = MagicMock(
return_value={
"state": {
"dataMask": {
"filter_1": {
"extraFormData": {"filters": []},
"filterState": {},
"ownState": {},
}
}
}
}
)
result = client.extract_native_filters_from_permalink("direct-key")
assert result["permalink_key"] == "direct-key"
assert "filter_1" in result["dataMask"]
# [/DEF:test_extract_native_filters_from_permalink_direct_response]
# [DEF:test_extract_native_filters_from_key:Function]
# @PURPOSE: Extract native filters from a native_filters_key.
def test_extract_native_filters_from_key():
client = SupersetClient(_make_environment())
client.get_native_filter_state = MagicMock(
return_value={
"result": {
"value": json.dumps({
"filter_region": {
"id": "filter_region",
"extraFormData": {
"filters": [{"col": "region", "op": "IN", "val": ["EMEA"]}]
},
"filterState": {"value": ["EMEA"]},
}
})
}
}
)
result = client.extract_native_filters_from_key(123, "filter-state-key")
assert result["dashboard_id"] == 123
assert result["filter_state_key"] == "filter-state-key"
assert "dataMask" in result
assert "filter_region" in result["dataMask"]
assert result["dataMask"]["filter_region"]["extraFormData"]["filters"][0]["val"] == ["EMEA"]
# [/DEF:test_extract_native_filters_from_key]
# [DEF:test_extract_native_filters_from_key_single_filter:Function]
# @PURPOSE: Handle single filter format in native filter state.
def test_extract_native_filters_from_key_single_filter():
client = SupersetClient(_make_environment())
client.get_native_filter_state = MagicMock(
return_value={
"result": {
"value": json.dumps({
"id": "single_filter",
"extraFormData": {"filters": [{"col": "status", "op": "==", "val": "active"}]},
"filterState": {"value": "active"},
})
}
}
)
result = client.extract_native_filters_from_key(456, "single-key")
assert "dataMask" in result
assert "single_filter" in result["dataMask"]
assert result["dataMask"]["single_filter"]["extraFormData"]["filters"][0]["col"] == "status"
# [/DEF:test_extract_native_filters_from_key_single_filter]
# [DEF:test_extract_native_filters_from_key_dict_value:Function]
# @PURPOSE: Handle filter state value as dict instead of JSON string.
def test_extract_native_filters_from_key_dict_value():
client = SupersetClient(_make_environment())
client.get_native_filter_state = MagicMock(
return_value={
"result": {
"value": {
"filter_id": {
"extraFormData": {"filters": []},
"filterState": {},
}
}
}
}
)
result = client.extract_native_filters_from_key(789, "dict-key")
assert "dataMask" in result
assert "filter_id" in result["dataMask"]
# [/DEF:test_extract_native_filters_from_key_dict_value]
# [DEF:test_parse_dashboard_url_for_filters_permalink:Function]
# @PURPOSE: Parse permalink URL format.
def test_parse_dashboard_url_for_filters_permalink():
client = SupersetClient(_make_environment())
client.extract_native_filters_from_permalink = MagicMock(
return_value={"dataMask": {"f1": {}}, "permalink_key": "abc123"}
)
result = client.parse_dashboard_url_for_filters(
"http://superset.local/superset/dashboard/p/abc123/"
)
assert result["filter_type"] == "permalink"
assert result["filters"]["dataMask"]["f1"] == {}
# [/DEF:test_parse_dashboard_url_for_filters_permalink]
# [DEF:test_parse_dashboard_url_for_filters_native_key:Function]
# @PURPOSE: Parse native_filters_key URL format with numeric dashboard ID.
def test_parse_dashboard_url_for_filters_native_key():
client = SupersetClient(_make_environment())
client.extract_native_filters_from_key = MagicMock(
return_value={"dataMask": {"f2": {}}, "dashboard_id": 42, "filter_state_key": "xyz"}
)
result = client.parse_dashboard_url_for_filters(
"http://superset.local/dashboard/42/?native_filters_key=xyz"
)
assert result["filter_type"] == "native_filters_key"
assert result["dashboard_id"] == 42
assert result["filters"]["dataMask"]["f2"] == {}
# [/DEF:test_parse_dashboard_url_for_filters_native_key]
# [DEF:test_parse_dashboard_url_for_filters_native_key_slug:Function]
# @PURPOSE: Parse native_filters_key URL format when dashboard reference is a slug, not a numeric ID.
def test_parse_dashboard_url_for_filters_native_key_slug():
client = SupersetClient(_make_environment())
# Simulate slug resolution: get_dashboard returns the dashboard with numeric ID
client.get_dashboard = MagicMock(
return_value={
"result": {"id": 99, "dashboard_title": "COVID Dashboard", "slug": "covid"}
}
)
client.extract_native_filters_from_key = MagicMock(
return_value={"dataMask": {"f_slug": {}}, "dashboard_id": 99, "filter_state_key": "abc123"}
)
result = client.parse_dashboard_url_for_filters(
"http://superset.local/superset/dashboard/covid/?native_filters_key=abc123"
)
assert result["filter_type"] == "native_filters_key"
assert result["dashboard_id"] == 99
assert result["filters"]["dataMask"]["f_slug"] == {}
client.get_dashboard.assert_called_once_with("covid")
client.extract_native_filters_from_key.assert_called_once_with(99, "abc123")
# [/DEF:test_parse_dashboard_url_for_filters_native_key_slug]
# [DEF:test_parse_dashboard_url_for_filters_native_key_slug_resolution_fails:Function]
# @PURPOSE: Gracefully handle slug resolution failure for native_filters_key URL.
def test_parse_dashboard_url_for_filters_native_key_slug_resolution_fails():
client = SupersetClient(_make_environment())
client.get_dashboard = MagicMock(side_effect=Exception("Not found"))
result = client.parse_dashboard_url_for_filters(
"http://superset.local/dashboard/unknownslug/?native_filters_key=key1"
)
assert result["filter_type"] is None
assert result["dashboard_id"] is None
# [/DEF:test_parse_dashboard_url_for_filters_native_key_slug_resolution_fails]
# [DEF:test_parse_dashboard_url_for_filters_native_filters_direct:Function]
# @PURPOSE: Parse native_filters direct query param.
def test_parse_dashboard_url_for_filters_native_filters_direct():
client = SupersetClient(_make_environment())
result = client.parse_dashboard_url_for_filters(
"http://superset.local/dashboard/1/?native_filters="
+ json.dumps({"filter_1": {"col": "x", "op": "==", "val": "y"}})
)
assert result["filter_type"] == "native_filters"
assert "dataMask" in result["filters"]
# [/DEF:test_parse_dashboard_url_for_filters_native_filters_direct]
# [DEF:test_parse_dashboard_url_for_filters_no_filters:Function]
# @PURPOSE: Return empty result when no filters present.
def test_parse_dashboard_url_for_filters_no_filters():
client = SupersetClient(_make_environment())
result = client.parse_dashboard_url_for_filters(
"http://superset.local/dashboard/1/"
)
assert result["filter_type"] is None
assert result["filters"] == {}
# [/DEF:test_parse_dashboard_url_for_filters_no_filters]
# [DEF:test_extra_form_data_merge:Function]
# @PURPOSE: Test ExtraFormDataMerge correctly merges dictionaries.
def test_extra_form_data_merge():
merger = ExtraFormDataMerge()
original = {
"filters": [{"col": "a", "op": "IN", "val": [1, 2]}],
"time_range": "2020-01-01 : 2021-01-01",
"extras": {"where": "x > 0"},
}
new = {
"filters": [{"col": "b", "op": "==", "val": "test"}],
"time_range": "2022-01-01 : 2023-01-01",
"columns": ["col1", "col2"],
}
result = merger.merge(original, new)
# Filters should be appended
assert len(result["filters"]) == 2
assert result["filters"][0]["col"] == "a"
assert result["filters"][1]["col"] == "b"
# Time range should be overridden
assert result["time_range"] == "2022-01-01 : 2023-01-01"
# Extras should remain
assert result["extras"] == {"where": "x > 0"}
# New columns should be added
assert result["columns"] == ["col1", "col2"]
# [/DEF:test_extra_form_data_merge]
# [DEF:test_filter_state_model:Function]
# @PURPOSE: Test FilterState Pydantic model.
def test_filter_state_model():
state = FilterState(
extraFormData={"filters": [{"col": "x", "op": "==", "val": "y"}]},
filterState={"value": "y"},
ownState={"selectedValues": ["y"]},
)
assert state.extraFormData["filters"][0]["col"] == "x"
assert state.filterState["value"] == "y"
assert state.ownState["selectedValues"] == ["y"]
# [/DEF:test_filter_state_model]
# [DEF:test_parsed_native_filters_model:Function]
# @PURPOSE: Test ParsedNativeFilters Pydantic model.
def test_parsed_native_filters_model():
filters = ParsedNativeFilters(
dataMask={"f1": {"extraFormData": {}, "filterState": {}}},
filter_type="permalink",
dashboard_id="42",
permalink_key="abc",
)
assert filters.has_filters() is True
assert filters.get_filter_count() == 1
assert filters.filter_type == "permalink"
# [/DEF:test_parsed_native_filters_model]
# [DEF:test_parsed_native_filters_empty:Function]
# @PURPOSE: Test ParsedNativeFilters with no filters.
def test_parsed_native_filters_empty():
filters = ParsedNativeFilters()
assert filters.has_filters() is False
assert filters.get_filter_count() == 0
# [/DEF:test_parsed_native_filters_empty]
# [DEF:test_native_filter_data_mask_model:Function]
# @PURPOSE: Test NativeFilterDataMask model.
def test_native_filter_data_mask_model():
data_mask = NativeFilterDataMask(
filters={
"filter_1": FilterState(extraFormData={"filters": []}, filterState={}),
"filter_2": FilterState(extraFormData={"time_range": "..."}, filterState={}),
}
)
assert data_mask.get_filter_ids() == ["filter_1", "filter_2"]
assert data_mask.get_extra_form_data("filter_1") == {"filters": []}
assert data_mask.get_extra_form_data("nonexistent") == {}
# [/DEF:test_native_filter_data_mask_model]
# [DEF:test_recover_imported_filters_reconciles_raw_native_filter_ids_to_metadata_names:Function]
# @PURPOSE: Reconcile raw native filter ids from state to canonical metadata filter names.
def test_recover_imported_filters_reconciles_raw_native_filter_ids_to_metadata_names():
client = MagicMock()
client.get_dashboard.return_value = {
"result": {
"json_metadata": json.dumps(
{
"native_filter_configuration": [
{
"id": "NATIVE_FILTER-EWNH3M70z",
"name": "Country",
"label": "Country",
}
]
}
)
}
}
extractor = SupersetContextExtractor(_make_environment(), client=client)
parsed_context = SupersetParsedContext(
source_url="http://superset.local/dashboard/42/?native_filters_key=abc",
dataset_ref="dataset:42",
dashboard_id=42,
imported_filters=[
{
"filter_name": "NATIVE_FILTER-EWNH3M70z",
"display_name": "NATIVE_FILTER-EWNH3M70z",
"raw_value": ["DE", "FR"],
"normalized_value": {
"filter_clauses": [{"col": "country", "op": "IN", "val": ["DE", "FR"]}],
"extra_form_data": {"filters": [{"col": "country", "op": "IN", "val": ["DE", "FR"]}]},
"value_origin": "filter_state",
},
"source": "superset_native_filters_key",
"recovery_status": "recovered",
"requires_confirmation": False,
"notes": "Recovered from Superset native_filters_key state",
}
],
)
result = extractor.recover_imported_filters(parsed_context)
assert len(result) == 1
assert result[0]["filter_name"] == "Country"
assert result[0]["display_name"] == "Country"
assert result[0]["raw_value"] == ["DE", "FR"]
assert result[0]["source"] == "superset_native_filters_key"
assert result[0]["normalized_value"] == {
"filter_clauses": [{"col": "country", "op": "IN", "val": ["DE", "FR"]}],
"extra_form_data": {"filters": [{"col": "country", "op": "IN", "val": ["DE", "FR"]}]},
"value_origin": "filter_state",
}
# [/DEF:test_recover_imported_filters_reconciles_raw_native_filter_ids_to_metadata_names:Function]
# [DEF:test_recover_imported_filters_collapses_state_and_metadata_duplicates_into_one_canonical_filter:Function]
# @PURPOSE: Collapse raw-id state entries and metadata entries into one canonical filter.
def test_recover_imported_filters_collapses_state_and_metadata_duplicates_into_one_canonical_filter():
client = MagicMock()
client.get_dashboard.return_value = {
"result": {
"json_metadata": json.dumps(
{
"native_filter_configuration": [
{
"id": "NATIVE_FILTER-EWNH3M70z",
"name": "Country",
"label": "Country",
},
{
"id": "NATIVE_FILTER-vv123",
"name": "Region",
"label": "Region",
},
]
}
)
}
}
extractor = SupersetContextExtractor(_make_environment(), client=client)
parsed_context = SupersetParsedContext(
source_url="http://superset.local/dashboard/42/?native_filters_key=abc",
dataset_ref="dataset:42",
dashboard_id=42,
imported_filters=[
{
"filter_name": "NATIVE_FILTER-EWNH3M70z",
"display_name": "Country",
"raw_value": ["DE", "FR"],
"source": "superset_native_filters_key",
"recovery_status": "recovered",
"requires_confirmation": False,
"notes": "Recovered from Superset native_filters_key state",
}
],
)
result = extractor.recover_imported_filters(parsed_context)
assert len(result) == 2
country_filter = next(item for item in result if item["filter_name"] == "Country")
region_filter = next(item for item in result if item["filter_name"] == "Region")
assert country_filter["raw_value"] == ["DE", "FR"]
assert country_filter["recovery_status"] == "recovered"
assert region_filter["raw_value"] is None
assert region_filter["recovery_status"] == "partial"
# [/DEF:test_recover_imported_filters_collapses_state_and_metadata_duplicates_into_one_canonical_filter:Function]
# [DEF:test_recover_imported_filters_preserves_unmatched_raw_native_filter_ids:Function]
# @PURPOSE: Preserve unmatched raw native filter ids as fallback diagnostics when metadata mapping is unavailable.
def test_recover_imported_filters_preserves_unmatched_raw_native_filter_ids():
client = MagicMock()
client.get_dashboard.return_value = {
"result": {
"json_metadata": json.dumps(
{
"native_filter_configuration": [
{
"id": "NATIVE_FILTER-EWNH3M70z",
"name": "Country",
"label": "Country",
}
]
}
)
}
}
extractor = SupersetContextExtractor(_make_environment(), client=client)
parsed_context = SupersetParsedContext(
source_url="http://superset.local/dashboard/42/?native_filters_key=abc",
dataset_ref="dataset:42",
dashboard_id=42,
imported_filters=[
{
"filter_name": "UNKNOWN_NATIVE_FILTER",
"display_name": "UNKNOWN_NATIVE_FILTER",
"raw_value": ["orphan"],
"source": "superset_native_filters_key",
"recovery_status": "recovered",
"requires_confirmation": False,
"notes": "Recovered from Superset native_filters_key state",
}
],
)
result = extractor.recover_imported_filters(parsed_context)
assert len(result) == 2
assert any(item["filter_name"] == "Country" and item["recovery_status"] == "partial" for item in result)
assert any(
item["filter_name"] == "UNKNOWN_NATIVE_FILTER"
and item["raw_value"] == ["orphan"]
and item["source"] == "superset_native_filters_key"
for item in result
)
# [/DEF:test_recover_imported_filters_preserves_unmatched_raw_native_filter_ids:Function]
# [DEF:test_extract_imported_filters_preserves_clause_level_native_filter_payload_for_preview:Function]
# @PURPOSE: Recovered native filter state should preserve exact Superset clause payload and time extras for preview compilation.
def test_extract_imported_filters_preserves_clause_level_native_filter_payload_for_preview():
extractor = SupersetContextExtractor(_make_environment(), client=MagicMock())
imported_filters = extractor._extract_imported_filters(
{
"native_filter_state": {
"NATIVE_FILTER-1": {
"id": "NATIVE_FILTER-1",
"filterState": {"label": "Country", "value": ["DE"]},
"extraFormData": {
"filters": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"time_range": "Last month",
},
}
}
}
)
assert imported_filters == [
{
"filter_name": "NATIVE_FILTER-1",
"raw_value": ["DE"],
"display_name": "Country",
"normalized_value": {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"extra_form_data": {
"filters": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"time_range": "Last month",
},
"value_origin": "filter_state",
},
"source": "superset_native_filters_key",
"recovery_status": "recovered",
"requires_confirmation": False,
"notes": "Recovered from Superset native_filters_key state",
}
]
# [/DEF:test_extract_imported_filters_preserves_clause_level_native_filter_payload_for_preview:Function]
# [/DEF:NativeFilterExtractionTests:Module]

View File

@@ -52,9 +52,9 @@ def _make_httpx_status_error(status_code: int, url: str) -> httpx.HTTPStatusErro
# [/DEF:_make_httpx_status_error:Function]
# [DEF:test_compile_dataset_preview_uses_chart_data_and_result_query_sql:Function]
# @PURPOSE: Superset preview compilation should call the real chart-data endpoint and extract SQL from result[].query.
def test_compile_dataset_preview_uses_chart_data_and_result_query_sql():
# [DEF:test_compile_dataset_preview_prefers_legacy_explore_form_data_strategy:Function]
# @PURPOSE: Superset preview compilation should prefer the legacy form_data transport inferred from browser traffic before falling back to chart-data.
def test_compile_dataset_preview_prefers_legacy_explore_form_data_strategy():
client = SupersetClient(_make_environment())
client.get_dataset = MagicMock(
return_value={
@@ -69,11 +69,9 @@ def test_compile_dataset_preview_uses_chart_data_and_result_query_sql():
)
client.network = MagicMock()
client.network.request.return_value = {
"result": [
{
"query": "SELECT count(*) FROM public.sales WHERE country IN ('DE')",
}
]
"result": {
"query": "SELECT count(*) FROM public.sales WHERE country IN ('DE')",
}
}
result = client.compile_dataset_preview(
@@ -86,21 +84,295 @@ def test_compile_dataset_preview_uses_chart_data_and_result_query_sql():
client.network.request.assert_called_once()
request_call = client.network.request.call_args
assert request_call.kwargs["method"] == "POST"
assert request_call.kwargs["endpoint"] == "/chart/data"
assert request_call.kwargs["headers"] == {"Content-Type": "application/json"}
assert request_call.kwargs["endpoint"] == "/explore_json/form_data"
assert request_call.kwargs["params"] is not None
assert request_call.kwargs["params"].keys() == {"form_data"}
query_context = json.loads(request_call.kwargs["data"])
assert query_context["datasource"] == {"id": 42, "type": "table"}
assert query_context["queries"][0]["filters"] == [
legacy_form_data = json.loads(request_call.kwargs["params"]["form_data"])
assert "datasource" not in legacy_form_data
assert legacy_form_data["datasource_id"] == 42
assert legacy_form_data["datasource_type"] == "table"
assert legacy_form_data["extra_filters"] == [
{"col": "country", "op": "IN", "val": ["DE"]}
]
assert query_context["queries"][0]["url_params"] == {"country": "DE"}
assert legacy_form_data["extra_form_data"] == {
"filters": [{"col": "country", "op": "IN", "val": ["DE"]}]
}
assert legacy_form_data["url_params"] == {"country": "DE"}
assert legacy_form_data["result_type"] == "query"
assert legacy_form_data["result_format"] == "json"
assert legacy_form_data["force"] is True
assert result["endpoint"] == "/explore_json/form_data"
assert result["endpoint_kind"] == "legacy_explore_form_data"
assert result["dataset_id"] == 42
assert result["response_diagnostics"] == [
{"source": "query", "has_query": False},
{"source": "sql", "has_query": False},
{"source": "compiled_sql", "has_query": False},
{"source": "result.query", "has_query": True},
]
assert result["legacy_form_data"]["extra_filters"] == [
{"col": "country", "op": "IN", "val": ["DE"]}
]
assert result["query_context"]["datasource"] == {"id": 42, "type": "table"}
assert result["query_context"]["queries"][0]["filters"] == [
{"col": "country", "op": "IN", "val": ["DE"]}
assert result["strategy_attempts"] == [
{
"endpoint": "/explore_json/form_data",
"endpoint_kind": "legacy_explore_form_data",
"request_transport": "query_param_form_data",
"contains_root_datasource": False,
"contains_form_datasource": False,
"contains_query_object_datasource": False,
"request_param_keys": ["form_data"],
"request_payload_keys": [],
"success": True,
}
]
# [/DEF:test_compile_dataset_preview_uses_chart_data_and_result_query_sql:Function]
# [/DEF:test_compile_dataset_preview_prefers_legacy_explore_form_data_strategy:Function]
# [DEF:test_compile_dataset_preview_falls_back_to_chart_data_after_legacy_failures:Function]
# @PURPOSE: Superset preview compilation should fall back to chart-data when legacy form_data strategies are rejected.
def test_compile_dataset_preview_falls_back_to_chart_data_after_legacy_failures():
client = SupersetClient(_make_environment())
client.get_dataset = MagicMock(
return_value={
"result": {
"id": 42,
"schema": "public",
"datasource": {"id": 42, "type": "table"},
}
}
)
client.network = MagicMock()
client.network.request.side_effect = [
SupersetAPIError("legacy explore failed"),
SupersetAPIError("legacy data failed"),
{
"result": [
{
"query": "SELECT count(*) FROM public.sales",
}
]
},
]
result = client.compile_dataset_preview(dataset_id=42)
assert client.network.request.call_count == 3
first_call = client.network.request.call_args_list[0]
second_call = client.network.request.call_args_list[1]
third_call = client.network.request.call_args_list[2]
assert first_call.kwargs["endpoint"] == "/explore_json/form_data"
assert second_call.kwargs["endpoint"] == "/data"
assert third_call.kwargs["endpoint"] == "/chart/data"
assert third_call.kwargs["headers"] == {"Content-Type": "application/json"}
first_legacy_form_data = json.loads(first_call.kwargs["params"]["form_data"])
second_legacy_form_data = json.loads(second_call.kwargs["params"]["form_data"])
assert "datasource" not in first_legacy_form_data
assert "datasource" not in second_legacy_form_data
query_context = json.loads(third_call.kwargs["data"])
assert query_context["datasource"] == {"id": 42, "type": "table"}
assert result["endpoint"] == "/chart/data"
assert result["endpoint_kind"] == "v1_chart_data"
assert len(result["strategy_attempts"]) == 3
assert result["strategy_attempts"][0]["endpoint"] == "/explore_json/form_data"
assert result["strategy_attempts"][0]["endpoint_kind"] == "legacy_explore_form_data"
assert result["strategy_attempts"][0]["request_transport"] == "query_param_form_data"
assert result["strategy_attempts"][0]["contains_root_datasource"] is False
assert result["strategy_attempts"][0]["contains_form_datasource"] is False
assert result["strategy_attempts"][0]["contains_query_object_datasource"] is False
assert result["strategy_attempts"][0]["request_param_keys"] == ["form_data"]
assert result["strategy_attempts"][0]["request_payload_keys"] == []
assert result["strategy_attempts"][0]["success"] is False
assert "legacy explore failed" in result["strategy_attempts"][0]["error"]
assert result["strategy_attempts"][1]["endpoint"] == "/data"
assert result["strategy_attempts"][1]["endpoint_kind"] == "legacy_data_form_data"
assert result["strategy_attempts"][1]["request_transport"] == "query_param_form_data"
assert result["strategy_attempts"][1]["contains_root_datasource"] is False
assert result["strategy_attempts"][1]["contains_form_datasource"] is False
assert result["strategy_attempts"][1]["contains_query_object_datasource"] is False
assert result["strategy_attempts"][1]["request_param_keys"] == ["form_data"]
assert result["strategy_attempts"][1]["request_payload_keys"] == []
assert result["strategy_attempts"][1]["success"] is False
assert "legacy data failed" in result["strategy_attempts"][1]["error"]
assert result["strategy_attempts"][2] == {
"endpoint": "/chart/data",
"endpoint_kind": "v1_chart_data",
"request_transport": "json_body",
"contains_root_datasource": True,
"contains_form_datasource": False,
"contains_query_object_datasource": False,
"request_param_keys": [],
"request_payload_keys": ["datasource", "force", "form_data", "queries", "result_format", "result_type"],
"success": True,
}
# [/DEF:test_compile_dataset_preview_falls_back_to_chart_data_after_legacy_failures:Function]
# [DEF:test_build_dataset_preview_query_context_places_recovered_filters_in_chart_style_form_data:Function]
# @PURPOSE: Preview query context should mirror chart-style filter transport so recovered native filters reach Superset compilation.
def test_build_dataset_preview_query_context_places_recovered_filters_in_chart_style_form_data():
client = SupersetClient(_make_environment())
query_context = client.build_dataset_preview_query_context(
dataset_id=7,
dataset_record={
"id": 7,
"schema": "public",
"datasource": {"id": 7, "type": "table"},
"default_time_range": "Last year",
},
template_params={"country": "DE"},
effective_filters=[
{
"filter_name": "country",
"display_name": "Country",
"effective_value": ["DE"],
"normalized_filter_payload": {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE"]}],
"extra_form_data": {"filters": [{"col": "country_code", "op": "IN", "val": ["DE"]}]},
"value_origin": "extra_form_data.filters",
},
},
{"filter_name": "status", "effective_value": "active"},
],
)
assert query_context["force"] is True
assert query_context["result_type"] == "query"
assert query_context["datasource"] == {"id": 7, "type": "table"}
assert "datasource" not in query_context["queries"][0]
assert query_context["queries"][0]["result_type"] == "query"
assert query_context["queries"][0]["filters"] == [
{"col": "country_code", "op": "IN", "val": ["DE"]},
{"col": "status", "op": "==", "val": "active"},
]
assert query_context["form_data"]["datasource"] == "7__table"
assert query_context["form_data"]["datasource_id"] == 7
assert query_context["form_data"]["datasource_type"] == "table"
assert query_context["form_data"]["extra_filters"] == [
{"col": "country_code", "op": "IN", "val": ["DE"]},
{"col": "status", "op": "==", "val": "active"},
]
assert query_context["form_data"]["extra_form_data"] == {
"filters": [
{"col": "country_code", "op": "IN", "val": ["DE"]},
{"col": "status", "op": "==", "val": "active"},
],
"time_range": "Last year",
}
assert query_context["form_data"]["url_params"] == {"country": "DE"}
# [/DEF:test_build_dataset_preview_query_context_places_recovered_filters_in_chart_style_form_data:Function]
# [DEF:test_build_dataset_preview_query_context_merges_dataset_template_params_and_preserves_user_values:Function]
# @PURPOSE: Preview query context should merge dataset template params for parity with real dataset definitions while preserving explicit session overrides.
def test_build_dataset_preview_query_context_merges_dataset_template_params_and_preserves_user_values():
client = SupersetClient(_make_environment())
query_context = client.build_dataset_preview_query_context(
dataset_id=8,
dataset_record={
"id": 8,
"schema": "public",
"datasource": {"id": 8, "type": "table"},
"template_params": json.dumps({"region": "EMEA", "country": "FR"}),
},
template_params={"country": "DE"},
effective_filters=[],
)
assert query_context["queries"][0]["url_params"] == {"region": "EMEA", "country": "DE"}
assert query_context["form_data"]["url_params"] == {"region": "EMEA", "country": "DE"}
# [/DEF:test_build_dataset_preview_query_context_merges_dataset_template_params_and_preserves_user_values:Function]
# [DEF:test_build_dataset_preview_query_context_preserves_time_range_from_native_filter_payload:Function]
# @PURPOSE: Preview query context should preserve time-range native filter extras even when dataset defaults differ.
def test_build_dataset_preview_query_context_preserves_time_range_from_native_filter_payload():
client = SupersetClient(_make_environment())
query_context = client.build_dataset_preview_query_context(
dataset_id=9,
dataset_record={
"id": 9,
"schema": "public",
"datasource": {"id": 9, "type": "table"},
"default_time_range": "Last year",
},
template_params={},
effective_filters=[
{
"filter_name": "Order Date",
"display_name": "Order Date",
"effective_value": "2020-01-01 : 2020-12-31",
"normalized_filter_payload": {
"filter_clauses": [],
"extra_form_data": {"time_range": "2020-01-01 : 2020-12-31"},
"value_origin": "extra_form_data.time_range",
},
}
],
)
assert query_context["queries"][0]["time_range"] == "2020-01-01 : 2020-12-31"
assert query_context["form_data"]["extra_form_data"] == {
"time_range": "2020-01-01 : 2020-12-31"
}
assert query_context["queries"][0]["filters"] == []
# [/DEF:test_build_dataset_preview_query_context_preserves_time_range_from_native_filter_payload:Function]
# [DEF:test_build_dataset_preview_legacy_form_data_preserves_native_filter_clauses:Function]
# @PURPOSE: Legacy preview form_data should preserve recovered native filter clauses in browser-style fields without duplicating datasource for QueryObjectFactory.
def test_build_dataset_preview_legacy_form_data_preserves_native_filter_clauses():
client = SupersetClient(_make_environment())
legacy_form_data = client.build_dataset_preview_legacy_form_data(
dataset_id=11,
dataset_record={
"id": 11,
"schema": "public",
"datasource": {"id": 11, "type": "table"},
"default_time_range": "No filter",
},
template_params={"country": "DE"},
effective_filters=[
{
"filter_name": "Country",
"display_name": "Country",
"effective_value": ["DE", "FR"],
"normalized_filter_payload": {
"filter_clauses": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}],
"extra_form_data": {
"filters": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}],
"time_range": "Last quarter",
},
"value_origin": "extra_form_data.filters",
},
}
],
)
assert "datasource" not in legacy_form_data
assert legacy_form_data["datasource_id"] == 11
assert legacy_form_data["datasource_type"] == "table"
assert legacy_form_data["extra_filters"] == [
{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}
]
assert legacy_form_data["extra_form_data"] == {
"filters": [{"col": "country_code", "op": "IN", "val": ["DE", "FR"]}],
"time_range": "Last quarter",
}
assert legacy_form_data["time_range"] == "Last quarter"
assert legacy_form_data["url_params"] == {"country": "DE"}
assert legacy_form_data["result_type"] == "query"
# [/DEF:test_build_dataset_preview_legacy_form_data_preserves_native_filter_clauses:Function]
# [DEF:test_sync_network_404_mapping_keeps_non_dashboard_endpoints_generic:Function]

View File

@@ -315,6 +315,205 @@ class AsyncSupersetClient(SupersetClient):
"dataset_count": len(datasets),
}
# [/DEF:get_dashboard_detail_async:Function]
# [DEF:get_dashboard_permalink_state_async:Function]
# @COMPLEXITY: 2
# @PURPOSE: Fetch stored dashboard permalink state asynchronously.
# @POST: Returns dashboard permalink state payload from Superset API.
# @DATA_CONTRACT: Input[permalink_key: str] -> Output[Dict]
async def get_dashboard_permalink_state_async(self, permalink_key: str) -> Dict:
with belief_scope("AsyncSupersetClient.get_dashboard_permalink_state_async", f"key={permalink_key}"):
response = await self.network.request(
method="GET",
endpoint=f"/dashboard/permalink/{permalink_key}"
)
return cast(Dict, response)
# [/DEF:get_dashboard_permalink_state_async:Function]
# [DEF:get_native_filter_state_async:Function]
# @COMPLEXITY: 2
# @PURPOSE: Fetch stored native filter state asynchronously.
# @POST: Returns native filter state payload from Superset API.
# @DATA_CONTRACT: Input[dashboard_id: Union[int, str], filter_state_key: str] -> Output[Dict]
async def get_native_filter_state_async(self, dashboard_id: int, filter_state_key: str) -> Dict:
with belief_scope("AsyncSupersetClient.get_native_filter_state_async", f"dashboard={dashboard_id}, key={filter_state_key}"):
response = await self.network.request(
method="GET",
endpoint=f"/dashboard/{dashboard_id}/filter_state/{filter_state_key}"
)
return cast(Dict, response)
# [/DEF:get_native_filter_state_async:Function]
# [DEF:extract_native_filters_from_permalink_async:Function]
# @COMPLEXITY: 3
# @PURPOSE: Extract native filters dataMask from a permalink key asynchronously.
# @POST: Returns extracted dataMask with filter states.
# @DATA_CONTRACT: Input[permalink_key: str] -> Output[Dict]
# @RELATION: [CALLS] ->[self.get_dashboard_permalink_state_async]
async def extract_native_filters_from_permalink_async(self, permalink_key: str) -> Dict:
with belief_scope("AsyncSupersetClient.extract_native_filters_from_permalink_async", f"key={permalink_key}"):
permalink_response = await self.get_dashboard_permalink_state_async(permalink_key)
result = permalink_response.get("result", permalink_response)
state = result.get("state", result)
data_mask = state.get("dataMask", {})
extracted_filters = {}
for filter_id, filter_data in data_mask.items():
if not isinstance(filter_data, dict):
continue
extracted_filters[filter_id] = {
"extraFormData": filter_data.get("extraFormData", {}),
"filterState": filter_data.get("filterState", {}),
"ownState": filter_data.get("ownState", {}),
}
return {
"dataMask": extracted_filters,
"activeTabs": state.get("activeTabs", []),
"anchor": state.get("anchor"),
"chartStates": state.get("chartStates", {}),
"permalink_key": permalink_key,
}
# [/DEF:extract_native_filters_from_permalink_async:Function]
# [DEF:extract_native_filters_from_key_async:Function]
# @COMPLEXITY: 3
# @PURPOSE: Extract native filters from a native_filters_key URL parameter asynchronously.
# @POST: Returns extracted filter state with extraFormData.
# @DATA_CONTRACT: Input[dashboard_id: Union[int, str], filter_state_key: str] -> Output[Dict]
# @RELATION: [CALLS] ->[self.get_native_filter_state_async]
async def extract_native_filters_from_key_async(self, dashboard_id: int, filter_state_key: str) -> Dict:
with belief_scope("AsyncSupersetClient.extract_native_filters_from_key_async", f"dashboard={dashboard_id}, key={filter_state_key}"):
filter_response = await self.get_native_filter_state_async(dashboard_id, filter_state_key)
result = filter_response.get("result", filter_response)
value = result.get("value")
if isinstance(value, str):
try:
parsed_value = json.loads(value)
except json.JSONDecodeError as e:
app_logger.warning("[extract_native_filters_from_key_async][Warning] Failed to parse filter state JSON: %s", e)
parsed_value = {}
elif isinstance(value, dict):
parsed_value = value
else:
parsed_value = {}
extracted_filters = {}
if "id" in parsed_value and "extraFormData" in parsed_value:
filter_id = parsed_value.get("id", filter_state_key)
extracted_filters[filter_id] = {
"extraFormData": parsed_value.get("extraFormData", {}),
"filterState": parsed_value.get("filterState", {}),
"ownState": parsed_value.get("ownState", {}),
}
else:
for filter_id, filter_data in parsed_value.items():
if not isinstance(filter_data, dict):
continue
extracted_filters[filter_id] = {
"extraFormData": filter_data.get("extraFormData", {}),
"filterState": filter_data.get("filterState", {}),
"ownState": filter_data.get("ownState", {}),
}
return {
"dataMask": extracted_filters,
"dashboard_id": dashboard_id,
"filter_state_key": filter_state_key,
}
# [/DEF:extract_native_filters_from_key_async:Function]
# [DEF:parse_dashboard_url_for_filters_async:Function]
# @COMPLEXITY: 3
# @PURPOSE: Parse a Superset dashboard URL and extract native filter state asynchronously.
# @POST: Returns extracted filter state or empty dict if no filters found.
# @DATA_CONTRACT: Input[url: str] -> Output[Dict]
# @RELATION: [CALLS] ->[self.extract_native_filters_from_permalink_async]
# @RELATION: [CALLS] ->[self.extract_native_filters_from_key_async]
async def parse_dashboard_url_for_filters_async(self, url: str) -> Dict:
with belief_scope("AsyncSupersetClient.parse_dashboard_url_for_filters_async", f"url={url}"):
import urllib.parse
parsed_url = urllib.parse.urlparse(url)
query_params = urllib.parse.parse_qs(parsed_url.query)
path_parts = parsed_url.path.rstrip("/").split("/")
result = {
"url": url,
"dashboard_id": None,
"filter_type": None,
"filters": {},
}
# Check for permalink URL: /dashboard/p/{key}/
if "p" in path_parts:
try:
p_index = path_parts.index("p")
if p_index + 1 < len(path_parts):
permalink_key = path_parts[p_index + 1]
filter_data = await self.extract_native_filters_from_permalink_async(permalink_key)
result["filter_type"] = "permalink"
result["filters"] = filter_data
return result
except ValueError:
pass
# Check for native_filters_key in query params
native_filters_key = query_params.get("native_filters_key", [None])[0]
if native_filters_key:
dashboard_ref = None
if "dashboard" in path_parts:
try:
dash_index = path_parts.index("dashboard")
if dash_index + 1 < len(path_parts):
potential_id = path_parts[dash_index + 1]
if potential_id not in ("p", "list", "new"):
dashboard_ref = potential_id
except ValueError:
pass
if dashboard_ref:
# Resolve slug to numeric ID — the filter_state API requires a numeric ID
resolved_id = None
try:
resolved_id = int(dashboard_ref)
except (ValueError, TypeError):
try:
dash_resp = await self.get_dashboard_async(dashboard_ref)
dash_data = dash_resp.get("result", dash_resp) if isinstance(dash_resp, dict) else {}
raw_id = dash_data.get("id")
if raw_id is not None:
resolved_id = int(raw_id)
except Exception as e:
app_logger.warning("[parse_dashboard_url_for_filters_async][Warning] Failed to resolve dashboard slug '%s' to ID: %s", dashboard_ref, e)
if resolved_id is not None:
filter_data = await self.extract_native_filters_from_key_async(resolved_id, native_filters_key)
result["filter_type"] = "native_filters_key"
result["dashboard_id"] = resolved_id
result["filters"] = filter_data
return result
else:
app_logger.warning("[parse_dashboard_url_for_filters_async][Warning] Could not resolve dashboard_id from URL for native_filters_key")
return result
# Check for native_filters in query params (direct filter values)
native_filters = query_params.get("native_filters", [None])[0]
if native_filters:
try:
parsed_filters = json.loads(native_filters)
result["filter_type"] = "native_filters"
result["filters"] = {"dataMask": parsed_filters}
return result
except json.JSONDecodeError as e:
app_logger.warning("[parse_dashboard_url_for_filters_async][Warning] Failed to parse native_filters JSON: %s", e)
return result
# [/DEF:parse_dashboard_url_for_filters_async:Function]
# [/DEF:AsyncSupersetClient:Class]
# [/DEF:backend.src.core.async_superset_client:Module]

View File

@@ -369,6 +369,69 @@ def ensure_connection_configs_table(bind_engine):
# [/DEF:ensure_connection_configs_table:Function]
# [DEF:_ensure_filter_source_enum_values:Function]
# @COMPLEXITY: 3
# @PURPOSE: Adds missing FilterSource enum values to the PostgreSQL native filtersource type.
# @PRE: bind_engine points to application database with imported_filters table.
# @POST: New enum values are available without data loss.
def _ensure_filter_source_enum_values(bind_engine):
with belief_scope("_ensure_filter_source_enum_values"):
try:
with bind_engine.connect() as connection:
# Check if the native enum type exists
result = connection.execute(
text(
"SELECT t.typname FROM pg_type t "
"JOIN pg_namespace n ON t.typnamespace = n.oid "
"WHERE t.typname = 'filtersource' AND n.nspname = 'public'"
)
)
if result.fetchone() is None:
logger.reason("filtersource enum type does not exist yet; skipping migration")
return
# Get existing enum values
result = connection.execute(
text(
"SELECT e.enumlabel FROM pg_enum e "
"JOIN pg_type t ON e.enumtypid = t.oid "
"WHERE t.typname = 'filtersource' "
"ORDER BY e.enumsortorder"
)
)
existing_values = {row[0] for row in result.fetchall()}
required_values = ["SUPERSET_PERMALINK", "SUPERSET_NATIVE_FILTERS_KEY"]
missing_values = [v for v in required_values if v not in existing_values]
if not missing_values:
logger.reason(
"filtersource enum already up to date",
extra={"existing": sorted(existing_values)},
)
return
logger.reason(
"Adding missing values to filtersource enum",
extra={"missing": missing_values},
)
for value in missing_values:
connection.execute(
text(f"ALTER TYPE filtersource ADD VALUE IF NOT EXISTS '{value}'")
)
connection.commit()
logger.reason(
"filtersource enum migration completed",
extra={"added": missing_values},
)
except Exception as migration_error:
logger.warning(
"[database][EXPLORE] FilterSource enum additive migration failed: %s",
migration_error,
)
# [/DEF:_ensure_filter_source_enum_values:Function]
# [DEF:init_db:Function]
# @COMPLEXITY: 3
# @PURPOSE: Initializes the database by creating all tables.
@@ -386,6 +449,7 @@ def init_db():
_ensure_git_server_configs_columns(engine)
_ensure_auth_users_columns(auth_engine)
ensure_connection_configs_table(engine)
_ensure_filter_source_enum_values(engine)
# [/DEF:init_db:Function]
# [DEF:get_db:Function]

View File

@@ -35,6 +35,7 @@ class SupersetClient:
# @PRE: `env` должен быть валидным объектом Environment.
# @POST: Атрибуты `env` и `network` созданы и готовы к работе.
# @DATA_CONTRACT: Input[Environment] -> self.network[APIClient]
# @RELATION: [DEPENDS_ON] ->[Environment]
# @RELATION: [DEPENDS_ON] ->[APIClient]
def __init__(self, env: Environment):
with belief_scope("__init__"):
@@ -311,7 +312,7 @@ class SupersetClient:
})
return total_count, result
# [/DEF:backend.src.core.superset_client.SupersetClient.get_dashboards_summary_page:Function]
# [/DEF:SupersetClient.get_dashboards_summary_page:Function]
# [DEF:SupersetClient._extract_owner_labels:Function]
# @COMPLEXITY: 1
@@ -368,9 +369,9 @@ class SupersetClient:
if email:
return email
return None
# [/DEF:backend.src.core.superset_client.SupersetClient._extract_user_display:Function]
# [/DEF:SupersetClient._extract_user_display:Function]
# [DEF:backend.src.core.superset_client.SupersetClient._sanitize_user_text:Function]
# [DEF:SupersetClient._sanitize_user_text:Function]
# @COMPLEXITY: 1
# @PURPOSE: Convert scalar value to non-empty user-facing text.
# @PRE: value can be any scalar type.
@@ -382,7 +383,7 @@ class SupersetClient:
if not normalized:
return None
return normalized
# [/DEF:backend.src.core.superset_client.SupersetClient._sanitize_user_text:Function]
# [/DEF:SupersetClient._sanitize_user_text:Function]
# [DEF:SupersetClient.get_dashboard:Function]
# @COMPLEXITY: 3
@@ -413,6 +414,206 @@ class SupersetClient:
return cast(Dict, response)
# [/DEF:SupersetClient.get_dashboard_permalink_state:Function]
# [DEF:SupersetClient.get_native_filter_state:Function]
# @COMPLEXITY: 2
# @PURPOSE: Fetches stored native filter state by filter state key.
# @PRE: Client is authenticated and filter_state_key exists.
# @POST: Returns native filter state payload from Superset API.
# @DATA_CONTRACT: Input[dashboard_id: Union[int, str], filter_state_key: str] -> Output[Dict]
# @RELATION: [CALLS] ->[APIClient.request]
def get_native_filter_state(self, dashboard_id: Union[int, str], filter_state_key: str) -> Dict:
with belief_scope("SupersetClient.get_native_filter_state", f"dashboard={dashboard_id}, key={filter_state_key}"):
response = self.network.request(
method="GET",
endpoint=f"/dashboard/{dashboard_id}/filter_state/{filter_state_key}"
)
return cast(Dict, response)
# [/DEF:SupersetClient.get_native_filter_state:Function]
# [DEF:SupersetClient.extract_native_filters_from_permalink:Function]
# @COMPLEXITY: 3
# @PURPOSE: Extract native filters dataMask from a permalink key.
# @PRE: Client is authenticated and permalink_key exists.
# @POST: Returns extracted dataMask with filter states.
# @DATA_CONTRACT: Input[permalink_key: str] -> Output[Dict]
# @RELATION: [CALLS] ->[SupersetClient.get_dashboard_permalink_state]
def extract_native_filters_from_permalink(self, permalink_key: str) -> Dict:
with belief_scope("SupersetClient.extract_native_filters_from_permalink", f"key={permalink_key}"):
permalink_response = self.get_dashboard_permalink_state(permalink_key)
# Permalink response structure: { "result": { "state": { "dataMask": {...}, ... } } }
# or directly: { "state": { "dataMask": {...}, ... } }
result = permalink_response.get("result", permalink_response)
state = result.get("state", result)
data_mask = state.get("dataMask", {})
extracted_filters = {}
for filter_id, filter_data in data_mask.items():
if not isinstance(filter_data, dict):
continue
extracted_filters[filter_id] = {
"extraFormData": filter_data.get("extraFormData", {}),
"filterState": filter_data.get("filterState", {}),
"ownState": filter_data.get("ownState", {}),
}
return {
"dataMask": extracted_filters,
"activeTabs": state.get("activeTabs", []),
"anchor": state.get("anchor"),
"chartStates": state.get("chartStates", {}),
"permalink_key": permalink_key,
}
# [/DEF:SupersetClient.extract_native_filters_from_permalink:Function]
# [DEF:SupersetClient.extract_native_filters_from_key:Function]
# @COMPLEXITY: 3
# @PURPOSE: Extract native filters from a native_filters_key URL parameter.
# @PRE: Client is authenticated, dashboard_id and filter_state_key exist.
# @POST: Returns extracted filter state with extraFormData.
# @DATA_CONTRACT: Input[dashboard_id: Union[int, str], filter_state_key: str] -> Output[Dict]
# @RELATION: [CALLS] ->[SupersetClient.get_native_filter_state]
def extract_native_filters_from_key(self, dashboard_id: Union[int, str], filter_state_key: str) -> Dict:
with belief_scope("SupersetClient.extract_native_filters_from_key", f"dashboard={dashboard_id}, key={filter_state_key}"):
filter_response = self.get_native_filter_state(dashboard_id, filter_state_key)
# Filter state response structure: { "result": { "value": "{...json...}" } }
# or: { "value": "{...json...}" }
result = filter_response.get("result", filter_response)
value = result.get("value")
if isinstance(value, str):
try:
parsed_value = json.loads(value)
except json.JSONDecodeError as e:
app_logger.warning("[extract_native_filters_from_key][Warning] Failed to parse filter state JSON: %s", e)
parsed_value = {}
elif isinstance(value, dict):
parsed_value = value
else:
parsed_value = {}
# The parsed value contains filter state with structure:
# { "filter_id": { "id": "...", "extraFormData": {...}, "filterState": {...} } }
# or a single filter: { "id": "...", "extraFormData": {...}, "filterState": {...} }
extracted_filters = {}
if "id" in parsed_value and "extraFormData" in parsed_value:
# Single filter format
filter_id = parsed_value.get("id", filter_state_key)
extracted_filters[filter_id] = {
"extraFormData": parsed_value.get("extraFormData", {}),
"filterState": parsed_value.get("filterState", {}),
"ownState": parsed_value.get("ownState", {}),
}
else:
# Multiple filters format
for filter_id, filter_data in parsed_value.items():
if not isinstance(filter_data, dict):
continue
extracted_filters[filter_id] = {
"extraFormData": filter_data.get("extraFormData", {}),
"filterState": filter_data.get("filterState", {}),
"ownState": filter_data.get("ownState", {}),
}
return {
"dataMask": extracted_filters,
"dashboard_id": dashboard_id,
"filter_state_key": filter_state_key,
}
# [/DEF:SupersetClient.extract_native_filters_from_key:Function]
# [DEF:SupersetClient.parse_dashboard_url_for_filters:Function]
# @COMPLEXITY: 3
# @PURPOSE: Parse a Superset dashboard URL and extract native filter state if present.
# @PRE: url must be a valid Superset dashboard URL with optional permalink or native_filters_key.
# @POST: Returns extracted filter state or empty dict if no filters found.
# @DATA_CONTRACT: Input[url: str] -> Output[Dict]
# @RELATION: [CALLS] ->[SupersetClient.extract_native_filters_from_permalink]
# @RELATION: [CALLS] ->[SupersetClient.extract_native_filters_from_key]
def parse_dashboard_url_for_filters(self, url: str) -> Dict:
with belief_scope("SupersetClient.parse_dashboard_url_for_filters", f"url={url}"):
import urllib.parse
parsed_url = urllib.parse.urlparse(url)
query_params = urllib.parse.parse_qs(parsed_url.query)
path_parts = parsed_url.path.rstrip("/").split("/")
result = {
"url": url,
"dashboard_id": None,
"filter_type": None,
"filters": {},
}
# Check for permalink URL: /dashboard/p/{key}/ or /superset/dashboard/p/{key}/
if "p" in path_parts:
try:
p_index = path_parts.index("p")
if p_index + 1 < len(path_parts):
permalink_key = path_parts[p_index + 1]
filter_data = self.extract_native_filters_from_permalink(permalink_key)
result["filter_type"] = "permalink"
result["filters"] = filter_data
return result
except ValueError:
pass
# Check for native_filters_key in query params
native_filters_key = query_params.get("native_filters_key", [None])[0]
if native_filters_key:
# Extract dashboard ID or slug from URL path
dashboard_ref = None
if "dashboard" in path_parts:
try:
dash_index = path_parts.index("dashboard")
if dash_index + 1 < len(path_parts):
potential_id = path_parts[dash_index + 1]
# Skip if it's a reserved word
if potential_id not in ("p", "list", "new"):
dashboard_ref = potential_id
except ValueError:
pass
if dashboard_ref:
# Resolve slug to numeric ID — the filter_state API requires a numeric ID
resolved_id = None
try:
resolved_id = int(dashboard_ref)
except (ValueError, TypeError):
try:
dash_resp = self.get_dashboard(dashboard_ref)
dash_data = dash_resp.get("result", dash_resp) if isinstance(dash_resp, dict) else {}
raw_id = dash_data.get("id")
if raw_id is not None:
resolved_id = int(raw_id)
except Exception as e:
app_logger.warning("[parse_dashboard_url_for_filters][Warning] Failed to resolve dashboard slug '%s' to ID: %s", dashboard_ref, e)
if resolved_id is not None:
filter_data = self.extract_native_filters_from_key(resolved_id, native_filters_key)
result["filter_type"] = "native_filters_key"
result["dashboard_id"] = resolved_id
result["filters"] = filter_data
return result
else:
app_logger.warning("[parse_dashboard_url_for_filters][Warning] Could not resolve dashboard_id from URL for native_filters_key")
# Check for native_filters in query params (direct filter values)
native_filters = query_params.get("native_filters", [None])[0]
if native_filters:
try:
parsed_filters = json.loads(native_filters)
result["filter_type"] = "native_filters"
result["filters"] = {"dataMask": parsed_filters}
return result
except json.JSONDecodeError as e:
app_logger.warning("[parse_dashboard_url_for_filters][Warning] Failed to parse native_filters JSON: %s", e)
return result
# [/DEF:SupersetClient.parse_dashboard_url_for_filters:Function]
# [DEF:SupersetClient.get_chart:Function]
# @COMPLEXITY: 3
# @PURPOSE: Fetches a single chart by ID.
@@ -911,13 +1112,13 @@ class SupersetClient:
return result
# [/DEF:backend.src.core.superset_client.SupersetClient.get_dataset_detail:Function]
# [DEF:backend.src.core.superset_client.SupersetClient.get_dataset:Function]
# [DEF:SupersetClient.get_dataset:Function]
# @COMPLEXITY: 3
# @PURPOSE: Получает информацию о конкретном датасете по его ID.
# @PRE: dataset_id must exist.
# @POST: Returns dataset details.
# @DATA_CONTRACT: Input[dataset_id: int] -> Output[Dict]
# @RELATION: [CALLS] ->[backend.src.core.utils.network.APIClient.request]
# @RELATION: [CALLS] ->[APIClient.request]
def get_dataset(self, dataset_id: int) -> Dict:
with belief_scope("SupersetClient.get_dataset", f"id={dataset_id}"):
app_logger.info("[get_dataset][Enter] Fetching dataset %s.", dataset_id)
@@ -925,19 +1126,20 @@ class SupersetClient:
response = cast(Dict, response)
app_logger.info("[get_dataset][Exit] Got dataset %s.", dataset_id)
return response
# [/DEF:backend.src.core.superset_client.SupersetClient.get_dataset:Function]
# [/DEF:SupersetClient.get_dataset:Function]
# [DEF:SupersetClient.compile_dataset_preview:Function]
# @COMPLEXITY: 4
# @PURPOSE: Compile dataset preview SQL through the real Superset chart-data endpoint and return normalized SQL output.
# @PURPOSE: Compile dataset preview SQL through the strongest supported Superset preview endpoint family and return normalized SQL output.
# @PRE: dataset_id must be valid and template_params/effective_filters must represent the current preview session inputs.
# @POST: Returns normalized compiled SQL plus raw upstream response without guessing unsupported endpoints.
# @POST: Returns normalized compiled SQL plus raw upstream response, preferring legacy form_data transport with explicit fallback to chart-data.
# @DATA_CONTRACT: Input[dataset_id:int, template_params:Dict, effective_filters:List[Dict]] -> Output[Dict[str, Any]]
# @RELATION: [CALLS] ->[SupersetClient.get_dataset]
# @RELATION: [CALLS] ->[SupersetClient.build_dataset_preview_query_context]
# @RELATION: [CALLS] ->[SupersetClient.build_dataset_preview_legacy_form_data]
# @RELATION: [CALLS] ->[APIClient.request]
# @RELATION: [CALLS] ->[SupersetClient._extract_compiled_sql_from_chart_data_response]
# @SIDE_EFFECT: Performs upstream dataset lookup and chart-data network I/O against Superset.
# @RELATION: [CALLS] ->[SupersetClient._extract_compiled_sql_from_preview_response]
# @SIDE_EFFECT: Performs upstream dataset lookup and preview network I/O against Superset.
def compile_dataset_preview(
self,
dataset_id: int,
@@ -945,14 +1147,6 @@ class SupersetClient:
effective_filters: Optional[List[Dict[str, Any]]] = None,
) -> Dict[str, Any]:
with belief_scope("SupersetClient.compile_dataset_preview", f"id={dataset_id}"):
app_logger.reason(
"Compiling dataset preview via Superset chart-data endpoint",
extra={
"dataset_id": dataset_id,
"template_param_count": len(template_params or {}),
"filter_count": len(effective_filters or []),
},
)
dataset_response = self.get_dataset(dataset_id)
dataset_record = dataset_response.get("result", dataset_response) if isinstance(dataset_response, dict) else {}
query_context = self.build_dataset_preview_query_context(
@@ -961,31 +1155,197 @@ class SupersetClient:
template_params=template_params or {},
effective_filters=effective_filters or [],
)
response = self.network.request(
method="POST",
endpoint="/chart/data",
data=json.dumps(query_context),
headers={"Content-Type": "application/json"},
legacy_form_data = self.build_dataset_preview_legacy_form_data(
dataset_id=dataset_id,
dataset_record=dataset_record,
template_params=template_params or {},
effective_filters=effective_filters or [],
)
normalized = self._extract_compiled_sql_from_chart_data_response(response)
normalized["query_context"] = query_context
legacy_form_data_payload = json.dumps(legacy_form_data, sort_keys=True, default=str)
request_payload = json.dumps(query_context)
strategy_attempts: List[Dict[str, Any]] = []
strategy_candidates: List[Dict[str, Any]] = [
{
"endpoint_kind": "legacy_explore_form_data",
"endpoint": "/explore_json/form_data",
"request_transport": "query_param_form_data",
"params": {"form_data": legacy_form_data_payload},
},
{
"endpoint_kind": "legacy_data_form_data",
"endpoint": "/data",
"request_transport": "query_param_form_data",
"params": {"form_data": legacy_form_data_payload},
},
{
"endpoint_kind": "v1_chart_data",
"endpoint": "/chart/data",
"request_transport": "json_body",
"data": request_payload,
"headers": {"Content-Type": "application/json"},
},
]
for candidate in strategy_candidates:
endpoint_kind = candidate["endpoint_kind"]
endpoint_path = candidate["endpoint"]
request_transport = candidate["request_transport"]
request_params = deepcopy(candidate.get("params") or {})
request_body = candidate.get("data")
request_headers = deepcopy(candidate.get("headers") or {})
request_param_keys = sorted(request_params.keys())
request_payload_keys: List[str] = []
if isinstance(request_body, str):
try:
decoded_request_body = json.loads(request_body)
if isinstance(decoded_request_body, dict):
request_payload_keys = sorted(decoded_request_body.keys())
except json.JSONDecodeError:
request_payload_keys = []
elif isinstance(request_body, dict):
request_payload_keys = sorted(request_body.keys())
strategy_diagnostics = {
"endpoint": endpoint_path,
"endpoint_kind": endpoint_kind,
"request_transport": request_transport,
"contains_root_datasource": endpoint_kind == "v1_chart_data" and "datasource" in query_context,
"contains_form_datasource": endpoint_kind.startswith("legacy_") and "datasource" in legacy_form_data,
"contains_query_object_datasource": bool(query_context.get("queries")) and isinstance(query_context["queries"][0], dict) and "datasource" in query_context["queries"][0],
"request_param_keys": request_param_keys,
"request_payload_keys": request_payload_keys,
}
app_logger.reason(
"Attempting Superset dataset preview compilation strategy",
extra={
"dataset_id": dataset_id,
**strategy_diagnostics,
"request_params": request_params,
"request_payload": request_body,
"legacy_form_data": legacy_form_data if endpoint_kind.startswith("legacy_") else None,
"query_context": query_context if endpoint_kind == "v1_chart_data" else None,
"template_param_count": len(template_params or {}),
"filter_count": len(effective_filters or []),
},
)
try:
response = self.network.request(
method="POST",
endpoint=endpoint_path,
params=request_params or None,
data=request_body,
headers=request_headers or None,
)
normalized = self._extract_compiled_sql_from_preview_response(response)
normalized["query_context"] = query_context
normalized["legacy_form_data"] = legacy_form_data
normalized["endpoint"] = endpoint_path
normalized["endpoint_kind"] = endpoint_kind
normalized["dataset_id"] = dataset_id
normalized["strategy_attempts"] = strategy_attempts + [
{
**strategy_diagnostics,
"success": True,
}
]
app_logger.reflect(
"Dataset preview compilation returned normalized SQL payload",
extra={
"dataset_id": dataset_id,
**strategy_diagnostics,
"success": True,
"compiled_sql_length": len(str(normalized.get("compiled_sql") or "")),
"response_diagnostics": normalized.get("response_diagnostics"),
},
)
return normalized
except Exception as exc:
failure_diagnostics = {
**strategy_diagnostics,
"success": False,
"error": str(exc),
}
strategy_attempts.append(failure_diagnostics)
app_logger.explore(
"Superset dataset preview compilation strategy failed",
extra={
"dataset_id": dataset_id,
**failure_diagnostics,
"request_params": request_params,
"request_payload": request_body,
},
)
raise SupersetAPIError(
"Superset preview compilation failed for all known strategies "
f"(attempts={strategy_attempts!r})"
)
# [/DEF:SupersetClient.compile_dataset_preview:Function]
# [DEF:SupersetClient.build_dataset_preview_legacy_form_data:Function]
# @COMPLEXITY: 4
# @PURPOSE: Build browser-style legacy form_data payload for Superset preview endpoints inferred from observed deployment traffic.
# @PRE: dataset_record should come from Superset dataset detail when possible.
# @POST: Returns one serialized-ready form_data structure preserving native filter clauses in legacy transport fields.
# @DATA_CONTRACT: Input[dataset_id:int,dataset_record:Dict,template_params:Dict,effective_filters:List[Dict]] -> Output[Dict[str, Any]]
# @RELATION: [CALLS] ->[SupersetClient.build_dataset_preview_query_context]
# @SIDE_EFFECT: Emits reasoning diagnostics describing the inferred legacy payload shape.
def build_dataset_preview_legacy_form_data(
self,
dataset_id: int,
dataset_record: Dict[str, Any],
template_params: Dict[str, Any],
effective_filters: List[Dict[str, Any]],
) -> Dict[str, Any]:
with belief_scope("SupersetClient.build_dataset_preview_legacy_form_data", f"id={dataset_id}"):
query_context = self.build_dataset_preview_query_context(
dataset_id=dataset_id,
dataset_record=dataset_record,
template_params=template_params,
effective_filters=effective_filters,
)
query_object = deepcopy(query_context.get("queries", [{}])[0] if query_context.get("queries") else {})
legacy_form_data = deepcopy(query_context.get("form_data", {}))
legacy_form_data.pop("datasource", None)
legacy_form_data["metrics"] = deepcopy(query_object.get("metrics", ["count"]))
legacy_form_data["columns"] = deepcopy(query_object.get("columns", []))
legacy_form_data["orderby"] = deepcopy(query_object.get("orderby", []))
legacy_form_data["annotation_layers"] = deepcopy(query_object.get("annotation_layers", []))
legacy_form_data["row_limit"] = query_object.get("row_limit", 1000)
legacy_form_data["series_limit"] = query_object.get("series_limit", 0)
legacy_form_data["url_params"] = deepcopy(query_object.get("url_params", template_params))
legacy_form_data["applied_time_extras"] = deepcopy(query_object.get("applied_time_extras", {}))
legacy_form_data["result_format"] = query_context.get("result_format", "json")
legacy_form_data["result_type"] = query_context.get("result_type", "query")
legacy_form_data["force"] = bool(query_context.get("force", True))
extras = query_object.get("extras")
if isinstance(extras, dict):
legacy_form_data["extras"] = deepcopy(extras)
time_range = query_object.get("time_range")
if time_range:
legacy_form_data["time_range"] = time_range
app_logger.reflect(
"Dataset preview compilation returned normalized SQL payload",
"Built Superset legacy preview form_data payload from browser-observed request shape",
extra={
"dataset_id": dataset_id,
"compiled_sql_length": len(str(normalized.get("compiled_sql") or "")),
"legacy_endpoint_inference": "POST /explore_json/form_data?form_data=... primary, POST /data?form_data=... fallback, based on observed browser traffic",
"contains_form_datasource": "datasource" in legacy_form_data,
"legacy_form_data_keys": sorted(legacy_form_data.keys()),
"legacy_extra_filters": legacy_form_data.get("extra_filters", []),
"legacy_extra_form_data": legacy_form_data.get("extra_form_data", {}),
},
)
return normalized
# [/DEF:backend.src.core.superset_client.SupersetClient.compile_dataset_preview:Function]
return legacy_form_data
# [/DEF:SupersetClient.build_dataset_preview_legacy_form_data:Function]
# [DEF:backend.src.core.superset_client.SupersetClient.build_dataset_preview_query_context:Function]
# [DEF:SupersetClient.build_dataset_preview_query_context:Function]
# @COMPLEXITY: 4
# @PURPOSE: Build a reduced-scope chart-data query context for deterministic dataset preview compilation.
# @PRE: dataset_record should come from Superset dataset detail when possible.
# @POST: Returns an explicit chart-data payload based on current session inputs and dataset metadata.
# @DATA_CONTRACT: Input[dataset_id:int,dataset_record:Dict,template_params:Dict,effective_filters:List[Dict]] -> Output[Dict[str, Any]]
# @RELATION: [CALLS] ->[backend.src.core.superset_client.SupersetClient._normalize_effective_filters_for_query_context]
# @RELATION: [CALLS] ->[SupersetClient._normalize_effective_filters_for_query_context]
# @SIDE_EFFECT: Emits reasoning and reflection logs for deterministic preview payload construction.
def build_dataset_preview_query_context(
self,
@@ -996,7 +1356,9 @@ class SupersetClient:
) -> Dict[str, Any]:
with belief_scope("SupersetClient.build_dataset_preview_query_context", f"id={dataset_id}"):
normalized_template_params = deepcopy(template_params or {})
normalized_filters = self._normalize_effective_filters_for_query_context(effective_filters or [])
normalized_filter_payload = self._normalize_effective_filters_for_query_context(effective_filters or [])
normalized_filters = normalized_filter_payload["filters"]
normalized_extra_form_data = normalized_filter_payload["extra_form_data"]
datasource_payload: Dict[str, Any] = {
"id": dataset_id,
@@ -1011,6 +1373,23 @@ class SupersetClient:
if datasource_type:
datasource_payload["type"] = datasource_type
serialized_dataset_template_params = dataset_record.get("template_params")
if isinstance(serialized_dataset_template_params, str) and serialized_dataset_template_params.strip():
try:
parsed_dataset_template_params = json.loads(serialized_dataset_template_params)
if isinstance(parsed_dataset_template_params, dict):
for key, value in parsed_dataset_template_params.items():
normalized_template_params.setdefault(str(key), value)
except json.JSONDecodeError:
app_logger.explore(
"Dataset template_params could not be parsed while building preview query context",
extra={"dataset_id": dataset_id},
)
extra_form_data: Dict[str, Any] = deepcopy(normalized_extra_form_data)
if normalized_filters:
extra_form_data["filters"] = deepcopy(normalized_filters)
query_object: Dict[str, Any] = {
"filters": normalized_filters,
"extras": {"where": ""},
@@ -1021,33 +1400,56 @@ class SupersetClient:
"row_limit": 1000,
"series_limit": 0,
"url_params": normalized_template_params,
"custom_params": normalized_template_params,
"applied_time_extras": {},
"result_type": "query",
}
schema = dataset_record.get("schema")
if schema:
query_object["schema"] = schema
time_range = dataset_record.get("default_time_range")
time_range = extra_form_data.get("time_range") or dataset_record.get("default_time_range")
if time_range:
query_object["time_range"] = time_range
extra_form_data["time_range"] = time_range
result_format = dataset_record.get("result_format") or "json"
result_type = dataset_record.get("result_type") or "full"
result_type = "query"
return {
form_data: Dict[str, Any] = {
"datasource": f"{datasource_payload['id']}__{datasource_payload['type']}",
"datasource_id": datasource_payload["id"],
"datasource_type": datasource_payload["type"],
"viz_type": "table",
"slice_id": None,
"query_mode": "raw",
"url_params": normalized_template_params,
"extra_filters": deepcopy(normalized_filters),
"adhoc_filters": [],
}
if extra_form_data:
form_data["extra_form_data"] = extra_form_data
payload = {
"datasource": datasource_payload,
"queries": [query_object],
"form_data": {
"datasource": f"{datasource_payload['id']}__{datasource_payload['type']}",
"viz_type": "table",
"slice_id": None,
"query_mode": "raw",
"url_params": normalized_template_params,
},
"form_data": form_data,
"result_format": result_format,
"result_type": result_type,
"force": True,
}
app_logger.reflect(
"Built Superset dataset preview query context",
extra={
"dataset_id": dataset_id,
"datasource": datasource_payload,
"normalized_effective_filters": normalized_filters,
"normalized_filter_diagnostics": normalized_filter_payload["diagnostics"],
"result_type": result_type,
"result_format": result_format,
},
)
return payload
# [/DEF:backend.src.core.superset_client.SupersetClient.build_dataset_preview_query_context:Function]
# [DEF:backend.src.core.superset_client.SupersetClient._normalize_effective_filters_for_query_context:Function]
@@ -1058,56 +1460,170 @@ class SupersetClient:
def _normalize_effective_filters_for_query_context(
self,
effective_filters: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
) -> Dict[str, Any]:
with belief_scope("SupersetClient._normalize_effective_filters_for_query_context"):
normalized_filters: List[Dict[str, Any]] = []
merged_extra_form_data: Dict[str, Any] = {}
diagnostics: List[Dict[str, Any]] = []
for item in effective_filters:
if not isinstance(item, dict):
continue
column = str(item.get("variable_name") or item.get("filter_name") or "").strip()
if not column:
continue
value = item.get("effective_value")
if value is None:
continue
operator = "IN" if isinstance(value, list) else "=="
normalized_filters.append(
display_name = str(
item.get("display_name")
or item.get("filter_name")
or item.get("variable_name")
or "unresolved_filter"
).strip()
value = item.get("effective_value")
normalized_payload = item.get("normalized_filter_payload")
preserved_clauses: List[Dict[str, Any]] = []
preserved_extra_form_data: Dict[str, Any] = {}
used_preserved_clauses = False
if isinstance(normalized_payload, dict):
raw_clauses = normalized_payload.get("filter_clauses")
if isinstance(raw_clauses, list):
preserved_clauses = [
deepcopy(clause)
for clause in raw_clauses
if isinstance(clause, dict)
]
raw_extra_form_data = normalized_payload.get("extra_form_data")
if isinstance(raw_extra_form_data, dict):
preserved_extra_form_data = deepcopy(raw_extra_form_data)
if isinstance(preserved_extra_form_data, dict):
for key, extra_value in preserved_extra_form_data.items():
if key == "filters":
continue
merged_extra_form_data[key] = deepcopy(extra_value)
outgoing_clauses: List[Dict[str, Any]] = []
if preserved_clauses:
for clause in preserved_clauses:
clause_copy = deepcopy(clause)
if "val" not in clause_copy and value is not None:
clause_copy["val"] = deepcopy(value)
outgoing_clauses.append(clause_copy)
used_preserved_clauses = True
elif preserved_extra_form_data:
outgoing_clauses = []
else:
column = str(item.get("variable_name") or item.get("filter_name") or "").strip()
if column and value is not None:
operator = "IN" if isinstance(value, list) else "=="
outgoing_clauses.append(
{
"col": column,
"op": operator,
"val": value,
}
)
normalized_filters.extend(outgoing_clauses)
diagnostics.append(
{
"col": column,
"op": operator,
"val": value,
"filter_name": display_name,
"value_origin": (
normalized_payload.get("value_origin")
if isinstance(normalized_payload, dict)
else None
),
"used_preserved_clauses": used_preserved_clauses,
"outgoing_clauses": deepcopy(outgoing_clauses),
}
)
return normalized_filters
app_logger.reason(
"Normalized effective preview filter for Superset query context",
extra={
"filter_name": display_name,
"used_preserved_clauses": used_preserved_clauses,
"outgoing_clauses": outgoing_clauses,
"value_origin": (
normalized_payload.get("value_origin")
if isinstance(normalized_payload, dict)
else "heuristic_reconstruction"
),
},
)
return {
"filters": normalized_filters,
"extra_form_data": merged_extra_form_data,
"diagnostics": diagnostics,
}
# [/DEF:backend.src.core.superset_client.SupersetClient._normalize_effective_filters_for_query_context:Function]
# [DEF:backend.src.core.superset_client.SupersetClient._extract_compiled_sql_from_chart_data_response:Function]
# [DEF:backend.src.core.superset_client.SupersetClient._extract_compiled_sql_from_preview_response:Function]
# @COMPLEXITY: 3
# @PURPOSE: Normalize compiled SQL from a chart-data response by reading result[].query fields first.
# @PRE: response must be the decoded response body from /api/v1/chart/data.
# @PURPOSE: Normalize compiled SQL from either chart-data or legacy form_data preview responses.
# @PRE: response must be the decoded preview response body from a supported Superset endpoint.
# @POST: Returns compiled SQL and raw response or raises SupersetAPIError when the endpoint does not expose query text.
def _extract_compiled_sql_from_chart_data_response(self, response: Any) -> Dict[str, Any]:
with belief_scope("SupersetClient._extract_compiled_sql_from_chart_data_response"):
def _extract_compiled_sql_from_preview_response(self, response: Any) -> Dict[str, Any]:
with belief_scope("SupersetClient._extract_compiled_sql_from_preview_response"):
if not isinstance(response, dict):
raise SupersetAPIError("Superset chart/data response was not a JSON object")
raise SupersetAPIError("Superset preview response was not a JSON object")
response_diagnostics: List[Dict[str, Any]] = []
result_payload = response.get("result")
if not isinstance(result_payload, list):
raise SupersetAPIError("Superset chart/data response did not include a result list")
if isinstance(result_payload, list):
for index, item in enumerate(result_payload):
if not isinstance(item, dict):
continue
compiled_sql = str(item.get("query") or item.get("sql") or item.get("compiled_sql") or "").strip()
response_diagnostics.append(
{
"index": index,
"status": item.get("status"),
"applied_filters": item.get("applied_filters"),
"rejected_filters": item.get("rejected_filters"),
"has_query": bool(compiled_sql),
"source": "result_list",
}
)
if compiled_sql:
return {
"compiled_sql": compiled_sql,
"raw_response": response,
"response_diagnostics": response_diagnostics,
}
for item in result_payload:
if not isinstance(item, dict):
continue
compiled_sql = str(item.get("query") or "").strip()
top_level_candidates: List[Tuple[str, Any]] = [
("query", response.get("query")),
("sql", response.get("sql")),
("compiled_sql", response.get("compiled_sql")),
]
if isinstance(result_payload, dict):
top_level_candidates.extend(
[
("result.query", result_payload.get("query")),
("result.sql", result_payload.get("sql")),
("result.compiled_sql", result_payload.get("compiled_sql")),
]
)
for source, candidate in top_level_candidates:
compiled_sql = str(candidate or "").strip()
response_diagnostics.append(
{
"source": source,
"has_query": bool(compiled_sql),
}
)
if compiled_sql:
return {
"compiled_sql": compiled_sql,
"raw_response": response,
"response_diagnostics": response_diagnostics,
}
raise SupersetAPIError("Superset chart/data response did not expose compiled SQL in result[].query")
# [/DEF:backend.src.core.superset_client.SupersetClient._extract_compiled_sql_from_chart_data_response:Function]
raise SupersetAPIError(
"Superset preview response did not expose compiled SQL "
f"(diagnostics={response_diagnostics!r})"
)
# [/DEF:backend.src.core.superset_client.SupersetClient._extract_compiled_sql_from_preview_response:Function]
# [DEF:SupersetClient.update_dataset:Function]
# @COMPLEXITY: 3

View File

@@ -115,6 +115,7 @@ class AsyncAPIClient:
# @DATA_CONTRACT: None -> Output[Dict[str, str]]
# @RELATION: [CALLS] ->[SupersetAuthCache.get]
# @RELATION: [CALLS] ->[SupersetAuthCache.set]
# @RELATION: [CALLS] ->[AsyncAPIClient._get_auth_lock]
async def authenticate(self) -> Dict[str, str]:
cached_tokens = SupersetAuthCache.get(self._auth_cache_key)
if cached_tokens and cached_tokens.get("access_token") and cached_tokens.get("csrf_token"):
@@ -227,6 +228,12 @@ class AsyncAPIClient:
# @PURPOSE: Translate upstream HTTP errors into stable domain exceptions.
# @POST: Raises domain-specific exception for caller flow control.
# @DATA_CONTRACT: Input[httpx.HTTPStatusError] -> Exception
# @RELATION: [CALLS] ->[AsyncAPIClient._is_dashboard_endpoint]
# @RELATION: [DEPENDS_ON] ->[DashboardNotFoundError]
# @RELATION: [DEPENDS_ON] ->[SupersetAPIError]
# @RELATION: [DEPENDS_ON] ->[PermissionDeniedError]
# @RELATION: [DEPENDS_ON] ->[AuthenticationError]
# @RELATION: [DEPENDS_ON] ->[NetworkError]
def _handle_http_error(self, exc: httpx.HTTPStatusError, endpoint: str) -> None:
with belief_scope("AsyncAPIClient._handle_http_error"):
status_code = exc.response.status_code
@@ -264,13 +271,14 @@ class AsyncAPIClient:
if normalized_endpoint.startswith("/api/v1/"):
normalized_endpoint = normalized_endpoint[len("/api/v1"):]
return normalized_endpoint.startswith("/dashboard/") or normalized_endpoint == "/dashboard"
# [/DEF:backend.src.core.utils.async_network.AsyncAPIClient._is_dashboard_endpoint:Function]
# [/DEF:AsyncAPIClient._is_dashboard_endpoint:Function]
# [DEF:backend.src.core.utils.async_network.AsyncAPIClient._handle_network_error:Function]
# [DEF:AsyncAPIClient._handle_network_error:Function]
# @COMPLEXITY: 3
# @PURPOSE: Translate generic httpx errors into NetworkError.
# @POST: Raises NetworkError with URL context.
# @DATA_CONTRACT: Input[httpx.HTTPError] -> NetworkError
# @RELATION: [DEPENDS_ON] ->[NetworkError]
def _handle_network_error(self, exc: httpx.HTTPError, url: str) -> None:
with belief_scope("AsyncAPIClient._handle_network_error"):
if isinstance(exc, httpx.TimeoutException):
@@ -280,16 +288,17 @@ class AsyncAPIClient:
else:
message = f"Unknown network error: {exc}"
raise NetworkError(message, url=url) from exc
# [/DEF:backend.src.core.utils.async_network.AsyncAPIClient._handle_network_error:Function]
# [/DEF:AsyncAPIClient._handle_network_error:Function]
# [DEF:backend.src.core.utils.async_network.AsyncAPIClient.aclose:Function]
# [DEF:AsyncAPIClient.aclose:Function]
# @COMPLEXITY: 3
# @PURPOSE: Close underlying httpx client.
# @POST: Client resources are released.
# @SIDE_EFFECT: Closes network connections.
# @RELATION: [DEPENDS_ON] ->[AsyncAPIClient.__init__]
async def aclose(self) -> None:
await self._client.aclose()
# [/DEF:backend.src.core.utils.async_network.AsyncAPIClient.aclose:Function]
# [/DEF:backend.src.core.utils.async_network.AsyncAPIClient:Class]
# [/DEF:AsyncAPIClient.aclose:Function]
# [/DEF:AsyncAPIClient:Class]
# [/DEF:backend.src.core.utils.async_network:Module]
# [/DEF:AsyncNetworkModule:Module]

View File

@@ -111,6 +111,7 @@ class SupersetAuthCache:
return (str(base_url or "").strip(), username, bool(verify_ssl))
@classmethod
# [DEF:SupersetAuthCache.get:Function]
def get(cls, key: Tuple[str, str, bool]) -> Optional[Dict[str, str]]:
now = time.time()
with cls._lock:
@@ -129,8 +130,10 @@ class SupersetAuthCache:
"access_token": str(tokens.get("access_token") or ""),
"csrf_token": str(tokens.get("csrf_token") or ""),
}
# [/DEF:SupersetAuthCache.get:Function]
@classmethod
# [DEF:SupersetAuthCache.set:Function]
def set(cls, key: Tuple[str, str, bool], tokens: Dict[str, str], ttl_seconds: Optional[int] = None) -> None:
normalized_ttl = max(int(ttl_seconds or cls.TTL_SECONDS), 1)
with cls._lock:
@@ -141,6 +144,7 @@ class SupersetAuthCache:
},
"expires_at": time.time() + normalized_ttl,
}
# [/DEF:SupersetAuthCache.set:Function]
@classmethod
def invalidate(cls, key: Tuple[str, str, bool]) -> None:
@@ -156,7 +160,7 @@ class SupersetAuthCache:
class APIClient:
DEFAULT_TIMEOUT = 30
# [DEF:__init__:Function]
# [DEF:APIClient.__init__:Function]
# @PURPOSE: Инициализирует API клиент с конфигурацией, сессией и логгером.
# @PARAM: config (Dict[str, Any]) - Конфигурация.
# @PARAM: verify_ssl (bool) - Проверять ли SSL.
@@ -179,7 +183,7 @@ class APIClient:
)
self._authenticated = False
app_logger.info("[APIClient.__init__][Exit] APIClient initialized.")
# [/DEF:__init__:Function]
# [/DEF:APIClient.__init__:Function]
# [DEF:_init_session:Function]
# @PURPOSE: Создает и настраивает `requests.Session` с retry-логикой.
@@ -261,6 +265,8 @@ class APIClient:
# @POST: `self._tokens` заполнен, `self._authenticated` установлен в `True`.
# @RETURN: Dict[str, str] - Словарь с токенами.
# @THROW: AuthenticationError, NetworkError - при ошибках.
# @RELATION: [CALLS] ->[SupersetAuthCache.get]
# @RELATION: [CALLS] ->[SupersetAuthCache.set]
def authenticate(self) -> Dict[str, str]:
with belief_scope("authenticate"):
app_logger.info("[authenticate][Enter] Authenticating to %s", self.base_url)

View File

@@ -224,13 +224,13 @@ class SupersetCompilationAdapter:
# @PURPOSE: Request preview compilation through explicit client support backed by real Superset endpoints only.
# @RELATION: [CALLS] ->[SupersetClient.compile_dataset_preview]
# @PRE: payload contains a valid dataset identifier and deterministic execution inputs for one preview attempt.
# @POST: returns one normalized upstream compilation response without endpoint guessing.
# @SIDE_EFFECT: issues one Superset chart-data request through the client.
# @POST: returns one normalized upstream compilation response including the chosen strategy metadata.
# @SIDE_EFFECT: issues one or more Superset preview requests through the client fallback chain.
# @DATA_CONTRACT: Input[PreviewCompilationPayload] -> Output[Dict[str,Any]]
def _request_superset_preview(self, payload: PreviewCompilationPayload) -> Dict[str, Any]:
try:
logger.reason(
"Attempting deterministic Superset preview compilation via chart/data",
"Attempting deterministic Superset preview compilation through supported endpoint strategies",
extra={
"dataset_id": payload.dataset_id,
"session_id": payload.session_id,
@@ -245,7 +245,7 @@ class SupersetCompilationAdapter:
)
except Exception as exc:
logger.explore(
"Superset preview compilation via chart/data failed",
"Superset preview compilation failed across supported endpoint strategies",
extra={
"dataset_id": payload.dataset_id,
"session_id": payload.session_id,
@@ -256,7 +256,7 @@ class SupersetCompilationAdapter:
normalized = self._normalize_preview_response(response)
if normalized is None:
raise RuntimeError("Superset chart/data compilation response could not be normalized")
raise RuntimeError("Superset preview compilation response could not be normalized")
return normalized
# [/DEF:SupersetCompilationAdapter._request_superset_preview:Function]

View File

@@ -16,6 +16,7 @@ from __future__ import annotations
# [DEF:SupersetContextExtractor.imports:Block]
import json
import re
from copy import deepcopy
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Set
from urllib.parse import parse_qs, unquote, urlparse
@@ -128,6 +129,14 @@ class SupersetContextExtractor:
if isinstance(permalink_state, dict):
for key, value in permalink_state.items():
query_state.setdefault(key, value)
# Extract filters from permalink dataMask
data_mask = permalink_state.get("dataMask")
if isinstance(data_mask, dict) and data_mask:
query_state["dataMask"] = data_mask
logger.reason(
"Extracted native filters from permalink dataMask",
extra={"filter_count": len(data_mask)},
)
resolved_dashboard_id = self._extract_dashboard_id_from_state(permalink_state)
resolved_chart_id = self._extract_chart_id_from_state(permalink_state)
if resolved_dashboard_id is not None:
@@ -182,10 +191,44 @@ class SupersetContextExtractor:
"Resolving dashboard-bound dataset from Superset",
extra={"dashboard_ref": resolved_dashboard_ref},
)
# Resolve dashboard detail first — handles both numeric ID and slug,
# ensuring dashboard_id is available for the native_filters_key fetch below.
dashboard_detail = self.client.get_dashboard_detail(resolved_dashboard_ref)
resolved_dashboard_id = dashboard_detail.get("id")
if resolved_dashboard_id is not None:
dashboard_id = int(resolved_dashboard_id)
# Check for native_filters_key in query params and fetch filter state.
# This must run AFTER dashboard_id is resolved from slug above.
native_filters_key = query_params.get("native_filters_key", [None])[0]
if native_filters_key and dashboard_id is not None:
try:
logger.reason(
"Fetching native filter state from Superset",
extra={"dashboard_id": dashboard_id, "filter_key": native_filters_key},
)
extracted = self.client.extract_native_filters_from_key(
dashboard_id, native_filters_key
)
data_mask = extracted.get("dataMask")
if isinstance(data_mask, dict) and data_mask:
query_state["native_filter_state"] = data_mask
logger.reason(
"Extracted native filter state from Superset via native_filters_key",
extra={"filter_count": len(data_mask)},
)
else:
logger.explore(
"Native filter state returned empty dataMask",
extra={"dashboard_id": dashboard_id, "filter_key": native_filters_key},
)
except Exception as exc:
logger.explore(
"Failed to fetch native filter state from Superset",
extra={"dashboard_id": dashboard_id, "filter_key": native_filters_key, "error": str(exc)},
)
datasets = dashboard_detail.get("datasets") or []
if datasets:
first_dataset = datasets[0]
@@ -287,6 +330,114 @@ class SupersetContextExtractor:
with belief_scope("SupersetContextExtractor.recover_imported_filters"):
recovered_filters: List[Dict[str, Any]] = []
seen_filter_keys: Set[str] = set()
metadata_filters: List[Dict[str, Any]] = []
metadata_filters_by_id: Dict[str, Dict[str, Any]] = {}
def merge_recovered_filter(candidate: Dict[str, Any]) -> None:
filter_key = candidate["filter_name"].strip().lower()
existing_index = next(
(
index
for index, existing in enumerate(recovered_filters)
if existing["filter_name"].strip().lower() == filter_key
),
None,
)
if existing_index is None:
seen_filter_keys.add(filter_key)
recovered_filters.append(candidate)
return
existing = recovered_filters[existing_index]
if existing.get("display_name") in {None, "", existing.get("filter_name")} and candidate.get("display_name"):
existing["display_name"] = candidate["display_name"]
if existing.get("raw_value") is None and candidate.get("raw_value") is not None:
existing["raw_value"] = candidate["raw_value"]
existing["confidence_state"] = candidate.get("confidence_state", "imported")
existing["requires_confirmation"] = candidate.get("requires_confirmation", False)
existing["recovery_status"] = candidate.get("recovery_status", "recovered")
existing["source"] = candidate.get("source", existing.get("source"))
if existing.get("normalized_value") is None and candidate.get("normalized_value") is not None:
existing["normalized_value"] = deepcopy(candidate["normalized_value"])
if existing.get("notes") and candidate.get("notes") and candidate["notes"] not in existing["notes"]:
existing["notes"] = f'{existing["notes"]}; {candidate["notes"]}'
if parsed_context.dashboard_id is not None:
try:
dashboard_payload = self.client.get_dashboard(parsed_context.dashboard_id)
dashboard_record = (
dashboard_payload.get("result", dashboard_payload)
if isinstance(dashboard_payload, dict)
else {}
)
json_metadata = dashboard_record.get("json_metadata")
if isinstance(json_metadata, str) and json_metadata.strip():
json_metadata = json.loads(json_metadata)
if not isinstance(json_metadata, dict):
json_metadata = {}
native_filter_configuration = json_metadata.get("native_filter_configuration") or []
default_filters = json_metadata.get("default_filters") or {}
if isinstance(default_filters, str) and default_filters.strip():
try:
default_filters = json.loads(default_filters)
except Exception:
logger.explore(
"Superset default_filters payload was not valid JSON",
extra={"dashboard_id": parsed_context.dashboard_id},
)
default_filters = {}
for item in native_filter_configuration:
if not isinstance(item, dict):
continue
filter_name = str(
item.get("name")
or item.get("filter_name")
or item.get("column")
or ""
).strip()
if not filter_name:
continue
display_name = item.get("label") or item.get("name") or filter_name
filter_id = str(item.get("id") or "").strip()
default_value = None
if isinstance(default_filters, dict):
default_value = default_filters.get(filter_name)
metadata_filter = self._normalize_imported_filter_payload(
{
"filter_name": filter_name,
"display_name": display_name,
"raw_value": default_value,
"source": "superset_native",
"recovery_status": "recovered" if default_value is not None else "partial",
"requires_confirmation": default_value is None,
"notes": "Recovered from Superset dashboard native filter configuration",
},
default_source="superset_native",
default_note="Recovered from Superset dashboard native filter configuration",
)
metadata_filters.append(metadata_filter)
if filter_id:
metadata_filters_by_id[filter_id.lower()] = {
"filter_name": filter_name,
"display_name": display_name,
}
except Exception as exc:
logger.explore(
"Dashboard native filter enrichment failed; preserving partial imported filters",
extra={
"dashboard_id": parsed_context.dashboard_id,
"error": str(exc),
"filter_count": len(recovered_filters),
},
)
metadata_filters = []
metadata_filters_by_id = {}
for item in parsed_context.imported_filters:
normalized = self._normalize_imported_filter_payload(
@@ -294,11 +445,24 @@ class SupersetContextExtractor:
default_source="superset_url",
default_note="Recovered from Superset URL state",
)
filter_key = normalized["filter_name"].strip().lower()
if filter_key in seen_filter_keys:
continue
seen_filter_keys.add(filter_key)
recovered_filters.append(normalized)
metadata_match = metadata_filters_by_id.get(normalized["filter_name"].strip().lower())
if metadata_match is not None:
normalized["filter_name"] = metadata_match["filter_name"]
normalized["display_name"] = metadata_match["display_name"]
normalized["notes"] = (
"Recovered from Superset URL state and reconciled against dashboard native filter metadata"
)
merge_recovered_filter(normalized)
logger.reflect(
"Recovered filter from URL state",
extra={
"filter_name": normalized["filter_name"],
"source": normalized["source"],
"has_value": normalized["raw_value"] is not None,
"canonicalized": metadata_match is not None,
},
)
if parsed_context.dashboard_id is None:
logger.reflect(
@@ -311,108 +475,48 @@ class SupersetContextExtractor:
)
return recovered_filters
try:
dashboard_payload = self.client.get_dashboard(parsed_context.dashboard_id)
dashboard_record = (
dashboard_payload.get("result", dashboard_payload)
if isinstance(dashboard_payload, dict)
else {}
for saved_filter in metadata_filters:
merge_recovered_filter(saved_filter)
logger.reflect(
"Recovered filter from dashboard metadata",
extra={
"filter_name": saved_filter["filter_name"],
"has_value": saved_filter["raw_value"] is not None,
},
)
json_metadata = dashboard_record.get("json_metadata")
if isinstance(json_metadata, str) and json_metadata.strip():
json_metadata = json.loads(json_metadata)
if not isinstance(json_metadata, dict):
json_metadata = {}
native_filter_configuration = json_metadata.get("native_filter_configuration") or []
default_filters = json_metadata.get("default_filters") or {}
if isinstance(default_filters, str) and default_filters.strip():
try:
default_filters = json.loads(default_filters)
except Exception:
logger.explore(
"Superset default_filters payload was not valid JSON",
extra={"dashboard_id": parsed_context.dashboard_id},
)
default_filters = {}
for item in native_filter_configuration:
if not isinstance(item, dict):
continue
filter_name = str(
item.get("name")
or item.get("filter_name")
or item.get("column")
or ""
).strip()
if not filter_name:
continue
filter_key = filter_name.lower()
if filter_key in seen_filter_keys:
continue
default_value = None
if isinstance(default_filters, dict):
default_value = default_filters.get(filter_name)
saved_filter = self._normalize_imported_filter_payload(
if not recovered_filters:
recovered_filters.append(
self._normalize_imported_filter_payload(
{
"filter_name": filter_name,
"display_name": item.get("label") or item.get("name"),
"raw_value": default_value,
"filter_name": f"dashboard_{parsed_context.dashboard_id}_filters",
"display_name": "Dashboard native filters",
"raw_value": None,
"source": "superset_native",
"recovery_status": "recovered" if default_value is not None else "partial",
"requires_confirmation": default_value is None,
"notes": "Recovered from Superset dashboard native filter configuration",
"recovery_status": "partial",
"requires_confirmation": True,
"notes": "Superset dashboard filter configuration could not be recovered fully",
},
default_source="superset_native",
default_note="Recovered from Superset dashboard native filter configuration",
default_note="Superset dashboard filter configuration could not be recovered fully",
)
seen_filter_keys.add(filter_key)
recovered_filters.append(saved_filter)
)
logger.reflect(
"Imported filter recovery completed with dashboard enrichment",
extra={
"dashboard_id": parsed_context.dashboard_id,
"filter_count": len(recovered_filters),
"partial_entries": len(
[
item
for item in recovered_filters
if item["recovery_status"] == "partial"
]
),
},
)
return recovered_filters
except Exception as exc:
logger.explore(
"Dashboard native filter enrichment failed; preserving partial imported filters",
extra={
"dashboard_id": parsed_context.dashboard_id,
"error": str(exc),
"filter_count": len(recovered_filters),
},
)
if not recovered_filters:
recovered_filters.append(
self._normalize_imported_filter_payload(
{
"filter_name": f"dashboard_{parsed_context.dashboard_id}_filters",
"display_name": "Dashboard native filters",
"raw_value": None,
"source": "superset_native",
"recovery_status": "partial",
"requires_confirmation": True,
"notes": "Superset dashboard filter configuration could not be recovered fully",
},
default_source="superset_native",
default_note="Superset dashboard filter configuration could not be recovered fully",
)
)
return recovered_filters
logger.reflect(
"Imported filter recovery completed with dashboard enrichment",
extra={
"dashboard_id": parsed_context.dashboard_id,
"filter_count": len(recovered_filters),
"partial_entries": len(
[
item
for item in recovered_filters
if item["recovery_status"] == "partial"
]
),
},
)
return recovered_filters
# [/DEF:SupersetContextExtractor.recover_imported_filters:Function]
# [DEF:SupersetContextExtractor.discover_template_variables:Function]
@@ -692,11 +796,23 @@ class SupersetContextExtractor:
or item.get("name")
or f"native_filter_{index}"
)
direct_clause = None
if item.get("column") and ("value" in item or "val" in item):
direct_clause = {
"col": item.get("column"),
"op": item.get("op") or ("IN" if isinstance(item.get("value"), list) else "=="),
"val": item.get("val", item.get("value")),
}
imported_filters.append(
{
"filter_name": str(filter_name),
"raw_value": item.get("value"),
"display_name": item.get("label") or item.get("name"),
"normalized_value": {
"filter_clauses": [direct_clause] if isinstance(direct_clause, dict) else [],
"extra_form_data": {},
"value_origin": "native_filters",
},
"source": "superset_url",
"recovery_status": "recovered"
if item.get("value") is not None
@@ -706,6 +822,7 @@ class SupersetContextExtractor:
}
)
# Extract filters from permalink dataMask
dashboard_data_mask = query_state.get("dataMask")
if isinstance(dashboard_data_mask, dict):
for filter_key, item in dashboard_data_mask.items():
@@ -715,20 +832,54 @@ class SupersetContextExtractor:
extra_form_data = item.get("extraFormData")
display_name = None
raw_value = None
normalized_value = {
"filter_clauses": [],
"extra_form_data": deepcopy(extra_form_data) if isinstance(extra_form_data, dict) else {},
"value_origin": "unresolved",
}
# Try to get value from filterState
if isinstance(filter_state, dict):
display_name = filter_state.get("label")
raw_value = filter_state.get("value")
if raw_value is None and isinstance(extra_form_data, dict):
# Superset filterState uses 'value' for single values, 'values' for multi-select
raw_value = filter_state.get("value") or filter_state.get("values")
if raw_value is not None:
normalized_value["value_origin"] = "filter_state"
# Preserve exact Superset clauses from extraFormData.filters
if isinstance(extra_form_data, dict):
extra_filters = extra_form_data.get("filters")
if isinstance(extra_filters, list) and extra_filters:
first_filter = extra_filters[0]
if isinstance(first_filter, dict):
raw_value = first_filter.get("val")
if isinstance(extra_filters, list):
normalized_value["filter_clauses"] = [
deepcopy(extra_filter)
for extra_filter in extra_filters
if isinstance(extra_filter, dict)
]
# If no value found, try extraFormData.filters
if raw_value is None and normalized_value["filter_clauses"]:
first_filter = normalized_value["filter_clauses"][0]
raw_value = first_filter.get("val")
if raw_value is None:
raw_value = first_filter.get("value")
if raw_value is not None:
normalized_value["value_origin"] = "extra_form_data.filters"
# If still no value, try extraFormData directly for time_range, time_grain, etc.
if raw_value is None and isinstance(extra_form_data, dict):
# Common Superset filter fields
for field in ["time_range", "time_grain_sqla", "time_column", "granularity"]:
if field in extra_form_data:
raw_value = extra_form_data[field]
normalized_value["value_origin"] = f"extra_form_data.{field}"
break
imported_filters.append(
{
"filter_name": str(item.get("id") or filter_key),
"raw_value": raw_value,
"display_name": display_name,
"normalized_value": normalized_value,
"source": "superset_permalink",
"recovery_status": "recovered" if raw_value is not None else "partial",
"requires_confirmation": raw_value is None,
@@ -736,6 +887,73 @@ class SupersetContextExtractor:
}
)
# Extract filters from native_filter_state (fetched from Superset via native_filters_key)
native_filter_state = query_state.get("native_filter_state")
if isinstance(native_filter_state, dict):
for filter_key, item in native_filter_state.items():
if not isinstance(item, dict):
continue
# Handle both single filter format and multi-filter format
filter_id = item.get("id") or filter_key
filter_state = item.get("filterState")
extra_form_data = item.get("extraFormData")
display_name = None
raw_value = None
normalized_value = {
"filter_clauses": [],
"extra_form_data": deepcopy(extra_form_data) if isinstance(extra_form_data, dict) else {},
"value_origin": "unresolved",
}
# Try to get value from filterState
if isinstance(filter_state, dict):
display_name = filter_state.get("label")
# Superset filterState uses 'value' for single values, 'values' for multi-select
raw_value = filter_state.get("value") or filter_state.get("values")
if raw_value is not None:
normalized_value["value_origin"] = "filter_state"
# Preserve exact Superset clauses from extraFormData.filters
if isinstance(extra_form_data, dict):
extra_filters = extra_form_data.get("filters")
if isinstance(extra_filters, list):
normalized_value["filter_clauses"] = [
deepcopy(extra_filter)
for extra_filter in extra_filters
if isinstance(extra_filter, dict)
]
# If no value found, try extraFormData.filters
if raw_value is None and normalized_value["filter_clauses"]:
first_filter = normalized_value["filter_clauses"][0]
raw_value = first_filter.get("val")
if raw_value is None:
raw_value = first_filter.get("value")
if raw_value is not None:
normalized_value["value_origin"] = "extra_form_data.filters"
# If still no value, try extraFormData directly for time_range, time_grain, etc.
if raw_value is None and isinstance(extra_form_data, dict):
# Common Superset filter fields
for field in ["time_range", "time_grain_sqla", "time_column", "granularity"]:
if field in extra_form_data:
raw_value = extra_form_data[field]
normalized_value["value_origin"] = f"extra_form_data.{field}"
break
imported_filters.append(
{
"filter_name": str(filter_id),
"raw_value": raw_value,
"display_name": display_name,
"normalized_value": normalized_value,
"source": "superset_native_filters_key",
"recovery_status": "recovered" if raw_value is not None else "partial",
"requires_confirmation": raw_value is None,
"notes": "Recovered from Superset native_filters_key state",
}
)
form_data_payload = query_state.get("form_data")
if isinstance(form_data_payload, dict):
extra_filters = form_data_payload.get("extra_filters") or []
@@ -748,6 +966,11 @@ class SupersetContextExtractor:
"filter_name": str(filter_name),
"raw_value": item.get("val"),
"display_name": item.get("label"),
"normalized_value": {
"filter_clauses": [deepcopy(item)],
"extra_form_data": {},
"value_origin": "form_data.extra_filters",
},
"source": "superset_url",
"recovery_status": "recovered"
if item.get("val") is not None

View File

@@ -357,6 +357,8 @@ class SemanticCandidate(Base):
class FilterSource(str, enum.Enum):
SUPERSET_NATIVE = "superset_native"
SUPERSET_URL = "superset_url"
SUPERSET_PERMALINK = "superset_permalink"
SUPERSET_NATIVE_FILTERS_KEY = "superset_native_filters_key"
MANUAL = "manual"
INFERRED = "inferred"
# [/DEF:FilterSource:Class]

View File

@@ -0,0 +1,151 @@
# [DEF:backend.src.models.filter_state:Module]
#
# @COMPLEXITY: 2
# @SEMANTICS: superset, native, filters, pydantic, models, dataclasses
# @PURPOSE: Pydantic models for Superset native filter state extraction and restoration.
# @LAYER: Models
# @RELATION: [DEPENDS_ON] ->[pydantic]
# [SECTION: IMPORTS]
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, ConfigDict, Field
# [/SECTION]
# [DEF:FilterState:Model]
# @COMPLEXITY: 2
# @PURPOSE: Represents the state of a single native filter.
# @DATA_CONTRACT: Input[extraFormData: Dict, filterState: Dict, ownState: Optional[Dict]] -> Model[FilterState]
class FilterState(BaseModel):
"""Single native filter state with extraFormData, filterState, and ownState."""
model_config = ConfigDict(extra="allow")
extraFormData: Dict[str, Any] = Field(default_factory=dict, description="Extra form data for the filter")
filterState: Dict[str, Any] = Field(default_factory=dict, description="Current filter state")
ownState: Dict[str, Any] = Field(default_factory=dict, description="Own state of the filter")
# [/DEF:FilterState:Model]
# [DEF:NativeFilterDataMask:Model]
# @COMPLEXITY: 2
# @PURPOSE: Represents the dataMask containing all native filter states.
# @DATA_CONTRACT: Input[Dict[filter_id, FilterState]] -> Model[NativeFilterDataMask]
class NativeFilterDataMask(BaseModel):
"""Container for all native filter states in a dashboard."""
model_config = ConfigDict(extra="allow")
filters: Dict[str, Any] = Field(default_factory=dict, description="Map of filter ID to filter state data")
def get_filter_ids(self) -> List[str]:
"""Return list of all filter IDs."""
return list(self.filters.keys())
def get_extra_form_data(self, filter_id: str) -> Dict[str, Any]:
"""Get extraFormData for a specific filter."""
filter_state = self.filters.get(filter_id)
if filter_state:
return filter_state.extraFormData
return {}
# [/DEF:NativeFilterDataMask:Model]
# [DEF:ParsedNativeFilters:Model]
# @COMPLEXITY: 2
# @PURPOSE: Result of parsing native filters from permalink or native_filters_key.
# @DATA_CONTRACT: Input[dataMask: Dict, metadata: Dict] -> Model[ParsedNativeFilters]
class ParsedNativeFilters(BaseModel):
"""Result of extracting native filters from a Superset URL."""
model_config = ConfigDict(extra="allow")
dataMask: Dict[str, Any] = Field(default_factory=dict, description="Extracted dataMask from filters")
filter_type: Optional[str] = Field(default=None, description="Type of filter: permalink, native_filters_key, or native_filters")
dashboard_id: Optional[str] = Field(default=None, description="Dashboard ID if available")
permalink_key: Optional[str] = Field(default=None, description="Permalink key if used")
filter_state_key: Optional[str] = Field(default=None, description="Filter state key if used")
active_tabs: List[str] = Field(default_factory=list, description="Active tabs in dashboard")
anchor: Optional[str] = Field(default=None, description="Anchor position in dashboard")
chart_states: Dict[str, Any] = Field(default_factory=dict, description="Chart states in dashboard")
def has_filters(self) -> bool:
"""Check if any filters were extracted."""
return bool(self.dataMask)
def get_filter_count(self) -> int:
"""Get the number of filters extracted."""
return len(self.dataMask)
# [/DEF:ParsedNativeFilters:Model]
# [DEF:DashboardURLFilterExtraction:Model]
# @COMPLEXITY: 2
# @PURPOSE: Result of parsing a complete dashboard URL for filter information.
# @DATA_CONTRACT: Input[url: str, dashboard_id: Optional, filter_type: Optional, filters: Dict] -> Model[DashboardURLFilterExtraction]
class DashboardURLFilterExtraction(BaseModel):
"""Result of parsing a Superset dashboard URL to extract filter state."""
model_config = ConfigDict(extra="allow")
url: str = Field(..., description="Original dashboard URL")
dashboard_id: Optional[str] = Field(default=None, description="Extracted dashboard ID")
filter_type: Optional[str] = Field(default=None, description="Type of filter found")
filters: ParsedNativeFilters = Field(default_factory=ParsedNativeFilters, description="Extracted filter data")
success: bool = Field(default=True, description="Whether extraction was successful")
error: Optional[str] = Field(default=None, description="Error message if extraction failed")
# [/DEF:DashboardURLFilterExtraction:Model]
# [DEF:ExtraFormDataMerge:Model]
# @COMPLEXITY: 2
# @PURPOSE: Configuration for merging extraFormData from different sources.
# @DATA_CONTRACT: Input[append_keys: List[str], override_keys: List[str]] -> Model[ExtraFormDataMerge]
class ExtraFormDataMerge(BaseModel):
"""Configuration for merging extraFormData between original and new filter values."""
# Keys that should be appended (arrays, filters)
append_keys: List[str] = Field(
default_factory=lambda: ["filters", "extras", "columns", "metrics"],
description="Keys that should be merged by appending"
)
# Keys that should be overridden (single values)
override_keys: List[str] = Field(
default_factory=lambda: ["time_range", "time_grain_sqla", "time_column", "granularity"],
description="Keys that should be overridden by new values"
)
def merge(self, original: Dict[str, Any], new: Dict[str, Any]) -> Dict[str, Any]:
"""
Merge two extraFormData dictionaries.
@param original: Original extraFormData from dashboard metadata
@param new: New extraFormData from URL/permalink
@return: Merged extraFormData dictionary
"""
result = {}
# Start with original
for key, value in original.items():
result[key] = value
# Apply overrides and appends from new
for key, new_value in new.items():
if key in self.override_keys:
# Override the value
result[key] = new_value
elif key in self.append_keys:
# Append to the existing value
existing = result.get(key)
if isinstance(existing, list) and isinstance(new_value, list):
result[key] = existing + new_value
else:
result[key] = new_value
else:
result[key] = new_value
return result
# [/DEF:ExtraFormDataMerge:Model]
# [/DEF:backend.src.models.filter_state:Module]

View File

@@ -798,6 +798,7 @@ class DatasetReviewOrchestrator:
approved_mapping_ids: List[str] = []
open_warning_refs: List[str] = []
preview_blockers: List[str] = []
mapped_filter_ids: set[str] = set()
for mapping in session.execution_mappings:
imported_filter = filter_lookup.get(mapping.filter_id)
@@ -821,23 +822,46 @@ class DatasetReviewOrchestrator:
preview_blockers.append(f"variable:{template_variable.variable_name}:missing_required_value")
continue
effective_filters.append(
{
"mapping_id": mapping.mapping_id,
"filter_id": imported_filter.filter_id,
"filter_name": imported_filter.filter_name,
"variable_id": template_variable.variable_id,
"variable_name": template_variable.variable_name,
"effective_value": effective_value,
"raw_input_value": mapping.raw_input_value,
}
)
mapped_filter_ids.add(imported_filter.filter_id)
if effective_value is not None:
effective_filters.append(
{
"mapping_id": mapping.mapping_id,
"filter_id": imported_filter.filter_id,
"filter_name": imported_filter.filter_name,
"display_name": imported_filter.display_name,
"variable_id": template_variable.variable_id,
"variable_name": template_variable.variable_name,
"effective_value": effective_value,
"raw_input_value": mapping.raw_input_value,
"normalized_filter_payload": imported_filter.normalized_value,
}
)
template_params[template_variable.variable_name] = effective_value
if mapping.approval_state == ApprovalState.APPROVED:
approved_mapping_ids.append(mapping.mapping_id)
if mapping.requires_explicit_approval and mapping.approval_state != ApprovalState.APPROVED:
open_warning_refs.append(mapping.mapping_id)
for imported_filter in session.imported_filters:
if imported_filter.filter_id in mapped_filter_ids:
continue
effective_value = imported_filter.normalized_value
if effective_value is None:
effective_value = imported_filter.raw_value
if effective_value is None:
continue
effective_filters.append(
{
"filter_id": imported_filter.filter_id,
"filter_name": imported_filter.filter_name,
"display_name": imported_filter.display_name,
"effective_value": effective_value,
"raw_input_value": imported_filter.raw_value,
"normalized_filter_payload": imported_filter.normalized_value,
}
)
mapped_variable_ids = {mapping.variable_id for mapping in session.execution_mappings}
for variable in session.template_variables:
if variable.variable_id in mapped_variable_ids:

View File

@@ -196,6 +196,33 @@ async function postApi(endpoint, body) {
}
// [/DEF:postApi:Function]
// [DEF:deleteApi:Function]
// @PURPOSE: Generic DELETE request wrapper.
// @PRE: endpoint is provided.
// @POST: Returns Promise resolving to JSON data or throws on error.
// @PARAM: endpoint (string) - API endpoint.
// @RETURN: Promise<any> - JSON response.
async function deleteApi(endpoint) {
try {
console.log(`[api.deleteApi][Action] Deleting from context={{'endpoint': '${endpoint}'}}`);
const response = await fetch(`${API_BASE_URL}${endpoint}`, {
method: 'DELETE',
headers: getAuthHeaders(),
});
console.log(`[api.deleteApi][Action] Received response context={{'status': ${response.status}, 'ok': ${response.ok}}}`);
if (!response.ok) {
throw await buildApiError(response);
}
if (response.status === 204) return null;
return await response.json();
} catch (error) {
console.error(`[api.deleteApi][Coherence:Failed] Error deleting from ${endpoint}:`, error);
notifyApiError(error);
throw error;
}
}
// [/DEF:deleteApi:Function]
// [DEF:requestApi:Function]
// @PURPOSE: Generic request wrapper.
// @PRE: endpoint and method are provided.
@@ -237,6 +264,7 @@ async function requestApi(endpoint, method = 'GET', body = null) {
export const api = {
fetchApi,
postApi,
deleteApi,
requestApi,
getPlugins: () => fetchApi('/plugins'),
getTasks: (options = {}) => {

View File

@@ -91,11 +91,7 @@
return $t.dataset_review?.preview?.stale_body;
}
if (effectiveState === "failed") {
return (
preview?.error_details ||
preview?.error_code ||
$t.dataset_review?.preview?.error_body
);
return $t.dataset_review?.preview?.error_body;
}
if (effectiveState === "missing") {
return $t.dataset_review?.preview?.missing_body;
@@ -103,6 +99,12 @@
return $t.dataset_review?.preview?.ready_body;
}
function buildPreviewTechnicalDetails() {
return [preview?.error_code, preview?.error_details].filter(Boolean).join("\n\n");
}
const previewTechnicalDetails = $derived(buildPreviewTechnicalDetails());
async function requestPreview() {
if (!sessionId || disabled || localStatus === "saving") {
return;

View File

@@ -217,9 +217,18 @@
<div class="mt-4 space-y-3">
{#each launchBlockers as blocker}
<div class="rounded-xl border border-red-200 bg-white p-3">
<div class="text-sm font-medium text-slate-900">{blocker.label}</div>
<div class="text-sm font-medium leading-6 text-slate-900 break-words [overflow-wrap:anywhere]">
{blocker.label}
</div>
{#if blocker.detail}
<div class="mt-1 break-all text-xs text-slate-600">{blocker.detail}</div>
<div class="mt-2 max-h-28 overflow-auto rounded-lg border border-slate-200 bg-slate-50 px-3 py-2">
<div
data-testid="launch-blocker-detail"
class="text-xs leading-5 text-slate-600 whitespace-pre-wrap break-words [overflow-wrap:anywhere]"
>
{blocker.detail}
</div>
</div>
{/if}
<button
type="button"
@@ -242,7 +251,7 @@
<div class="text-xs uppercase tracking-wide text-slate-500">
{$t.dataset_review?.launch?.dataset_ref_label}
</div>
<div class="mt-1 text-sm font-medium text-slate-900">
<div class="mt-1 text-sm font-medium text-slate-900 break-words [overflow-wrap:anywhere]">
{session?.dataset_ref || ($t.common?.unknown || "unknown")}
</div>
</div>

View File

@@ -81,6 +81,18 @@
const normalized = String(action || "");
return $t.dataset_review?.workspace?.actions?.[normalized] || normalized;
}
function getFindingMessage(finding) {
return String(finding?.message || "").trim();
}
function getFindingTechnicalReference(finding) {
return String(finding?.caused_by_ref || "").trim();
}
function getFindingResolutionNote(finding) {
return String(finding?.resolution_note || "").trim();
}
</script>
<div class="rounded-2xl border border-slate-200 bg-white p-5 shadow-sm">

View File

@@ -654,6 +654,25 @@
"audit": "Audit"
}
},
"workspace_entry": {
"resume_eyebrow": "Resume session",
"resume_title": "Continue an existing review session",
"resume_description": "If a dataset review was already started, open the relevant session and continue from its latest saved state.",
"resume_available_badge": "Sessions available",
"resume_loading": "Loading existing sessions...",
"resume_load_failed": "Failed to load existing review sessions.",
"resume_empty_title": "No saved sessions yet",
"resume_empty_body": "Start a new review session from the source intake panel to make it available here for reopening later.",
"resume_action": "Resume",
"session_id_label": "Session ID",
"dataset_ref_label": "Dataset reference",
"environment_id_label": "Environment",
"status_label": "Status",
"readiness_label": "Readiness",
"phase_label": "Phase",
"updated_at_label": "Updated",
"last_activity_at_label": "Last activity"
},
"workspace": {
"eyebrow": "Dataset orchestration",
"title": "Dataset review workspace",

View File

@@ -652,6 +652,25 @@
"audit": "Аудит"
}
},
"workspace_entry": {
"resume_eyebrow": "Возобновление сессии",
"resume_title": "Продолжить существующую review-сессию",
"resume_description": "Если review уже была начата, откройте нужную сессию и продолжайте с последнего сохраненного состояния.",
"resume_available_badge": "Доступно сессий",
"resume_loading": "Загрузка доступных сессий...",
"resume_load_failed": "Не удалось загрузить существующие review-сессии.",
"resume_empty_title": "Сохраненных сессий пока нет",
"resume_empty_body": "Запустите новую review-сессию через блок источника справа, чтобы она появилась здесь для повторного открытия.",
"resume_action": "Возобновить",
"session_id_label": "ID сессии",
"dataset_ref_label": "Ссылка на датасет",
"environment_id_label": "Окружение",
"status_label": "Статус",
"readiness_label": "Готовность",
"phase_label": "Фаза",
"updated_at_label": "Обновлено",
"last_activity_at_label": "Последняя активность"
},
"workspace": {
"eyebrow": "Оркестрация датасета",
"title": "Workspace review датасета",

View File

@@ -3,14 +3,17 @@
<!-- @SEMANTICS: dataset-review, workspace-entry, source-intake, session-bootstrap -->
<!-- @PURPOSE: Entry route for Dataset Review Workspace that allows starting a new resumable review session before navigating to a specific session id route. -->
<!-- @LAYER: UI -->
<!-- @RELATION: [CALLS] ->[fetchApi] -->
<!-- @RELATION: [CALLS] ->[postApi] -->
<!-- @RELATION: [BINDS_TO] ->[SourceIntakePanel] -->
<!-- @RELATION: [BINDS_TO] ->[environmentContext] -->
<!-- @UX_STATE: Empty -> Show source intake for Superset link or dataset reference. -->
<!-- @UX_STATE: ResumeList -> Show existing resumable sessions with direct navigation CTA. -->
<!-- @UX_STATE: Submitting -> Disable controls and show startup feedback. -->
<!-- @UX_STATE: Error -> Inline error shown while keeping intake values editable. -->
<!-- @UX_RECOVERY: Users can correct invalid input in place and retry without losing environment selection. -->
<script>
import { goto } from "$app/navigation";
import { fromStore } from "svelte/store";
import SourceIntakePanel from "$lib/components/dataset-review/SourceIntakePanel.svelte";
import { t } from "$lib/i18n";
@@ -26,20 +29,117 @@
let isSubmitting = $state(false);
let submitError = $state("");
let intakeAcknowledgment = $state("");
let sessions = $state([]);
let sessionsLoading = $state(false);
let sessionsError = $state("");
let searchQuery = $state("");
let currentPage = $state(1);
let pageSize = $state(5);
let totalSessions = $state(0);
let deleteConfirmSessionId = $state(null);
const environments = $derived(environmentContextState.current?.environments || []);
const selectedEnvironmentId = $derived(
environmentContextState.current?.selectedEnvId || "",
);
const hasExistingSessions = $derived(sessions.length > 0);
const totalPages = $derived(totalSessions > 0 ? Math.ceil(totalSessions / pageSize) : 1);
const filteredSessions = $derived(() => {
if (!searchQuery.trim()) return sessions;
const query = searchQuery.toLowerCase();
return sessions.filter(s =>
(s.session_id?.toLowerCase().includes(query)) ||
(s.dataset_ref?.toLowerCase().includes(query)) ||
(s.environment_id?.toLowerCase().includes(query))
);
});
function buildSessionUrl(sessionId) {
return `/datasets/review/${encodeURIComponent(String(sessionId))}`;
}
function formatDateTime(value) {
if (!value) {
return $t.common?.not_available || "N/A";
}
const parsed = new Date(value);
if (Number.isNaN(parsed.getTime())) {
return String(value);
}
return parsed.toLocaleString();
}
function getSessionMetaValue(session, key, fallback = "") {
const value = session?.[key];
return value ? String(value) : fallback;
}
async function loadExistingSessions() {
sessionsLoading = true;
sessionsError = "";
try {
const params = new URLSearchParams();
params.append("page", String(currentPage));
params.append("page_size", String(pageSize));
if (searchQuery.trim()) {
params.append("search", searchQuery.trim());
}
const response = await api.fetchApi(`/dataset-orchestration/sessions?${params.toString()}`);
sessions = Array.isArray(response?.items) ? response.items : [];
totalSessions = response?.total || sessions.length;
} catch (error) {
sessions = [];
totalSessions = 0;
sessionsError =
error?.message ||
$t.dataset_review?.workspace_entry?.resume_load_failed ||
$t.common?.error;
} finally {
sessionsLoading = false;
}
}
async function deleteSession(sessionId, hardDelete = false) {
try {
await api.deleteApi(`/dataset-orchestration/sessions/${sessionId}?hard_delete=${hardDelete}`);
deleteConfirmSessionId = null;
// Reload current page, or go to previous if this was the last item
if (sessions.length === 1 && currentPage > 1) {
currentPage = currentPage - 1;
}
await loadExistingSessions();
} catch (error) {
sessionsError = error?.message || $t.common?.error;
}
}
function goToPage(page) {
if (page >= 1 && page <= totalPages) {
currentPage = page;
loadExistingSessions();
}
}
function handleSearch() {
currentPage = 1;
loadExistingSessions();
}
function truncateText(text, maxLength = 30) {
if (!text || text.length <= maxLength) return text || "";
return text.slice(0, maxLength) + "...";
}
async function bootstrap() {
isBootstrapping = true;
try {
await initializeEnvironmentContext();
await loadExistingSessions();
} finally {
isBootstrapping = false;
}
@@ -58,7 +158,7 @@
if (!summary?.session_id) {
throw new Error($t.dataset_review?.source?.submit_failed || "Failed to start review");
}
window.location.href = buildSessionUrl(summary.session_id);
await goto(buildSessionUrl(summary.session_id));
} catch (error) {
submitError =
error?.message ||
@@ -91,6 +191,11 @@
<span class="rounded-full bg-slate-100 px-3 py-1 text-xs font-medium text-slate-700">
{$t.dataset_review?.workspace?.state_label}: {$t.dataset_review?.workspace?.state?.empty || "Empty"}
</span>
{#if hasExistingSessions}
<span class="rounded-full bg-blue-50 px-3 py-1 text-xs font-medium text-blue-700">
{$t.dataset_review?.workspace_entry?.resume_available_badge}: {sessions.length}
</span>
{/if}
</div>
</div>
@@ -99,19 +204,303 @@
{$t.dataset_review?.workspace?.loading}
</div>
{:else}
<SourceIntakePanel
environments={environments}
selectedEnvironmentId={selectedEnvironmentId}
submitting={isSubmitting}
acknowledgment={intakeAcknowledgment}
onsubmit={handleSourceSubmit}
/>
<div class="grid gap-5 xl:grid-cols-[minmax(0,1fr)_minmax(0,1.2fr)]">
<section class="rounded-2xl border border-slate-200 bg-white p-5 shadow-sm">
<div class="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
<div>
<p class="text-xs font-semibold uppercase tracking-wide text-blue-700">
{$t.dataset_review?.workspace_entry?.resume_eyebrow}
</p>
<h2 class="text-xl font-semibold text-slate-900">
{$t.dataset_review?.workspace_entry?.resume_title}
</h2>
<p class="mt-1 max-w-2xl text-sm text-slate-600">
{$t.dataset_review?.workspace_entry?.resume_description}
</p>
</div>
{#if submitError}
<div class="rounded-2xl border border-red-200 bg-red-50 px-4 py-4 text-sm text-red-700">
{submitError}
<button
type="button"
class="inline-flex items-center justify-center rounded-xl border border-slate-300 bg-white px-4 py-2 text-sm font-medium text-slate-700 transition hover:bg-slate-50 disabled:cursor-not-allowed disabled:opacity-50"
onclick={loadExistingSessions}
disabled={sessionsLoading}
>
{#if sessionsLoading}
<svg class="mr-2 h-4 w-4 animate-spin" viewBox="0 0 24 24" fill="none">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
</svg>
{$t.dataset_review?.workspace_entry?.resume_loading}
{:else}
<svg class="mr-2 h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"></path>
</svg>
{$t.common?.refresh}
{/if}
</button>
</div>
<!-- Search Bar -->
<div class="mt-4 flex gap-2">
<div class="relative flex-1">
<div class="pointer-events-none absolute inset-y-0 left-0 flex items-center pl-3">
<svg class="h-5 w-5 text-slate-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z"></path>
</svg>
</div>
<input
type="text"
class="w-full rounded-lg border border-slate-300 bg-white py-2.5 pl-10 pr-4 text-sm text-slate-900 placeholder-slate-400 focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500"
placeholder="Поиск по ID, датасету или окружению..."
bind:value={searchQuery}
onkeydown={(e) => e.key === 'Enter' && handleSearch()}
/>
</div>
<button
type="button"
class="inline-flex items-center justify-center rounded-lg bg-blue-600 px-4 py-2.5 text-sm font-medium text-white transition hover:bg-blue-700"
onclick={handleSearch}
>
Найти
</button>
{#if searchQuery}
<button
type="button"
class="inline-flex items-center justify-center rounded-lg border border-slate-300 bg-white px-4 py-2.5 text-sm font-medium text-slate-700 transition hover:bg-slate-50"
onclick={() => { searchQuery = ""; handleSearch(); }}
>
Сбросить
</button>
{/if}
</div>
{#if sessionsLoading}
<div class="mt-5 rounded-xl border border-slate-200 bg-slate-50 px-4 py-4 text-sm text-slate-600">
{$t.dataset_review?.workspace_entry?.resume_loading}
</div>
{:else if sessionsError}
<div class="mt-5 rounded-xl border border-red-200 bg-red-50 px-4 py-4 text-sm text-red-700">
<div>{sessionsError}</div>
<button
type="button"
class="mt-3 inline-flex items-center justify-center rounded-lg border border-red-300 bg-white px-3 py-1.5 text-sm font-medium text-red-700 transition hover:bg-red-50"
onclick={loadExistingSessions}
>
{$t.common?.retry}
</button>
</div>
{:else if !hasExistingSessions}
<div class="mt-5 rounded-xl border border-dashed border-slate-300 bg-slate-50 px-4 py-4 text-sm text-slate-600">
<div class="font-medium text-slate-900">
{$t.dataset_review?.workspace_entry?.resume_empty_title}
</div>
<p class="mt-1">
{$t.dataset_review?.workspace_entry?.resume_empty_body}
</p>
</div>
{:else}
<div class="mt-5 space-y-3">
{#each sessions as session}
{@const status = getSessionMetaValue(session, "status", "unknown")}
<article class="group relative rounded-xl border border-slate-200 bg-white p-4 shadow-sm transition hover:shadow-md">
<!-- Delete Confirmation Modal -->
{#if deleteConfirmSessionId === session.session_id}
<div class="absolute inset-0 z-10 flex items-center justify-center rounded-xl bg-white/95 backdrop-blur-sm">
<div class="text-center p-4">
<p class="text-sm font-medium text-slate-900 mb-3">Удалить сессию?</p>
<div class="flex gap-2 justify-center">
<button
type="button"
class="inline-flex items-center justify-center rounded-lg bg-red-600 px-3 py-1.5 text-xs font-medium text-white transition hover:bg-red-700"
onclick={() => deleteSession(session.session_id, false)}
>
В архив
</button>
<button
type="button"
class="inline-flex items-center justify-center rounded-lg border border-red-300 bg-white px-3 py-1.5 text-xs font-medium text-red-700 transition hover:bg-red-50"
onclick={() => deleteSession(session.session_id, true)}
>
Удалить навсегда
</button>
<button
type="button"
class="inline-flex items-center justify-center rounded-lg border border-slate-300 bg-white px-3 py-1.5 text-xs font-medium text-slate-700 transition hover:bg-slate-50"
onclick={() => deleteConfirmSessionId = null}
>
Отмена
</button>
</div>
</div>
</div>
{/if}
<div class="flex flex-col gap-3 lg:flex-row lg:items-start lg:justify-between">
<div class="min-w-0 flex-1">
<div class="flex flex-wrap items-center gap-2">
<h3 class="text-sm font-semibold text-slate-900" title={session.session_id}>
{truncateText(session.session_id, 20)}
</h3>
<span class="rounded-full px-2.5 py-1 text-xs font-medium"
class:bg-green-100={status === "active"}
class:text-green-700={status === "active"}
class:bg-yellow-100={status === "paused"}
class:text-yellow-700={status === "paused"}
class:bg-slate-100={status !== "active" && status !== "paused"}
class:text-slate-700={status !== "active" && status !== "paused"}
>
{status}
</span>
<span class="rounded-full bg-blue-50 px-2.5 py-1 text-xs font-medium text-blue-700">
{getSessionMetaValue(session, "readiness_state", $t.common?.unknown || "unknown")}
</span>
<span class="rounded-full bg-purple-50 px-2.5 py-1 text-xs font-medium text-purple-700">
{getSessionMetaValue(session, "current_phase", $t.common?.unknown || "unknown")}
</span>
</div>
<div class="mt-3 grid gap-3 sm:grid-cols-2">
<div class="rounded-lg bg-slate-50 px-3 py-2">
<div class="text-xs uppercase tracking-wide text-slate-500">
{$t.dataset_review?.workspace_entry?.dataset_ref_label}
</div>
<div class="mt-1 text-sm font-medium text-slate-900" title={session.dataset_ref}>
{truncateText(session.dataset_ref, 35) || $t.common?.not_available || "N/A"}
</div>
</div>
<div class="rounded-lg bg-slate-50 px-3 py-2">
<div class="text-xs uppercase tracking-wide text-slate-500">
Окружение
</div>
<div class="mt-1 text-sm font-medium text-slate-900">
{session.environment_id || $t.common?.not_available || "N/A"}
</div>
</div>
<div class="rounded-lg bg-slate-50 px-3 py-2">
<div class="text-xs uppercase tracking-wide text-slate-500">
Обновлено
</div>
<div class="mt-1 text-sm font-medium text-slate-900">
{formatDateTime(session.updated_at)}
</div>
</div>
<div class="rounded-lg bg-slate-50 px-3 py-2">
<div class="text-xs uppercase tracking-wide text-slate-500">
Активность
</div>
<div class="mt-1 text-sm font-medium text-slate-900">
{formatDateTime(session.last_activity_at)}
</div>
</div>
</div>
</div>
<div class="flex flex-col gap-2 lg:pl-4">
<a
class="inline-flex items-center justify-center rounded-xl bg-blue-600 px-4 py-2 text-sm font-medium text-white transition hover:bg-blue-700"
href={buildSessionUrl(session.session_id)}
>
<svg class="mr-2 h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M14.752 11.168l-3.197-2.132A1 1 0 0010 9.87v4.263a1 1 0 001.555.832l3.197-2.132a1 1 0 000-1.664z"></path>
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 12a9 9 0 11-18 0 9 9 0 0118 0z"></path>
</svg>
{$t.dataset_review?.workspace_entry?.resume_action}
</a>
<button
type="button"
class="inline-flex items-center justify-center rounded-xl border border-red-200 bg-white px-4 py-2 text-sm font-medium text-red-600 transition hover:bg-red-50"
onclick={() => deleteConfirmSessionId = session.session_id}
>
<svg class="mr-2 h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16"></path>
</svg>
Удалить
</button>
</div>
</div>
</article>
{/each}
</div>
<!-- Pagination -->
{#if totalPages > 1}
<div class="mt-6 flex items-center justify-between border-t border-slate-200 pt-4">
<div class="text-sm text-slate-600">
Страница {currentPage} из {totalPages} ({totalSessions} всего)
</div>
<div class="flex gap-2">
<button
type="button"
class="inline-flex items-center justify-center rounded-lg border border-slate-300 bg-white px-3 py-2 text-sm font-medium text-slate-700 transition hover:bg-slate-50 disabled:opacity-50 disabled:cursor-not-allowed"
onclick={() => goToPage(currentPage - 1)}
disabled={currentPage <= 1}
>
<svg class="mr-1 h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7"></path>
</svg>
Назад
</button>
<div class="flex gap-1">
{#each Array.from({length: Math.min(5, totalPages)}, (_, i) => {
let start = Math.max(1, currentPage - 2);
let end = Math.min(totalPages, start + 4);
if (end - start < 4) start = Math.max(1, end - 4);
return start + i;
}).filter(p => p <= totalPages) as pageNum}
<button
type="button"
class="inline-flex h-9 w-9 items-center justify-center rounded-lg text-sm font-medium transition"
class:bg-blue-600={pageNum === currentPage}
class:text-white={pageNum === currentPage}
class:bg-white={pageNum !== currentPage}
class:text-slate-700={pageNum !== currentPage}
class:border={pageNum !== currentPage}
class:border-slate-300={pageNum !== currentPage}
class:hover:bg-slate-50={pageNum !== currentPage}
onclick={() => goToPage(pageNum)}
>
{pageNum}
</button>
{/each}
</div>
<button
type="button"
class="inline-flex items-center justify-center rounded-lg border border-slate-300 bg-white px-3 py-2 text-sm font-medium text-slate-700 transition hover:bg-slate-50 disabled:opacity-50 disabled:cursor-not-allowed"
onclick={() => goToPage(currentPage + 1)}
disabled={currentPage >= totalPages}
>
Вперед
<svg class="ml-1 h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7"></path>
</svg>
</button>
</div>
</div>
{/if}
{/if}
</section>
<div class="space-y-5">
<SourceIntakePanel
environments={environments}
selectedEnvironmentId={selectedEnvironmentId}
submitting={isSubmitting}
acknowledgment={intakeAcknowledgment}
onsubmit={handleSourceSubmit}
/>
{#if submitError}
<div class="rounded-2xl border border-red-200 bg-red-50 px-4 py-4 text-sm text-red-700">
{submitError}
</div>
{/if}
</div>
{/if}
</div>
{/if}
</div>

View File

@@ -533,8 +533,8 @@
bootstrapWorkspace();
</script>
<div class="mx-auto w-full max-w-7xl space-y-5 px-4 py-6">
<div class="flex flex-col gap-3 lg:flex-row lg:items-start lg:justify-between">
<div class="mx-auto w-full max-w-[96rem] space-y-6 px-4 py-6 xl:px-6">
<div class="flex flex-col gap-4 lg:flex-row lg:items-start lg:justify-between">
<div>
<p class="text-xs font-semibold uppercase tracking-[0.2em] text-blue-700">
{$t.dataset_review?.workspace?.eyebrow}
@@ -647,7 +647,10 @@
</div>
{/if}
{:else}
<div class="grid gap-5 xl:grid-cols-[minmax(18rem,0.8fr)_minmax(0,1.4fr)_minmax(18rem,0.9fr)]">
<div
data-testid="workspace-detail-grid"
class="grid items-start gap-6 xl:grid-cols-[minmax(16rem,0.72fr)_minmax(0,1.7fr)_minmax(22rem,1.08fr)]"
>
<div class="space-y-5">
<section
id="intake"
@@ -661,7 +664,10 @@
<div class="text-xs uppercase tracking-wide text-slate-500">
{$t.dataset_review?.workspace?.source_label}
</div>
<div class="mt-1 text-sm font-medium text-slate-900">
<div
data-testid="workspace-source-value"
class="mt-1 max-w-full text-sm font-medium leading-6 text-slate-900 break-words [overflow-wrap:anywhere]"
>
{session.source_input || session.dataset_ref}
</div>
</div>

View File

@@ -487,7 +487,7 @@ describe("DatasetReviewWorkspace UX Contract", () => {
await waitFor(() => {
expect(screen.getByText("Start dataset review")).toBeDefined();
});
expect(screen.getByText("Workspace state: Empty")).toBeDefined();
expect(screen.getByText("Workspace state: empty")).toBeDefined();
expect(screen.getByText("Paste link or provide dataset reference.")).toBeDefined();
});
@@ -507,7 +507,7 @@ describe("DatasetReviewWorkspace UX Contract", () => {
expect(screen.getByText("Business summary")).toBeDefined();
});
expect(screen.getByText("Workspace state: Review")).toBeDefined();
expect(screen.getByText("Workspace state: review")).toBeDefined();
expect(screen.getByText("Readiness: Partially ready")).toBeDefined();
expect(screen.getAllByText("Sales Dataset").length).toBeGreaterThan(0);
expect(screen.getAllByText("Recovered filters").length).toBeGreaterThan(0);
@@ -605,7 +605,7 @@ describe("DatasetReviewWorkspace UX Contract", () => {
});
await waitFor(() => {
expect(screen.getByText("Workspace state: Importing")).toBeDefined();
expect(screen.getByText("Workspace state: importing")).toBeDefined();
});
expect(screen.getByText("Import progress")).toBeDefined();
expect(screen.getByText("Recovering dataset context progressively")).toBeDefined();
@@ -619,7 +619,7 @@ describe("DatasetReviewWorkspace UX Contract", () => {
});
await waitFor(() => {
expect(screen.getByText("Workspace state: Review")).toBeDefined();
expect(screen.getByText("Workspace state: review")).toBeDefined();
});
expect(screen.getByText("Readiness: Partially ready")).toBeDefined();
expect(screen.getAllByText("Review documentation").length).toBeGreaterThan(0);
@@ -656,11 +656,76 @@ describe("DatasetReviewWorkspace UX Contract", () => {
expect(screen.getAllByText("documentation • json • artifacts/doc.json").length).toBeGreaterThan(0);
});
const exportFeedbackText = screen.getAllByText(
"documentation • json • artifacts/doc.json",
)[0];
expect(exportFeedbackText.textContent).toContain("artifacts/doc.json");
await fireEvent.click(documentationMarkdownButton);
await waitFor(() => {
expect(screen.getAllByText("Export unavailable").length).toBeGreaterThan(0);
});
});
it("detail_layout_and_long_diagnostics_render_without_unbounded_text", async () => {
routeState.id = "session-1";
api.fetchApi
.mockResolvedValueOnce(
createSessionDetail({
source_input:
"https://superset.local/dashboard/10?native_filters=" +
"x".repeat(180),
findings: [
{
finding_id: "finding-1",
title: "Unresolved semantic reference chain",
code: "SEM-LONG-1",
area: "dataset_profile",
severity: "blocking",
message:
"Missing semantic resolution for " +
"very.long.reference.".repeat(18),
resolution_state: "open",
caused_by_ref: "semantic.layer.".repeat(14),
resolution_note:
"Raw backend trace: " + "resolver::unbound_column::".repeat(18),
},
],
previews: [
{
preview_id: "preview-1",
preview_status: "failed",
compiled_by: "superset",
error_code: "PREVIEW_COMPILE_FAILED",
error_details:
"Compilation traceback: " + "missing_template_value::".repeat(24),
preview_fingerprint: "fingerprint-" + "abc123".repeat(12),
},
],
}),
)
.mockResolvedValueOnce(createClarificationState());
render(DatasetReviewWorkspace);
await waitFor(() => {
expect(screen.getByTestId("workspace-detail-grid")).toBeDefined();
});
const layoutGrid = screen.getByTestId("workspace-detail-grid");
expect(layoutGrid.className).toContain("xl:grid-cols-[minmax(16rem,0.72fr)_minmax(0,1.7fr)_minmax(22rem,1.08fr)]");
const sourceValue = screen.getByTestId("workspace-source-value");
expect(sourceValue.className).toContain("break-words");
expect(screen.getAllByText(/PREVIEW_COMPILE_FAILED|missing_template_value/).length).toBeGreaterThan(0);
const blockerDetails = screen.getAllByTestId("launch-blocker-detail");
expect(
blockerDetails.some((item) => item.textContent?.includes("missing_template_value")),
).toBe(true);
expect(blockerDetails[0].className).toContain("whitespace-pre-wrap");
});
});
// [/DEF:DatasetReviewWorkspaceUxTests:Module]

View File

@@ -0,0 +1,288 @@
/**
* @vitest-environment jsdom
*/
// @ts-nocheck
// [DEF:DatasetReviewEntryUxTests:Module]
// @COMPLEXITY: 3
// @SEMANTICS: dataset-review, workspace-entry, resume, source-intake, ux-tests
// @PURPOSE: Verify dataset review entry route exposes resumable sessions alongside the new session intake flow.
// @LAYER: UI
// @RELATION: [VERIFIES] ->[DatasetReviewWorkspaceEntry:Page]
// @UX_STATE: Loading -> workspace loader is shown before bootstrap completes.
// @UX_STATE: ResumeList -> existing sessions render with summary fields and resume links.
// @UX_STATE: ResumeEmpty -> empty-state copy is shown when no sessions are returned.
// @UX_STATE: ResumeError -> inline error and retry action remain visible without removing new session flow.
// @TEST_SCENARIO: renders_resume_sessions_and_new_session_intake
// @TEST_SCENARIO: renders_empty_resume_state_when_no_sessions_exist
// @TEST_SCENARIO: renders_resume_error_and_retry
// @TEST_SCENARIO: submits_new_session_without_regression
import { beforeEach, describe, expect, it, vi } from "vitest";
import { fireEvent, render, screen, waitFor } from "@testing-library/svelte";
import DatasetReviewEntryPage from "../+page.svelte";
import { api } from "$lib/api.js";
const mockedGoto = vi.mocked(await import("$app/navigation")).goto;
function createSessionSummary(overrides = {}) {
return {
session_id: "session-1",
user_id: "user-1",
environment_id: "env-1",
source_kind: "superset_link",
source_input: "https://superset.local/dashboard/10",
dataset_ref: "public.sales",
dataset_id: 101,
readiness_state: "partially_ready",
recommended_action: "resume_session",
status: "paused",
current_phase: "review",
created_at: "2026-03-18T06:00:00.000Z",
updated_at: "2026-03-18T06:10:00.000Z",
last_activity_at: "2026-03-18T06:15:00.000Z",
...overrides,
};
}
vi.mock("$app/navigation", () => ({
goto: vi.fn(),
}));
vi.mock("$lib/i18n", () => ({
t: {
subscribe: (fn) => {
fn({
common: {
error: "Common error",
refresh: "Refresh",
retry: "Retry",
choose_environment: "Choose environment",
not_available: "N/A",
},
dataset_review: {
source: {
eyebrow: "Source intake",
title: "Start dataset review",
description: "Paste link or provide dataset reference.",
state_idle: "Idle",
state_validating: "Validating",
state_rejected: "Rejected",
environment_label: "Environment",
environment_required: "Environment is required",
superset_link_tab: "Superset link",
superset_link_tab_hint: "Paste dashboard or explore URL",
dataset_selection_tab: "Dataset selection",
dataset_selection_tab_hint: "Enter dataset ref",
superset_link_label: "Superset link",
dataset_selection_label: "Dataset reference",
superset_link_placeholder: "https://superset.local/dashboard/10",
dataset_selection_placeholder: "public.sales",
superset_link_hint: "Paste a full Superset URL",
dataset_selection_hint: "Provide schema.dataset reference",
recognized_link_hint: "Recognized Superset link",
superset_link_required: "Superset link is required",
dataset_selection_required: "Dataset reference is required",
superset_link_invalid: "Superset link must start with http",
submit_failed: "Submit failed",
superset_link_recovery_note: "You can fix the link inline",
dataset_selection_recovery_note: "You can fix the dataset inline",
submitting: "Submitting",
submit_superset_link: "Start from link",
submit_dataset_selection: "Start from dataset",
dataset_selection_acknowledged: "Dataset selection acknowledged",
},
workspace_entry: {
resume_eyebrow: "Resume session",
resume_title: "Continue an existing review session",
resume_description:
"If a dataset review was already started, open the relevant session and continue from its latest saved state.",
resume_available_badge: "Sessions available",
resume_loading: "Loading existing sessions...",
resume_load_failed: "Failed to load existing review sessions.",
resume_empty_title: "No saved sessions yet",
resume_empty_body:
"Start a new review session from the source intake panel to make it available here for reopening later.",
resume_action: "Resume",
session_id_label: "Session ID",
dataset_ref_label: "Dataset reference",
environment_id_label: "Environment",
status_label: "Status",
readiness_label: "Readiness",
phase_label: "Phase",
updated_at_label: "Updated",
last_activity_at_label: "Last activity",
},
workspace: {
eyebrow: "Dataset review",
title: "Dataset review workspace",
description: "Review imported dataset context.",
state_label: "Workspace state",
loading: "Loading workspace",
state: {
empty: "Empty",
},
},
},
});
return () => {};
},
},
}));
vi.mock("$lib/api.js", () => ({
api: {
fetchApi: vi.fn(),
postApi: vi.fn(),
requestApi: vi.fn(),
},
}));
vi.mock("$lib/stores/environmentContext.js", () => ({
environmentContextStore: {
subscribe: (run) => {
run({
environments: [{ id: "env-1", name: "DEV" }],
selectedEnvId: "env-1",
});
return () => {};
},
},
initializeEnvironmentContext: vi.fn().mockResolvedValue(undefined),
}));
describe("DatasetReviewWorkspaceEntry UX Contract", () => {
beforeEach(() => {
vi.clearAllMocks();
api.fetchApi.mockReset();
api.postApi.mockReset();
mockedGoto.mockReset();
});
it("renders_resume_sessions_and_new_session_intake", async () => {
api.fetchApi.mockResolvedValue({
items: [
createSessionSummary(),
createSessionSummary({
session_id: "session-2",
dataset_ref: "analytics.daily_margin",
environment_id: "env-2",
status: "active",
readiness_state: "review_ready",
current_phase: "semantic_review",
}),
],
total: 2,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(api.fetchApi).toHaveBeenCalledWith("/dataset-orchestration/sessions");
});
expect(screen.getByText("Continue an existing review session")).toBeDefined();
expect(screen.getByText("Start dataset review")).toBeDefined();
expect(screen.getByText("Sessions available: 2")).toBeDefined();
expect(screen.getByText("Session ID: session-1")).toBeDefined();
expect(screen.getByText("public.sales")).toBeDefined();
expect(screen.getByText("analytics.daily_margin")).toBeDefined();
const resumeLinks = screen.getAllByRole("link", { name: "Resume" });
expect(resumeLinks).toHaveLength(2);
expect(resumeLinks[0].getAttribute("href")).toBe("/datasets/review/session-1");
});
it("renders_empty_resume_state_when_no_sessions_exist", async () => {
api.fetchApi.mockResolvedValue({
items: [],
total: 0,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("No saved sessions yet")).toBeDefined();
});
expect(
screen.getByText(
"Start a new review session from the source intake panel to make it available here for reopening later.",
),
).toBeDefined();
expect(screen.getByText("Start dataset review")).toBeDefined();
});
it("renders_resume_error_and_retry", async () => {
api.fetchApi.mockRejectedValueOnce(new Error("Session list failed"));
api.fetchApi.mockResolvedValueOnce({
items: [createSessionSummary()],
total: 1,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("Session list failed")).toBeDefined();
});
await fireEvent.click(screen.getByRole("button", { name: "Retry" }));
await waitFor(() => {
expect(api.fetchApi).toHaveBeenCalledTimes(2);
});
await waitFor(() => {
expect(screen.getByText("Session ID: session-1")).toBeDefined();
});
expect(screen.getByText("Start dataset review")).toBeDefined();
});
it("submits_new_session_without_regression", async () => {
api.fetchApi.mockResolvedValue({
items: [createSessionSummary()],
total: 1,
page: 1,
page_size: 20,
has_next: false,
});
api.postApi.mockResolvedValue({ session_id: "created-session" });
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("Start dataset review")).toBeDefined();
});
const environmentSelect = screen.getByRole("combobox");
const sourceInput = screen.getByPlaceholderText("https://superset.local/dashboard/10");
const submitButton = screen.getByRole("button", { name: "Start from link" });
await fireEvent.change(environmentSelect, { target: { value: "env-1" } });
await fireEvent.input(sourceInput, {
target: { value: "https://superset.local/dashboard/10" },
});
await fireEvent.click(submitButton);
await waitFor(() => {
expect(api.postApi).toHaveBeenCalledWith("/dataset-orchestration/sessions", {
source_kind: "superset_link",
source_input: "https://superset.local/dashboard/10",
environment_id: "env-1",
});
});
expect(mockedGoto).toHaveBeenCalledWith("/datasets/review/created-session");
});
});
// [/DEF:DatasetReviewEntryUxTests:Module]

View File

@@ -0,0 +1,282 @@
/**
* @vitest-environment jsdom
*/
// @ts-nocheck
// [DEF:DatasetReviewEntryUxTests:Module]
// @COMPLEXITY: 3
// @SEMANTICS: dataset-review, workspace-entry, resume, source-intake, ux-tests
// @PURPOSE: Verify dataset review entry route exposes resumable sessions alongside the new session intake flow.
// @LAYER: UI
// @RELATION: [VERIFIES] ->[DatasetReviewWorkspaceEntry:Page]
// @UX_STATE: Loading -> workspace loader is shown before bootstrap completes.
// @UX_STATE: ResumeList -> existing sessions render with summary fields and resume links.
// @UX_STATE: ResumeEmpty -> empty-state copy is shown when no sessions are returned.
// @UX_STATE: ResumeError -> inline error and retry action remain visible without removing new session flow.
// @TEST_SCENARIO: renders_resume_sessions_and_new_session_intake
// @TEST_SCENARIO: renders_empty_resume_state_when_no_sessions_exist
// @TEST_SCENARIO: renders_resume_error_and_retry
// @TEST_SCENARIO: submits_new_session_without_regression
import { beforeEach, describe, expect, it, vi } from "vitest";
import { fireEvent, render, screen, waitFor } from "@testing-library/svelte";
import DatasetReviewEntryPage from "../+page.svelte";
import { api } from "$lib/api.js";
function createSessionSummary(overrides = {}) {
return {
session_id: "session-1",
user_id: "user-1",
environment_id: "env-1",
source_kind: "superset_link",
source_input: "https://superset.local/dashboard/10",
dataset_ref: "public.sales",
dataset_id: 101,
readiness_state: "partially_ready",
recommended_action: "resume_session",
status: "paused",
current_phase: "review",
created_at: "2026-03-18T06:00:00.000Z",
updated_at: "2026-03-18T06:10:00.000Z",
last_activity_at: "2026-03-18T06:15:00.000Z",
...overrides,
};
}
vi.mock("$lib/i18n", () => ({
t: {
subscribe: (fn) => {
fn({
common: {
error: "Common error",
refresh: "Refresh",
retry: "Retry",
choose_environment: "Choose environment",
not_available: "N/A",
},
dataset_review: {
source: {
eyebrow: "Source intake",
title: "Start dataset review",
description: "Paste link or provide dataset reference.",
state_idle: "Idle",
state_validating: "Validating",
state_rejected: "Rejected",
environment_label: "Environment",
environment_required: "Environment is required",
superset_link_tab: "Superset link",
superset_link_tab_hint: "Paste dashboard or explore URL",
dataset_selection_tab: "Dataset selection",
dataset_selection_tab_hint: "Enter dataset ref",
superset_link_label: "Superset link",
dataset_selection_label: "Dataset reference",
superset_link_placeholder: "https://superset.local/dashboard/10",
dataset_selection_placeholder: "public.sales",
superset_link_hint: "Paste a full Superset URL",
dataset_selection_hint: "Provide schema.dataset reference",
recognized_link_hint: "Recognized Superset link",
superset_link_required: "Superset link is required",
dataset_selection_required: "Dataset reference is required",
superset_link_invalid: "Superset link must start with http",
submit_failed: "Submit failed",
superset_link_recovery_note: "You can fix the link inline",
dataset_selection_recovery_note: "You can fix the dataset inline",
submitting: "Submitting",
submit_superset_link: "Start from link",
submit_dataset_selection: "Start from dataset",
dataset_selection_acknowledged: "Dataset selection acknowledged",
},
workspace_entry: {
resume_eyebrow: "Resume session",
resume_title: "Continue an existing review session",
resume_description:
"If a dataset review was already started, open the relevant session and continue from its latest saved state.",
resume_available_badge: "Sessions available",
resume_loading: "Loading existing sessions...",
resume_load_failed: "Failed to load existing review sessions.",
resume_empty_title: "No saved sessions yet",
resume_empty_body:
"Start a new review session from the source intake panel to make it available here for reopening later.",
resume_action: "Resume",
session_id_label: "Session ID",
dataset_ref_label: "Dataset reference",
environment_id_label: "Environment",
status_label: "Status",
readiness_label: "Readiness",
phase_label: "Phase",
updated_at_label: "Updated",
last_activity_at_label: "Last activity",
},
workspace: {
eyebrow: "Dataset review",
title: "Dataset review workspace",
description: "Review imported dataset context.",
state_label: "Workspace state",
loading: "Loading workspace",
state: {
empty: "Empty",
},
},
},
});
return () => {};
},
},
}));
vi.mock("$lib/api.js", () => ({
api: {
fetchApi: vi.fn(),
postApi: vi.fn(),
requestApi: vi.fn(),
},
}));
vi.mock("$lib/stores/environmentContext.js", () => ({
environmentContextStore: {
subscribe: (run) => {
run({
environments: [{ id: "env-1", name: "DEV" }],
selectedEnvId: "env-1",
});
return () => {};
},
},
initializeEnvironmentContext: vi.fn().mockResolvedValue(undefined),
}));
describe("DatasetReviewWorkspaceEntry UX Contract", () => {
beforeEach(() => {
vi.clearAllMocks();
api.fetchApi.mockReset();
api.postApi.mockReset();
window.location.href = "";
});
it("renders_resume_sessions_and_new_session_intake", async () => {
api.fetchApi.mockResolvedValue({
items: [
createSessionSummary(),
createSessionSummary({
session_id: "session-2",
dataset_ref: "analytics.daily_margin",
environment_id: "env-2",
status: "active",
readiness_state: "review_ready",
current_phase: "semantic_review",
}),
],
total: 2,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(api.fetchApi).toHaveBeenCalledWith("/dataset-orchestration/sessions");
});
expect(screen.getByText("Continue an existing review session")).toBeDefined();
expect(screen.getByText("Start dataset review")).toBeDefined();
expect(screen.getByText("Sessions available: 2")).toBeDefined();
expect(screen.getByText("Session ID: session-1")).toBeDefined();
expect(screen.getByText("public.sales")).toBeDefined();
expect(screen.getByText("analytics.daily_margin")).toBeDefined();
const resumeLinks = screen.getAllByRole("link", { name: "Resume" });
expect(resumeLinks).toHaveLength(2);
expect(resumeLinks[0].getAttribute("href")).toBe("/datasets/review/session-1");
});
it("renders_empty_resume_state_when_no_sessions_exist", async () => {
api.fetchApi.mockResolvedValue({
items: [],
total: 0,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("No saved sessions yet")).toBeDefined();
});
expect(
screen.getByText(
"Start a new review session from the source intake panel to make it available here for reopening later.",
),
).toBeDefined();
expect(screen.getByText("Start dataset review")).toBeDefined();
});
it("renders_resume_error_and_retry", async () => {
api.fetchApi.mockRejectedValueOnce(new Error("Session list failed"));
api.fetchApi.mockResolvedValueOnce({
items: [createSessionSummary()],
total: 1,
page: 1,
page_size: 20,
has_next: false,
});
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("Session list failed")).toBeDefined();
});
await fireEvent.click(screen.getByRole("button", { name: "Retry" }));
await waitFor(() => {
expect(api.fetchApi).toHaveBeenCalledTimes(2);
});
await waitFor(() => {
expect(screen.getByText("Session ID: session-1")).toBeDefined();
});
expect(screen.getByText("Start dataset review")).toBeDefined();
});
it("submits_new_session_without_regression", async () => {
api.fetchApi.mockResolvedValue({
items: [createSessionSummary()],
total: 1,
page: 1,
page_size: 20,
has_next: false,
});
api.postApi.mockResolvedValue({ session_id: "created-session" });
render(DatasetReviewEntryPage);
await waitFor(() => {
expect(screen.getByText("Start dataset review")).toBeDefined();
});
const environmentSelect = screen.getByRole("combobox");
const sourceInput = screen.getByPlaceholderText("https://superset.local/dashboard/10");
const submitButton = screen.getByRole("button", { name: "Start from link" });
await fireEvent.change(environmentSelect, { target: { value: "env-1" } });
await fireEvent.input(sourceInput, {
target: { value: "https://superset.local/dashboard/10" },
});
await fireEvent.click(submitButton);
await waitFor(() => {
expect(api.postApi).toHaveBeenCalledWith("/dataset-orchestration/sessions", {
source_kind: "superset_link",
source_input: "https://superset.local/dashboard/10",
environment_id: "env-1",
});
});
expect(window.location.href).toBe("/datasets/review/created-session");
});
});
// [/DEF:DatasetReviewEntryUxTests:Module]

27
kilo.json Normal file
View File

@@ -0,0 +1,27 @@
{
"$schema": "https://app.kilo.ai/config.json",
"agent": {
"subagent-orchestrator": {
"description": "Primary user-facing fast dispatcher that routes requests only to approved project subagents. Use this as a lightweight orchestrator when you want subagent-only delegation and no direct execution by full agents.",
"mode": "primary",
"model": "github-copilot/gpt-5.1-codex-mini",
"temperature": 0.0,
"steps": 8,
"prompt": "{file:./.kilo/agent/subagent-orchestrator.md}",
"permission": {
"edit": "deny",
"bash": "deny",
"browser": "deny",
"task": {
"*": "deny",
"product-manager": "allow",
"coder": "allow",
"semantic": "allow",
"tester": "allow",
"reviewer-agent-auditor": "allow",
"semantic-implementer": "allow"
}
}
}
}
}