Files
2026-03-10 12:00:18 +03:00

110 lines
5.9 KiB
Markdown

# UX Reference: Dashboard Health Windows
**Feature Branch**: `026-dashboard-health-windows`
**Created**: 2026-03-10
**Status**: Phase 6 Verified
## 1. User Persona & Context
* **Who is the user?**: Data Engineer, System Administrator, or Data Analyst (Data Team).
* **What is their goal?**: Schedule automated LLM checks for many dashboards without overloading the database, and quickly monitor which dashboards are currently broken.
* **Context**: Managing dozens or hundreds of dashboards across environments (e.g., production, dev) via a web interface, and needing a high-level overview of system health every morning.
## 2. The "Happy Path" Narrative
The user navigates to "Settings -> Automation" and creates a new validation policy for 15 critical production dashboards. Instead of writing complex cron jobs, they simply set an "Execution Window" from 01:00 to 05:00. They check the box to "Auto-notify Owners". The system automatically spaces out the 15 checks within those 4 hours.
During the night, a dashboard owned by the user fails. The user immediately receives a Telegram message: "🚨 Validation Failed: Sales Executive (Broken Chart). [View in ss-tools]".
In the morning, the user opens the app and immediately notices a red badge `[2🔴]` next to "Dashboards" in the sidebar. They click it to open the "Health Center", which displays a clear "traffic light" summary: 13 Green, 0 Yellow, 2 Red. The two broken dashboards are listed at the top with a summary of the issues. They can also just ask the AI Assistant: "Which dashboards failed last night?" and get a direct link.
## 3. Interface Mockups
### UI Layout & Flow
**Screen/Component**: Automation Settings (Validation Policies)
* **Layout**: List view of existing policies with a "Create Rule" modal.
* **Key Elements**:
* **Policy List**: Shows Active/Paused status, target count, schedule window, and active notification channels.
* **Create Modal**:
* Scope selection (Environment dropdown, Filter by selected/tags/all).
* Schedule configuration (Days of week checkboxes, Start/End time for Execution Window).
* Alerts configuration:
* Trigger condition (e.g., "Only FAIL").
* Toggle: "Auto-notify Owners (Email / Messenger)".
* Add custom Channels (Slack webhook, General Email).
* **States**:
* **Info Message**: When selecting times, a helpful hint appears: "💡 System will automatically distribute 15 checks within this 4-hour window to avoid peak database load."
**Screen/Component**: Dashboard Health Center
* **Layout**: Top summary cards (Traffic lights), followed by a data table.
* **Key Elements**:
* **Environment Selector**: Dropdown to switch between `ss-prod`, `ss-dev`, etc.
* **Health Matrix Summary**: `🟢 145 Pass | 🟡 5 Warn | 🔴 2 Fail`
* **Data Table**: Columns for Dashboard Name, Status (Badge), Detected Issues (Text), Checked At (Time), and a "View Report" action button.
* **Sidebar Badge**: The main navigation sidebar has a badge `[2🔴]` next to the Dashboards menu item.
**Screen/Component**: Global Settings -> Notifications
* **Layout**: Form fields grouped by provider type (Email, Slack, Telegram).
* **Key Elements**:
* **Email (SMTP)**: Host, Port, Login, Password, "From" Address.
* **Slack**: Webhook URL list.
* **Telegram**: Bot Token.
### Notification Payloads
**Email Example (Rich HTML)**:
* **Subject**: 🔴 [ss-tools] Ошибка валидации: Sales Executive Dashboard
* **Body**: 🔴 **FAIL:** Sales Executive (Env: ss-prod). "Сломан чарт Revenue YTD. Ошибка: колонка profit не найдена." [Embedded Screenshot] -> [Button: View Full Report].
**Messenger Example (Compact Text)**:
```text
🚨 **Dashboard Validation Failed**
**Dashboard:** [Sales Executive](link)
**Env:** ss-prod | **Time:** 03:15 AM
🤖 **LLM Summary:**
Обнаружена SQL-ошибка в чарте "Revenue". Данные не отображаются.
🔗 [View in ss-tools]
```
### Chat Assistant Interaction
```text
User: "Show dashboards that failed the night check"
Assistant: "I found 2 dashboards that failed their last validation in production:
1. 🔴 Sales Executive (Issues: Broken Chart, Timeout)
2. 🔴 Marketing Funnel (Issues: SQL Syntax Error)
[View detailed reports in Health Center]"
```
## 4. The "Error" Experience
**Philosophy**: Guide the user to resolve failing dashboard validations quickly and prevent misconfiguration of schedules.
### Scenario A: Execution Window Too Small
* **User Action**: Selects an execution window of 5 minutes for 100 dashboards.
* **System Response**:
* (UI) Warning message appears below the time selection: "⚠️ The selected window is too narrow for 100 dashboards. This may cause database spikes. Recommended window: at least 60 minutes."
* **Recovery**: User adjusts the end time to broaden the window before saving.
### Scenario B: Dashboard Fails Validation
* **User Action**: Navigates to Health Center to view a failed dashboard.
* **System Response**: Row is highlighted in red. Clicking "View Report" takes them directly to the detailed LLM analysis page showing the screenshot and specific errors.
## 5. Tone & Voice
* **Style**: Professional, monitoring-focused, reassuring.
* **Terminology**: Use "Execution Window" for time ranges, "Health Center" for the monitoring view, and "Policies" for automated rules. Avoid overly technical scheduling terms like "Cron expressions" in the UI.
## 6. Phase 6 Implementation Verification
- [x] **T029**: Profile UI updated with Telegram ID and Email Override. Matches "Smart Notifications" section in User Preferences.
- [x] **T030/T031**: Backend routing logic implemented. Supports owner routing and custom channels.
- [x] **T032**: Dispatch wired into LLM analysis plugin.
- [x] **T033**: Global Notification Settings UI implemented with SMTP, Telegram, and Slack sections.