# UX Reference: Dashboard Health Windows **Feature Branch**: `026-dashboard-health-windows` **Created**: 2026-03-10 **Status**: Phase 6 Verified ## 1. User Persona & Context * **Who is the user?**: Data Engineer, System Administrator, or Data Analyst (Data Team). * **What is their goal?**: Schedule automated LLM checks for many dashboards without overloading the database, and quickly monitor which dashboards are currently broken. * **Context**: Managing dozens or hundreds of dashboards across environments (e.g., production, dev) via a web interface, and needing a high-level overview of system health every morning. ## 2. The "Happy Path" Narrative The user navigates to "Settings -> Automation" and creates a new validation policy for 15 critical production dashboards. Instead of writing complex cron jobs, they simply set an "Execution Window" from 01:00 to 05:00. They check the box to "Auto-notify Owners". The system automatically spaces out the 15 checks within those 4 hours. During the night, a dashboard owned by the user fails. The user immediately receives a Telegram message: "🚨 Validation Failed: Sales Executive (Broken Chart). [View in ss-tools]". In the morning, the user opens the app and immediately notices a red badge `[2🔴]` next to "Dashboards" in the sidebar. They click it to open the "Health Center", which displays a clear "traffic light" summary: 13 Green, 0 Yellow, 2 Red. The two broken dashboards are listed at the top with a summary of the issues. They can also just ask the AI Assistant: "Which dashboards failed last night?" and get a direct link. ## 3. Interface Mockups ### UI Layout & Flow **Screen/Component**: Automation Settings (Validation Policies) * **Layout**: List view of existing policies with a "Create Rule" modal. * **Key Elements**: * **Policy List**: Shows Active/Paused status, target count, schedule window, and active notification channels. * **Create Modal**: * Scope selection (Environment dropdown, Filter by selected/tags/all). * Schedule configuration (Days of week checkboxes, Start/End time for Execution Window). * Alerts configuration: * Trigger condition (e.g., "Only FAIL"). * Toggle: "Auto-notify Owners (Email / Messenger)". * Add custom Channels (Slack webhook, General Email). * **States**: * **Info Message**: When selecting times, a helpful hint appears: "💡 System will automatically distribute 15 checks within this 4-hour window to avoid peak database load." **Screen/Component**: Dashboard Health Center * **Layout**: Top summary cards (Traffic lights), followed by a data table. * **Key Elements**: * **Environment Selector**: Dropdown to switch between `ss-prod`, `ss-dev`, etc. * **Health Matrix Summary**: `🟢 145 Pass | 🟡 5 Warn | 🔴 2 Fail` * **Data Table**: Columns for Dashboard Name, Status (Badge), Detected Issues (Text), Checked At (Time), and a "View Report" action button. * **Sidebar Badge**: The main navigation sidebar has a badge `[2🔴]` next to the Dashboards menu item. **Screen/Component**: Global Settings -> Notifications * **Layout**: Form fields grouped by provider type (Email, Slack, Telegram). * **Key Elements**: * **Email (SMTP)**: Host, Port, Login, Password, "From" Address. * **Slack**: Webhook URL list. * **Telegram**: Bot Token. ### Notification Payloads **Email Example (Rich HTML)**: * **Subject**: 🔴 [ss-tools] Ошибка валидации: Sales Executive Dashboard * **Body**: 🔴 **FAIL:** Sales Executive (Env: ss-prod). "Сломан чарт Revenue YTD. Ошибка: колонка profit не найдена." [Embedded Screenshot] -> [Button: View Full Report]. **Messenger Example (Compact Text)**: ```text 🚨 **Dashboard Validation Failed** **Dashboard:** [Sales Executive](link) **Env:** ss-prod | **Time:** 03:15 AM 🤖 **LLM Summary:** Обнаружена SQL-ошибка в чарте "Revenue". Данные не отображаются. 🔗 [View in ss-tools] ``` ### Chat Assistant Interaction ```text User: "Show dashboards that failed the night check" Assistant: "I found 2 dashboards that failed their last validation in production: 1. 🔴 Sales Executive (Issues: Broken Chart, Timeout) 2. 🔴 Marketing Funnel (Issues: SQL Syntax Error) [View detailed reports in Health Center]" ``` ## 4. The "Error" Experience **Philosophy**: Guide the user to resolve failing dashboard validations quickly and prevent misconfiguration of schedules. ### Scenario A: Execution Window Too Small * **User Action**: Selects an execution window of 5 minutes for 100 dashboards. * **System Response**: * (UI) Warning message appears below the time selection: "⚠️ The selected window is too narrow for 100 dashboards. This may cause database spikes. Recommended window: at least 60 minutes." * **Recovery**: User adjusts the end time to broaden the window before saving. ### Scenario B: Dashboard Fails Validation * **User Action**: Navigates to Health Center to view a failed dashboard. * **System Response**: Row is highlighted in red. Clicking "View Report" takes them directly to the detailed LLM analysis page showing the screenshot and specific errors. ## 5. Tone & Voice * **Style**: Professional, monitoring-focused, reassuring. * **Terminology**: Use "Execution Window" for time ranges, "Health Center" for the monitoring view, and "Policies" for automated rules. Avoid overly technical scheduling terms like "Cron expressions" in the UI. ## 6. Phase 6 Implementation Verification - [x] **T029**: Profile UI updated with Telegram ID and Email Override. Matches "Smart Notifications" section in User Preferences. - [x] **T030/T031**: Backend routing logic implemented. Supports owner routing and custom channels. - [x] **T032**: Dispatch wired into LLM analysis plugin. - [x] **T033**: Global Notification Settings UI implemented with SMTP, Telegram, and Slack sections.