# UX Reference: Dashboard Health Windows

**Feature Branch**: `026-dashboard-health-windows`
**Created**: 2026-03-10
**Status**: Phase 6 Verified

## 1. User Persona & Context

*   **Who is the user?**: Data Engineer, System Administrator, or Data Analyst (Data Team).
*   **What is their goal?**: Schedule automated LLM checks for many dashboards without overloading the database, and quickly monitor which dashboards are currently broken.
*   **Context**: Managing dozens or hundreds of dashboards across environments (e.g., production, dev) via a web interface, and needing a high-level overview of system health every morning.

## 2. The "Happy Path" Narrative

The user navigates to "Settings -> Automation" and creates a new validation policy for 15 critical production dashboards. Instead of writing complex cron jobs, they simply set an "Execution Window" from 01:00 to 05:00. They check the box to "Auto-notify Owners". The system automatically spaces out the 15 checks within those 4 hours.

During the night, a dashboard owned by the user fails. The user immediately receives a Telegram message: "🚨 Validation Failed: Sales Executive (Broken Chart). [View in ss-tools]".

In the morning, the user opens the app and immediately notices a red badge `[2🔴]` next to "Dashboards" in the sidebar. They click it to open the "Health Center", which displays a clear "traffic light" summary: 13 Green, 0 Yellow, 2 Red. The two broken dashboards are listed at the top with a summary of the issues. They can also just ask the AI Assistant: "Which dashboards failed last night?" and get a direct link.

## 3. Interface Mockups

### UI Layout & Flow

**Screen/Component**: Automation Settings (Validation Policies)

*   **Layout**: List view of existing policies with a "Create Rule" modal.
*   **Key Elements**:
    *   **Policy List**: Shows Active/Paused status, target count, schedule window, and active notification channels.
    *   **Create Modal**:
        *   Scope selection (Environment dropdown, Filter by selected/tags/all).
        *   Schedule configuration (Days of week checkboxes, Start/End time for Execution Window).
        *   Alerts configuration:
            *   Trigger condition (e.g., "Only FAIL").
            *   Toggle: "Auto-notify Owners (Email / Messenger)".
            *   Add custom Channels (Slack webhook, General Email).
*   **States**:
    *   **Info Message**: When selecting times, a helpful hint appears: "💡 System will automatically distribute 15 checks within this 4-hour window to avoid peak database load."

**Screen/Component**: Dashboard Health Center

*   **Layout**: Top summary cards (Traffic lights), followed by a data table.
*   **Key Elements**:
    *   **Environment Selector**: Dropdown to switch between `ss-prod`, `ss-dev`, etc.
    *   **Health Matrix Summary**: `🟢 145 Pass | 🟡 5 Warn | 🔴 2 Fail`
    *   **Data Table**: Columns for Dashboard Name, Status (Badge), Detected Issues (Text), Checked At (Time), and a "View Report" action button.
    *   **Sidebar Badge**: The main navigation sidebar has a badge `[2🔴]` next to the Dashboards menu item.

**Screen/Component**: Global Settings -> Notifications

*   **Layout**: Form fields grouped by provider type (Email, Slack, Telegram).
*   **Key Elements**:
    *   **Email (SMTP)**: Host, Port, Login, Password, "From" Address.
    *   **Slack**: Webhook URL list.
    *   **Telegram**: Bot Token.

### Notification Payloads

**Email Example (Rich HTML)**:
*   **Subject**: 🔴 [ss-tools] Ошибка валидации: Sales Executive Dashboard
*   **Body**: 🔴 **FAIL:** Sales Executive (Env: ss-prod). "Сломан чарт Revenue YTD. Ошибка: колонка profit не найдена." [Embedded Screenshot] -> [Button: View Full Report].

**Messenger Example (Compact Text)**:
```text
🚨 **Dashboard Validation Failed**
**Dashboard:** [Sales Executive](link)
**Env:** ss-prod | **Time:** 03:15 AM

🤖 **LLM Summary:**
Обнаружена SQL-ошибка в чарте "Revenue". Данные не отображаются.
🔗 [View in ss-tools]
```

### Chat Assistant Interaction

```text
User: "Show dashboards that failed the night check"
Assistant: "I found 2 dashboards that failed their last validation in production:
1. 🔴 Sales Executive (Issues: Broken Chart, Timeout)
2. 🔴 Marketing Funnel (Issues: SQL Syntax Error)
[View detailed reports in Health Center]"
```

## 4. The "Error" Experience

**Philosophy**: Guide the user to resolve failing dashboard validations quickly and prevent misconfiguration of schedules.

### Scenario A: Execution Window Too Small

*   **User Action**: Selects an execution window of 5 minutes for 100 dashboards.
*   **System Response**: 
    *   (UI) Warning message appears below the time selection: "⚠️ The selected window is too narrow for 100 dashboards. This may cause database spikes. Recommended window: at least 60 minutes."
*   **Recovery**: User adjusts the end time to broaden the window before saving.

### Scenario B: Dashboard Fails Validation

*   **User Action**: Navigates to Health Center to view a failed dashboard.
*   **System Response**: Row is highlighted in red. Clicking "View Report" takes them directly to the detailed LLM analysis page showing the screenshot and specific errors.

## 5. Tone & Voice

*   **Style**: Professional, monitoring-focused, reassuring.
*   **Terminology**: Use "Execution Window" for time ranges, "Health Center" for the monitoring view, and "Policies" for automated rules. Avoid overly technical scheduling terms like "Cron expressions" in the UI.

## 6. Phase 6 Implementation Verification

- [x] **T029**: Profile UI updated with Telegram ID and Email Override. Matches "Smart Notifications" section in User Preferences.
- [x] **T030/T031**: Backend routing logic implemented. Supports owner routing and custom channels.
- [x] **T032**: Dispatch wired into LLM analysis plugin.
- [x] **T033**: Global Notification Settings UI implemented with SMTP, Telegram, and Slack sections.