# Complete Notification Flow Analysis **Date**: 2026-01-06 **Purpose**: Trace the entire notification system flow to identify issues and improvements --- ## 🔍 **FLOW 1: Initial Page Load & Count Display** ### Step-by-Step Flow: 1. **Component Mount** (`notification-badge.tsx`) - `useNotifications()` hook initializes - `useEffect` triggers when `status === 'authenticated'` - Calls `fetchNotificationCount(true)` (force refresh) - Calls `fetchNotifications()` - Starts polling every 60 seconds 2. **Count Fetch** (`use-notifications.ts` → `/api/notifications/count`) - Hook calls `/api/notifications/count?_t=${Date.now()}` (cache-busting) - API route authenticates user - Calls `NotificationService.getNotificationCount(userId)` 3. **Service Layer** (`notification-service.ts`) - **Checks Redis cache first** (`notifications:count:${userId}`) - If cached: Returns cached data immediately - If not cached: Fetches from adapters 4. **Adapter Layer** (`leantime-adapter.ts`) - `getNotificationCount()` calls `getNotifications(userId, 1, 100)` - **⚠️ ISSUE**: Only fetches first 100 notifications for counting - Filters unread: `notifications.filter(n => !n.isRead).length` - Returns count object 5. **Cache Storage** - Service stores count in Redis with 30-second TTL - Returns to API route - API returns to hook - Hook updates React state: `setNotificationCount(data)` 6. **UI Update** - Badge displays `notificationCount.unread` - Shows "65" if 65 unread notifications --- ## 🔍 **FLOW 2: Mark Single Notification as Read** ### Step-by-Step Flow: 1. **User Action** (`notification-badge.tsx`) - User clicks "Mark as read" button - Calls `handleMarkAsRead(notificationId)` - Calls `markAsRead(notificationId)` from hook 2. **Hook Action** (`use-notifications.ts`) - Makes POST to `/api/notifications/${notificationId}/read` - **Optimistic UI Update**: - Updates notification in state: `isRead: true` - Decrements count: `unread: Math.max(0, prev.unread - 1)` - Waits 100ms, then calls `fetchNotificationCount(true)` 3. **API Route** (`app/api/notifications/[id]/read/route.ts`) - Authenticates user - Extracts notification ID: `leantime-2732` → splits to get source and ID - Calls `NotificationService.markAsRead(userId, notificationId)` 4. **Service Layer** (`notification-service.ts`) - Extracts source: `leantime` from ID - Gets adapter: `this.adapters.get('leantime')` - Calls `adapter.markAsRead(userId, notificationId)` 5. **Adapter Layer** (`leantime-adapter.ts`) - **Gets user email from session**: `getUserEmail()` - **Gets Leantime user ID**: `getLeantimeUserId(email)` - **⚠️ CRITICAL ISSUE**: If `getLeantimeUserId()` fails → returns `false` - If successful: Calls Leantime API `markNotificationRead` - Returns success/failure 6. **Cache Invalidation** (`notification-service.ts`) - If `markAsRead()` returns `true`: - Calls `invalidateCache(userId)` - Deletes count cache: `notifications:count:${userId}` - Deletes all list caches: `notifications:list:${userId}:*` - If returns `false`: **Cache NOT invalidated** ❌ 7. **Count Refresh** (`use-notifications.ts`) - After 100ms delay, calls `fetchNotificationCount(true)` - Fetches fresh count from API - **⚠️ ISSUE**: If cache wasn't invalidated, might get stale count --- ## 🔍 **FLOW 3: Mark All Notifications as Read** ### Step-by-Step Flow: 1. **User Action** (`notification-badge.tsx`) - User clicks "Mark all read" button - Calls `handleMarkAllAsRead()` - Calls `markAllAsRead()` from hook 2. **Hook Action** (`use-notifications.ts`) - Makes POST to `/api/notifications/read-all` - **Optimistic UI Update**: - Sets all notifications: `isRead: true` - Sets count: `unread: 0` - Waits 200ms, then calls `fetchNotificationCount(true)` 3. **API Route** (`app/api/notifications/read-all/route.ts`) - Authenticates user - Calls `NotificationService.markAllAsRead(userId)` 4. **Service Layer** (`notification-service.ts`) - Loops through all adapters - For each adapter: - Checks if configured - Calls `adapter.markAllAsRead(userId)` - Collects results: `[true/false, ...]` - Determines: `success = results.every(r => r)`, `anySuccess = results.some(r => r)` - **Cache Invalidation**: - If `anySuccess === true`: Invalidates cache ✅ - If `anySuccess === false`: **Cache NOT invalidated** ❌ 5. **Adapter Layer** (`leantime-adapter.ts`) - **Gets user email**: `getUserEmail()` - **Gets Leantime user ID**: `getLeantimeUserId(email)` - **⚠️ CRITICAL ISSUE**: If this fails → returns `false` immediately - If successful: - Fetches all notifications directly from API (up to 1000) - Filters unread: `rawNotifications.filter(n => n.read === 0)` - Marks each individually using `markNotificationRead` - Returns success if any were marked 6. **Cache Invalidation** (`notification-service.ts`) - Only happens if `anySuccess === true` - **⚠️ ISSUE**: If `getLeantimeUserId()` fails, `anySuccess = false` - Cache stays stale → count remains 65 7. **Count Refresh** (`use-notifications.ts`) - After 200ms, calls `fetchNotificationCount(true)` - **⚠️ ISSUE**: If cache wasn't invalidated, gets stale count from cache --- ## 🔍 **FLOW 4: Fetch Notification List** ### Step-by-Step Flow: 1. **User Opens Dropdown** (`notification-badge.tsx`) - `handleOpenChange(true)` called - Calls `manualFetch()` which calls `fetchNotifications(1, 10)` 2. **Hook Action** (`use-notifications.ts`) - Makes GET to `/api/notifications?page=1&limit=20` - Updates state: `setNotifications(data.notifications)` 3. **API Route** (`app/api/notifications/route.ts`) - Authenticates user - Calls `NotificationService.getNotifications(userId, page, limit)` 4. **Service Layer** (`notification-service.ts`) - **Checks Redis cache first**: `notifications:list:${userId}:${page}:${limit}` - If cached: Returns cached data immediately - If not cached: Fetches from adapters 5. **Adapter Layer** (`leantime-adapter.ts`) - Gets user email and Leantime user ID - Calls Leantime API `getAllNotifications` with pagination - Transforms notifications to our format - Returns array 6. **Cache Storage** - Service stores list in Redis with 5-minute TTL - Returns to API - API returns to hook - Hook updates React state --- ## 🐛 **IDENTIFIED ISSUES** ### **Issue #1: getLeantimeUserId() Fails Inconsistently** **Problem**: - `getLeantimeUserId()` works in `getNotifications()` and `getNotificationCount()` - But fails in `markAllAsRead()` and sometimes in `markAsRead()` - Logs show: `"User not found in Leantime: a.tmiri@clm.foundation"` **Root Cause**: - `getLeantimeUserId()` calls Leantime API `getAll` users endpoint - Fetches ALL users, then searches for matching email - **Possible causes**: 1. **Race condition**: API call happens at different times 2. **Session timing**: Session might be different between calls 3. **API rate limiting**: Leantime API might throttle requests 4. **Caching issue**: No caching of user ID lookup **Impact**: - Mark all as read fails → cache not invalidated → count stays 65 - Mark single as read might fail → cache not invalidated → count doesn't update **Solution**: - Cache Leantime user ID in Redis with longer TTL - Add retry logic with exponential backoff - Add better error handling and logging --- ### **Issue #2: Cache Invalidation Only on Success** **Problem**: - Cache is only invalidated if `markAsRead()` or `markAllAsRead()` returns `true` - If operation fails (e.g., `getLeantimeUserId()` fails), cache stays stale - Count remains at old value (65) **Root Cause**: ```typescript if (success) { await this.invalidateCache(userId); } ``` **Impact**: - User sees stale count even after attempting to mark as read - UI shows optimistic update, but server count doesn't match **Solution**: - Always invalidate cache after marking attempt (even on failure) - Or: Invalidate cache before marking, then refresh after - Or: Use optimistic updates with eventual consistency --- ### **Issue #3: Count Based on First 100 Notifications** **Problem**: - `getNotificationCount()` only fetches first 100 notifications - If user has 200 notifications with 66 unread, count shows 66 - But if 66 unread are beyond first 100, count is wrong **Root Cause**: ```typescript const notifications = await this.getNotifications(userId, 1, 100); const unreadCount = notifications.filter(n => !n.isRead).length; ``` **Impact**: - Count might be inaccurate if >100 notifications exist - User might see "66 unread" but only 10 displayed (pagination) **Solution**: - Use dedicated count API if Leantime provides one - Or: Fetch all notifications for counting (up to reasonable limit) - Or: Show "66+ unread" if count reaches 100 --- ### **Issue #4: Race Condition Between Cache Invalidation and Count Fetch** **Problem**: - Hook calls `fetchNotificationCount(true)` after 100-200ms delay - But cache invalidation might not be complete - Count fetch might still get stale cache **Root Cause**: ```typescript setTimeout(() => { fetchNotificationCount(true); }, 200); ``` **Impact**: - Count might not update immediately after marking - User sees optimistic update, then stale count **Solution**: - Increase delay to 500ms - Or: Poll count until it matches expected value - Or: Use WebSocket/SSE for real-time updates --- ### **Issue #5: No Caching of Leantime User ID** **Problem**: - `getLeantimeUserId()` fetches ALL users from Leantime API every time - No caching, so repeated calls are slow and might fail - Different calls might get different results (race condition) **Root Cause**: - No Redis cache for user ID mapping - Each call makes full API request **Impact**: - Slow performance - Inconsistent results - API rate limiting issues **Solution**: - Cache user ID in Redis: `leantime:userid:${email}` with 1-hour TTL - Invalidate cache only when user changes or on explicit refresh --- ### **Issue #6: getNotificationCount Uses Cached getNotifications** **Problem**: - `getNotificationCount()` calls `getNotifications(userId, 1, 100)` - `getNotifications()` uses cache if available - Count might be based on stale cached notifications **Root Cause**: ```typescript async getNotificationCount(userId: string): Promise { const notifications = await this.getNotifications(userId, 1, 100); // Uses cached data if available } ``` **Impact**: - Count might be stale even if notifications were marked as read - Cache TTL mismatch: count cache (30s) vs list cache (5min) **Solution**: - Fetch notifications directly from API for counting (bypass cache) - Or: Use dedicated count endpoint - Or: Invalidate list cache when count cache is invalidated --- ### **Issue #7: Optimistic Updates Don't Match Server State** **Problem**: - Hook optimistically updates count: `unread: 0` - But server count might still be 65 (cache not invalidated) - After refresh, count jumps back to 65 **Root Cause**: - Optimistic update happens immediately - Server cache invalidation might fail - Count refresh gets stale data **Impact**: - Confusing UX: count goes to 0, then back to 65 - User thinks operation failed when it might have succeeded **Solution**: - Only show optimistic update if we're confident operation will succeed - Or: Show loading state until server confirms - Or: Poll until count matches expected value --- ## 🎯 **RECOMMENDED IMPROVEMENTS** ### **Priority 1: Fix getLeantimeUserId() Reliability** 1. **Cache User ID Mapping** ```typescript // Cache key: leantime:userid:${email} // TTL: 1 hour // Invalidate on user update or explicit refresh ``` 2. **Add Retry Logic** ```typescript // Retry 3 times with exponential backoff // Log each attempt // Return cached value if API fails ``` 3. **Better Error Handling** ```typescript // Log full error details // Return null only after all retries fail // Don't fail entire operation on user ID lookup failure ``` --- ### **Priority 2: Always Invalidate Cache After Marking** 1. **Invalidate Before Marking** ```typescript // Invalidate cache first // Then mark as read // Then refresh count ``` 2. **Or: Always Invalidate After Attempt** ```typescript // Always invalidate cache after marking attempt // Even if operation failed // This ensures fresh data on next fetch ``` --- ### **Priority 3: Fix Count Accuracy** 1. **Use Dedicated Count API** (if available) ```typescript // Check if Leantime has count-only endpoint // Use that instead of fetching all notifications ``` 2. **Or: Fetch All for Counting** ```typescript // Fetch up to 1000 notifications for counting // Or use pagination to count all ``` 3. **Or: Show "66+ unread" if limit reached** ```typescript // If count === 100, show "100+ unread" // Indicate there might be more ``` --- ### **Priority 4: Improve Cache Strategy** 1. **Unified Cache Invalidation** ```typescript // When count cache is invalidated, also invalidate list cache // When list cache is invalidated, also invalidate count cache // Keep them in sync ``` 2. **Shorter Cache TTLs** ```typescript // Count cache: 10 seconds (currently 30s) // List cache: 1 minute (currently 5min) // More frequent updates ``` 3. **Cache Tags/Versioning** ```typescript // Use cache version numbers // Increment on invalidation // Check version before using cache ``` --- ### **Priority 5: Better Error Recovery** 1. **Graceful Degradation** ```typescript // If mark as read fails, still invalidate cache // Show error message to user // Allow retry ``` 2. **Retry Logic** ```typescript // Retry failed operations automatically // Exponential backoff // Max 3 retries ``` --- ## 📊 **FLOW DIAGRAM: Current vs Improved** ### **Current Flow (Mark All As Read)**: ``` User clicks → Hook → API → Service → Adapter ↓ getLeantimeUserId() → FAILS ❌ ↓ Returns false → Service: anySuccess = false ↓ Cache NOT invalidated ❌ ↓ Count refresh → Gets stale cache → Shows 65 ❌ ``` ### **Improved Flow (Mark All As Read)**: ``` User clicks → Hook → API → Service → Adapter ↓ getLeantimeUserId() → Check cache first ↓ If cached: Use cached ID ✅ If not cached: Fetch from API → Cache result ✅ ↓ Mark all as read → Success ✅ ↓ Always invalidate cache (even on partial failure) ✅ ↓ Count refresh → Gets fresh data → Shows 0 ✅ ``` --- ## 🚀 **IMPLEMENTATION PRIORITY** 1. **Fix getLeantimeUserId() caching** (High Priority) - Add Redis cache for user ID mapping - Add retry logic - Better error handling 2. **Always invalidate cache** (High Priority) - Invalidate cache even on failure - Or invalidate before marking 3. **Fix count accuracy** (Medium Priority) - Use dedicated count API or fetch all - Show "66+ unread" if limit reached 4. **Improve cache strategy** (Medium Priority) - Unified invalidation - Shorter TTLs - Cache versioning 5. **Better error recovery** (Low Priority) - Graceful degradation - Retry logic - Better UX --- **Status**: Analysis complete. Ready for implementation.