NeahNew/NOTIFICATION_FLOW_ANALYSIS.md

# Complete Notification Flow Analysis

**Date**: 2026-01-06
**Purpose**: Trace the entire notification system flow to identify issues and improvements

---

## 🔍 **FLOW 1: Initial Page Load & Count Display**

### Step-by-Step Flow:

1. **Component Mount** (`notification-badge.tsx`)
   - `useNotifications()` hook initializes
   - `useEffect` triggers when `status === 'authenticated'`
   - Calls `fetchNotificationCount(true)` (force refresh)
   - Calls `fetchNotifications()`
   - Starts polling every 60 seconds

2. **Count Fetch** (`use-notifications.ts` → `/api/notifications/count`)
   - Hook calls `/api/notifications/count?_t=${Date.now()}` (cache-busting)
   - API route authenticates user
   - Calls `NotificationService.getNotificationCount(userId)`

3. **Service Layer** (`notification-service.ts`)
   - **Checks Redis cache first** (`notifications:count:${userId}`)
   - If cached: Returns cached data immediately
   - If not cached: Fetches from adapters

4. **Adapter Layer** (`leantime-adapter.ts`)
   - `getNotificationCount()` calls `getNotifications(userId, 1, 100)`
   - **⚠️ ISSUE**: Only fetches first 100 notifications for counting
   - Filters unread: `notifications.filter(n => !n.isRead).length`
   - Returns count object

5. **Cache Storage**
   - Service stores count in Redis with 30-second TTL
   - Returns to API route
   - API returns to hook
   - Hook updates React state: `setNotificationCount(data)`

6. **UI Update**
   - Badge displays `notificationCount.unread`
   - Shows "65" if 65 unread notifications

---

## 🔍 **FLOW 2: Mark Single Notification as Read**

### Step-by-Step Flow:

1. **User Action** (`notification-badge.tsx`)
   - User clicks "Mark as read" button
   - Calls `handleMarkAsRead(notificationId)`
   - Calls `markAsRead(notificationId)` from hook

2. **Hook Action** (`use-notifications.ts`)
   - Makes POST to `/api/notifications/${notificationId}/read`
   - **Optimistic UI Update**:
     - Updates notification in state: `isRead: true`
     - Decrements count: `unread: Math.max(0, prev.unread - 1)`
   - Waits 100ms, then calls `fetchNotificationCount(true)`

3. **API Route** (`app/api/notifications/[id]/read/route.ts`)
   - Authenticates user
   - Extracts notification ID: `leantime-2732` → splits to get source and ID
   - Calls `NotificationService.markAsRead(userId, notificationId)`

4. **Service Layer** (`notification-service.ts`)
   - Extracts source: `leantime` from ID
   - Gets adapter: `this.adapters.get('leantime')`
   - Calls `adapter.markAsRead(userId, notificationId)`

5. **Adapter Layer** (`leantime-adapter.ts`)
   - **Gets user email from session**: `getUserEmail()`
   - **Gets Leantime user ID**: `getLeantimeUserId(email)`
   - **⚠️ CRITICAL ISSUE**: If `getLeantimeUserId()` fails → returns `false`
   - If successful: Calls Leantime API `markNotificationRead`
   - Returns success/failure

6. **Cache Invalidation** (`notification-service.ts`)
   - If `markAsRead()` returns `true`:
     - Calls `invalidateCache(userId)`
     - Deletes count cache: `notifications:count:${userId}`
     - Deletes all list caches: `notifications:list:${userId}:*`
   - If returns `false`: **Cache NOT invalidated** ❌

7. **Count Refresh** (`use-notifications.ts`)
   - After 100ms delay, calls `fetchNotificationCount(true)`
   - Fetches fresh count from API
   - **⚠️ ISSUE**: If cache wasn't invalidated, might get stale count

---

## 🔍 **FLOW 3: Mark All Notifications as Read**

### Step-by-Step Flow:

1. **User Action** (`notification-badge.tsx`)
   - User clicks "Mark all read" button
   - Calls `handleMarkAllAsRead()`
   - Calls `markAllAsRead()` from hook

2. **Hook Action** (`use-notifications.ts`)
   - Makes POST to `/api/notifications/read-all`
   - **Optimistic UI Update**:
     - Sets all notifications: `isRead: true`
     - Sets count: `unread: 0`
   - Waits 200ms, then calls `fetchNotificationCount(true)`

3. **API Route** (`app/api/notifications/read-all/route.ts`)
   - Authenticates user
   - Calls `NotificationService.markAllAsRead(userId)`

4. **Service Layer** (`notification-service.ts`)
   - Loops through all adapters
   - For each adapter:
     - Checks if configured
     - Calls `adapter.markAllAsRead(userId)`
   - Collects results: `[true/false, ...]`
   - Determines: `success = results.every(r => r)`, `anySuccess = results.some(r => r)`
   - **Cache Invalidation**:
     - If `anySuccess === true`: Invalidates cache ✅
     - If `anySuccess === false`: **Cache NOT invalidated** ❌

5. **Adapter Layer** (`leantime-adapter.ts`)
   - **Gets user email**: `getUserEmail()`
   - **Gets Leantime user ID**: `getLeantimeUserId(email)`
   - **⚠️ CRITICAL ISSUE**: If this fails → returns `false` immediately
   - If successful:
     - Fetches all notifications directly from API (up to 1000)
     - Filters unread: `rawNotifications.filter(n => n.read === 0)`
     - Marks each individually using `markNotificationRead`
     - Returns success if any were marked

6. **Cache Invalidation** (`notification-service.ts`)
   - Only happens if `anySuccess === true`
   - **⚠️ ISSUE**: If `getLeantimeUserId()` fails, `anySuccess = false`
   - Cache stays stale → count remains 65

7. **Count Refresh** (`use-notifications.ts`)
   - After 200ms, calls `fetchNotificationCount(true)`
   - **⚠️ ISSUE**: If cache wasn't invalidated, gets stale count from cache

---

## 🔍 **FLOW 4: Fetch Notification List**

### Step-by-Step Flow:

1. **User Opens Dropdown** (`notification-badge.tsx`)
   - `handleOpenChange(true)` called
   - Calls `manualFetch()` which calls `fetchNotifications(1, 10)`

2. **Hook Action** (`use-notifications.ts`)
   - Makes GET to `/api/notifications?page=1&limit=20`
   - Updates state: `setNotifications(data.notifications)`

3. **API Route** (`app/api/notifications/route.ts`)
   - Authenticates user
   - Calls `NotificationService.getNotifications(userId, page, limit)`

4. **Service Layer** (`notification-service.ts`)
   - **Checks Redis cache first**: `notifications:list:${userId}:${page}:${limit}`
   - If cached: Returns cached data immediately
   - If not cached: Fetches from adapters

5. **Adapter Layer** (`leantime-adapter.ts`)
   - Gets user email and Leantime user ID
   - Calls Leantime API `getAllNotifications` with pagination
   - Transforms notifications to our format
   - Returns array

6. **Cache Storage**
   - Service stores list in Redis with 5-minute TTL
   - Returns to API
   - API returns to hook
   - Hook updates React state

---

## 🐛 **IDENTIFIED ISSUES**

### **Issue #1: getLeantimeUserId() Fails Inconsistently**

**Problem**:
- `getLeantimeUserId()` works in `getNotifications()` and `getNotificationCount()`
- But fails in `markAllAsRead()` and sometimes in `markAsRead()`
- Logs show: `"User not found in Leantime: a.tmiri@clm.foundation"`

**Root Cause**:
- `getLeantimeUserId()` calls Leantime API `getAll` users endpoint
- Fetches ALL users, then searches for matching email
- **Possible causes**:
  1. **Race condition**: API call happens at different times
  2. **Session timing**: Session might be different between calls
  3. **API rate limiting**: Leantime API might throttle requests
  4. **Caching issue**: No caching of user ID lookup

**Impact**:
- Mark all as read fails → cache not invalidated → count stays 65
- Mark single as read might fail → cache not invalidated → count doesn't update

**Solution**:
- Cache Leantime user ID in Redis with longer TTL
- Add retry logic with exponential backoff
- Add better error handling and logging

---

### **Issue #2: Cache Invalidation Only on Success**

**Problem**:
- Cache is only invalidated if `markAsRead()` or `markAllAsRead()` returns `true`
- If operation fails (e.g., `getLeantimeUserId()` fails), cache stays stale
- Count remains at old value (65)

**Root Cause**:
```typescript
if (success) {
  await this.invalidateCache(userId);
}
```

**Impact**:
- User sees stale count even after attempting to mark as read
- UI shows optimistic update, but server count doesn't match

**Solution**:
- Always invalidate cache after marking attempt (even on failure)
- Or: Invalidate cache before marking, then refresh after
- Or: Use optimistic updates with eventual consistency

---

### **Issue #3: Count Based on First 100 Notifications**

**Problem**:
- `getNotificationCount()` only fetches first 100 notifications
- If user has 200 notifications with 66 unread, count shows 66
- But if 66 unread are beyond first 100, count is wrong

**Root Cause**:
```typescript
const notifications = await this.getNotifications(userId, 1, 100);
const unreadCount = notifications.filter(n => !n.isRead).length;
```

**Impact**:
- Count might be inaccurate if >100 notifications exist
- User might see "66 unread" but only 10 displayed (pagination)

**Solution**:
- Use dedicated count API if Leantime provides one
- Or: Fetch all notifications for counting (up to reasonable limit)
- Or: Show "66+ unread" if count reaches 100

---

### **Issue #4: Race Condition Between Cache Invalidation and Count Fetch**

**Problem**:
- Hook calls `fetchNotificationCount(true)` after 100-200ms delay
- But cache invalidation might not be complete
- Count fetch might still get stale cache

**Root Cause**:
```typescript
setTimeout(() => {
  fetchNotificationCount(true);
}, 200);
```

**Impact**:
- Count might not update immediately after marking
- User sees optimistic update, then stale count

**Solution**:
- Increase delay to 500ms
- Or: Poll count until it matches expected value
- Or: Use WebSocket/SSE for real-time updates

---

### **Issue #5: No Caching of Leantime User ID**

**Problem**:
- `getLeantimeUserId()` fetches ALL users from Leantime API every time
- No caching, so repeated calls are slow and might fail
- Different calls might get different results (race condition)

**Root Cause**:
- No Redis cache for user ID mapping
- Each call makes full API request

**Impact**:
- Slow performance
- Inconsistent results
- API rate limiting issues

**Solution**:
- Cache user ID in Redis: `leantime:userid:${email}` with 1-hour TTL
- Invalidate cache only when user changes or on explicit refresh

---

### **Issue #6: getNotificationCount Uses Cached getNotifications**

**Problem**:
- `getNotificationCount()` calls `getNotifications(userId, 1, 100)`
- `getNotifications()` uses cache if available
- Count might be based on stale cached notifications

**Root Cause**:
```typescript
async getNotificationCount(userId: string): Promise<NotificationCount> {
  const notifications = await this.getNotifications(userId, 1, 100);
  // Uses cached data if available
}
```

**Impact**:
- Count might be stale even if notifications were marked as read
- Cache TTL mismatch: count cache (30s) vs list cache (5min)

**Solution**:
- Fetch notifications directly from API for counting (bypass cache)
- Or: Use dedicated count endpoint
- Or: Invalidate list cache when count cache is invalidated

---

### **Issue #7: Optimistic Updates Don't Match Server State**

**Problem**:
- Hook optimistically updates count: `unread: 0`
- But server count might still be 65 (cache not invalidated)
- After refresh, count jumps back to 65

**Root Cause**:
- Optimistic update happens immediately
- Server cache invalidation might fail
- Count refresh gets stale data

**Impact**:
- Confusing UX: count goes to 0, then back to 65
- User thinks operation failed when it might have succeeded

**Solution**:
- Only show optimistic update if we're confident operation will succeed
- Or: Show loading state until server confirms
- Or: Poll until count matches expected value

---

## 🎯 **RECOMMENDED IMPROVEMENTS**

### **Priority 1: Fix getLeantimeUserId() Reliability**

1. **Cache User ID Mapping**
   ```typescript
   // Cache key: leantime:userid:${email}
   // TTL: 1 hour
   // Invalidate on user update or explicit refresh
   ```

2. **Add Retry Logic**
   ```typescript
   // Retry 3 times with exponential backoff
   // Log each attempt
   // Return cached value if API fails
   ```

3. **Better Error Handling**
   ```typescript
   // Log full error details
   // Return null only after all retries fail
   // Don't fail entire operation on user ID lookup failure
   ```

---

### **Priority 2: Always Invalidate Cache After Marking**

1. **Invalidate Before Marking**
   ```typescript
   // Invalidate cache first
   // Then mark as read
   // Then refresh count
   ```

2. **Or: Always Invalidate After Attempt**
   ```typescript
   // Always invalidate cache after marking attempt
   // Even if operation failed
   // This ensures fresh data on next fetch
   ```

---

### **Priority 3: Fix Count Accuracy**

1. **Use Dedicated Count API** (if available)
   ```typescript
   // Check if Leantime has count-only endpoint
   // Use that instead of fetching all notifications
   ```

2. **Or: Fetch All for Counting**
   ```typescript
   // Fetch up to 1000 notifications for counting
   // Or use pagination to count all
   ```

3. **Or: Show "66+ unread" if limit reached**
   ```typescript
   // If count === 100, show "100+ unread"
   // Indicate there might be more
   ```

---

### **Priority 4: Improve Cache Strategy**

1. **Unified Cache Invalidation**
   ```typescript
   // When count cache is invalidated, also invalidate list cache
   // When list cache is invalidated, also invalidate count cache
   // Keep them in sync
   ```

2. **Shorter Cache TTLs**
   ```typescript
   // Count cache: 10 seconds (currently 30s)
   // List cache: 1 minute (currently 5min)
   // More frequent updates
   ```

3. **Cache Tags/Versioning**
   ```typescript
   // Use cache version numbers
   // Increment on invalidation
   // Check version before using cache
   ```

---

### **Priority 5: Better Error Recovery**

1. **Graceful Degradation**
   ```typescript
   // If mark as read fails, still invalidate cache
   // Show error message to user
   // Allow retry
   ```

2. **Retry Logic**
   ```typescript
   // Retry failed operations automatically
   // Exponential backoff
   // Max 3 retries
   ```

---

## 📊 **FLOW DIAGRAM: Current vs Improved**

### **Current Flow (Mark All As Read)**:
```
User clicks → Hook → API → Service → Adapter
  ↓
getLeantimeUserId() → FAILS ❌
  ↓
Returns false → Service: anySuccess = false
  ↓
Cache NOT invalidated ❌
  ↓
Count refresh → Gets stale cache → Shows 65 ❌
```

### **Improved Flow (Mark All As Read)**:
```
User clicks → Hook → API → Service → Adapter
  ↓
getLeantimeUserId() → Check cache first
  ↓
If cached: Use cached ID ✅
If not cached: Fetch from API → Cache result ✅
  ↓
Mark all as read → Success ✅
  ↓
Always invalidate cache (even on partial failure) ✅
  ↓
Count refresh → Gets fresh data → Shows 0 ✅
```

---

## 🚀 **IMPLEMENTATION PRIORITY**

1. **Fix getLeantimeUserId() caching** (High Priority)
   - Add Redis cache for user ID mapping
   - Add retry logic
   - Better error handling

2. **Always invalidate cache** (High Priority)
   - Invalidate cache even on failure
   - Or invalidate before marking

3. **Fix count accuracy** (Medium Priority)
   - Use dedicated count API or fetch all
   - Show "66+ unread" if limit reached

4. **Improve cache strategy** (Medium Priority)
   - Unified invalidation
   - Shorter TTLs
   - Cache versioning

5. **Better error recovery** (Low Priority)
   - Graceful degradation
   - Retry logic
   - Better UX

---

**Status**: Analysis complete. Ready for implementation.