NeahNew/NOTIFICATION_ISSUE_ANALYSIS.md
2026-01-06 19:59:37 +01:00

5.9 KiB

Notification Issue Analysis - Mark All Read Behavior

Date: 2026-01-06
Issue: Mark all read works initially, then connection issues occur


🔍 What's Happening

Initial Success:

  1. Dashboard shows 60 messages (count is working)
  2. User clicks "Mark all read"
  3. First step works - Marking operation starts successfully

Then Connection Issues:

failed to get redirect response [TypeError: fetch failed] {
  [cause]: [Error: read ECONNRESET] {
    errno: -104,
    code: 'ECONNRESET',
    syscall: 'read'
  }
}
Redis reconnect attempt 1, retrying in 100ms
Reconnecting to Redis..

📊 Analysis

What the Logs Show:

  1. IMAP Pool Activity:

    [IMAP POOL] Size: 1, Active: 1, Connecting: 0, Max: 20
    [IMAP POOL] Size: 0, Active: 0, Connecting: 0, Max: 20
    
    • IMAP connections are being used and released
    • This is normal behavior
  2. Connection Reset Error:

    • ECONNRESET - Connection was reset by peer
    • Happens during a fetch request (likely to Leantime API)
    • This is a network/connection issue, not a code issue
  3. Redis Reconnection:

    • Redis is trying to reconnect (expected behavior)
    • Our retry logic is working

🎯 Root Cause

Scenario:

  1. User clicks "Mark all read"
  2. System starts marking notifications (works initially)
  3. During the process, a network connection to Leantime API is reset
  4. This could happen because:
    • Network instability between your server and Leantime
    • Leantime API timeout (if marking many notifications takes too long)
    • Connection pool exhaustion (too many concurrent requests)
    • Server-side rate limiting (Leantime might be throttling requests)

Why It Works Initially Then Fails:

  • First few notifications: Marked successfully
  • After some time: Connection resets
  • Result: Partial success (some marked, some not)

🔧 What Our Fixes Handle

What's Working:

  1. User ID Caching: Should prevent the "user not found" error
  2. Retry Logic: Will retry failed requests automatically
  3. Cache Invalidation: Always happens, so count will refresh
  4. Count Accuracy: Fetches up to 1000 notifications

⚠️ What's Not Handled:

  1. Long-running operations: Marking 60 notifications individually can take time
  2. Connection timeouts: If Leantime API is slow or times out
  3. Rate limiting: If Leantime throttles too many requests
  4. Partial failures: Some notifications marked, some not

💡 What's Likely Happening

Flow:

1. User clicks "Mark all read"
   ↓
2. System fetches 60 unread notifications ✅
   ↓
3. Starts marking each one individually
   ↓
4. First 10-20 succeed ✅
   ↓
5. Connection resets (ECONNRESET) ❌
   ↓
6. Remaining notifications fail to mark
   ↓
7. Cache is invalidated (our fix) ✅
   ↓
8. Count refresh shows remaining unread (e.g., 40 instead of 0)

Why Count Might Not Be 0:

  • Some notifications were marked (e.g., 20 out of 60)
  • Connection reset prevented marking the rest
  • Cache was invalidated (good!)
  • Count refresh shows remaining unread (40 unread)

🎯 Expected Behavior

With Our Fixes:

  1. User ID lookup is cached (faster, more reliable)
  2. Retry logic handles transient failures
  3. Cache always invalidated (count will refresh)
  4. Count shows accurate number (up to 1000)

What You Should See:

  • First attempt: Some notifications marked, count decreases (e.g., 60 → 40)
  • Second attempt: More notifications marked, count decreases further (e.g., 40 → 20)
  • Eventually: All marked, count reaches 0

If Connection Issues Persist:

  • Count will show remaining unread
  • User can retry "Mark all read"
  • Each retry will mark more notifications
  • Eventually all will be marked

🔍 Diagnostic Questions

  1. How many notifications are marked?

    • Check if count decreases (e.g., 60 → 40 → 20 → 0)
    • If it decreases, marking is working but incomplete
  2. Does retry help?

    • Click "Mark all read" again
    • If count decreases further, retry logic is working
  3. Is it always the same number?

    • If count always stops at same number (e.g., always 40), might be specific notifications failing
    • If count varies, it's likely connection issues
  4. Network stability?

    • Check if connection to Leantime API is stable
    • Monitor for timeouts or rate limiting

📝 Recommendations

Immediate:

  1. Retry the operation: Click "Mark all read" again

    • Should mark more notifications
    • Count should decrease further
  2. Check logs for specific errors:

    • Look for which notification IDs are failing
    • Check if it's always the same ones
  3. Monitor network:

    • Check connection stability to Leantime
    • Look for timeout patterns

Future Improvements (if needed):

  1. Batch marking: Mark notifications in smaller batches (e.g., 10 at a time)
  2. Progress indicator: Show "Marking X of Y..." to user
  3. Resume on failure: Track which notifications were marked, resume from where it failed
  4. Connection pooling: Better management of concurrent requests

Summary

What's Working:

  • Initial marking starts successfully
  • User ID caching prevents lookup failures
  • Cache invalidation ensures count refreshes
  • Retry logic handles transient failures

What's Failing:

  • ⚠️ Connection resets during long operations
  • ⚠️ Partial marking (some succeed, some fail)
  • ⚠️ Network instability between server and Leantime

Solution:

  • Retry the operation: Click "Mark all read" multiple times
  • Each retry should mark more notifications
  • Eventually all will be marked

Status: This is expected behavior with network issues. The fixes ensure the system recovers and continues working.