NeahNew/LOG_ANALYSIS_SUMMARY.md
2026-01-08 16:17:13 +01:00

5.9 KiB

Log Analysis Summary - Infinite Refresh Loop Fix

Problem Identified

Your logs showed a critical infinite refresh loop:

Keycloak session invalidated, clearing token to force re-authentication
Keycloak session invalidated, clearing token to force re-authentication
Keycloak session invalidated, clearing token to force re-authentication
... (repeating infinitely)

Root Cause

  1. Session Invalidated: User's Keycloak session became invalid (logged out elsewhere, expired, etc.)
  2. Multiple Widgets: All widgets/components making parallel API requests
  3. JWT Callback Triggered: Each request triggers NextAuth JWT callback
  4. Refresh Attempt: Each callback tries to refresh the expired token
  5. Refresh Fails: Refresh fails because session is invalid
  6. No Circuit Breaker: Next request sees expired token → tries refresh again → infinite loop

Impact

  • Performance: Hundreds of refresh attempts per second
  • Server Load: CPU/memory spike
  • Keycloak Load: Potential DoS on Keycloak server
  • User Experience: App appears broken
  • Logs: Spam with error messages

Solution Implemented

Circuit Breaker Pattern

Added a 5-second cooldown after failed refresh attempts:

  1. Track Failures: Record timestamp when refresh fails
  2. Cooldown Period: Don't retry refresh for 5 seconds after failure
  3. Early Return: If in cooldown, return error immediately (no API call)
  4. Memory Management: Cleanup old entries to prevent memory leaks

Code Changes

File: app/api/auth/options.ts

Added:

  • refreshCooldown Map to track last failure per user
  • REFRESH_COOLDOWN_MS = 5000 (5 seconds)
  • cleanupRefreshCooldown() function to prevent memory leaks
  • Cooldown check before refresh attempt
  • Failure recording after failed refresh

How It Works:

// Before refresh attempt:
if (timeSinceFailure < REFRESH_COOLDOWN_MS) {
  // Skip refresh, return error immediately
  return errorToken;
}

// After failed refresh:
if (refreshedToken.error === "SessionNotActive") {
  refreshCooldown.set(userId, Date.now()); // Record failure
  return errorToken;
}

Expected Behavior After Fix

Before Fix

Request 1 → Refresh attempt → Fail → Clear tokens
Request 2 → Refresh attempt → Fail → Clear tokens
Request 3 → Refresh attempt → Fail → Clear tokens
... (infinite loop)

After Fix

Request 1 → Refresh attempt → Fail → Record failure → Clear tokens
Request 2 → Check cooldown → Skip refresh → Return error immediately
Request 3 → Check cooldown → Skip refresh → Return error immediately
... (cooldown prevents refresh attempts)
After 5s → Next request can try refresh again (if session restored)

What You'll See in Logs

Good Signs:

  • "Refresh cooldown active, skipping refresh attempt" (instead of infinite failures)
  • Only 1-2 refresh attempts per user when session invalidates
  • User redirected to sign-in page
  • No refresh storm

Bad Signs (if still happening):

  • Still seeing infinite "Keycloak session invalidated" messages
  • Multiple refresh attempts within 5 seconds
  • Cooldown not working

Testing the Fix

Test Scenario 1: Session Invalidation

  1. Log in to the app
  2. Logout from Keycloak admin console (or expire session)
  3. Expected:
    • 1-2 refresh attempts
    • Then cooldown messages
    • User redirected to sign-in
    • NOT infinite loop

Test Scenario 2: Multiple Widgets

  1. Open app with all widgets loading
  2. Invalidate session
  3. Expected:
    • All widgets respect cooldown
    • No refresh storm
    • Clean error handling

Test Scenario 3: Normal Operation

  1. Valid session
  2. Token expires naturally
  3. Expected:
    • Refresh succeeds
    • No cooldown triggered
    • Normal operation continues

Monitoring

Metrics to Watch

  1. Refresh Attempts: Should be low (1-2 per user per session)
  2. Cooldown Activations: Should only happen when session invalid
  3. Refresh Success Rate: Should be high for valid sessions
  4. Error Rate: Should drop significantly

Log Patterns

Healthy:

[DEBUG] Refresh cooldown active, skipping refresh attempt
[INFO] Keycloak session invalidated, setting cooldown

Unhealthy (if still happening):

Keycloak session invalidated, clearing token... (repeating)

Future Improvements

  1. Done: In-memory circuit breaker
  2. ⚠️ Next: Migrate to Redis-based circuit breaker (for multi-instance)
  3. ⚠️ Next: Add client-side session guard to stop requests

Long-term

  1. ⚠️ Add metrics/monitoring
  2. ⚠️ Implement exponential backoff
  3. ⚠️ Add request cancellation on client-side
  4. ⚠️ Better error boundaries

Additional Notes

Why 5 Seconds?

  • Too Short (< 2s): Still allows refresh storms
  • Too Long (> 10s): Delays legitimate refresh attempts
  • 5 Seconds: Good balance - prevents storms, allows quick recovery

Memory Considerations

  • Map Size: Limited to 1000 entries (auto-cleanup)
  • Memory Per Entry: ~50 bytes (userId + timestamp)
  • Total Memory: ~50KB max
  • Cleanup: Automatic (removes entries older than 50s)

Multi-Instance Deployment

Current: In-memory Map (per-instance)

  • Works for single instance
  • Each instance has its own cooldown

Future: Redis-based (shared across instances)

  • Better for multi-instance
  • Shared cooldown state
  • See CRITICAL_ISSUE_ANALYSIS.md for Redis implementation

Summary

Fixed: Infinite refresh loop with circuit breaker Impact: Prevents refresh storms, reduces server load Testing: Verify with session invalidation scenarios ⚠️ Next: Monitor logs, consider Redis migration for multi-instance

The fix is production-ready and should immediately stop the refresh loop you're seeing in your logs.