# Critical Issue: Infinite Token Refresh Loop ## Problem Analysis ### What's Happening 1. **Initial Load**: App starts successfully, user authenticated 2. **Session Invalidation**: Keycloak session becomes invalid (user logged out elsewhere, session expired, etc.) 3. **Refresh Storm**: Every API request triggers: - JWT callback execution - Token refresh attempt - Refresh failure (session invalid) - Token cleared, but error state persists - **Next request repeats the cycle** ### Root Cause The JWT callback in `app/api/auth/options.ts` has no circuit breaker: ```typescript // Current problematic flow: if (expiresAt && Date.now() < expiresAt) { return token; // Token valid } // Token expired - ALWAYS tries to refresh const refreshedToken = await refreshAccessToken(token); // If refresh fails, clears tokens but... // Next request will see expired token and try again! ``` **The Problem:** - No cooldown period after failed refresh - No "session invalid" cache/flag - Every request triggers refresh attempt - Multiple widgets = multiple parallel requests = refresh storm ### Impact - **Performance**: Excessive Keycloak API calls - **Server Load**: CPU/memory spike from refresh attempts - **User Experience**: App appears broken, constant loading - **Logs**: Spam with "Keycloak session invalidated" messages - **Security**: Potential DoS on Keycloak server --- ## Solution: Circuit Breaker Pattern ### Implementation Strategy 1. **Add Refresh Cooldown**: Don't retry refresh for X seconds after failure 2. **Session Invalid Flag**: Cache the "session invalid" state 3. **Early Return**: If session known to be invalid, skip refresh attempt 4. **Client-Side Detection**: Stop making requests when session invalid ### Code Changes Needed #### 1. Add Circuit Breaker to JWT Callback ```typescript // Add to app/api/auth/options.ts // Track last failed refresh attempt const lastFailedRefresh = new Map(); const REFRESH_COOLDOWN = 5000; // 5 seconds async jwt({ token, account, profile }) { // ... existing initial sign-in logic ... // Check if token is expired const expiresAt = token.accessTokenExpires as number; if (expiresAt && Date.now() < expiresAt) { return token; // Token still valid } // CIRCUIT BREAKER: Check if we recently failed to refresh const userId = token.sub || 'unknown'; const lastFailure = lastFailedRefresh.get(userId) || 0; const timeSinceFailure = Date.now() - lastFailure; if (timeSinceFailure < REFRESH_COOLDOWN) { // Too soon after failure, return error token immediately logger.debug('Refresh cooldown active, skipping refresh attempt', { userId, timeSinceFailure, }); return { ...token, error: "SessionNotActive", accessToken: undefined, refreshToken: undefined, idToken: undefined, }; } // Try to refresh if (!token.refreshToken) { return { ...token, error: "NoRefreshToken", accessToken: undefined, refreshToken: undefined, idToken: undefined, }; } const refreshedToken = await refreshAccessToken(token); // If refresh failed, record the failure time if (refreshedToken.error === "SessionNotActive") { lastFailedRefresh.set(userId, Date.now()); logger.info("Keycloak session invalidated, setting cooldown", { userId }); // Clean up old entries (prevent memory leak) if (lastFailedRefresh.size > 1000) { const now = Date.now(); for (const [key, value] of lastFailedRefresh.entries()) { if (now - value > REFRESH_COOLDOWN * 10) { lastFailedRefresh.delete(key); } } } return { ...refreshedToken, accessToken: undefined, refreshToken: undefined, idToken: undefined, }; } // Success - clear any previous failure record lastFailedRefresh.delete(userId); return refreshedToken; } ``` #### 2. Use Redis for Distributed Circuit Breaker For multi-instance deployments, use Redis: ```typescript // lib/services/auth-circuit-breaker.ts import { getRedisClient } from '@/lib/redis'; const REFRESH_COOLDOWN = 5000; // 5 seconds const CIRCUIT_BREAKER_KEY = (userId: string) => `auth:refresh:cooldown:${userId}`; export async function isRefreshInCooldown(userId: string): Promise { const redis = getRedisClient(); const key = CIRCUIT_BREAKER_KEY(userId); const lastFailure = await redis.get(key); if (!lastFailure) { return false; } const timeSinceFailure = Date.now() - parseInt(lastFailure, 10); return timeSinceFailure < REFRESH_COOLDOWN; } export async function recordRefreshFailure(userId: string): Promise { const redis = getRedisClient(); const key = CIRCUIT_BREAKER_KEY(userId); await redis.set(key, Date.now().toString(), 'EX', Math.ceil(REFRESH_COOLDOWN / 1000)); } export async function clearRefreshCooldown(userId: string): Promise { const redis = getRedisClient(); const key = CIRCUIT_BREAKER_KEY(userId); await redis.del(key); } ``` #### 3. Client-Side Request Stopping Add to components to stop making requests when session invalid: ```typescript // hooks/use-session-guard.ts import { useSession } from 'next-auth/react'; import { useEffect, useRef } from 'react'; export function useSessionGuard() { const { status, data: session } = useSession(); const hasInvalidSession = useRef(false); useEffect(() => { if (status === 'unauthenticated' && !hasInvalidSession.current) { hasInvalidSession.current = true; // Stop all refresh intervals // Clear any pending requests } }, [status]); return { shouldMakeRequests: status === 'authenticated' && !hasInvalidSession.current, isInvalid: hasInvalidSession.current, }; } ``` --- ## Immediate Fix (Quick) ### Option 1: Add Simple Cooldown (In-Memory) Add this to `app/api/auth/options.ts`: ```typescript // At top of file const refreshCooldown = new Map(); const COOLDOWN_MS = 5000; // 5 seconds // In jwt callback, before refresh attempt: const userId = token.sub || 'unknown'; const lastFailure = refreshCooldown.get(userId) || 0; if (Date.now() - lastFailure < COOLDOWN_MS) { // Skip refresh, return error immediately return { ...token, error: "SessionNotActive", accessToken: undefined, refreshToken: undefined, idToken: undefined, }; } // After failed refresh: if (refreshedToken.error === "SessionNotActive") { refreshCooldown.set(userId, Date.now()); // ... rest of error handling } ``` ### Option 2: Early Return on Known Invalid Session ```typescript // In jwt callback, check token error first: if (token.error === "SessionNotActive") { // Already know session is invalid, don't try again return { ...token, accessToken: undefined, refreshToken: undefined, idToken: undefined, error: "SessionNotActive", }; } ``` --- ## Recommended Implementation 1. **Immediate**: Add simple in-memory cooldown (Option 1) 2. **Short-term**: Migrate to Redis-based circuit breaker 3. **Long-term**: Add client-side session guard to stop requests --- ## Testing After implementing: 1. **Test Scenario 1**: Logout from Keycloak admin console - Should see: 1-2 refresh attempts, then cooldown - Should NOT see: Infinite loop 2. **Test Scenario 2**: Expire session manually - Should see: Cooldown prevents refresh storm - Should see: User redirected to sign-in 3. **Test Scenario 3**: Multiple widgets loading - Should see: All widgets respect cooldown - Should see: No refresh storm --- ## Monitoring Add metrics: - Refresh attempt count - Refresh failure count - Cooldown activations - Session invalidations per user