293 lines
7.6 KiB
Markdown
293 lines
7.6 KiB
Markdown
# Critical Issue: Infinite Token Refresh Loop
|
|
|
|
## Problem Analysis
|
|
|
|
### What's Happening
|
|
|
|
1. **Initial Load**: App starts successfully, user authenticated
|
|
2. **Session Invalidation**: Keycloak session becomes invalid (user logged out elsewhere, session expired, etc.)
|
|
3. **Refresh Storm**: Every API request triggers:
|
|
- JWT callback execution
|
|
- Token refresh attempt
|
|
- Refresh failure (session invalid)
|
|
- Token cleared, but error state persists
|
|
- **Next request repeats the cycle**
|
|
|
|
### Root Cause
|
|
|
|
The JWT callback in `app/api/auth/options.ts` has no circuit breaker:
|
|
|
|
```typescript
|
|
// Current problematic flow:
|
|
if (expiresAt && Date.now() < expiresAt) {
|
|
return token; // Token valid
|
|
}
|
|
|
|
// Token expired - ALWAYS tries to refresh
|
|
const refreshedToken = await refreshAccessToken(token);
|
|
|
|
// If refresh fails, clears tokens but...
|
|
// Next request will see expired token and try again!
|
|
```
|
|
|
|
**The Problem:**
|
|
- No cooldown period after failed refresh
|
|
- No "session invalid" cache/flag
|
|
- Every request triggers refresh attempt
|
|
- Multiple widgets = multiple parallel requests = refresh storm
|
|
|
|
### Impact
|
|
|
|
- **Performance**: Excessive Keycloak API calls
|
|
- **Server Load**: CPU/memory spike from refresh attempts
|
|
- **User Experience**: App appears broken, constant loading
|
|
- **Logs**: Spam with "Keycloak session invalidated" messages
|
|
- **Security**: Potential DoS on Keycloak server
|
|
|
|
---
|
|
|
|
## Solution: Circuit Breaker Pattern
|
|
|
|
### Implementation Strategy
|
|
|
|
1. **Add Refresh Cooldown**: Don't retry refresh for X seconds after failure
|
|
2. **Session Invalid Flag**: Cache the "session invalid" state
|
|
3. **Early Return**: If session known to be invalid, skip refresh attempt
|
|
4. **Client-Side Detection**: Stop making requests when session invalid
|
|
|
|
### Code Changes Needed
|
|
|
|
#### 1. Add Circuit Breaker to JWT Callback
|
|
|
|
```typescript
|
|
// Add to app/api/auth/options.ts
|
|
|
|
// Track last failed refresh attempt
|
|
const lastFailedRefresh = new Map<string, number>();
|
|
const REFRESH_COOLDOWN = 5000; // 5 seconds
|
|
|
|
async jwt({ token, account, profile }) {
|
|
// ... existing initial sign-in logic ...
|
|
|
|
// Check if token is expired
|
|
const expiresAt = token.accessTokenExpires as number;
|
|
if (expiresAt && Date.now() < expiresAt) {
|
|
return token; // Token still valid
|
|
}
|
|
|
|
// CIRCUIT BREAKER: Check if we recently failed to refresh
|
|
const userId = token.sub || 'unknown';
|
|
const lastFailure = lastFailedRefresh.get(userId) || 0;
|
|
const timeSinceFailure = Date.now() - lastFailure;
|
|
|
|
if (timeSinceFailure < REFRESH_COOLDOWN) {
|
|
// Too soon after failure, return error token immediately
|
|
logger.debug('Refresh cooldown active, skipping refresh attempt', {
|
|
userId,
|
|
timeSinceFailure,
|
|
});
|
|
return {
|
|
...token,
|
|
error: "SessionNotActive",
|
|
accessToken: undefined,
|
|
refreshToken: undefined,
|
|
idToken: undefined,
|
|
};
|
|
}
|
|
|
|
// Try to refresh
|
|
if (!token.refreshToken) {
|
|
return {
|
|
...token,
|
|
error: "NoRefreshToken",
|
|
accessToken: undefined,
|
|
refreshToken: undefined,
|
|
idToken: undefined,
|
|
};
|
|
}
|
|
|
|
const refreshedToken = await refreshAccessToken(token);
|
|
|
|
// If refresh failed, record the failure time
|
|
if (refreshedToken.error === "SessionNotActive") {
|
|
lastFailedRefresh.set(userId, Date.now());
|
|
logger.info("Keycloak session invalidated, setting cooldown", { userId });
|
|
|
|
// Clean up old entries (prevent memory leak)
|
|
if (lastFailedRefresh.size > 1000) {
|
|
const now = Date.now();
|
|
for (const [key, value] of lastFailedRefresh.entries()) {
|
|
if (now - value > REFRESH_COOLDOWN * 10) {
|
|
lastFailedRefresh.delete(key);
|
|
}
|
|
}
|
|
}
|
|
|
|
return {
|
|
...refreshedToken,
|
|
accessToken: undefined,
|
|
refreshToken: undefined,
|
|
idToken: undefined,
|
|
};
|
|
}
|
|
|
|
// Success - clear any previous failure record
|
|
lastFailedRefresh.delete(userId);
|
|
|
|
return refreshedToken;
|
|
}
|
|
```
|
|
|
|
#### 2. Use Redis for Distributed Circuit Breaker
|
|
|
|
For multi-instance deployments, use Redis:
|
|
|
|
```typescript
|
|
// lib/services/auth-circuit-breaker.ts
|
|
import { getRedisClient } from '@/lib/redis';
|
|
|
|
const REFRESH_COOLDOWN = 5000; // 5 seconds
|
|
const CIRCUIT_BREAKER_KEY = (userId: string) => `auth:refresh:cooldown:${userId}`;
|
|
|
|
export async function isRefreshInCooldown(userId: string): Promise<boolean> {
|
|
const redis = getRedisClient();
|
|
const key = CIRCUIT_BREAKER_KEY(userId);
|
|
const lastFailure = await redis.get(key);
|
|
|
|
if (!lastFailure) {
|
|
return false;
|
|
}
|
|
|
|
const timeSinceFailure = Date.now() - parseInt(lastFailure, 10);
|
|
return timeSinceFailure < REFRESH_COOLDOWN;
|
|
}
|
|
|
|
export async function recordRefreshFailure(userId: string): Promise<void> {
|
|
const redis = getRedisClient();
|
|
const key = CIRCUIT_BREAKER_KEY(userId);
|
|
await redis.set(key, Date.now().toString(), 'EX', Math.ceil(REFRESH_COOLDOWN / 1000));
|
|
}
|
|
|
|
export async function clearRefreshCooldown(userId: string): Promise<void> {
|
|
const redis = getRedisClient();
|
|
const key = CIRCUIT_BREAKER_KEY(userId);
|
|
await redis.del(key);
|
|
}
|
|
```
|
|
|
|
#### 3. Client-Side Request Stopping
|
|
|
|
Add to components to stop making requests when session invalid:
|
|
|
|
```typescript
|
|
// hooks/use-session-guard.ts
|
|
import { useSession } from 'next-auth/react';
|
|
import { useEffect, useRef } from 'react';
|
|
|
|
export function useSessionGuard() {
|
|
const { status, data: session } = useSession();
|
|
const hasInvalidSession = useRef(false);
|
|
|
|
useEffect(() => {
|
|
if (status === 'unauthenticated' && !hasInvalidSession.current) {
|
|
hasInvalidSession.current = true;
|
|
// Stop all refresh intervals
|
|
// Clear any pending requests
|
|
}
|
|
}, [status]);
|
|
|
|
return {
|
|
shouldMakeRequests: status === 'authenticated' && !hasInvalidSession.current,
|
|
isInvalid: hasInvalidSession.current,
|
|
};
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Immediate Fix (Quick)
|
|
|
|
### Option 1: Add Simple Cooldown (In-Memory)
|
|
|
|
Add this to `app/api/auth/options.ts`:
|
|
|
|
```typescript
|
|
// At top of file
|
|
const refreshCooldown = new Map<string, number>();
|
|
const COOLDOWN_MS = 5000; // 5 seconds
|
|
|
|
// In jwt callback, before refresh attempt:
|
|
const userId = token.sub || 'unknown';
|
|
const lastFailure = refreshCooldown.get(userId) || 0;
|
|
|
|
if (Date.now() - lastFailure < COOLDOWN_MS) {
|
|
// Skip refresh, return error immediately
|
|
return {
|
|
...token,
|
|
error: "SessionNotActive",
|
|
accessToken: undefined,
|
|
refreshToken: undefined,
|
|
idToken: undefined,
|
|
};
|
|
}
|
|
|
|
// After failed refresh:
|
|
if (refreshedToken.error === "SessionNotActive") {
|
|
refreshCooldown.set(userId, Date.now());
|
|
// ... rest of error handling
|
|
}
|
|
```
|
|
|
|
### Option 2: Early Return on Known Invalid Session
|
|
|
|
```typescript
|
|
// In jwt callback, check token error first:
|
|
if (token.error === "SessionNotActive") {
|
|
// Already know session is invalid, don't try again
|
|
return {
|
|
...token,
|
|
accessToken: undefined,
|
|
refreshToken: undefined,
|
|
idToken: undefined,
|
|
error: "SessionNotActive",
|
|
};
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Recommended Implementation
|
|
|
|
1. **Immediate**: Add simple in-memory cooldown (Option 1)
|
|
2. **Short-term**: Migrate to Redis-based circuit breaker
|
|
3. **Long-term**: Add client-side session guard to stop requests
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
After implementing:
|
|
|
|
1. **Test Scenario 1**: Logout from Keycloak admin console
|
|
- Should see: 1-2 refresh attempts, then cooldown
|
|
- Should NOT see: Infinite loop
|
|
|
|
2. **Test Scenario 2**: Expire session manually
|
|
- Should see: Cooldown prevents refresh storm
|
|
- Should see: User redirected to sign-in
|
|
|
|
3. **Test Scenario 3**: Multiple widgets loading
|
|
- Should see: All widgets respect cooldown
|
|
- Should see: No refresh storm
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
Add metrics:
|
|
- Refresh attempt count
|
|
- Refresh failure count
|
|
- Cooldown activations
|
|
- Session invalidations per user
|
|
|