Building Resilient Systems: Timeout, Retry, and Fallback

Resilient systems handle failures gracefully. After building production systems, here are the patterns that work.

Timeout Pattern

Implementation

function withTimeout(promise, timeoutMs) {
    return Promise.race([
        promise,
        new Promise((_, reject) => {
            setTimeout(() => {
                reject(new Error('Operation timed out'));
            }, timeoutMs);
        })
    ]);
}

// Usage
try {
    const result = await withTimeout(
        fetchUser(userId),
        5000 // 5 second timeout
    );
} catch (error) {
    if (error.message === 'Operation timed out') {
        // Handle timeout
        return getCachedUser(userId);
    }
    throw error;
}

Retry Pattern

Exponential Backoff

async function retryWithBackoff(fn, options = {}) {
    const maxRetries = options.maxRetries || 3;
    const initialDelay = options.initialDelay || 1000;
    
    for (let attempt = 0; attempt <= maxRetries; attempt++) {
        try {
            return await fn();
        } catch (error) {
            if (attempt === maxRetries) {
                throw error;
            }
            
            const delay = initialDelay * Math.pow(2, attempt);
            await sleep(delay);
        }
    }
}

Fallback Pattern

Implementation

async function getUserWithFallback(userId) {
    try {
        return await userService.getUser(userId);
    } catch (error) {
        // Fallback to cache
        const cached = await cache.get(`user:${userId}`);
        if (cached) {
            return cached;
        }
        
        // Fallback to default
        return {
            id: userId,
            name: 'Guest',
            email: 'guest@example.com'
        };
    }
}

Best Practices

Set timeouts - Prevent hanging
Retry wisely - Exponential backoff
Use fallbacks - Graceful degradation
Monitor failures - Track patterns
Circuit breakers - Prevent cascading
Handle errors - Proper error types
Test failures - Chaos engineering
Document patterns - Clear guidelines

Conclusion

Resilient systems require:

Timeout handling
Retry logic
Fallback strategies
Error monitoring

Combine patterns for production resilience. The patterns shown here handle real-world failures.

Building resilient systems from May 2022, covering timeout, retry, and fallback patterns.