When systems fail, users do not care which dependency is down. They care whether the product still helps them finish the task they opened it for.
That is why graceful degradation matters more than uptime as an abstract KPI. A system can be partially degraded and still preserve most of the user value if the failure modes were designed well.
What Good Degradation Looks Like
Instead of a blank page or a global error modal, a resilient product may:
- serve cached catalog data
- hide a secondary widget
- show stale but still useful content
- keep read paths available while write paths are paused
This is not hiding reality. It is deciding which experience is less harmful when part of the stack is unhealthy.
Why stale-while-revalidate Helps
The stale-while-revalidate model is useful for content that changes infrequently but is read constantly.
Cache-Control: s-maxage=60, stale-while-revalidate=86400
That tells an intermediary cache it can keep serving slightly stale content while it tries to refresh it in the background. If the refresh fails, users may still get a useful response instead of an origin error.
That makes sense for:
- product pages
- docs
- public content
It makes much less sense for:
- balances
- checkout totals
- highly sensitive real-time state
UI Degradation Matters Too
Backend resilience is only part of the story. Frontend architecture should isolate failure so one weak dependency does not take down the whole page.
That often means:
- component-level fallbacks
- partial rendering
- clear but calm status messaging
Further Reading