Upstash Status · History · Incident #2212

RESOLVED

Fly.io Upstash Redis Service Distruption

Critical · Started May 12, 2026 · 9:49 AM

  • Duration

    3h 59m

  • Severity

    Critical

  • Detection lead

  • User reports

Summary

Fly.io Upstash Redis Service Distruption

On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.


  • Started

    May 12, 2026 · 9:49 AM

  • Resolved

    May 12, 2026 · 1:48 PM

  • Duration

    3h 59m

  • Severity

    Critical

Event timeline

How this incident unfolded

  • Investigating

    May 12 · 9:49 AM Upstash

    Some regions are experiencing connectivity issues due to an ongoing network problem. We are currently investigating

  • Identified

    May 12 · 10:50 AM Upstash

    The issue has been identified and the fix is being implemented.

  • Monitoring

    May 12 · 12:05 PM Upstash

    A fix has been implemented and we are monitoring the results.

  • Resolved

    May 12 · 1:48 PM Upstash

    This incident has been resolved.

  • Postmortem

    May 15 · 12:43 PM Upstash

    On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.

Get alerted before the next Upstash outage.

Pulsetic catches degradations minutes before vendors acknowledge them.

Start monitoring free
Hey there 👋  Friends from designmodo are here to help!