Upstash Status · History · Incident #2202

RESOLVED

Fly.io Upstash Redis service distruption (FRA region)

Major · Started May 11, 2026 · 3:05 PM

  • Duration

    3h 16m

  • Severity

    Major

  • Detection lead

  • User reports

Summary

Fly.io Upstash Redis service distruption (FRA region)

On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.


  • Started

    May 11, 2026 · 3:05 PM

  • Resolved

    May 11, 2026 · 6:22 PM

  • Duration

    3h 16m

  • Severity

    Major

Event timeline

How this incident unfolded

  • Investigating

    May 11 · 3:05 PM Upstash

    Some databases may experience increased latency or timeouts in Fly.io’s FRA region.

  • Investigating

    May 11 · 3:07 PM Upstash

    We are continuing to investigate the issue.

  • Investigating

    May 11 · 5:19 PM Upstash

    We are working with Fly team to investigate the root cause.

  • Resolved

    May 11 · 6:22 PM Upstash

    The incident has been resolved. We are working with Fly team on RCA.

  • Postmortem

    May 15 · 12:49 PM Upstash

    On May 12th and 13th at various times, a subset of Upstash Redis instances on [Fly.io](http://Fly.io) experienced intermittent hangs and elevated error rates. The Redis process would stall inside a logging syscall — alive but not making progress — which made the issue hard to spot from our usual telemetry. After investigating with Fly's team, we identified the root cause as a bad interaction between a recent guest kernel update on Fly's newer machines and an upstream Cloud Hypervisor bug \([cloud-hypervisor#7672](https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7672)\) affecting log writes from inside the VM. We mitigated by disabling the affected logging paths, and Fly has since rolled out a hypervisor-side patch, fully resolving the issue. No data was lost. Sorry for the disruption.

Get alerted before the next Upstash outage.

Pulsetic catches degradations minutes before vendors acknowledge them.

Start monitoring free
Hey there 👋  Friends from designmodo are here to help!