QStash US Region Service Disruption

Critical · Started May 8, 2026 · 9:46 AM

Duration

22m
Severity

Critical
Detection lead

—
User reports

—

Summary

QStash US Region Service Disruption

**Root Cause Analysis** On **April 24**, we deployed a more optimized scheduler implementation in the **US East \(N. Virginia\)** region. On **May 8**, a user who had active schedules deleted their account. Under normal behavior, scheduled tasks associated with a deleted account should wake up, detect that the account no longer exists, and exit after performing cleanup. Due to a bug introduced in the new scheduler implementation, this code path did not return early as intended. Execution continued and resulted in a nil pointer dereference. A second issue then amplified the impact. When a panic occurs in the scheduler, it is designed to be recovered, logged, and isolated so that the process remains healthy. Because of another bug in the panic recovery path, the panic was not properly caught, which caused the worker process handling the scheduled job to terminate. After that process exited, another worker picked up responsibility for delivering the same scheduled task. Since the same faulty execution path was still present, that worker also failed. This created a cascading failure pattern across workers attempting to process the affected schedules. **Resolution** We deployed two fixes: * Added the missing early return in the deleted-account cleanup path, preventing the nil pointer dereference. * Corrected the panic recovery logic so that future panics are safely recovered, logged, and reported without causing worker processes to terminate. With these changes in place, the affected execution path is now safe. Even if a future bug triggers a panic in this area, it will be isolated and reported rather than causing process-level failure.

Started

May 8, 2026 · 9:46 AM
Resolved

May 8, 2026 · 10:08 AM
Duration

22m
Severity

Critical

Event timeline

How this incident unfolded

◐

Investigating
May 8 · 9:46 AM Upstash

We are currently investigating the issue.
◐

Investigating
May 8 · 9:46 AM Upstash

We are continuing to investigate this issue.
◉

Monitoring
May 8 · 10:06 AM Upstash

A fix has been implemented and we are monitoring the results.
✓

Resolved
May 8 · 10:08 AM Upstash

This incident has been resolved, we will publish RCA soon.
✓

Postmortem
May 12 · 8:25 AM Upstash

**Root Cause Analysis** On **April 24**, we deployed a more optimized scheduler implementation in the **US East \(N. Virginia\)** region. On **May 8**, a user who had active schedules deleted their account. Under normal behavior, scheduled tasks associated with a deleted account should wake up, detect that the account no longer exists, and exit after performing cleanup. Due to a bug introduced in the new scheduler implementation, this code path did not return early as intended. Execution continued and resulted in a nil pointer dereference. A second issue then amplified the impact. When a panic occurs in the scheduler, it is designed to be recovered, logged, and isolated so that the process remains healthy. Because of another bug in the panic recovery path, the panic was not properly caught, which caused the worker process handling the scheduled job to terminate. After that process exited, another worker picked up responsibility for delivering the same scheduled task. Since the same faulty execution path was still present, that worker also failed. This created a cascading failure pattern across workers attempting to process the affected schedules. **Resolution** We deployed two fixes: * Added the missing early return in the deleted-account cleanup path, preventing the nil pointer dereference. * Corrected the panic recovery logic so that future panics are safely recovered, logged, and reported without causing worker processes to terminate. With these changes in place, the affected execution path is now safe. Even if a future bug triggers a panic in this area, it will be isolated and reported rather than causing process-level failure.

Pattern

Other recent incidents

Browse full incident history →

Get alerted before the next Upstash outage.

Pulsetic catches degradations minutes before vendors acknowledge them.

Start monitoring free

Stay online, all the time, with Pulsetic's uptime prime. Try Free

By Designmodo
MONITORING

SSL Monitoring

Ping Monitoring

Port Monitoring

TCP Monitoring

Keyword Monitoring

Cron Monitoring

Domain Monitoring
STATUS

ChatGPT

Claude

Cloudflare

GitHub

Shopify

Slack

Stripe

Supabase
SERVICE

Pricing

Status Pages

Status Badges

Is Website Down?

Website Errors

Privacy

Terms

DPA
COMPARE

Statuspage Atlassian

Better Stack

Uptimerobot

StatusCake

Freshping

Pingdom

Site24x7

Uptime.com
ACCOUNT

Login

Help

Support

Lost Password

Affiliate Program