QStash US Region: Schedule Degradation

Minor · Started May 5, 2026 · 2:24 PM

Duration

17h 40m
Severity

Minor
Detection lead

—
User reports

—

Summary

QStash US Region: Schedule Degradation

# **Incident Postmortem: Scheduled Jobs Inconsistency in US Region** On May 1, 2026, we experienced an incident affecting a subset of schedules in the US region following a recent infrastructure update. The issue has been resolved, and all affected schedules have been restored. ## **Summary** As part of an ongoing scalability improvement, we recently updated scheduling infrastructure in the US region to a new architecture. During this transition, a legacy execution path remained in the codebase as a fallback mechanism. On May 1, a bug caused the system to revert to the legacy path. This resulted in inconsistent state between the old and new scheduling systems for some users. ## **Impact** The incident affected a limited number of users in the US region. **Most users were not affected**, and the vast majority of schedules continued operating normally throughout the incident. Users who did not update schedules during the transition window continued operating normally throughout the incident. **A subset of users who created, edited, paused, or deleted schedules between April 24 and May 1 may have experienced one or more of the following:** * Schedule updates not being reflected * Paused schedules becoming active again * Deleted schedules reappearing * Newly created schedules not executing as expected Schedules created after the transition may have stopped executing briefly before recovery. ## **Root Cause** During the transition, the new scheduling infrastructure became the source of truth for schedule state. Due to a bug, the system unexpectedly reverted traffic to the legacy scheduling path, which began accepting updates independently from the new system. This caused the two systems to diverge and resulted in inconsistent schedule state for affected users. ## **Resolution** After identifying the issue, we: 1. Restored the new scheduling system as the active source of truth 2. Reconciled data between the legacy and new systems 3. Updated missing schedule changes back into the new infrastructure 4. Performed conflict resolution to preserve user data and schedule continuity In some cases, schedules that had previously been paused or deleted were restored to avoid permanent data loss. ## **Preventive Measures** We are implementing several changes to prevent similar incidents: * Removing obsolete fallback execution paths after transitions complete * Adding automated safeguards and alerts for unexpected system fallback behavior * Improving consistency validation between systems * Expanding rollback and reconciliation testing We apologize for the disruption and appreciate everyone’s patience while we resolved the issue.

Started

May 5, 2026 · 2:24 PM
Resolved

May 6, 2026 · 8:05 AM
Duration

17h 40m
Severity

Minor

Event timeline

How this incident unfolded

◆

Identified
May 5 · 2:24 PM Upstash

We are currently experiencing issues in the US region. - Duplicate Deliveries: During this period, some scheduled jobs may be executed twice. - Schedule Disruption: Schedules created between April 24, 2026 and May 2, 2026 are currently not running. Our team is actively working on a fix. Once the migration is complete, affected schedules will resume normal operation. We will provide updates as progress continues.
◆

Identified
May 5 · 2:29 PM Upstash

We are continuing to work on a fix for this issue.
◉

Monitoring
May 5 · 3:44 PM Upstash

A fix has been implemented and we are monitoring the results.
◉

Monitoring
May 5 · 9:31 PM Upstash

Main schedule functionality is back to normal. We are currently checking if previously created schedules are delivered as expected before marking the incident as resolved.
✓

Resolved
May 6 · 8:05 AM Upstash

This incident has been resolved.
✓

Postmortem
May 6 · 8:55 AM Upstash

# **Incident Postmortem: Scheduled Jobs Inconsistency in US Region** On May 1, 2026, we experienced an incident affecting a subset of schedules in the US region following a recent infrastructure update. The issue has been resolved, and all affected schedules have been restored. ## **Summary** As part of an ongoing scalability improvement, we recently updated scheduling infrastructure in the US region to a new architecture. During this transition, a legacy execution path remained in the codebase as a fallback mechanism. On May 1, a bug caused the system to revert to the legacy path. This resulted in inconsistent state between the old and new scheduling systems for some users. ## **Impact** The incident affected a limited number of users in the US region. **Most users were not affected**, and the vast majority of schedules continued operating normally throughout the incident. Users who did not update schedules during the transition window continued operating normally throughout the incident. **A subset of users who created, edited, paused, or deleted schedules between April 24 and May 1 may have experienced one or more of the following:** * Schedule updates not being reflected * Paused schedules becoming active again * Deleted schedules reappearing * Newly created schedules not executing as expected Schedules created after the transition may have stopped executing briefly before recovery. ## **Root Cause** During the transition, the new scheduling infrastructure became the source of truth for schedule state. Due to a bug, the system unexpectedly reverted traffic to the legacy scheduling path, which began accepting updates independently from the new system. This caused the two systems to diverge and resulted in inconsistent schedule state for affected users. ## **Resolution** After identifying the issue, we: 1. Restored the new scheduling system as the active source of truth 2. Reconciled data between the legacy and new systems 3. Updated missing schedule changes back into the new infrastructure 4. Performed conflict resolution to preserve user data and schedule continuity In some cases, schedules that had previously been paused or deleted were restored to avoid permanent data loss. ## **Preventive Measures** We are implementing several changes to prevent similar incidents: * Removing obsolete fallback execution paths after transitions complete * Adding automated safeguards and alerts for unexpected system fallback behavior * Improving consistency validation between systems * Expanding rollback and reconciliation testing We apologize for the disruption and appreciate everyone’s patience while we resolved the issue.

Pattern

Other recent incidents

Browse full incident history →

Get alerted before the next Upstash outage.

Pulsetic catches degradations minutes before vendors acknowledge them.

Start monitoring free

Stay online, all the time, with Pulsetic's uptime prime. Try Free

By Designmodo
MONITORING

SSL Monitoring

Ping Monitoring

Port Monitoring

TCP Monitoring

Keyword Monitoring

Cron Monitoring

Domain Monitoring
STATUS

ChatGPT

Claude

Cloudflare

GitHub

Shopify

Slack

Stripe

Supabase
SERVICE

Pricing

Status Pages

Status Badges

Is Website Down?

Website Errors

Privacy

Terms

DPA
COMPARE

Statuspage Atlassian

Better Stack

Uptimerobot

StatusCake

Freshping

Pingdom

Site24x7

Uptime.com
ACCOUNT

Login

Help

Support

Lost Password

Affiliate Program