Upstash Status · History · Incident #1894
RESOLVEDQStash US Region: Schedule Degradation
Minor · Started May 5, 2026 · 2:24 PM
$HTTP_PROTOCOL = (isset($_SERVER['HTTPS']) && ($_SERVER['HTTPS'] == 'on' || $_SERVER['HTTPS'] == 1)) || (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') ? 'https://' : 'http://'; $SITE_URL = $HTTP_PROTOCOL . $_SERVER['SERVER_NAME'] . '/'; ?>
Upstash Status · History · Incident #1894
RESOLVEDMinor · Started May 5, 2026 · 2:24 PM
Duration
17h 40m
Severity
Minor
Detection lead
—
User reports
—
Summary
# **Incident Postmortem: Scheduled Jobs Inconsistency in US Region** On May 1, 2026, we experienced an incident affecting a subset of schedules in the US region following a recent infrastructure update. The issue has been resolved, and all affected schedules have been restored. ## **Summary** As part of an ongoing scalability improvement, we recently updated scheduling infrastructure in the US region to a new architecture. During this transition, a legacy execution path remained in the codebase as a fallback mechanism. On May 1, a bug caused the system to revert to the legacy path. This resulted in inconsistent state between the old and new scheduling systems for some users. ## **Impact** The incident affected a limited number of users in the US region. **Most users were not affected**, and the vast majority of schedules continued operating normally throughout the incident. Users who did not update schedules during the transition window continued operating normally throughout the incident. **A subset of users who created, edited, paused, or deleted schedules between April 24 and May 1 may have experienced one or more of the following:** * Schedule updates not being reflected * Paused schedules becoming active again * Deleted schedules reappearing * Newly created schedules not executing as expected Schedules created after the transition may have stopped executing briefly before recovery. ## **Root Cause** During the transition, the new scheduling infrastructure became the source of truth for schedule state. Due to a bug, the system unexpectedly reverted traffic to the legacy scheduling path, which began accepting updates independently from the new system. This caused the two systems to diverge and resulted in inconsistent schedule state for affected users. ## **Resolution** After identifying the issue, we: 1. Restored the new scheduling system as the active source of truth 2. Reconciled data between the legacy and new systems 3. Updated missing schedule changes back into the new infrastructure 4. Performed conflict resolution to preserve user data and schedule continuity In some cases, schedules that had previously been paused or deleted were restored to avoid permanent data loss. ## **Preventive Measures** We are implementing several changes to prevent similar incidents: * Removing obsolete fallback execution paths after transitions complete * Adding automated safeguards and alerts for unexpected system fallback behavior * Improving consistency validation between systems * Expanding rollback and reconciliation testing We apologize for the disruption and appreciate everyone’s patience while we resolved the issue.
Started
May 5, 2026 · 2:24 PM
Resolved
May 6, 2026 · 8:05 AM
Duration
17h 40m
Severity
Minor
Event timeline
Identified
May 5 · 2:24 PM UpstashWe are currently experiencing issues in the US region. - Duplicate Deliveries: During this period, some scheduled jobs may be executed twice. - Schedule Disruption: Schedules created between April 24, 2026 and May 2, 2026 are currently not running. Our team is actively working on a fix. Once the migration is complete, affected schedules will resume normal operation. We will provide updates as progress continues.
Identified
May 5 · 2:29 PM UpstashWe are continuing to work on a fix for this issue.
Monitoring
May 5 · 3:44 PM UpstashA fix has been implemented and we are monitoring the results.
Monitoring
May 5 · 9:31 PM UpstashMain schedule functionality is back to normal. We are currently checking if previously created schedules are delivered as expected before marking the incident as resolved.
Resolved
May 6 · 8:05 AM UpstashThis incident has been resolved.
Postmortem
May 6 · 8:55 AM Upstash# **Incident Postmortem: Scheduled Jobs Inconsistency in US Region** On May 1, 2026, we experienced an incident affecting a subset of schedules in the US region following a recent infrastructure update. The issue has been resolved, and all affected schedules have been restored. ## **Summary** As part of an ongoing scalability improvement, we recently updated scheduling infrastructure in the US region to a new architecture. During this transition, a legacy execution path remained in the codebase as a fallback mechanism. On May 1, a bug caused the system to revert to the legacy path. This resulted in inconsistent state between the old and new scheduling systems for some users. ## **Impact** The incident affected a limited number of users in the US region. **Most users were not affected**, and the vast majority of schedules continued operating normally throughout the incident. Users who did not update schedules during the transition window continued operating normally throughout the incident. **A subset of users who created, edited, paused, or deleted schedules between April 24 and May 1 may have experienced one or more of the following:** * Schedule updates not being reflected * Paused schedules becoming active again * Deleted schedules reappearing * Newly created schedules not executing as expected Schedules created after the transition may have stopped executing briefly before recovery. ## **Root Cause** During the transition, the new scheduling infrastructure became the source of truth for schedule state. Due to a bug, the system unexpectedly reverted traffic to the legacy scheduling path, which began accepting updates independently from the new system. This caused the two systems to diverge and resulted in inconsistent schedule state for affected users. ## **Resolution** After identifying the issue, we: 1. Restored the new scheduling system as the active source of truth 2. Reconciled data between the legacy and new systems 3. Updated missing schedule changes back into the new infrastructure 4. Performed conflict resolution to preserve user data and schedule continuity In some cases, schedules that had previously been paused or deleted were restored to avoid permanent data loss. ## **Preventive Measures** We are implementing several changes to prevent similar incidents: * Removing obsolete fallback execution paths after transitions complete * Adding automated safeguards and alerts for unexpected system fallback behavior * Improving consistency validation between systems * Expanding rollback and reconciliation testing We apologize for the disruption and appreciate everyone’s patience while we resolved the issue.
Pattern
Pulsetic catches degradations minutes before vendors acknowledge them.
Stay online, all the time, with Pulsetic's uptime prime. Try Free
By Designmodo
MONITORING
STATUS
SERVICE
COMPARE
ACCOUNT