GitHub Status · History · Incident #270
RESOLVEDIncident with Actions
Critical · Started May 5, 2026 · 1:37 PM
$HTTP_PROTOCOL = (isset($_SERVER['HTTPS']) && ($_SERVER['HTTPS'] == 'on' || $_SERVER['HTTPS'] == 1)) || (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') ? 'https://' : 'http://'; $SITE_URL = $HTTP_PROTOCOL . $_SERVER['SERVER_NAME'] . '/'; ?>
GitHub Status · History · Incident #270
RESOLVEDCritical · Started May 5, 2026 · 1:37 PM
Duration
3h 48m
Severity
Critical
Detection lead
—
User reports
—
Summary
On May 5, 2026, from approximately 13:22 UTC to 17:05 UTC, GitHub Actions hosted runners in the East US region were degraded. 13.5% of jobs requesting a standard runner failed and ~16% of requested Larger Runners with private networking pinned to East US failed or were delayed by more than 5 minutes. Copilot Code Review requests were also impacted. Approximately 8,500 code review requests timed out during this window. Affected users saw an error comment on their pull requests and were able to retry by re-requesting a review. Most runner requests were picked up by other regions automatically, but a portion of requests still routing to East US were impacted.<br /><br />This was triggered by a scale-up operation for hosted runner VMs in the East US region. This is a regular operation, but the VM create load hit an internal rate limit when VM creates pull images from storage. Existing backoff logic was not triggered because of the response code returned in this case. The rate limiting and VM creation failures were mitigated by reducing load to allow for recovery and allowing queued work to be processed. By 15:34 UTC, queued and failed job assignments were mostly mitigated, with less than 0.5% of runner assignments impacted between 15:34 and full recovery at 17:05.<br /><br />We are improving our system’s throttling behavior when limits occur, improving our controls to more quickly mitigate similar situations in the future, and reviewing all limits end-to-end for similar operations. We also immediately paused all scale and similar operations until these changes are in place and validated.
Started
May 5, 2026 · 1:37 PM
Resolved
May 5, 2026 · 5:26 PM
Duration
3h 48m
Severity
Critical
Event timeline
Investigating
May 5 · 1:37 PM GitHubWe are investigating reports of degraded availability for Actions
Investigating
May 5 · 1:48 PM GitHubWe are investigating elevated queue times on Actions Jobs running on Standard Hosted Runners in East US affecting 10% of runs
Investigating
May 5 · 2:14 PM GitHubWe are investigating elevated queue times and failures on Actions Jobs running on Hosted Runners in East US affecting 8% of runs. Hosted Runners with private networking can fail over to a different Azure region to mitigate the issue.
Investigating
May 5 · 3:12 PM GitHubWe are working with our compute provider to alleviate elevated queue times and failures for Actions Jobs running on Hosted Runners in the East US region affecting 10% of runs. Hosted Runners with private networking can fail over to a different Region to mitigate the issue.
Investigating
May 5 · 3:54 PM GitHubWe've applied a mitigation for long queue times and failures on Standard Hosted Runners and are monitoring for full recovery. Hosted Runners with Private Networking in the East US region remain affected as we continue working with our compute provider to restore capacity.
Investigating
May 5 · 4:33 PM GitHubWe've seen signs of recovery for Standard Hosted Runners and are continuing to monitor for full recovery. Hosted Runners with Private Networking in the East US region remain affected as we continue working with our compute provider to restore capacity.
Investigating
May 5 · 5:11 PM GitHubStandard hosted runners have now reached full recovery. Hosted Runners with Private Networking in the East US region remain degraded as we continue working with our compute provider to restore capacity. Hosted Runners with private networking can fail over to a different Region to mitigate the issue.
Investigating
May 5 · 5:11 PM GitHubActions is experiencing degraded performance. We are continuing to investigate.
Resolved
May 5 · 5:26 PM GitHubOn May 5, 2026, from approximately 13:22 UTC to 17:05 UTC, GitHub Actions hosted runners in the East US region were degraded. 13.5% of jobs requesting a standard runner failed and ~16% of requested Larger Runners with private networking pinned to East US failed or were delayed by more than 5 minutes. Copilot Code Review requests were also impacted. Approximately 8,500 code review requests timed out during this window. Affected users saw an error comment on their pull requests and were able to retry by re-requesting a review. Most runner requests were picked up by other regions automatically, but a portion of requests still routing to East US were impacted.<br /><br />This was triggered by a scale-up operation for hosted runner VMs in the East US region. This is a regular operation, but the VM create load hit an internal rate limit when VM creates pull images from storage. Existing backoff logic was not triggered because of the response code returned in this case. The rate limiting and VM creation failures were mitigated by reducing load to allow for recovery and allowing queued work to be processed. By 15:34 UTC, queued and failed job assignments were mostly mitigated, with less than 0.5% of runner assignments impacted between 15:34 and full recovery at 17:05.<br /><br />We are improving our system’s throttling behavior when limits occur, improving our controls to more quickly mitigate similar situations in the future, and reviewing all limits end-to-end for similar operations. We also immediately paused all scale and similar operations until these changes are in place and validated.
Pattern
Pulsetic catches degradations minutes before vendors acknowledge them.
Stay online, all the time, with Pulsetic's uptime prime. Try Free
By Designmodo
MONITORING
STATUS
SERVICE
COMPARE
ACCOUNT