GitHub Outage
Incident Report for CircleCI
Resolved
GitHub has resolved the issue, we are no longer seeing any abnormal load and all queues are normal. Thank you for your patience as we worked thru this set of events.
Posted Oct 22, 2018 - 23:25 UTC
Update
Per GitHub: Webhook deliveries have caught up. We will continue to monitor and maintain capacity as we work thru the backlog of jobs
Posted Oct 22, 2018 - 22:27 UTC
Update
We are continuing to process webhooks as we receive them and are meeting current demand. Some fleets will continue to see a backlog
Posted Oct 22, 2018 - 22:03 UTC
Update
We are continuing to process webhooks as we receive them and are meeting current demand. Some fleets will continue to see a backlog
Posted Oct 22, 2018 - 21:33 UTC
Update
We are continuing to process webhooks as we receive them and are meeting current demand. Some fleets will continue to see a backlog for a while.
Posted Oct 22, 2018 - 20:54 UTC
Update
We are continuing to process webhooks as we receive them and have scaled to meet the continued demand.
Posted Oct 22, 2018 - 20:22 UTC
Update
We are continuing to process webhooks as we receive them and have scaled to meet the continued demand.
Posted Oct 22, 2018 - 19:53 UTC
Update
We are continuing to process webhooks as we receive them and have scaled to meet the demand.
Posted Oct 22, 2018 - 19:19 UTC
Update
We are continuing to process jobs as then are pushed form GitHub - be aware that our macOS fleet is going to be at capacity.
Posted Oct 22, 2018 - 18:36 UTC
Update
We have seen inbound hooks flowing into our system and we are monitoring to ensure that we have capacity to meet the demand.
Posted Oct 22, 2018 - 18:25 UTC
Update
Per GitHub: We have temporarily paused delivery of webhooks while we address an issue. We are working to resume delivery as soon as possible.
Posted Oct 22, 2018 - 17:40 UTC
Update
Per GitHub: We have resumed delivery of webhooks and will continue to monitor as we process a delayed backlog of events.
Posted Oct 22, 2018 - 16:46 UTC
Update
From GitHub -- We've completed validation of data consistency and have enabled some background jobs. We're continuing to monitor as the system recovers and expect to resume delivering webhooks at 16:45UTC.
Posted Oct 22, 2018 - 16:31 UTC
Monitoring
At 22:52 UTC on 21 October (15:52 PDT), GitHub experienced a network partition and subsequent database failure. This has caused intermittent issues with webhook delivery and other events that CircleCI depends on to manage your CircleCI workflows and jobs. The downtime has also prevented us from making API calls to GitHub to check on authorization and project/organization status.

Until GitHub has ended their outage, we will be unable to know fully what changes or issues this has caused with your projects or jobs within our system. Furthermore, when GitHub does start delivering webhooks again, we will see a surge of jobs starting, and we will immediately scale in response and remain overprovisioned until the surge is complete.

CircleCI Discuss: https://discuss.circleci.com/t/github-outage-on-21-october-2018/25903
Posted Oct 22, 2018 - 16:15 UTC