Resolved
Jan 7th, 2023 11:59AM PST
All services have been restored. It may take a few minutes for some users to see the websites correctly as caching services rebuild.
Our team has identified a power issue impacting physical hardware which is the root cause for the outage.
Postmortem:
Incident Summary
Our team received alerts indicating that oregon.gov websites and applications were unavailable. As the team investigated the issue it was discovered that physical hardware in the primary data center had lost power and was unavailable.
The team initiated a failover of oregon.gov websites, which completed successfully but were not rendering completely. It was then discovered that a critical piece of software, an asset loader system, had failed to failover successfully and resulted in the incomplete rendering of the oregon.gov sites.
Once power was restored to the affected hardware the critical piece of software became available again and all websites returned to full operations.
Root Cause
The root cause of the issue was related to power draw from a single rack of servers that exceeded the amperage of the circuit serving the hardware.
The asset loader system did not failover to the alternate data center as expected, resulting in some sites loading improperly or not at all.
Corrective Action
A new circuit was run that provides 3 times the amperage to the affected hardware to provide adequate power for the impacted rack. Our team is also working on rectifying the issue that prevented the critical asset loader software from failing over to the alternate data cent successfully.
We have updated the asset loader system that is a critical piece of the oregon.gov websites to allow for a more seamless failover to the alternate data center in the event that the primary data center is unavailable.