Cloud Functions Incident #18018

Investigating an issue with Functions

Incident began at 2018-03-09 03:30 and ended at 2018-03-09 08:39 (all times are US/Pacific).

Date Time Description
Mar 13, 2018 16:50

SUMMARY: A configuration error resulted in an outage of HTTP functions, delays in executing functions triggered by other sources, and deploy failures.

DETAILED DESCRIPTION OF IMPACT: During the outage, all requests to HTTP functions returned 500 responses. Functions triggered from other sources, such as authentication or database events, were delayed but eventually executed once functionality was fully restored. During the outage, deployments also failed.

ROOT CAUSE: We root caused this outage to a bad configuration file which was prematurely pushed to production. The problem was complicated by a bug in code intended to gracefully handle configuration errors.

REMEDIATION AND PREVENTION: To prevent this from happening again, we will implement improvements to our deploy tools and process docs, as well as performing a code review to identify and mitigate any similar risks that might exist. Thank you for your patience as we continue to improve the reliability of Cloud Functions.

Mar 09, 2018 08:40

The issue with Cloud Functions should have been resolved for all affected projects as of 08:40 US/Pacific. We will conduct an internal investigation of this issue and make appropriate improvements to our systems to prevent or minimize future recurrence.

Mar 09, 2018 07:20

Outage is still ongoing. We are seeing improvements but the problem is not fully resolved. We will keep this incident opened until we have more news.

Mar 09, 2018 04:45

We are still investigating the issue with Cloud Functions execution. We will provide another status update as soon as possible.

Mar 09, 2018 04:05

We've received a report of an issue with Google Cloud Functions as of 2018-03-09 03:30 AM US/Pacific. We will provide more information by 04:30 AM US/Pacific.

All times are US/Pacific
Send Feedback