Console Incident #16008
Issue with Firebase Console
Incident began at 2016-06-09 21:10 and ended at 2016-06-09 22:25 (all times are US/Pacific).
|Jul 12, 2016||14:09||
On Thursday 9 June 2016, the Firebase Console was unavailable for a duration of 93 minutes, with significant performance degradation in the preceding half hour. Although this did not affect user resources running on Firebase, we appreciate that many of our customers rely on the Firebase Console to manage those resources, and we apologize to everyone who was affected by the incident.
DETAILED DESCRIPTION OF IMPACT:
On Thursday 9 June 2016 from 20:52 to 22:25 PDT, the Firebase Console was unavailable. Users who attempted to connect to the Firebase Console observed high latency and HTTP server errors. Many users also observed increasing latency and error rates during the half hour before the incident.
Developer’s applications were unaffected by the incident and continued to run normally.
The Firebase Console, as well as Google Cloud Console, runs on Google App Engine, where it uses internal functionality that is not used by customer applications. Google App Engine version 1.9.39 introduced a bug in one internal function which affected both Firebase and Google Cloud Console instances, but not customer-owned applications, and thus escaped detection during testing and during initial rollout. Once enough instances of both consoles had been switched to 1.9.39, the consoles were unavailable and internal monitoring alerted the engineering team, who restored service by starting additional Console instances on 1.9.38.
During the entire incident, customer-owned applications were not affected and continued to operate normally.
REMEDIATION AND PREVENTION:
When the issue was provisionally identified as a specific interaction between Google App Engine version 1.9.39 and the Console, App Engine engineers brought up capacity running the previous App Engine version and transferred the Console to it, restoring service at 22:23 PDT.
The low-level bug that triggered the error has been identified and fixed.
To prevent a future recurrence, Google engineers are: 1) augmenting the testing and rollout monitoring to error rates on internal-only features, complementing existing monitoring for customer applications. 2) increasing the fidelity of the rollout monitoring framework to detect high error rates in individual apps, even if increases in global App Engine error rates are too small to detect.
We apologize again for the inconvenience this issue caused our customers.
|Jun 09, 2016||22:39||
The issue with Firebase Console should have been resolved for all users as of 22:25 US/Pacific. We will conduct an internal investigation of this issue and make appropriate improvements to our systems to prevent or minimize future recurrence. We will provide a more detailed analysis of this incident once we have completed our internal investigation.
|Jun 09, 2016||21:59||
We are still investigating the issue with Firebase Console. Current data indicates that all users are affected by this issue.
App serving is unaffected.
We will provide another status update by 23:00 US/Pacific with current details.
|Jun 09, 2016||21:22||
We are experiencing an issue with Firebase Console beginning at Thursday, 2016-06-09 21:10 US/Pacific. All users are affected and unable to access the console.
For everyone who is affected, we apologize for any inconvenience you may be experiencing. We will provide an update by 22:00 US/Pacific with current details.