Google Workspace Status Dashboard

This page provides status information on the services that are part of Google Workspace. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://workspace.google.com/. For incidents related to Google Analytics, visit the Google Ads Status Dashboard.

Incident affecting Google Docs

Incident began at 2024-08-12 13:20 and ended at 2024-08-12 15:32 (times are in Coordinated Universal Time (UTC)).

Date Time Description
Aug 15, 2024 9:42 PM UTC

Incident Report

Summary

On 12 August 2024 at 06:20 US/Pacific, multiple Google Cloud and Google Workspace products experienced connectivity issues in europe-west2 for a duration of 40 minutes. During the time, ingress traffic to europe-west2 and egress traffic from europe-west2 experienced elevated latencies, connection timeouts, and connection failures.

Root Cause

On 12 August 2024 06:20 US/Pacific, primary and backup power feeds were both lost in a Google Point of Presence (POP) due to a substation switchgear failure. The affected POP hosts about ⅓ of serving first-layer Google Front Ends (GFEs) located in europe-west2 and some distributed networking equipment for that region. The power loss impacted the following Google products and services that depend on GFEs in that region:

  • Google Cloud APIs, Google Workspace, and other Google services like YouTube,
  • Customer-created global external application and proxy network load balancers, including Cloud CDN

The power loss also impacted the following Google Cloud products which depended on impacted networking equipment:

  • Customer-created regional external application, proxy network, and passthrough network load balancers in the europe-west2 region,
  • External protocol forwarding and VM external IP address connectivity for VMs in the europe-west2 region.
  • Google Cloud Interconnect connections in some LHR colocation facilities.

Impact was limited to situations where either or both of the following was true:

  • Inbound requests or connections were routed into the europe-west2 region of Google’s network, from the Internet, and those requests or connections depended on networking equipment that was offline, or unreachable pending reconvergence.
  • Outbound responses were routed to the Internet, from the europe-west2 region of Google’s network, and those responses depended on networking equipment that was without power.

The power outage caused Internet routes advertised by Google to be withdrawn in networks connected to Google’s network. The withdrawn routes were automatically replaced by other Google-advertised routes that didn’t depend on impacted networking equipment. Withdrawing and replacing routes relies on the BGP protocol and its timers, so replacement route convergence is not instantaneous, and overloading in the automatically selected replacement route GFEs extended the duration of the incident.

Detailed Description of Impact

  • Google Workspace: _Gmail, Google Calendar, Google Chat, Google Docs, Google Drive, Google Meet and Google Tasks users connecting to Workspace services from the UK region and surrounding areas experienced connectivity issues as described in the next point.
  • GFE-based products and services: _Customers on the Internet experienced a spike of broken connections followed by elevated latencies or HTTP error responses when communicating with GFE-powered Google APIs and services or customer-created global external application and proxy network load balancers. At roughly 06:23 US/Pacific, Google automatically redirected connections to the nearest possible first-layer GFEs with some latency penalty. Unfortunately, some of the nearest possible first-layer GFEs were overloaded until 06:48 when Google engineers made adjustments to more efficiently distribute incoming requests among nearby first-layer GFEs. Depending on the Google API or service or the customer-created global external load balancer, elevated latencies could have persisted until about 08:30 US/Pacific. Elevated latencies also could have applied to customer-created global external load balancers that had Cloud CDN enabled.
  • Regional Google Cloud products and services: _Until replacement routes were in effect, customers on the Internet experienced connection failures to the following GCP resources in the europe-west2 region:
    • Regional external application, proxy network, and passthrough network load balancers.
    • External protocol forwarding and VM external IP addresses.
  • Google Cloud Interconnect: _Google Cloud Interconnect connections in some LHR colocation facilities (lhr-zone1-47, lhr-zone1-832, lhr-zone1-2262, lhr-zone1-4885, lhr-zone1-99051 and lhr-zone2-47) remained offline from 06:20 US/Pacific to at least 06:57 US/Pacific, when power was restored.

At 06:43 US/Pacific, power was restored to the impacted networking equipment. Google networking equipment was fully operational by 06:57 US/Pacific, and connectivity to GFE-based products and services, regional Google Cloud products and services, and Google Cloud Interconnect resumed shortly thereafter.

Remediation and Prevention

Multiple Google engineering teams were alerted and automated recovery tooling was triggered as expected; however, manual adjustments were required to address subsequent first-layer GFE overload. Google is reviewing automation improvements in tasks that required manual intervention to reduce the duration of future power event impact. Similarly, Google is working to increase Cloud Interconnect control plane resilience and reduce mitigation time through automated reaction to isolation events.

Additionally Google's partner who maintains the affected facility power in LHR (London) is conducting a full root cause analysis with the switchboard manufacturer and substation owner(s) involved in supplying power, including follow up as to why stored or generated on-site emergency power did not carry loads.

Aug 12, 2024 7:33 PM UTC

A mini incident report has been posted to https://status.cloud.google.com/incidents/ETJGhvY9Xaktw7tgi8dF

Aug 12, 2024 3:39 PM UTC

The issue with Gmail, Google Calendar, Google Chat, Google Docs, Google Drive, Google Tasks has been resolved for all affected users as of Monday, 2024-08-12 08:35 US/Pacific.

During the issue, users connecting to Workspace services from the UK region may have experienced connectivity issues.

We will publish an analysis of this incident once we have completed our internal investigation.

We thank you for your patience while we worked on resolving the issue.

Aug 12, 2024 3:16 PM UTC

SUMMARY

Multiple Workspace services experienced brief connectivity issues for users connecting from the UK region

DESCRIPTION

We have experienced an issue with Gmail, Google Drive, Google Calendar, Google Docs, Google Chat, Google Tasks beginning at Monday, 2024-08-12 06:28 US/Pacific. The issue is now mitigated and our engineering teams are closely monitoring for any residual impact We will provide an update by Monday, 2024-08-12 09:15 US/Pacific with current details. We apologize to all who are affected by the disruption.

DIAGNOSIS

Multiple Workspace services experienced brief connectivity issues for users connecting from the UK region.

WORKAROUND

None at this time.