GCP AI Notebooks Vulnerability - Remediation

This is an update to my previous blog post which documented a mechanism for GCP Org Policy Bypass using custom metadata on compute instances.


The Original Issue
Google makes use of the custom metadata in compute instances to authorize access to AI Notebooks and their web UIs.
Individuals granted access via custom metadata need not have any IAM permissions on the compute instance, on the service account running the Notebook or even be a member of the Organization. Authorization via custom metadata, bypasses a specific Organization Policy Constraint which is intended to restrict cross-domain resource sharing.

Googles Notification To Customers
On September 9th, Google sent a notification to customers informing them of a ‘Security Enhancement’ being made to the AI Notebooks API.

Notification-1 Notification-2

Highlights of the notification include:

  • AI Notebooks will require a user who is granted access to a Notebook in Single-User Mode to have the iam.serviceAccounts.actAs permission on the Service Account the Notebook Instance is running as.
  • Indicates the fix will be retroactive. Meaning, if any access is currently being granted without the user having the iam.serviceAccounts.actAs permission, this change will actively break their access.
  • Provided a gcloud command for customers to search compute instances for ones using metadata as an authorization mechanism.
  • Date for roll-out to be Friday, September 10th 2021.

Is the Original Issue Remediated?
Finally, yes
While the engineers at Google pushed a patch for this issue on September 10th, they did not invalidate a persistent auth cookie set by the Datalab signin endpoint.
With a valid ‘DATALAB_TUNNEL_TOKEN’ cookie, an end user is not redirected through the authentication flow (and through the patched piece of code) rather, they are fast tracked to the ‘/lab?authuser=0’ endpoint where they are authorized to access the Jupyter Notebook, again, irrespective of IAM Role assignment.

Is there any lingering risk?
I don’t believe so.

  1. The initial fix Google rolled out on Friday, September 10th seemed to address all active connections through the Inverting Proxy.
    A forcible re-registeration process was performed that had the affect of actively terminating access to notebooks if the user in proxy-user-mail custom metadata field DID NOT have the iam.serviceAccounts.actAs permission.
  2. The ‘DATALAB_TUNNEL_TOKEN’ cookie has a TTL of 24 hours and the expiration appears to be enforced server side. It has been more than 24 since the Datalab Signin enpoint would have issued a ‘DATALAB_TUNNEL_TOKEN’ cookie to a user who does not have the iam.serviceAccounts.actAs permission

Lessons Learned

  1. It’s never good when two people are performing seemingly the same test and come up with different results.
  2. Invalidate persistent auth cookies when patching auth bypass issues.
  3. If you release details of insufficiently mitigated vulnerability in a major cloud provider on a Friday night, expect to be working the weekend.

Shout Outs
Special thanks to the Google Engineers that partnered with me over the weekend to figure out what was going on.
And to the friends who let me complain to them.