Schema and example queries for Drive inventory exports in BigQuery

Supported editions for this feature: Enterprise Standard and Enterprise Plus; Education Standard and Education Plus; Enterprise Essentials Plus; Cloud Identity Premium. Compare your edition

After you set up Drive inventory exports, you can analyze the dataset in BigQuery. This page describes the table schema and example queries for common reports you can get from BigQuery.

Note: You must set up billing for your Google Cloud project because there is a cost to export and query your Drive inventory in BigQuery. Enable billing for a project.

Drive inventory export schema

Schema and definitions

The Drive inventory export is added to the BigQuery dataset with the following schema:

Name Schema field Description
Item ID id The unique Drive item identifier of the item.
Version version The latest version number of the file. The file version is a monotonically increasing number. This reflects every change made to the file on the server, even those not visible to the user.
Owner ID owner.user.id

For items owned by a user, the user ID of the file owner.

Owner email owner.user.email For items owned by a user, the email of the file owner.
Shared drive ID owner.shared_drive.id For items in a shared drive, the shared drive ID.
Creator creator The email and user ID of the item's creator. This field isn't populated if the creator no longer has access to the item.
Snapshot timestamps snapshot_start_time_micros and snapshot_end_time_micros The timestamps of when the report generation started and ended.
Title title

The user-specified file title.

Description description A short description of the item.
Trashed Status trashed Whether or not the item was moved to the trash but not yet deleted.
Mime Type mime_type The MIME type of the item. See Google Workspace and Google Drive supported MIME types.
File Extension file_extension The final component of the full file extension, which is extracted from the name field. This is only available for items with binary content in Google Drive
Creation Time create_time_micros The timestamp of when the item was created.
Last Modified Time last_modified_time_micros The timestamp of the last time the item was modified by anyone.
File Size file_size_bytes The size in bytes of blobs and Docs Editors files. Not reported for items that have no size, like shortcuts and folders.
Consumed Quota consumed_quota_bytes The number of storage quota bytes used by the item. This includes the head revision as well as previous revisions that have been kept.
Shared Drive ID shared_drive_id For items in a shared drive, the shared drive ID.
Organizational Unit org_unit_path The organizational unit of the item's owner.
Applied Labels* applied_labels.id The ID of the label applied to the item.
applied_labels.title The resource name of the label applied to the item.
applied_labels.field_values.id

The ID of a field, unique within a label or library.

applied_labels.field_values.type

The type of field. Can be one of the following:

applied_labels.field_values.display_name

The text to show in the UI that identifies the field.
Permissions access.permissions.permission_id

The ID of this permission. This is a unique identifier for the grantee.

Note: The permission ID is not the same as the user ID.

access.permissions.name

The "pretty" name of the value of the permission. The following is a list of examples for each type of permission:

  • user - User's full name, as defined for their Google account, such as "Dana Cruz"
  • group - Name of the group, such as "Example Admins"
  • domain - String domain name, such as "example.com"
  • anyone - No name reported
access.permissions.type

The type of the permission. Valid values are:

  • user
  • group
  • domain
  • anyone
access.permissions.role

The role granted by this permission. Valid values are:

  • owner
  • organizer
  • fileOrganizer
  • writer
  • commenter
  • reader

Note: See roles and permission for details on the type.

access.permissions.view

If the permission belongs to a view, the value is published.

access.permissions.allow_file_discovery

Whether the permission allows the file to be searchable. Only reported for permissions of type domain or anyone.
access.permissions.email The email address of the user or group to which this permission refers.
access.permissions.domain The domain name of the entity this permission refers to. This is present when the permission type is user, group or domain.
access.permissions.user_id The user ID associated with email on the permissions.

* Refers to labels applied to items in Drive.

Query examples

These query examples assume legacy SQL. Replace project_name.dataset_name with your own project name and dataset name. Learn more about querying BigQuery data.

Expand all  |  Collapse all

Find items that are findable with Drive search

Replace project_name.dataset_name with your project name and dataset name.

Returns a table with File ID, title, owner email, and type.

SELECT id, title, mime_type, owner.user.email
FROM project_name.dataset_name
WHERE EXISTS (
  SELECT 1
  FROM UNNEST(access.permissions) AS permission
  WHERE permission.allow_file_discovery = true
)
Find items with a specific label

Replace project_name.dataset_name with your project name and dataset name. Replace label_id with your label ID, which you can get in the label manager.

This query returns a table with File ID, title, owner email, and type.

SELECT id, title, mime_type, owner.user.email
FROM project_name.dataset_name
WHERE EXISTS (
  SELECT 1
  FROM UNNEST(applied_labels) AS label
  WHERE label.id = label_id
)
Find users who are above a Drive storage use threshold

Replace project_name.dataset_name with your project name and dataset name. Replace threshold with your threshold in MB. For example, to find users with over 15 GB of Drive storage use, the second-to-last line is HAVING total_storage_mb > 15000.

This query returns a table user email and total Drive storage use in MB in descending order.

SELECT owner.user.email AS user_email,
  SUM(size_bytes) / (1000000) AS total_storage_mb
FROM project_name.dataset_name
GROUP BY user_email
HAVING total_storage_mb > threshold
ORDER BY total_storage_mb DESC
Find which item types use the most space

Replace project_name.dataset_name with your project name and dataset name.

This query returns a table of item types by descending size.

SELECT mime_type, SUM(consumed_quota_bytes) AS total_consumed
FROM project_name.dataset_name
GROUP BY mime_type
ORDER BY total_consumed DESC
Find items encrypted with client-side encryption (CSE)

Replace project_name.dataset_name with your project name and dataset name.

SELECT id, owner.user.email, title
FROM project_name.dataset_name
WHERE mime_type LIKE '%.encrypted'
Find items shared with a specific user

Replace project_name.dataset_name with your project name and dataset name. Replace [email protected] with your specific user.

Note: This query doesn't find sharing for the user through groups or target audience membership.

SELECT id, title, mime_type, owner.user.email as owner_email
FROM project_name.dataset_name
WHERE EXISTS (
  SELECT 1
  FROM UNNEST(access.permissions) AS permissions
  WHERE permissions.email = [email protected]
)

Related topics

Was this helpful?

How can we improve it?
Search
Clear search
Close search
Google apps
Main menu
9077178885817568510
true
Search Help Center
true
true
true
true
true
73010
false
false