Supported editions for this feature: Enterprise Standard and Enterprise Plus; Education Standard and Education Plus; Enterprise Essentials Plus; Cloud Identity Premium. Compare your edition
After you set up Drive inventory exports, you can analyze the dataset in BigQuery. This page describes the table schema and example queries for common reports you can get from BigQuery.
Note: You must set up billing for your Google Cloud project because there is a cost to export and query your Drive inventory in BigQuery. Enable billing for a project.
Drive inventory export schema
Schema and definitions
The Drive inventory export is added to the BigQuery dataset with the following schema:
Name | Schema field | Description |
---|---|---|
Item ID | id |
The unique Drive item identifier of the item. |
Version | version |
The latest version number of the file. The file version is a monotonically increasing number. This reflects every change made to the file on the server, even those not visible to the user. |
Owner ID | owner.user.id |
For items owned by a user, the user ID of the file owner. |
Owner email | owner.user.email |
For items owned by a user, the email of the file owner. |
Shared drive ID | owner.shared_drive.id |
For items in a shared drive, the shared drive ID. |
Creator | creator |
The email and user ID of the item's creator. This field isn't populated if the creator no longer has access to the item. |
Snapshot timestamps | snapshot_start_time_micros and snapshot_end_time_micros |
The timestamps of when the report generation started and ended. |
Title | title |
The user-specified file title. |
Description | description |
A short description of the item. |
Trashed Status | trashed |
Whether or not the item was moved to the trash but not yet deleted. |
Mime Type | mime_type |
The MIME type of the item. See Google Workspace and Google Drive supported MIME types. |
File Extension | file_extension |
The final component of the full file extension, which is extracted from the name field. This is only available for items with binary content in Google Drive |
Creation Time | create_time_micros |
The timestamp of when the item was created. |
Last Modified Time | last_modified_time_micros |
The timestamp of the last time the item was modified by anyone. |
File Size | file_size_bytes |
The size in bytes of blobs and Docs Editors files. Not reported for items that have no size, like shortcuts and folders. |
Consumed Quota | consumed_quota_bytes |
The number of storage quota bytes used by the item. This includes the head revision as well as previous revisions that have been kept. |
Shared Drive ID | shared_drive_id |
For items in a shared drive, the shared drive ID. |
Organizational Unit | org_unit_path |
The organizational unit of the item's owner. |
Applied Labels* | applied_labels.id |
The ID of the label applied to the item. |
applied_labels.title |
The resource name of the label applied to the item. | |
applied_labels.field_values.id |
The ID of a field, unique within a label or library. |
|
|
The type of field. Can be one of the following: |
|
|
The text to show in the UI that identifies the field. | |
Permissions | access.permissions.permission_id |
The ID of this permission. This is a unique identifier for the grantee. Note: The permission ID is not the same as the user ID. |
access.permissions.name |
The "pretty" name of the value of the permission. The following is a list of examples for each type of permission:
|
|
access.permissions.type |
The type of the permission. Valid values are:
|
|
access.permissions.role |
The role granted by this permission. Valid values are:
Note: See roles and permission for details on the type. |
|
access.permissions.view |
If the permission belongs to a view, the value is |
|
|
Whether the permission allows the file to be searchable. Only reported for permissions of type domain or anyone . |
|
access.permissions.email |
The email address of the user or group to which this permission refers. | |
access.permissions.domain |
The domain name of the entity this permission refers to. This is present when the permission type is user , group or domain . |
|
access.permissions.user_id |
The user ID associated with email on the permissions. |
* Refers to labels applied to items in Drive.
Query examples
These query examples assume legacy SQL. Replace project_name.dataset_name with your own project name and dataset name. Learn more about querying BigQuery data.
Find items that are findable with Drive searchReplace project_name.dataset_name with your project name and dataset name.
Returns a table with File ID, title, owner email, and type.
SELECT id, title, mime_type, owner.user.email
FROM project_name.dataset_name
WHERE EXISTS (
SELECT 1
FROM UNNEST(access.permissions) AS permission
WHERE permission.allow_file_discovery = true
)
Replace project_name.dataset_name with your project name and dataset name. Replace label_id with your label ID, which you can get in the label manager.
This query returns a table with File ID, title, owner email, and type.
SELECT id, title, mime_type, owner.user.email
FROM project_name.dataset_name
WHERE EXISTS (
SELECT 1
FROM UNNEST(applied_labels) AS label
WHERE label.id = label_id
)
Replace project_name.dataset_name with your project name and dataset name. Replace threshold with your threshold in MB. For example, to find users with over 15 GB of Drive storage use, the second-to-last line is HAVING total_storage_mb > 15000
.
This query returns a table user email and total Drive storage use in MB in descending order.
SELECT owner.user.email AS user_email,
SUM(size_bytes) / (1000000) AS total_storage_mb
FROM project_name.dataset_name
GROUP BY user_email
HAVING total_storage_mb > threshold
ORDER BY total_storage_mb DESC
Replace project_name.dataset_name with your project name and dataset name.
This query returns a table of item types by descending size.
SELECT mime_type, SUM(consumed_quota_bytes) AS total_consumed
FROM project_name.dataset_name
GROUP BY mime_type
ORDER BY total_consumed DESC
Replace project_name.dataset_name with your project name and dataset name.
SELECT id, owner.user.email, title
FROM project_name.dataset_name
WHERE mime_type LIKE '%.encrypted'
Replace project_name.dataset_name with your project name and dataset name. Replace [email protected] with your specific user.
Note: This query doesn't find sharing for the user through groups or target audience membership.
SELECT id, title, mime_type, owner.user.email as owner_email
FROM project_name.dataset_name
WHERE EXISTS (
SELECT 1
FROM UNNEST(access.permissions) AS permissions
WHERE permissions.email = [email protected]
)