Best practices for bulk data export queries
Always use aggregation functions
Data in the tables is not guaranteed to be consolidated by date, URL, site, or any combination of keys. Therefore you should always aggregate your fields (SUM, COUNT, and so on) to guarantee that your row values are consolidated. For example, a table might have several rows with the query "fat cats":
query date search_type impressions ... fat cats 10-10-2022 WEB 1000 fat cats 10-10-2022 WEB 200 fat cats 10-10-2022 IMAGE 5000
Searching simply for the query "fat cats", even for a specific date and search type, can result in multiple rows. So you should aggregate as follows to see the sum of your impressions:
/* BAD way to see most popular query: will get only 1 row per query */ SELECT query, impressions FROM searchconsole.searchdata_site_impression ORDER BY 2 DESC LIMIT 1; Result: fat cats 5000 /* RIGHT way to see the most popular query */ SELECT query, SUM(impressions) FROM searchconsole.searchdata_url_impression WHERE query != '' /* Filter out anonymized queries */ GROUP BY query ORDER BY 2 DESC LIMIT 1; Result: fat cats 6200
Limit queries by date to save processing costs
Remember that when you run a query on BigQuery you will be charged for it, and your tables can become quite large. The tables exported are date-partitioned, which allows for efficient limiting of the input scan to only days of interest - for large sites this can translate into real savings in query cost. A good way to minimize query costs is to use a WHERE clause to limit the date range in the date partitioned table.
Filter out null query strings
Anonymized queries often make up the single most common query for a site. An anonymized query is reported as a zero-length string in the table. If you want to see the most popular query for specific criteria, you probably want to filter out zero-length query values from your SQL query.
Sample queries
Below are some sample queries to help you get up and running with the data.
Results by site
All web stats, by day (last two weeks)
SELECT data_date AS date, sum(impressions) AS impressions, sum(clicks) as clicks, sum(clicks) / sum(impressions) AS ctr, /* Added one below, because position is zero-based */ ((sum(sum_top_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_site_impression WHERE search_type = 'WEB' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY data_date ORDER BY date desc LIMIT 1000
Top USA mobile web queries by clicks (last two weeks)
SELECT query, device, sum(impressions) AS impressions, sum(clicks) AS clicks, sum(clicks) / sum(impressions) AS ctr, ((sum(sum_top_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_site_impression WHERE search_type = 'WEB' AND country = 'usa' AND device = 'MOBILE' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY clicks LIMIT 1000
Results by URL
Top Discover URLs by clicks (on a specific day)
SELECT url, sum(impressions), sum(clicks) FROM searchconsole.searchdata_url_impression WHERE search_type = 'DISCOVER' AND data_date = DATE(2022, 02, 01) GROUP BY url ORDER BY 3 LIMIT 1000
Total query+url stats for FAQ rich results (last two weeks)
SELECT url, query, sum(impressions) AS impressions, sum(clicks) AS clicks, sum(clicks) / sum(impressions) AS ctr, ((sum(sum_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_url_impression WHERE search_type = 'WEB' AND is_tpf_faq = true AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY clicks LIMIT 1000
All your traffic by URL (last two weeks)
SELECT url, search_type, sum(impressions) as impressions, sum(clicks) as clicks FROM searchconsole.searchdata_url_impression WHERE data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY 3 desc LIMIT 1000
Impressions for queries containing "your_brand" (last two weeks)
SELECT regexp_contains(query, 'your_brand') AS contains_brand, sum(impressions) AS impressions, sum(clicks) AS clicks FROM searchconsole.searchdata_url_impression WHERE search_type = 'WEB' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1 ORDER BY 2 desc LIMIT 1000