This guide will help you integrate Google BigQuery with Twing Data to optimize your data warehouse performance, reduce costs, and gain valuable insights into query patterns and resource utilization.
Twing Data analyzes your BigQuery metadata to provide actionable insights. We collect information from these key system tables:
Twing Data processes your BigQuery metadata to provide:
The live integration creates a dedicated service account with minimal permissions to access only the necessary system views. This ensures secure, continuous monitoring while maintaining strict access controls.
-- Create service account and grant necessary permissions -- Run these commands in Cloud Shell or using gcloud CLI # Create service account gcloud iam service-accounts create twing-data-monitor \ --display-name="Twing Data Monitor" # Grant BigQuery Job User role gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \ --member="serviceAccount:twing-data-monitor@YOUR_PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/bigquery.jobUser" # Grant BigQuery Resource Viewer role gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \ --member="serviceAccount:twing-data-monitor@YOUR_PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/bigquery.resourceViewer" # Create and download service account key gcloud iam service-accounts keys create twing-data-key.json \ --iam-account=twing-data-monitor@YOUR_PROJECT_ID.iam.gserviceaccount.com
Note: The service account has read-only access to system views and job information. It cannot access your actual data tables or modify any resources.
For teams preferring a one-time analysis, this option extracts historical query data for a comprehensive performance audit. The analysis includes query patterns, resource utilization, and optimization opportunities.
-- Query job history and performance metrics SELECT creation_time, user_email, job_type, statement_type, total_bytes_processed, total_slot_ms, cache_hit, query, job_stages FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND job_type = 'QUERY' AND state = 'DONE' AND query NOT LIKE '%INFORMATION_SCHEMA%' ORDER BY creation_time DESC LIMIT 5000; -- Get detailed timeline information SELECT job_id, period_start, period_end, active_units, timeline FROM `region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE WHERE period_start >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) ORDER BY period_start DESC;
The extracted data provides insights into query performance, resource utilization, and optimization opportunities while excluding system queries and maintaining data privacy.