Post

How to Use SQL with pandasql and BigQuery?

How to Use SQL with pandasql and BigQuery?

Using SQL with pandasql and BigQuery


This guide shows how to run SQL queries in two powerful environments:

  • Python’s pandasql for in-memory DataFrame querying
  • Google BigQuery for cloud-scale data warehousing

1. pandasql: SQL on Pandas DataFrames

pandasql lets you write SQL queries directly on pandas DataFrames using SQLite syntax.

Setup

1
pip install pandasql

Example Usage

1
2
import pandas as pd
import pandasql as ps

2. BigQuery: SQL at Cloud Scale

BigQuery is a fully-managed enterprise data warehouse provided by Google Cloud Platform (GCP). It allows you to run SQL queries on massive datasets hosted in the cloud.

Setup

  1. Search ‘Bigquery’ or follow the link
  2. Sign in or create a google account and enable BigQuery.

    You can use BigQuery for free while in the trial period.

  3. You’re ready to go!

Example Usage

1
2
3
4
SELECT *
FROM `your-project-id`.dataset.table
-- Replace `your-project-id` with yours --   
LIMIT 10;

3. Summary

ToolUse CaseData SizeEnvironment
pandasqlQuick, local SQL queries on DataFramesSmallLocal / In-Memory
BigQueryScalable analytics, public datasetsLargeCloud (GCP)

Now you can utilize sql on pandas and bigquery.

Want tp learn some SQL queries? Check out this guide!

Happy querying!

This post is licensed under CC BY 4.0 by the author.