How to Use SQL with pandasql and BigQuery?
How to Use SQL with pandasql and BigQuery?
Using SQL with pandasql and BigQuery
This guide shows how to run SQL queries in two powerful environments:
- Python’s
pandasqlfor in-memory DataFrame querying - Google
BigQueryfor cloud-scale data warehousing
1. pandasql: SQL on Pandas DataFrames
pandasql lets you write SQL queries directly on pandas DataFrames using SQLite syntax.
Setup
1
pip install pandasql
Example Usage
1
2
import pandas as pd
import pandasql as ps
2. BigQuery: SQL at Cloud Scale
BigQuery is a fully-managed enterprise data warehouse provided by Google Cloud Platform (GCP). It allows you to run SQL queries on massive datasets hosted in the cloud.
Setup
- Search ‘Bigquery’ or follow the link
- Sign in or create a google account and enable BigQuery.
You can use BigQuery for free while in the trial period.
- You’re ready to go!
Example Usage
1
2
3
4
SELECT *
FROM `your-project-id`.dataset.table
-- Replace `your-project-id` with yours --
LIMIT 10;
3. Summary
| Tool | Use Case | Data Size | Environment |
|---|---|---|---|
| pandasql | Quick, local SQL queries on DataFrames | Small | Local / In-Memory |
| BigQuery | Scalable analytics, public datasets | Large | Cloud (GCP) |
Now you can utilize sql on pandas and bigquery.
Want tp learn some SQL queries? Check out this guide!
Happy querying!
This post is licensed under CC BY 4.0 by the author.