Smartsheet to BigQuery: A Practical Integration Guide
How to get Smartsheet data into BigQuery for real analysis — covering the API, ETL options, schema design, incremental loading, and how to deliver charts downstream to Slack automatically.
Smartsheet to BigQuery: A Practical Integration Guide
Smartsheet is where a lot of operational work actually lives. Project plans, resource allocation grids, budget trackers, intake forms, programme portfolios. Teams love it because it looks like a spreadsheet but behaves more like a database — row hierarchy, conditional logic, cross-sheet formulas, automated workflows.
The problem is that Smartsheet's analytics are limited to what Smartsheet can see. If you want to know how your project delivery rate correlates with your sales pipeline, or how resource utilisation has trended over the last two years, or how your intake queue backlog compares across quarters — you need a warehouse.
This guide covers the full journey: getting Smartsheet data into BigQuery correctly, handling the data model quirks that trip most people up, and then using that data to power automated chart delivery to Slack.
Why Smartsheet Data Belongs in BigQuery
Before getting into the mechanics, it's worth being precise about which Smartsheet data is valuable in a warehouse — because not all of it is.
High-value data to bring into BigQuery:
- Project portfolio status — current phase, RAG status, owner, target dates across all projects. In Smartsheet this is usually a multi-level hierarchy sheet. In BQ you can flatten it, aggregate by department or portfolio, and join against delivery outcomes.
- Resource allocation / capacity — who is assigned to what, at what percentage, across what dates. Smartsheet is commonly used as a capacity planner. In BQ you can calculate utilisation rates, spot overallocation, and trend it over time.
- Budget vs actuals — finance teams often maintain spend tracking in Smartsheet sheets with monthly actuals filled in. In BQ this can be joined against GL data or ARR data for a complete picture.
- Intake / request queues — volume, source, category, resolution time, assignee. Smartsheet forms feed into sheets; that operational data is rich for SLA analysis.
- Cross-sheet summaries — Smartsheet reports can pull rows from multiple sheets, but they can't do proper SQL aggregations. A BQ view can.
Lower-value data to think carefully about:
- Attachment metadata and discussion threads (useful for audit trails, but rarely for analytics)
- Auto-number columns and Smartsheet system fields (mostly internal plumbing)
- Sheets that are really just working documents rather than structured data records
Option 1: Managed ETL (Fastest Path)
If you want Smartsheet in BigQuery with minimal engineering investment, managed connectors are the right call:
Fivetran
Fivetran has a Smartsheet connector that runs incremental syncs on a schedule. It normalises sheets into BQ tables, handles pagination and API rate limits, and alerts on schema drift. The catch is cost — Fivetran prices per connector per month, and it adds up fast for large orgs with many sheets.
Best for: Enterprises where engineering time is more expensive than tooling cost.
Airbyte
Airbyte is the open-source alternative. The Smartsheet source connector is community-maintained. You can self-host on a VM or Cloud Run, or use Airbyte Cloud. Less polish than Fivetran, but free to run and highly configurable.
Best for: Engineering teams comfortable with self-hosted infrastructure who want to avoid per-connector costs.
Stitch (by Talend)
Stitch sits between Fivetran and Airbyte in cost and complexity. It has a Smartsheet integration and destinations including BigQuery. Simpler setup than Airbyte, cheaper than Fivetran for smaller volumes.
Best for: Smaller data teams who want a managed service without enterprise pricing.
Option 2: Custom Python Pipeline (Most Control)
If your Smartsheet schema is complex, if you need to transform data as it lands, or if you want to run this inside your existing GCP infrastructure, a custom pipeline gives you full control.
The Smartsheet Python SDK
Smartsheet publishes an official Python SDK. Install it alongside the BigQuery client:
pip install smartsheet-python-sdk google-cloud-bigquery pandas pyarrow
Pulling a sheet
The core unit in Smartsheet is a Sheet. Every sheet has:
- A
columnslist — each column has anid,title, andtype - A
rowslist — each row hascells, with each cell referencing acolumn_id
import smartsheet
ss = smartsheet.Smartsheet(access_token='your_token')
sheet = ss.Sheets.get_sheet(
sheet_id=1234567890,
include=['objectValue'], # Needed for CONTACT_LIST and multi-value types
)
# Build a column ID → title map
columns = {col.id: col.title for col in sheet.columns}
col_types = {col.id: col.type for col in sheet.columns}
Handling Smartsheet column types
This is where most DIY pipelines go wrong. Smartsheet has a richer column type system than it appears:
| Smartsheet type | What it contains | How to extract |
|---|---|---|
TEXT_NUMBER | String or number | cell.value |
DATE | ISO date string | cell.value |
DATETIME | ISO datetime | cell.value |
CONTACT_LIST | Object with name, email | cell.object_value.email |
MULTI_CONTACT_LIST | List of contact objects | Join emails with comma |
CHECKBOX | Boolean | cell.value (True/False) |
PICKLIST | String | cell.value |
MULTI_PICKLIST | List of strings | cell.object_value.values |
DURATION | String like "4W 2D" | cell.value (parse separately) |
PREDECESSOR | Dependency string | cell.value |
AUTO_NUMBER | Auto-incremented string | cell.value |
def extract_cell_value(cell, col_type):
if cell.value is None:
return None
if col_type == 'CONTACT_LIST':
obj = cell.object_value
if obj and hasattr(obj, 'email'):
return obj.email
return cell.display_value
elif col_type == 'MULTI_CONTACT_LIST':
obj = cell.object_value
if obj and hasattr(obj, 'values'):
return ','.join(
v.email for v in obj.values if hasattr(v, 'email')
)
return cell.display_value
elif col_type == 'MULTI_PICKLIST':
obj = cell.object_value
if obj and hasattr(obj, 'values'):
return ','.join(obj.values)
return cell.display_value
else:
# For TEXT_NUMBER, DATE, DATETIME, CHECKBOX, PICKLIST
return cell.display_value or cell.value
Handling row hierarchy
Smartsheet rows can be nested — a project has sub-tasks, sub-tasks have checklist items. Each row carries parent_id and indent fields. Whether you need to preserve this depends on your use case:
def parse_rows(sheet, columns, col_types):
rows = []
for row in sheet.rows:
record = {
'_row_id': row.id,
'_row_number': row.row_number,
'_parent_id': row.parent_id, # None if top-level
'_indent': row.indent, # 1 = top-level, 2 = child, etc.
'_created_at': row.created_at.isoformat() if row.created_at else None,
'_modified_at': row.modified_at.isoformat() if row.modified_at else None,
}
for cell in row.cells:
col_title = columns.get(cell.column_id)
col_type = col_types.get(cell.column_id)
if col_title:
# Sanitise column name for BQ
bq_col = col_title.lower().replace(' ', '_').replace('-', '_')
record[bq_col] = extract_cell_value(cell, col_type)
rows.append(record)
return rows
If you're building project hierarchy analysis, keep _parent_id and _indent in your BQ table and use recursive CTEs or ARRAY_AGG to reconstruct the tree.
Writing to BigQuery
import pandas as pd
from google.cloud import bigquery
def load_to_bigquery(rows, project, dataset, table, write_mode='WRITE_TRUNCATE'):
df = pd.DataFrame(rows)
bq = bigquery.Client(project=project)
table_ref = f'{project}.{dataset}.{table}'
job_config = bigquery.LoadJobConfig(
write_disposition=write_mode, # WRITE_TRUNCATE for full refresh
autodetect=True,
source_format=bigquery.SourceFormat.PARQUET,
)
job = bq.load_table_from_dataframe(df, table_ref, job_config=job_config)
job.result()
print(f'Loaded {len(df)} rows into {table_ref}')
Option 3: CSV Export → GCS → BigQuery (Simple but Manual)
For teams that don't need real-time freshness and prefer simplicity:
- In Smartsheet, export a sheet as CSV (File → Export → Export Sheet to CSV)
- Upload to a GCS bucket (
gsutil cp sheet.csv gs://your-bucket/smartsheet/) - Load into BigQuery via
bq loador the console
This is fine for monthly snapshots or ad hoc analysis, but breaks down for anything you want to automate or keep fresh. The CSV export also loses some metadata (row IDs, hierarchy) that the API preserves.
Incremental Loading: Only Sync What Changed
Full sheet refreshes are expensive at scale and burn through your Smartsheet API quota (300 requests/minute on most plans). Once your initial load is in place, switch to incremental loading.
The Smartsheet API supports a rowsModifiedSince filter:
from datetime import datetime, timezone
def get_modified_rows(ss, sheet_id, since: datetime):
"""Pull only rows modified after `since`."""
sheet = ss.Sheets.get_sheet(
sheet_id=sheet_id,
rows_modified_since=since.strftime('%Y-%m-%dT%H:%M:%SZ'),
include=['objectValue'],
)
return sheet
# Load last sync time from a metadata table in BQ
def get_last_sync_time(bq, project, dataset) -> datetime:
query = f"""
SELECT MAX(sync_at) AS last_sync
FROM `{project}.{dataset}._sync_metadata`
WHERE sheet_id = @sheet_id
"""
# ... execute and return datetime or epoch if no prior sync
pass
# After a successful load, record the sync time
def record_sync(bq, project, dataset, sheet_id):
now = datetime.now(timezone.utc)
rows = [{'sheet_id': str(sheet_id), 'sync_at': now.isoformat()}]
bq.insert_rows_json(f'{project}.{dataset}._sync_metadata', rows)
For the BQ write, use WRITE_APPEND with a subsequent MERGE statement rather than WRITE_TRUNCATE to upsert changed rows against the existing table:
-- Merge incremental rows into the main table
MERGE `your-project.smartsheet.projects` AS target
USING `your-project.smartsheet.projects_staging` AS source
ON target._row_id = source._row_id
WHEN MATCHED THEN
UPDATE SET
status = source.status,
owner = source.owner,
target_date = source.target_date,
_modified_at = source._modified_at
WHEN NOT MATCHED THEN
INSERT ROW
This is the right pattern for any Smartsheet sheet that updates frequently — project status grids, resource plans, budget trackers.
Schema Design Recommendations
A few patterns that work well for Smartsheet data in BigQuery:
One table per sheet (simple, recommended to start)
Each Smartsheet sheet becomes a BQ table. Column names are sanitised sheet column titles. This mirrors the source structure and is easy to reason about.
smartsheet.projects -- Main project portfolio sheet
smartsheet.resources -- Resource allocation sheet
smartsheet.budget_tracker -- Budget vs actuals sheet
smartsheet.intake_queue -- Incoming request form responses
Add a _sheet_id column for multi-sheet merges
If you're ingesting multiple sheets with the same structure (e.g., one sheet per region or one sheet per team), add _sheet_id and _sheet_name columns so you can UNION ALL them cleanly:
SELECT *, '1234567890' AS _sheet_id, 'EMEA Projects' AS _sheet_name
FROM `smartsheet.emea_projects`
UNION ALL
SELECT *, '9876543210' AS _sheet_id, 'APAC Projects' AS _sheet_name
FROM `smartsheet.apac_projects`
Flatten hierarchy into a view
Keep the raw _parent_id and _indent in your base table, then expose a flattened view for analysts:
-- View: top-level projects with summary child counts
CREATE OR REPLACE VIEW `smartsheet.vw_projects_summary` AS
SELECT
p.project_name,
p.status,
p.owner,
p.target_date,
COUNT(c._row_id) AS task_count,
COUNTIF(c.status = 'Complete') AS tasks_complete,
SAFE_DIVIDE(COUNTIF(c.status = 'Complete'), COUNT(c._row_id)) AS completion_rate
FROM `smartsheet.projects` p
LEFT JOIN `smartsheet.projects` c ON c._parent_id = p._row_id
WHERE p._parent_id IS NULL -- Top-level rows only
GROUP BY 1, 2, 3, 4
Querying Smartsheet Data in BigQuery
Once your pipeline is running, BQ unlocks analysis that Smartsheet simply cannot do:
-- Portfolio health: how many projects are at risk this quarter?
SELECT
owner,
status,
COUNT(*) AS project_count,
COUNTIF(target_date < CURRENT_DATE() AND status != 'Complete') AS overdue
FROM `smartsheet.vw_projects_summary`
WHERE DATE_TRUNC(target_date, QUARTER) = DATE_TRUNC(CURRENT_DATE(), QUARTER)
GROUP BY 1, 2
ORDER BY overdue DESC;
-- Resource utilisation: who is over 100% allocated next month?
SELECT
resource_name,
SUM(allocation_pct) AS total_allocation,
COUNT(DISTINCT project_name) AS project_count
FROM `smartsheet.resources`
WHERE start_date <= DATE_ADD(CURRENT_DATE(), INTERVAL 1 MONTH)
AND end_date >= CURRENT_DATE()
GROUP BY 1
HAVING total_allocation > 100
ORDER BY total_allocation DESC;
-- Intake queue SLA: average resolution time by category over last 90 days
SELECT
category,
COUNT(*) AS total_requests,
COUNTIF(resolution_date IS NOT NULL) AS resolved,
AVG(DATE_DIFF(resolution_date, submitted_date, DAY)) AS avg_days_to_resolve
FROM `smartsheet.intake_queue`
WHERE submitted_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
GROUP BY 1
ORDER BY avg_days_to_resolve DESC;
This kind of analysis is what makes the pipeline worth building. Smartsheet shows you a snapshot; BigQuery lets you ask questions across time.
Downstream: Delivering Charts to Slack with Chartcastr
Getting data into BigQuery is the hard part. The last mile — making sure the people who need to act on that data actually see it — is where most teams drop the ball.
Stakeholders don't query BigQuery. They're in Slack. If your Smartsheet project health data lives in a BQ table with nobody looking at it, you've moved the "data sitting unread" problem from Smartsheet to a fancier location.
The pattern that works
Step 1: Surface BQ results in Google Sheets (Connected Sheets)
Use Connected Sheets to pull your BQ query results into a Google Sheet on a refresh schedule:
- Open a Google Sheet → Data → Data connectors → Connect to BigQuery
- Connect to your project and run your query (e.g. the portfolio health query above)
- Set Connected Sheets to refresh daily
The Sheet now stays current with your BQ data automatically.
Step 2: Build your chart in Sheets
Build a chart on the Connected Sheets range — bar chart of projects by status, line chart of intake queue volume over time, whatever is most useful for your audience.
Step 3: Connect to Chartcastr → deliver to Slack
Connect the Google Sheet to Chartcastr, pick the chart or data range, select a Slack channel, and set a schedule.
Every Monday morning, Chartcastr:
- Reads the latest data from your Connected Sheet (which has pulled from BQ)
- Renders a high-resolution chart
- Posts it to your Slack channel with an AI-written summary

The AI summary explains what changed week-over-week — not just "here is the chart" but "project completion rate dropped 8 points; 3 projects moved from On Track to At Risk, all owned by the same team."
What to send, and to whom
| BQ query | Slack channel | Cadence |
|---|---|---|
| Portfolio health by owner | #project-management | Weekly Monday |
| Overdue projects | #leadership | Weekly Monday |
| Resource overallocation warnings | #people-ops | Weekly Friday |
| Intake queue volume + SLA | #ops-team | Daily |
| Budget burn rate | #finance | Monthly first Monday |
The pattern is: one BQ query → one Connected Sheet chart → one Chartcastr connection → one Slack channel on a schedule. You can have as many of these as you need.
The alternative: Chartcastr direct BigQuery connection
If you'd rather skip the Connected Sheets step, Chartcastr supports BigQuery as a direct source. Connect your GCP project, write the SQL, and Chartcastr handles the chart rendering and Slack delivery end-to-end — no intermediate Sheet required.
Putting It Together
The full Smartsheet → BigQuery → Slack pipeline looks like this:
Smartsheet (operational data)
↓ [Python pipeline / Fivetran / Airbyte]
BigQuery (analytical layer)
↓ [Connected Sheets or Chartcastr direct]
Google Sheets (optional rendering layer)
↓ [Chartcastr]
Slack (where the team actually sees it)
Each layer has a clear job:
- Smartsheet — where the work happens and data is entered
- BigQuery — where the data is analysed, joined, and historised
- Connected Sheets / Chartcastr — where the data becomes a chart
- Slack — where the insight reaches the team without them having to go look for it
The pipeline from Smartsheet to Slack isn't trivial to set up — but once it's running, the reporting that used to require someone manually pulling numbers and pasting screenshots into channels happens automatically every week.
Set up your first BigQuery → Slack chart delivery with Chartcastr →