A semantic layer for the 90% who do not have a data warehouse

7 min readBy Michael Carter

The semantic layer conversation has been captured by warehouse-adjacent tools. Most business teams will never build a warehouse. They still need their AI to understand what the numbers mean.

TL;DR

The semantic layer conversation has been captured by warehouse-adjacent tools — dbt, Cube.js, AtScale, LookML. All of them assume a data warehouse at the other end of the pipe. But most business teams will never build a warehouse. Their data lives in Shopify, Google Sheets, HubSpot, and Xero. An AI-facing semantic layer works at the interpretation level, not the query level. It encodes what metrics mean and how to analyze them, connecting directly to SaaS tools. No SQL. No ETL. No warehouse.

If you follow the data engineering conversation, the semantic layer is having a moment. dbt launched theirs. Cube.js has been building one for years. AtScale sells one to enterprises. Every data conference in the last two years has had at least one talk on why you need a semantic layer.

They are all talking about the same thing: an abstraction that sits between your data warehouse and your BI tool, ensuring that "revenue" compiles to the same SQL no matter who writes the query or which dashboard it appears on.

This is useful work. It solves a real problem. And it is completely irrelevant to most businesses.

The warehouse assumption

Every major semantic layer tool shares a prerequisite: you need a data warehouse. dbt's semantic layer generates SQL against Snowflake, BigQuery, Postgres, or Databricks. Cube.js sits on top of a warehouse and exposes a consistent API. LookML is Looker's modeling language for warehouse data.

If you don't have a warehouse, these tools don't apply. Not "they're hard to use" — they literally don't connect.

And most businesses don't have a warehouse. Not because they're unsophisticated. Because they don't need one yet. Their revenue is in Shopify. Their pipeline is in HubSpot. Their marketing spend is in Google Ads and Meta Ads. Their tracking spreadsheet is in Google Sheets. Their bookkeeping is in Xero or QuickBooks. The data is already structured, already queryable through each tool's own API, and not large enough to justify a centralization project.

Telling these teams they need a semantic layer that requires a warehouse is like telling someone who needs a bicycle that they need a parking garage first.

Two different problems

The warehouse-adjacent semantic layer solves query normalization. It makes sure that every dashboard, every report, and every ad-hoc query computes "revenue" the same way. This matters when you have 50 analysts writing SQL against the same warehouse and you need consistency.

The AI-facing semantic layer solves interpretation. It makes sure that when an AI model analyzes your Shopify revenue chart, it knows the difference between Total Sales and Net Sales, understands that a flat conversion rate during a traffic spike is actually good news, and recognizes that AOV increases can mask order-count declines.

These are related but distinct problems. One is about consistent computation. The other is about correct understanding.

DimensionWarehouse semantic layerAI-facing semantic layer
PrerequisiteData warehouseData source (any SaaS tool)
What it abstractsSQL queriesMetric interpretation
Who uses itAnalysts, BI toolsAI models during analysis
Core problem"Does revenue mean the same thing everywhere?""Does the AI understand what this number means?"
Example toolsdbt, Cube.js, AtScale, LookMLChartcastr's five-layer model
Requires ETLYesNo

Most teams need the right column, not the left one.

What the AI-facing semantic layer actually does

In the five-layer model we use at Chartcastr, the AI semantic layer includes:

  1. Metric definitions — what each number means for each provider. Shopify's "Total Sales" includes tax; "Net Sales" does not. This matters for every downstream interpretation.
  2. Domain expertise — how to analyze data from each source. HubSpot pipeline analysis requires knowing stage-conversion benchmarks. Google Ads CPC analysis requires knowing that holiday spikes are normal.
  3. Business context — your targets, plans, and decisions, linked from Google Docs, Notion, or planning spreadsheets.
  4. Institutional memory — past conversations about the data. When the team already explained a dip, the AI doesn't re-flag it.
  5. Cross-source intelligence — the AI queries other connected systems during analysis. When Shopify revenue dips, it checks whether ad spend changed in the same window.

None of these layers require a warehouse. They require connections to the tools you already use and a knowledge system that encodes what matters about each one.

Who this is for

The teams being left out of the semantic layer conversation are exactly the teams that need it most:

Agencies managing client data across Google Ads, Meta Ads, Shopify, and Google Sheets. They need the AI to understand that a CPC spike in a client's Google Ads account during Black Friday is expected, not alarming.

Small e-commerce brands running Shopify with Klaviyo and a spreadsheet for inventory. They need the AI to know that a revenue increase driven entirely by AOV (with flat or declining orders) is a different story than one driven by order volume.

Non-technical ops teams tracking KPIs in Google Sheets and HubSpot. They need the AI to interpret a pipeline report against the targets the team set in a Google Doc last quarter.

Startups pre-Series B with data in five SaaS tools and zero data infrastructure. They need their Monday morning Slack report to say something useful, not just recite numbers.

None of these teams are going to build a warehouse, install dbt, and configure a semantic layer. But every one of them needs the AI that reads their data to understand what the numbers mean.

The convergence

The data engineering world and the AI analytics world are converging on the same realization: models need context to be useful. The warehouse-adjacent tools are adding AI features. The AI-native tools are encoding domain knowledge. Both roads lead to the same destination — a knowledge system that makes analysis accurate and meaningful.

The difference is where you start. If you have a warehouse and a data team, the traditional semantic layer is the right foundation. If your data lives in SaaS tools and you don't have a warehouse, the AI-facing semantic layer is what makes AI analytics useful instead of generic.

For the full five-layer breakdown and how each layer changes the AI's output, see What is a semantic layer, and why does your AI analytics need one?.

Frequently Asked Questions

Was this post helpful?

Google SheetsSlackAI Summaries

Turn your data into automated team updates.

Connect a data source, create charts, and deliver AI-powered insights to Slack or email — in minutes.

No card required. Setup in 3 minutes.

Chartcastr