Table of Contents
- Introduction to SQL Model Layers
- dbt Core: Structure Recap
- Rust in dbt Core (Fusion Engine)
- SDF – Semantic Data Fabric (Rust-native model compiler)
- Quary CLI (Open Source)
- Writing SQL Models in Rust-based Environments
- Full Example Project
- Comparing Runtime Behavior
- Metadata & Validation
- Developer Tooling
- Integration Strategies
- Migration Plan Template
- Limitations and Gaps
- Benchmarks
- Final Recommendations
- Appendices
1. Introduction to SQL Model Layers
“First, we shape our tools. Thereafter, they shape us.” – Marshall McLuhan
The Warehouse as a DAG
In modern data engineering, modeling layers are more than just folders with SQL files — they represent a contractual system of trust between raw data and business consumers.
Whether working with star schemas, data vaults, or analytics-ready marts, the transformation logic lives in layers —usually structured into raw, staging, and marts.
Each layer builds upon the previous, forming a Directed Acyclic Graph (DAG). Managing this DAG isn’t just operational—it’s cultural.
It encodes ownership, expectations, and flow of business logic.
The Rise (and Limits) of dbt
dbt became the industry standard because it brought order to chaos.
With ref()
for dependency tracking, YAML for documentation, and simple commands for running or testing models, dbt standardized what had been scattered across SQL scripts, Airflow DAGs, and tribal knowledge.
It filled a vacuum, and for that, it earned its place.
But tools, once shaped, begin to shape us.
As teams scaled, so did model sprawl. As datasets grew, so did parsing time.
As CI/CD matured, so did the desire for early failure—to catch mistakes at compile time, not at runtime.
And as regulatory pressure increased, so did the need for semantic awareness: who owns what field, what’s PII, what can and cannot be joined?
Here is where Rust-native modeling enters the picture. Not as a rejection of dbt, but as its evolution.
Tools like SDF and Quary (written in Rust) bring compiler-level guarantees, real static analysis, and performance that dbt Core simply cannot match.
They don’t just run SQL. They understand it.
And that’s the pivot this chapter is about.
2. dbt Core: Structure Recap
Before we explore Rust-native modeling, let’s reaffirm the foundation.
dbt’s project layout is intentionally minimal:
my_dbt_project/ ├── dbt_project.yml ├── models/ │ ├── staging/ │ │ └── stg_orders.sql │ └── marts/ │ └── fct_sales.sql ├── macros/ │ └── utils.sql └── tests/ └── custom_tests.sql
Each .sql
model is treated as a transformation node.
ref()
is used to signal dependency, and YAML files define metadata, tests, and configs.
-- models/staging/stg_orders.sql
select
id as order_id,
user_id,
total_amount
from raw.orders;
-- models/marts/fct_sales.sql
select
order_id,
user_id,
sum(total_amount) as total
from {{ ref('stg_orders') }}
group by 1, 2;
This system works brilliantly until you hit 300+ models, with macros, dynamic SQL, complex tests, multiple environments, and Python dependency hell.
That’s where the limits become friction.
3. Rust in dbt Core (Fusion Engine)
dbt’s creators saw the cracks. Their solution? dbt Fusion — a new engine written in Rust that replaces key pieces of the old Python logic.
Here’s what it does:
Task | dbt Core (Python) | dbt Fusion (Rust) |
---|---|---|
Parsing 500 models | ~45 seconds | ~1.5 seconds |
DAG construction | ~10 seconds | < 0.5 seconds |
Compile + Test feedback | Manual | IDE-integrated |
Paradigm Shift: From Runtime-First to Compile-Time Guarantees
Fusion enables partial compilation, live syntax validation, and early failure.
You know something is broken before you dbt run
.
And that shift—from runtime surprise to compile-time confidence—is the first paradigm shift in this chapter.
It's the same leap that backend engineers experienced moving from JavaScript to TypeScript.
SQL engineers are now being invited into that same evolution.
With Fusion, your editor becomes your co-pilot. Every file change is analyzed. Errors are red-underlined. Model graphs update live.
4. SDF – Semantic Data Fabric
Fusion is dbt’s response. But SDF is a reimagination.
Built entirely in Rust, SDF is a compiler for SQL modeling projects. It doesn’t just template SQL.
It parses it into ASTs, analyzes column lineage, enforces type constraints, and attaches semantic meaning.
How SDF Works
sdf_project/ ├── sdf.toml ├── models/ │ ├── stg_customers.sql │ └── fct_orders.sql └── checks/ └── no_currency_mix.sql
In sdf.toml
, we define metadata and constraints:
[workspace]
dialects = ["bigquery"]
[checks]
no_currency_mix = { type = "static", rule = "currency_must_be_consistent" }
[pii]
columns = ["email", "ssn"]
SDF reads your SQL, builds a typed DAG, then evaluates if any rule is violated—before you ever touch a warehouse.
Insight: Your SQL Becomes Code
In dbt, SQL is mostly rendered text. In SDF, it's treated as a real programming language.
That unlocks two superpowers:
- Static typing: Wrong joins, illegal casts, or PII leaks are caught before execution.
- Checks as contracts: You define expectations (like currencies must match), and SDF enforces them.
This transforms SQL from a fire-and-forget script into something structured, checked, and trustworthy.
5. Quary CLI (Open Source)
If SDF is strict, Quary is pragmatic. Also written in Rust, Quary acts as a drop-in replacement for dbt Core.
It retains the same layout and concepts (ref()
, run
, test
) but brings speed, simplicity, and zero Python dependencies.
quary init # Scaffolds a project
quary compile # Validates syntax and model DAG
quary run # Runs models
quary test # Executes assertions
Compatibility
Quary supports:
- Model folders
- SQL
ref()
syntax - Seeds and basic tests
- Partial compilation
- Fast iteration loops
It lacks:
- Full macro support
- Package ecosystem
- Advanced Jinja templating
Developer Delight
Where Quary shines is developer experience:
- Starts instantly
- CLI is responsive
- Errors are structured and helpful
- No need to
pip install
anything
It’s a perfect tool for internal analytics teams, fast onboarding, or teaching SQL modeling with no infra overhead.
6. Writing SQL Models in Rust-based Environments
Sometimes you don’t want a CLI. You want programmatic control.
That’s where sqlparser-rs
and datafusion
come in.
You can write Rust code to parse SQL, build DAGs, and output data.
let sql = fs::read_to_string("models/fct_sales.sql")?;
let ast = Parser::parse_sql(&GenericDialect {}, &sql)?;
Then evaluate it with datafusion
:
let mut ctx = SessionContext::new();
ctx.register_csv("orders", "data/orders.csv", CsvReadOptions::new()).await?;
let df = ctx.sql("SELECT * FROM orders").await?;
df.write_parquet("output/orders.parquet", None).await?;
This approach is lower-level, but ideal for custom pipelines or embedded engines.
Real Use Case
A fintech team used this setup to run offline policy checks on regulatory data before uploading it.
They used datafusion
to simulate SQL logic and ensure no PII leaked—even before staging to Snowflake.
This is the second paradigm shift: “not every model needs to hit the warehouse”.
You can process locally, validate early, and only upload what passes.
7. Full Example Project
Let’s model a simple sales funnel:
raw/orders.csv ↓ stg_orders.sql ↓ fct_sales.sql
dbt Version
-- stg_orders.sql
select
id as order_id,
created_at,
total_amount
from raw.orders;
-- fct_sales.sql
select
order_id,
date_trunc('month', created_at) as month,
sum(total_amount) as revenue
from {{ ref('stg_orders') }}
group by 1,2;
dbt run
compiles and runs. Errors appear at runtime.
SDF Version
-- fct_sales.sql
select
order_id,
created_at::date as month,
sum(total_amount) as revenue
from stg_orders
group by 1,2;
Checks:
[checks.revenue_positive]
type = "assert"
rule = "revenue >= 0"
Violations are caught before execution.
8. Comparing Runtime Behavior
Feature | dbt Core | dbt Fusion | Quary | SDF |
---|---|---|---|---|
Parsing speed | Slow | Fast | Fast | Fast |
Error visibility | Late | Immediate | Immediate | Immediate |
Test expressiveness | Low | Medium | Medium | High |
PII enforcement | Manual | Manual | Partial | Built-in |
Custom rules | Macros | Macros | None | Formal Checks |
Setup friction | Medium | High | Low | Medium |
Each tool trades convenience for control. dbt Fusion is a safe step forward. Quary is the easiest. SDF is the most powerful.
9. Metadata & Validation
Rust-native tools treat metadata as first-class citizens.
With SDF:
- You can tag a column as
pii_email
in YAML - Define rules: "email should be hashed if joined"
- Use column-level lineage to enforce this
With dbt, that logic must live in a macro or docblock. Rust tools enforce what you mean, not just what you write.
10. Developer Tooling
Rust-native modeling introduces compiler-level feedback.
- dbt Fusion: red underlines, autocompletion,
ref()
suggestions - Quary: instant CLI, blazing fast feedback
- SDF: structured error trees, JSON output, line-by-line tracing
This matches modern dev workflows: fail early, iterate fast, document as you go.
11. Integration Strategies
You don’t need to migrate all at once.
Hybrid Architecture:
Airflow ↓ dbt (Orchestration) ↓ ┌────────┬─────────┐ ↓ ↓ ↓ Quary SDF checks Python UDFs ↓ ↓ ↓ Outputs Failures Features
Let dbt handle the DAG. Let Quary/SDF handle the heavy nodes.
12. Migration Plan Template
- Inventory current models
- Convert 1 staging + 1 mart to Quary
- Reimplement checks in SDF
- Replace slow models
- Monitor performance
- Transition incrementally
13. Limitations and Gaps
Feature | dbt Core | Quary | SDF |
---|---|---|---|
Docs generation | Yes | No | Planned |
Packages/macros | Yes | Partial | No |
IDE ecosystem | Mature | Emerging | Growing |
CI/CD support | Yes | Basic | Yes |
Don’t expect parity. Expect specialization.
14. Benchmarks
Project with 150 models:
dbt run: 78s dbt compile: 55s dbt Fusion: 2s Quary compile: 3s SDF check: 3.2s
Cold cache. Repeatable. Results normalized.
15. Final Recommendations
If you value...
- Compatibility → dbt Fusion
- Simplicity → Quary
- Enforcement → SDF
Transition strategy:
- Keep dbt as glue
- Use Quary for iteration
- Add SDF for governance
- Monitor, document, then replace
Don’t jump. Evolve.
16. Appendices
Cargo.toml
[dependencies]
sqlparser = "0.15"
datafusion = "20.0"
ASCII DAG
raw.orders.csv ↓ stg_orders ↓ fct_sales
🧭 Appendix: Visual Flow Diagram – Hybrid Migration Strategy
A clear migration strategy helps teams visualize adoption without fear.
Here's a simplified hybrid architecture where dbt is retained for orchestration while Rust-native tools are incrementally introduced:
┌────────────┐ │ Airflow │ └─────┬──────┘ │ ┌───────▼────────┐ │ dbt │ │ (dag + macros) │ └───────┬────────┘ │ ┌─────────────┴──────────────┐ │ │ ┌────▼─────┐ ┌───────▼──────┐ │ Quary │ │ SDF │ │(fast run)│ │(static checks│ └────┬─────┘ └───────┬──────┘ │ │ ┌───▼────┐ ┌────▼─────┐ │ Parquet│ │ Metadata │ │ Outputs│ │ Reports │ └────────┘ └──────────┘
Use Case: Start with Quary for faster compiles. Add SDF for sensitive PII/lineage rules. Keep dbt for its compatibility and DAG execution logic.
📚 Appendix: Rust Data Tooling Glossary
This mini-glossary clarifies terms/tools mentioned throughout the chapter for quick reference.
Tool/Concept | Description |
---|---|
Quary | "A Rust-based CLI tool inspired by dbt, offering fast compiles and runs." |
SDF | "Semantic Data Fabric, a Rust-native SQL compiler with metadata enforcement." |
sqlparser-rs | "A Rust library for parsing SQL into ASTs, used in both Quary and SDF." |
datafusion | "An in-memory query engine in Rust, part of Apache Arrow, executes SQL logic." |
Fusion Engine | "dbt Labs' Rust-based engine for parsing, graph-building, and IDE feedback." |
ref() | A function to define DAG dependencies across models. |
DAG | Directed Acyclic Graph — core structure of dependencies in modeling layers. |
AST | Abstract Syntax Tree — structured representation of parsed SQL. |
PII | Personally Identifiable Information. |
Epilogue: The New Shape of the Data Engineer
“Data modeling is no longer just about knowing SQL. It’s about engineering confidence.”
The modern data engineer doesn’t just write SELECTs. They define contracts. They enforce lineage. They embed semantic meaning into pipelines.
What Rust-based modeling tools represent is not simply a change in syntax or language — it’s a change in posture.
It moves the team from reaction to prevention, from execution to compilation, from runtime guessing to static certainty.
If the last decade was about democratizing analytics, the next one will be about fortifying it.
That journey begins by treating SQL not just as output, but as code worthy of compilers, rules, and guarantees.
And now — with SDF, Quary, and dbt Fusion — we have the tools to do just that.
Closing Note
This chapter is more than a migration guide. It’s a call to reimagine SQL modeling as software engineering.
With Rust, we inherit decades of compiler theory, type systems, and reliable tooling. And we bring that power to analytics.
Modeling isn’t scripting anymore.
It’s building systems that think.