~15 min read

Direct Lake

Understanding Connection Modes in Microsoft Fabric

From data chaos to unified analytics -- how OneLake and Direct Lake change the game

Compare Import, DirectQuery & Direct Lake
Scroll to explore

The Data Landscape -- Before OneLake

Data scattered across disconnected silos

The Challenge

Organizations often have data scattered across multiple disconnected systems -- each with its own format, security model, refresh schedule, and cost structure.

Azure Synapse
Analytics warehouse
Power BI
Business intelligence
ADLS Gen2
Data lake storage
Data Factory
ETL / orchestration
Databricks
Spark compute
Amazon S3
External cloud
Azure ML
Machine learning
Data Explorer
Log & time-series
SQL Server
On-prem database
Excel
Spreadsheets
SharePoint
Document storage
On-Prem DBs
Legacy systems
The Problem
Multiple disconnected platforms
Duplicated storage across systems
Separate security models per tool
Independent refresh schedules
Siloed access and governance
This is the problem OneLake solves. Instead of managing dozens of separate systems, OneLake unifies everything under a single storage layer with one security model, one copy of the data, and one governance framework.

OneLake -- The Unified Platform

One lake, multiple engines, all data

Foundation: OneLake
Storage
OneLake
Delta Lake Format One Copy of Data Shortcuts to External ADLS Gen2 Compatible
OneLake is like OneDrive for data -- one copy of the data, accessible by many tools. All Fabric workloads read from and write to OneLake in open Delta/Parquet format.
Compute Engines
Compute Engines
Spark
T-SQL
KQL
Analysis Services
Storage
OneLake

Different compute engines can all query the same data in OneLake -- no copying needed.

Fabric Items
Items
Lakehouse
Warehouse
Notebooks
Pipelines
Reports
Real-Time Analytics
Compute Engines
Spark
T-SQL
KQL
Analysis Services
Storage
OneLake

Connection Modes

Four ways Power BI connects to your data

Import The traditional approach
Sources
Scheduled copy
Semantic Model
Report
Data is copied from sources at scheduled intervals. The ETL process can apply complex transformations. Queries are fast (in-memory), but data can be stale between refreshes.
1GB model limit in shared capacity, up to 400GB in Premium. Storage is duplicated -- source + import cache.
DirectQuery Live queries to source
Source DB
Live query
Report
Every interaction generates a query back to the source database. Data is always real-time, but query performance depends entirely on the source.
Best for: real-time dashboards with simple models. Can overwhelm source systems under heavy report usage.
Direct Lake Fabric-native
Sources
Fabric
OneLake
Delta tables
Live by default
Semantic Model
Optional scheduled refresh
Report
Best of both worlds. The semantic model reads only the columns it needs directly from OneLake Lakehouses or Warehouses. Data flows live by default -- a "refresh" simply updates the delta version pointer so the model sees the latest files.
Falls back to DirectQuery if data exceeds capacity memory limits. Schedule a refresh after data pipeline loads to ensure the model reflects the complete, consistent dataset.
Composite Mix of storage modes
Sources
Mixed
Semantic Model
Import DL DQ
Report
Different tables use different storage modes within the same model. Import small lookup tables for speed, use Direct Lake for large fact tables, add DirectQuery for real-time dimensions.
Also allows extending a Live Connection model with local Import or DQ tables. Adds complexity -- use when a single mode doesn't fit all tables.

Direct Lake Mode

Best of Both Worlds: Import Performance with DirectQuery Freshness

How It Works
1

Data in OneLake

Delta tables stored in Parquet format. Data is ingested via pipelines, Dataflows, Spark, or Shortcuts.

2

Transcoding on Demand

When a query arrives, Parquet data is converted to VertiPaq columnar format. Only needed columns are loaded.

3

Targeted Caching

Transcoded columns are cached in memory. Subsequent queries on the same columns run at import-mode speed.

4

Auto Refresh

When Delta tables change, the cache auto-invalidates. The next query triggers fresh transcoding -- data stays current. Schedule a refresh after incremental pipeline loads to ensure the model reflects the complete dataset.

Key Benefits

Fast Queries

Near-import performance via in-memory caching and on-demand transcoding.

Transcoding takes milliseconds per column -- cached data is served at full VertiPaq speed.

Always Fresh

No scheduled refresh needed -- data stays current as Delta tables update.

Can also be set to snapshot mode for consistent point-in-time reporting. Ideal for daily or weekly board-level reports that need a stable view.

No Duplication

Single copy in OneLake -- no import cache consuming extra storage and memory.

Eliminates the storage cost of maintaining a separate VertiPaq copy of your data.

Massive Scale

No model size limits imposed by import. Scales with your Fabric capacity.

Automatically falls back to DirectQuery for data exceeding memory, so queries never fail.

Direct Lake vs Import

Import Mode

Power BI
Import
OneLake
Requires Refresh
vs

Direct Lake Mode Best of Both

Power BI
Import
OneLake
Always Fresh
Targeted Caching
OneLake Table
ID
Sales
Region
Date
Notes
Amount
Status
Ref
Only queried columns
In-Memory Cache
Columns not in your query stay on disk. Cached columns run at full import speed.
Schedule a refresh after pipeline loads to pick up new delta versions.

Storage Mode Comparison

Import
Speed Fastest
Freshness Scheduled
Storage Duplicated
Refresh Required
Data In model
DirectQuery
Speed Slower
Freshness Real-time
Storage None
Refresh Not needed
Data At source
Direct Lake
Speed Near Import
Freshness Real-time
Storage None
Refresh Optional
Data OneLake
DirectQuery Variants
To Relational Source
SQL Server, Azure SQL, etc.
Report SQL Database
Sends live T-SQL queries directly to the database. Speed depends on source optimization.
Speed = Source DB
vs
Over Analysis Services
Chaining via published models
Report Semantic Model
Chains to a published model, inheriting its storage mode without duplication.
Inherits Mode

Requirements

What you need to use Direct Lake

Delta Tables

Data must be in Delta/Parquet format in OneLake. This is the native storage format for Fabric Lakehouses and Warehouses.

Use Fabric notebooks, pipelines, or Dataflows Gen2 to convert existing data to Delta format.

Lakehouse or Warehouse

Delta tables must reside in a Fabric Lakehouse or Warehouse. Shortcuts to external Delta tables are also supported.

Shortcuts enable Direct Lake on data stored in ADLS Gen2 or S3 without moving it.

Fabric Capacity

A Microsoft Fabric capacity (F2 or higher) is required. Direct Lake is not available in shared/Pro-only workspaces.

F2 is the minimum SKU. Larger capacities allow more data to be cached in memory before fallback occurs.

V-Order Optimized

Tables should be V-Order optimized for best transcoding performance. Fabric applies V-Order by default.

V-Order pre-sorts data for faster VertiPaq transcoding. Run OPTIMIZE on existing tables.
All four requirements must be met. F2 SKU is the minimum Fabric capacity. Without Delta tables in OneLake, the semantic model cannot use Direct Lake mode and will fall back to DirectQuery or require Import.
Start by establishing the **problem** -- everyone has data everywhere. *Ask the audience: "How many data sources does your organization have?"* Usually gets laughs. The point is: **silos create friction**.
Let the issues sink in: - Multiple platforms, **duplicated storage** - Separate security models - Independent refresh schedules - **Siloed governance** This is why organizations struggle with data strategy. *Transition: "This is the problem OneLake solves."*
**OneLake is like OneDrive for data** -- one copy, many consumers. Emphasize: you don't move data *to* Fabric -- Fabric's tools all read *from* OneLake. Open **Delta/Parquet** format.
**Key point**: four different engines, same data, **no copies**. - Data engineer uses **Spark** - DBA uses **T-SQL** - Analyst uses **Power BI** All hitting the same OneLake tables.
**Fabric Items** create and consume data: - **Lakehouse** for data engineering - **Warehouse** for SQL - **Notebooks** for data science - **Reports** for visualization All share OneLake storage underneath. **Semantic Model** is the 4th compute engine -- it powers Direct Lake.
**Import** -- the classic. Data is copied into **VertiPaq** (in-memory columnar). Queries are blazing fast, but data can be **stale**. - 1GB model limit in shared capacity, up to 400GB in premium - You're duplicating storage *This is what most people know.*
**DirectQuery** -- every click = a query to the source. Good for **real-time dashboards**, bad for complex models with many visuals. Performance depends entirely on the source database -- can **overwhelm the source**.
**Composite** -- the hybrid. Different tables can use different storage modes: - **Import** your small lookup tables - **Direct Lake** your large facts - **DirectQuery** for real-time feeds *Powerful but complex.*
**Direct Lake** -- the Fabric-native mode. No ETL needed for the semantic model. Reads Delta files directly from OneLake. Only the **columns needed** for the current query are transcoded to VertiPaq format. Data flows **live by default** -- a "refresh" just updates the delta version pointer. *Schedule refreshes after pipeline loads for consistency.*
**On screen:** Process card #1 -- "Data in OneLake" with Delta/Parquet description - This is the **foundation**: data lives as Delta tables in OneLake, written by pipelines, Dataflows, Spark, or Shortcuts - Emphasize the **open format** -- Parquet files with a Delta transaction log on top - The semantic model doesn't copy this data; it reads from it *Transition: "So the data is there -- what happens when someone opens a report?"*
**On screen:** Process card #2 -- "Transcoding on Demand" - This is the key differentiator: transcoding happens **at query time**, not upfront - Parquet columns are converted to **VertiPaq** columnar format on the fly - Only the **specific columns** the DAX query touches get transcoded -- not the whole table - This is why there's no traditional "refresh" -- no bulk import step *Key message: "On demand" is the magic phrase here.*
**On screen:** Process card #3 -- "Targeted Caching" - Once a column is transcoded, it's **cached in memory** - Subsequent queries hitting those same columns run at full import speed - The cache is column-level, not table-level -- granular and memory-efficient - This is why the first query is slightly slower, but everything after that is fast *Transition: "But what happens when the underlying data changes?"*
**On screen:** Process card #4 -- "Auto Refresh" - When Delta tables are updated, the cache **auto-invalidates** - The next query triggers fresh transcoding -- no manual intervention needed - **Pro tip**: schedule a semantic model refresh after pipeline loads to ensure the model picks up the latest delta version immediately - You *can* use snapshot mode for consistent point-in-time reporting (board reports, monthly closes) *Key message: Data freshness is automatic, not a scheduled chore.*
**On screen:** Benefit card -- "Fast Queries" (teal, lightning icon) - Near-import performance via in-memory caching - Transcoding takes **milliseconds per column** -- users won't notice the difference - Cached columns serve at full VertiPaq speed - First-time column access is the only "cost" -- and it's barely noticeable *Transition: "Speed is great, but what about freshness?"*
**On screen:** Benefit card -- "Always Fresh" (green, refresh icon) - No scheduled refresh needed for data currency -- Delta changes flow through automatically - Mention the footnote: **snapshot mode** is available for when you *want* a stable view (weekly board reports, audit periods) - This is the DirectQuery benefit without the DirectQuery performance penalty *Key message: You get real-time freshness without sacrificing query speed.*
**On screen:** Benefit card -- "No Duplication" (orange, database-with-X icon) - Single copy of data lives in OneLake -- the semantic model doesn't create a second copy - Eliminates the **storage cost** of maintaining a separate VertiPaq dataset - For large datasets, this savings is significant -- no more 400GB import models duplicating your lakehouse *Transition: "And it scales beyond what import can handle."*
**On screen:** Benefit card -- "Massive Scale" (purple, expand icon) - No model size limits imposed by import mode - Scales with your Fabric capacity SKU - **Fallback behavior**: if data exceeds available memory, Direct Lake automatically falls back to DirectQuery for those columns -- queries never fail - This is a safety net that import mode doesn't have *Key message: Direct Lake grows with your data -- you don't hit a wall.*
**On screen:** Side-by-side comparison -- Import Mode panel (left side, with "Requires Refresh" badge) - Walk through the Import flow: Power BI -> Import cache (red highlight) -> OneLake - That middle layer -- the import cache -- is the **problem**: it creates staleness and storage duplication - Data is only as fresh as the last scheduled refresh - Point out the red highlight on the Import box -- that's the bottleneck *Transition: "Now look at what Direct Lake does differently."*
**On screen:** Direct Lake Mode panel (right side, with bypass arrow and "Always Fresh" badge) - The animated dashed arrow **bypasses** the import cache entirely -- Power BI reads straight from OneLake - The crossed-out Import box shows what's been eliminated - "Best of Both" badge reinforces: import speed + DirectQuery freshness - Only the columns your current query needs are cached -- not the entire dataset *Key message: The visual makes it obvious -- Direct Lake removes the middleman.*
**On screen:** Targeted Caching diagram -- OneLake table columns with 3 highlighted, arrow to In-Memory Cache - This visual shows exactly how selective caching works: 8 columns in the table, only 3 (Sales, Date, Amount) are cached - The dimmed columns (ID, Region, Notes, Status, Ref) stay on disk -- no memory wasted - Cached columns run at **full import speed** - Footnote reminds: schedule a refresh after pipeline loads to pick up new delta versions *Key message: "Only what you need, when you need it" -- that's the efficiency of Direct Lake.*
**On screen:** Import mode feature card (orange) -- Speed: Fastest, Freshness: Scheduled, Storage: Duplicated, Refresh: Required - Import is the **baseline** everyone knows -- fastest queries, but at a cost - **Scheduled** freshness means data can be hours or days stale - **Duplicated** storage means you're paying for the same data twice - Refresh is **required** -- miss one and your dashboard is wrong *Transition: "What if you flip those trade-offs?"*
**On screen:** DirectQuery feature card (blue) -- Speed: Slower, Freshness: Real-time, Storage: None, Refresh: Not needed - DirectQuery solves the freshness problem but creates a **speed** problem - Every click sends a live query to the source -- performance depends on the source database - No storage duplication, no refresh needed -- but **complex models with many visuals can overwhelm the source** - Good for operational dashboards with low visual counts *Key message: DirectQuery trades speed for freshness -- the opposite of Import.*
**On screen:** Direct Lake feature card (teal, highlighted) -- Speed: Near Import, Freshness: Real-time, Storage: None, Refresh: Optional - This is the **punchline**: Direct Lake gets the best of both columns - **Near Import** speed -- first query slightly slower, subsequent queries match import - **Real-time** freshness like DirectQuery - **No storage duplication** -- single copy in OneLake - Refresh is **Optional** -- you *can* schedule it for consistency, but you don't *have* to *Key message: Point at the green checkmarks -- this is why Direct Lake matters.*
**On screen:** DirectQuery Variants panel -- Relational Source vs. Chaining over Analysis Services - Two types of DirectQuery that audiences often confuse: - **Relational DQ**: Report sends live SQL to a database (SQL Server, Azure SQL) -- speed = source DB performance - **Chaining DQ**: Report connects to a published semantic model, inheriting its storage mode -- no data duplication - In **composite models**, both types can coexist: import lookup tables for speed + DQ fact tables for freshness *Key message: Chaining is how teams share a single source of truth without copying data.*
**On screen:** Checklist item -- "Delta Tables" with gold checkmark - **Hard requirement #1**: data must be in Delta/Parquet format in OneLake - This is the native format for Fabric Lakehouses and Warehouses - If your data isn't Delta yet, use notebooks, pipelines, or Dataflows Gen2 to convert - Shortcuts to external Delta tables (ADLS Gen2, S3) also count *Transition: "Where do those Delta tables need to live?"*
**On screen:** Checklist item -- "Lakehouse or Warehouse" with gold checkmark - **Hard requirement #2**: Delta tables must reside in a Fabric Lakehouse or Warehouse - **Shortcuts** are a powerful option -- point to Delta tables in ADLS Gen2 or S3 without moving them - This means you don't have to migrate everything into Fabric storage to use Direct Lake *Transition: "What infrastructure do you need?"*
**On screen:** Checklist item -- "Fabric Capacity" with gold checkmark - **Hard requirement #3**: Microsoft Fabric capacity, **F2 minimum** - Direct Lake is **not available** in shared or Pro-only workspaces - Larger SKUs allow more data to be cached in memory before fallback to DirectQuery occurs - This is often the blocker for smaller organizations -- plan for the capacity cost *Transition: "One more optimization to get the best performance."*
**On screen:** Checklist item -- "V-Order Optimized" with gold checkmark - **Hard requirement #4**: tables should be V-Order optimized for best transcoding performance - V-Order **pre-sorts** data in a way that aligns with VertiPaq's columnar compression - Fabric applies V-Order by default on new tables -- but for existing tables, run **OPTIMIZE** - Without V-Order, transcoding still works but takes longer -- noticeable on large tables *Key message: V-Order is the difference between "near import" and "noticeably slower."*
**On screen:** Summary callout -- "All four requirements must be met" with F2 minimum and fallback explanation - Reinforce: **all four** are hard requirements, not optional - Without Delta tables in OneLake, the semantic model **cannot** use Direct Lake mode - Fallback behavior: the model drops to DirectQuery (slower) or requires Import (back to square one) - Good closing question: *"Which of these four does your organization already have in place?"* *Key message: This is a checklist -- if you can check all four, you're ready for Direct Lake.*
1 / 8