MotherDuck (DuckDB)
DuckDB-powered cloud data warehouse scaling to terabytes with ease.
Features
| Feature | Supported |
|---|---|
| Batch Mode | ✅ |
| Stream Mode | ✅ |
| Deduplication | ✅ |
| Queries Optimization | ✅ |
Configuration
Advanced: Implementation Details
This section describes how Jitsu implements various modes and features for DuckDB.
Batch Mode
Algorithm
-- Jitsu collects events batches in tmp file on file system.
ATTACH ':memory:' as jitsu_memdb
INSERT into jitsu_memdb.tmp_table
INSERT into target_table select from jitsu_memdb.tmp_tableStream Mode
INSERT INTO target_table (...) VALUES (..)
Deduplication
For batch mode the following algorithm is used:
Algorithm
-- Jitsu collects events batches in tmp file on file system.
-- Deduplicate rows in tmp file
ATTACH ':memory:' as jitsu_memdb
INSERT into jitsu_memdb.tmp_table
INSERT OR REPLACE into target_table select from jitsu_memdb.tmp_tableFor stream mode:
INSERT OR REPLACE INTO target_table (...) VALUES (..)
Queries Optimization
Timestamp connection setting is used to optimize SELECT queries.
Regular index is created on specified timestamp column.