MotherDuck (DuckDB)

DuckDB-powered cloud data warehouse scaling to terabytes with ease.

Features

FeatureSupported
Batch Mode
Stream Mode
Deduplication
Queries Optimization

Configuration

Advanced: Implementation Details

This section describes how Jitsu implements various modes and features for DuckDB.

Batch Mode

Algorithm
-- Jitsu collects events batches in tmp file on file system.
ATTACH ':memory:' as jitsu_memdb
INSERT into jitsu_memdb.tmp_table
INSERT into target_table select from jitsu_memdb.tmp_table

Stream Mode

INSERT INTO target_table (...) VALUES (..)

Deduplication

For batch mode the following algorithm is used:

Algorithm
-- Jitsu collects events batches in tmp file on file system.
-- Deduplicate rows in tmp file
ATTACH ':memory:' as jitsu_memdb
INSERT into jitsu_memdb.tmp_table
INSERT OR REPLACE into target_table select from jitsu_memdb.tmp_table

For stream mode:

INSERT OR REPLACE INTO target_table (...) VALUES (..)

Queries Optimization

Timestamp connection setting is used to optimize SELECT queries.

Regular index is created on specified timestamp column.