Like Jitsu? Give us a star on ⭐ GitHub!

πŸ“œ Configuration

Configuration UI

πŸ‘©β€πŸ”¬ Extending Jitsu

Overview
Destination Extensions
Source Extensions
API Specs

Jitsu Internals

Retroactive User Recognition

Jitsu supports storing all events from anonymous users and updates them in DWH with user id after users identification. At present this functionality is supported only for Postgres, Redshift, Snowflake, MySQL and ClickHouse*

*User Recognition support for Clickhouse is limited to ReplacingMergeTree and ReplicatedReplacingMergeTree engine.
*Clickhouse handles data mutation differently. Please read Clickhouse specifics to avoid unexpected results of Retroactive User Recognition on Clickhouse data tables.

Example

event_idanonymous_idemail
event11
event21
event31a@b.com
event41a@b.com

Right after event3 Jitsu amends event1 and event2 and adds email=a@b.com. As a result, there will be the following events in DWH:

event_idanonymous_idemail
event11a@b.com
event21a@b.com
event31a@b.com
event41a@b.com

Fields anonymous_id and email are configurable. See identification_nodes below.

Resources

user recognition flow
user recognition flow

Retroactive Users Recognition stores all anonymous incoming events to Redis. RAM consumption can be pretty high. You can take a few measures to reduce the consumption. Namely, use a dedicated Redis instance and configure eviction and compression. Read how to optimize Redis memory

Configuration

To enable this feature, set users_recognition.enabled to true in the configuration file. Or use its env variable equivalent USER_RECOGNITION_ENABLED=true.

This setting enables user recognition for all supported destinations: Postgres, Redshift, Snowflake, MySQL and ClickHouse. By default, /user/anonymous_id will be used as a node for getting anonymous_id. /user/id and /user/email will be used as a source for user identification field.

Those settings can be redefined on global level of config file:

users_recognition:
  enabled: true #Enabled by default.
  anonymous_id_node: /user/anonymous_id
  identification_nodes:
    - /user/id
    - /user/email

Those settings cannot be configured with env variables at the moment :(

By default, a system-wide Redis instance will be used for storing the data (meta.storage.redis in config file or REDIS_URL env var).

You can use a dedicated Redis instance (separate from Redis user for configuration and short-time caches) and apply memory optimization. Read more about options here.

This feature requires:

  1. users_recognition.redis or meta.storage.redis configuration
  2. primary_key_fields configuration in Postgres, Redshift and MySQL destinations. Read more about those settings on General Configuration