Like Jitsu? Give us a star on ⭐ GitHub!

πŸ“œ Configuration

Configuration UI

πŸ‘©β€πŸ”¬ Extending Jitsu

Overview
Destination Extensions
Source Extensions
API Specs

Jitsu Internals

Airbyte Based Sources

Make sure Jitsu has access to docker. If you've deployed Jitsu with Docker - you should mount volumes: /var/run/docker.sock:/var/run/docker.sock. and jitsu_workspace:/home/eventnative/data/airbyte Read more about volumes for Joint Image (if you deploy @jitsucom/jitsu or for Jitsu Server if you deploy @jitsucom/server

Airbyte Based Sources may not work properly in managed environments that limits access to docker, e.g., Heroku.

If you deploy Jitsu to Kubernetes please check How to run Airbyte sources in Kubernetes

Airbyte is an open-source ETL-framework. Jitsu supports Airbyte as an of the connector backend (the other one being Singer and native connectors

This page describes how to configure Jitsu Server if it runs in standalone mode (without Configurator). If you deployed Jitsu along with Configurator, you can configure Airbyte connectors directly in the UI

Understanding Airbyte connectors

Each Airbyte connector is a standalone docker image. The connector implements Airbyte protocol. The protocol is CLI based: connector executor (Jitsu) feeds connector with a formatted JSON objects (see below) and parses the stdout which is a stream of JSON objects as well.

Configuration

Airbyte configuration is a two of JSON objects (see specification):

NameDescription
Config (required)JSON object contains connector's configuration parameters. Each Airbyte connector has a configuration specification which describes configuration parameters.
CatalogJSON object contains all streams (object types) and fields to download. If not provided, Jitsu will do discover and save catalog with all available streams. JSON structure is standardized, but stream and field names depend on the connector.

The connectors' configuration should be placed under the sources section of configuration file. The structure is an extension of generic source configuration - type: airbyte indicates that the source should be handled by Airbyte executor

sources:
  <connector_id>:
    type: airbyte
    schedule: '...'                  # Cron expression or period notation (@daily or @hourly)
    destinations: [ ]                # List of destinations
    config:
      config: {}                     # Airbyte config object (see below)
      docker_image: 'airbyte/image'  # Airbyte connector image
      catalog: {}                    # Airbyte catalog object (see below).
                                     # Optional. If not provided, all streams
                                     # will be synchronized

Example:

sources:
  ...
  jitsu_airbyte_hubspot:
    type: airbyte
    destinations: [ "clickhouse_destination_id" ]
    schedule: '*/5 * * * *'
    config:
      config:
        credentials:
          api_key: '<HUBSPOT_API_KEY>'
        start_date: "2017-01-25T00:00:00Z"
      docker_image: source-hubspot
  airbyte_source_shopify:
    destinations: [ "clickhouse_destination_id" ]
    type: airbyte
    schedule: '@daily'
    config:
      config:
        api_password: '<SHOPIFY_API_PASSWORD>'
        shop: 'https://EXAMPLE.myshopify.com'
        start_date: "2020-10-10"
      docker_image: source-shopify

JSON configuration parameters such as config, catalog, state can be an object or a raw JSON or JSON string or path to local JSON file

Table Names

Jitsu creates tables with names $sourceID_$AirbyteStreamName by default. For instance, table with name jitsu_airbyte_shopify_orders will be created according to the following configuration:

sources:
  ...
  airbyte_source_shopify:
    destinations: [ "clickhouse_destination_id" ]
    type: airbyte
    schedule: '@daily'
    config:
      config:
        api_password: '<SHOPIFY_API_PASSWORD>'
        shop: 'https://EXAMPLE.myshopify.com'
        start_date: "2020-10-10"
      catalog: '{"streams":[{"stream": {"name":"orders", ...}}]}'
      docker_image: source-shopify

Table names might be overridden by adding stream_table_names configuration parameter:

sources:
  ...
  airbyte_source_shopify:
    destinations: [ "clickhouse_destination_id" ]
    type: airbyte
    schedule: '@daily'
    config:
      config:
        api_password: '<SHOPIFY_API_PASSWORD>'
        shop: 'https://EXAMPLE.myshopify.com'
        start_date: "2020-10-10"
      catalog: '{"streams":[{"stream": {"name":"orders", ...}},{"stream": {"name":"products", ...}}]}'
      docker_image: source-shopify
      stream_table_names:
        orders: my_orders
        products: my_products