PostgreSQL

Postgres is a powerful, open source object-relational database system.

Features

FeatureSupported
Batch Mode
Stream Mode
Deduplication
Queries Optimization

Configuration

Jitsu supports database username/password based authentication for postgres.

For Google Cloud SQL for PostgreSQL, Private Service Connect connection is also supported.

General parameters

Parameter nameDescription
Authentication Methodpassword - Username/Password based authentication, google-psc - Private Service Connect for Google-managed postgres instances
DatabasePostgres database name
SchemaPostgres schema

Configuration settings depend on the selected authentication method.

Username/Password based authentication

Parameter nameDescription
HostPostgres host
PortPostgres port
UsernamePostgres username
PasswordPostgres password
SSL ModeSSL mode for Postgres connection: disable,require,verify-ca,verify-full
SSL Server CASSL Certificate Authority for verify-ca,verify-full SSL Modes
SSL Client CertSSL Client Certificate for verify-ca,verify-full SSL Modes
SSL Client KeySSL Client Key for verify-ca,verify-full SSL Modes

Google Cloud Private Service Connect mode

Parameter nameDescription
Instance Connection NameGoogle Cloud SQL instance connection name in the format project-name:region:instance-name

Advanced: Private Service Connect for Google Cloud SQL

Private Service Connect (PSC) is a Google Cloud networking feature that allows to connect to Google manages services from multiple VPC networks that belong to different groups, teams, projects, or organizations.

Private Service Connect allows you to grant access to your Google Cloud SQL instances to the Jitsu service account without exposing the instances to the public internet. This is particularly useful for organizations that need to maintain strict security and access controls.

More on Private Service Connect

More on Private Service Connect for Cloud SQL

Configuring Google Cloud SQL

Enable Private Service Connect for a Cloud SQL Instance

https://cloud.google.com/sql/docs/postgres/configure-private-services-access-and-private-service-connect#enable-psa-instance

Jitsu Project ID to provide in the allowed-psc-projects parameter: jitsu-cloud-infra

Configure Cloud SQL Instances for IAM Database Authentication

https://cloud.google.com/sql/docs/postgres/create-edit-iam-instances#console_1

Grant Permissions for Jitsu Service Account:

You need to grant the Jitsu service account permissions to connect to your Cloud SQL instances.

Contact Jitsu Support to get the service account email address.

Then complete the following steps to grant the necessary permissions:

  • Open the Google Cloud Console
  • Navigate to IAM & Admin > IAM
  • Click "Grant Access"
  • Add Jitsu service account email address as a Principal and assign roles:
    • cloudsql.instances.connect (Cloud SQL Client)
    • cloudsql.instanceUser (Cloud SQL Instance User)
  • Optional: Limit access to specific instances using Conditions: https://cloud.google.com/sql/docs/postgres/iam-conditions

Add Jitsu Service Account to a Cloud SQL Instance as a database user

Please refer to the following documentation to create a database user for Jitsu service account:

https://cloud.google.com/sql/docs/postgres/add-manage-iam-users#creating-a-database-user

Grant Database privileges

Keep in mind that database users created from service account use shortened domain name:

name@google-project.iam must be used instead of name@google-project.iam.gserviceaccount.com

Example:

-- Allow user to use and create objects in the public schema
GRANT USAGE, CREATE ON SCHEMA public TO "name@google-project.iam";

Provide Jitsu with Cloud SQL Instance Details

  • Connection Name
  • Private Service Connect (PSC) - DNS name
  • Private Service Connect (PSC) - Service attachment

You can get this information by running:

gcloud sql instances describe YOUR_CLOUD_SQL_INSTANCE_NAME --project=YOUR_PROJECT | grep "dnsName:\|connectionName\|pscServiceAttachmentLink"

Alternatively, find these details in the Cloud SQL instance’s UI - Overview section.

Advanced: Implementation Details

This section describes how Jitsu implements various modes and features for Postgres.

Batch Mode

Algorithm
-- Jitsu collects events batches in tmp file on file system.
BEGIN -- start transaction
COPY from STDIN to tmp_table -- load tmp file into tmp_table
INSERT into target_table select from tmp_table
COMMIT -- commit transaction

Stream Mode

INSERT INTO target_table (...) VALUES (..)

Deduplication

For batch mode the following algorithm is used:

Algorithm
-- Jitsu collects events batches in tmp file on file system.
-- Deduplicate rows in tmp file
BEGIN -- start transaction
COPY from STDIN to tmp_table -- load tmp file into tmp_table
INSERT into target_table select from tmp_table ON CONFLICT UPDATE ...
COMMIT -- commit transaction

For stream mode:

INSERT INTO target_table (...) VALUES (..) ON CONFLICT UPDATE ...

Queries Optimization

Timestamp connection setting is used to optimize SELECT queries.

Regular index is created on specified timestamp column.