PostgreSQL
Postgres is a powerful, open source object-relational database system.
Features
| Feature | Supported |
|---|---|
| Batch Mode | ✅ |
| Stream Mode | ✅ |
| Deduplication | ✅ |
| Queries Optimization | ✅ |
Configuration
Jitsu supports database username/password based authentication for postgres.
For Google Cloud SQL for PostgreSQL, Private Service Connect connection is also supported.
General parameters
| Parameter name | Description |
|---|---|
| Authentication Method | password - Username/Password based authentication, google-psc - Private Service Connect for Google-managed postgres instances |
| Database | Postgres database name |
| Schema | Postgres schema |
Configuration settings depend on the selected authentication method.
Username/Password based authentication
| Parameter name | Description |
|---|---|
| Host | Postgres host |
| Port | Postgres port |
| Username | Postgres username |
| Password | Postgres password |
| SSL Mode | SSL mode for Postgres connection: disable,require,verify-ca,verify-full |
| SSL Server CA | SSL Certificate Authority for verify-ca,verify-full SSL Modes |
| SSL Client Cert | SSL Client Certificate for verify-ca,verify-full SSL Modes |
| SSL Client Key | SSL Client Key for verify-ca,verify-full SSL Modes |
Google Cloud Private Service Connect mode
| Parameter name | Description |
|---|---|
| Instance Connection Name | Google Cloud SQL instance connection name in the format project-name:region:instance-name |
Advanced: Private Service Connect for Google Cloud SQL
Private Service Connect (PSC) is a Google Cloud networking feature that allows to connect to Google manages services from multiple VPC networks that belong to different groups, teams, projects, or organizations.
Private Service Connect allows you to grant access to your Google Cloud SQL instances to the Jitsu service account without exposing the instances to the public internet. This is particularly useful for organizations that need to maintain strict security and access controls.
More on Private Service Connect
More on Private Service Connect for Cloud SQL
Configuring Google Cloud SQL
Enable Private Service Connect for a Cloud SQL Instance
Jitsu Project ID to provide in the allowed-psc-projects parameter: jitsu-cloud-infra
Configure Cloud SQL Instances for IAM Database Authentication
https://cloud.google.com/sql/docs/postgres/create-edit-iam-instances#console_1
Grant Permissions for Jitsu Service Account:
You need to grant the Jitsu service account permissions to connect to your Cloud SQL instances.
Contact Jitsu Support to get the service account email address.
Then complete the following steps to grant the necessary permissions:
- Open the Google Cloud Console
- Navigate to IAM & Admin > IAM
- Click "Grant Access"
- Add Jitsu service account email address as a Principal and assign roles:
cloudsql.instances.connect(Cloud SQL Client)cloudsql.instanceUser(Cloud SQL Instance User)
- Optional: Limit access to specific instances using Conditions: https://cloud.google.com/sql/docs/postgres/iam-conditions
Add Jitsu Service Account to a Cloud SQL Instance as a database user
Please refer to the following documentation to create a database user for Jitsu service account:
https://cloud.google.com/sql/docs/postgres/add-manage-iam-users#creating-a-database-user
Grant Database privileges
Keep in mind that database users created from service account use shortened domain name:
name@google-project.iam must be used instead of name@google-project.iam.gserviceaccount.com
Example:
-- Allow user to use and create objects in the public schema
GRANT USAGE, CREATE ON SCHEMA public TO "name@google-project.iam";Provide Jitsu with Cloud SQL Instance Details
- Connection Name
- Private Service Connect (PSC) - DNS name
- Private Service Connect (PSC) - Service attachment
You can get this information by running:
gcloud sql instances describe YOUR_CLOUD_SQL_INSTANCE_NAME --project=YOUR_PROJECT | grep "dnsName:\|connectionName\|pscServiceAttachmentLink"Alternatively, find these details in the Cloud SQL instance’s UI - Overview section.
Advanced: Implementation Details
This section describes how Jitsu implements various modes and features for Postgres.
Batch Mode
Algorithm
-- Jitsu collects events batches in tmp file on file system.
BEGIN -- start transaction
COPY from STDIN to tmp_table -- load tmp file into tmp_table
INSERT into target_table select from tmp_table
COMMIT -- commit transactionStream Mode
INSERT INTO target_table (...) VALUES (..)
Deduplication
For batch mode the following algorithm is used:
Algorithm
-- Jitsu collects events batches in tmp file on file system.
-- Deduplicate rows in tmp file
BEGIN -- start transaction
COPY from STDIN to tmp_table -- load tmp file into tmp_table
INSERT into target_table select from tmp_table ON CONFLICT UPDATE ...
COMMIT -- commit transactionFor stream mode:
INSERT INTO target_table (...) VALUES (..) ON CONFLICT UPDATE ...
Queries Optimization
Timestamp connection setting is used to optimize SELECT queries.
Regular index is created on specified timestamp column.