Like Jitsu? Give us a star on ⭐ GitHub!

πŸ“œ Configuration

Configuration UI

πŸ‘©β€πŸ”¬ Extending Jitsu

Overview
Destination Extensions
Source Extensions
API Specs

Jitsu Internals

Apify Dataset

Overview

Apify is a web scraping and web automation platform providing both ready-made and custom solutions, an open-source SDK for web scraping, proxies, and many other tools to help you build and run web automation jobs at scale. The results of a scraping job are usually stored in Apify Dataset. This connector allows you to automatically sync the contents of a dataset to your chosen destination. To sync data from a dataset, all you need to know is its ID. You will find it in Apify console under storages.

The source is using Airbyte docker image (@airbyte/source-apify-dataset). Learn more how Airbyte-based sources work

How to connect

Obtain Apify Dataset ID.

Connection Parameters

ParameterDocumentation
datasetId*
string
(required)
ID of the dataset you would like to load to Airbyte.
clean
boolean
(not required)
If set to true, only clean items will be downloaded from the dataset. See description of what clean means in Apify API docs. If not sure, set clean to false.