Platform — Infrastructure

Data Pipelines

Ventryx pipelines let you move, transform, and route data between systems without writing custom ETL code. Define sources, apply transformations, and deliver to any destination in real time or on a schedule.

Pipeline concepts

Source — where data originates (an API endpoint, an event stream, a webhook, or a scheduled query)
Transform — optional steps that filter, map, enrich, or validate data records
Destination — where processed data is delivered (a database, a data warehouse, an external API, or a storage bucket)

Supported sources

Source type	Description
Ventryx Events	Stream all or filtered platform events into a pipeline
Webhook ingestion	Accept POST payloads from any external service
Scheduled API pull	Poll an external REST endpoint on a cron schedule
Database CDC	Capture row-level changes from PostgreSQL or MySQL (Enterprise)

Supported destinations

Destination	Type
AWS S3	Object storage (JSON, CSV, Parquet)
BigQuery	Data warehouse
Snowflake	Data warehouse
PostgreSQL	Relational database
HTTP endpoint	Any REST API
Ventryx Events	Loop back as platform events

Transforms

Transforms are applied in sequence. Each transform receives the output of the previous step:

Pipeline definition (JSON)

{
  "name": "Order events to BigQuery",
  "source": { "type": "ventryx_events", "filter": "order.*" },
  "transforms": [
    { "type": "filter", "condition": "data.amount > 100" },
    { "type": "map", "fields": { "order_id": "data.order_id", "total": "data.amount" } }
  ],
  "destination": {
    "type": "bigquery",
    "dataset": "prod_analytics",
    "table": "orders"
  }
}