Platform — Infrastructure

Data Pipelines

Ventryx pipelines let you move, transform, and route data between systems without writing custom ETL code. Define sources, apply transformations, and deliver to any destination in real time or on a schedule.

Pipeline concepts

  • Source — where data originates (an API endpoint, an event stream, a webhook, or a scheduled query)
  • Transform — optional steps that filter, map, enrich, or validate data records
  • Destination — where processed data is delivered (a database, a data warehouse, an external API, or a storage bucket)

Supported sources

Source typeDescription
Ventryx EventsStream all or filtered platform events into a pipeline
Webhook ingestionAccept POST payloads from any external service
Scheduled API pullPoll an external REST endpoint on a cron schedule
Database CDCCapture row-level changes from PostgreSQL or MySQL (Enterprise)

Supported destinations

DestinationType
AWS S3Object storage (JSON, CSV, Parquet)
BigQueryData warehouse
SnowflakeData warehouse
PostgreSQLRelational database
HTTP endpointAny REST API
Ventryx EventsLoop back as platform events

Transforms

Transforms are applied in sequence. Each transform receives the output of the previous step:

Pipeline definition (JSON)
{
  "name": "Order events to BigQuery",
  "source": { "type": "ventryx_events", "filter": "order.*" },
  "transforms": [
    { "type": "filter", "condition": "data.amount > 100" },
    { "type": "map", "fields": { "order_id": "data.order_id", "total": "data.amount" } }
  ],
  "destination": {
    "type": "bigquery",
    "dataset": "prod_analytics",
    "table": "orders"
  }
}