Data Pipelines
Ventryx pipelines let you move, transform, and route data between systems without writing custom ETL code. Define sources, apply transformations, and deliver to any destination in real time or on a schedule.
Pipeline concepts
- Source — where data originates (an API endpoint, an event stream, a webhook, or a scheduled query)
- Transform — optional steps that filter, map, enrich, or validate data records
- Destination — where processed data is delivered (a database, a data warehouse, an external API, or a storage bucket)
Supported sources
| Source type | Description |
|---|---|
| Ventryx Events | Stream all or filtered platform events into a pipeline |
| Webhook ingestion | Accept POST payloads from any external service |
| Scheduled API pull | Poll an external REST endpoint on a cron schedule |
| Database CDC | Capture row-level changes from PostgreSQL or MySQL (Enterprise) |
Supported destinations
| Destination | Type |
|---|---|
| AWS S3 | Object storage (JSON, CSV, Parquet) |
| BigQuery | Data warehouse |
| Snowflake | Data warehouse |
| PostgreSQL | Relational database |
| HTTP endpoint | Any REST API |
| Ventryx Events | Loop back as platform events |
Transforms
Transforms are applied in sequence. Each transform receives the output of the previous step:
Pipeline definition (JSON)
{
"name": "Order events to BigQuery",
"source": { "type": "ventryx_events", "filter": "order.*" },
"transforms": [
{ "type": "filter", "condition": "data.amount > 100" },
{ "type": "map", "fields": { "order_id": "data.order_id", "total": "data.amount" } }
],
"destination": {
"type": "bigquery",
"dataset": "prod_analytics",
"table": "orders"
}
}