The Data Pipeline API provides customers a webhook for inserting event data into a data warehouse table via HTTP.  You get a webhook upon creating a Data Pipeline API subscription in the customers accounts. 

How Does It Work?

Webhook is mapped to a database and a single table. As such, once an endpoint is established and data is sent, it is expected to follow the same schema going forward. In order to submit data to multiple tables, multiple webhooks must be provisioned. 

When our system receives an event to your webhook, it parses the event and routes the data to a target database and table location. Our system is designed to accept requests containing JSON data. The data should be formatted as such:

"field1": "value1",
"field2": 2,
"field3": true,
"field4": null

The data is to be sent in a POST request to the webhook. Before sending the request, you can check JSON validity using tools such as this

Please note, the system expects to receive a single event per request.

Setting Up Your Database and Table

Initializing the output table

The system expects the same data definition to be sent to a unique webhook each time. The data format (number of fields, data types, etc.) is set from the first event which is sent and cannot be changed. As such, priming the API with an event which contains data in all fields is recommended.

IMPORTANT: If an event with a different schema is sent to a previously initialized webhook, a new version of the table will be created. Once this table is initialized, the process will repeat - if a new type of event is sent, another table version will be created. 

Code examples

Any method of sending POST requests to an HTTP endpoint will be sufficient. See the following cURL example:

curl -H "Content-Type: application/json" \
-d '{ "id": "123", "location": USA, "event_time": 2018-03-01 2018-03-01 13:00:00,"created_date": "2018-03-01","is_valid": true,"record_info": { "order": 1 } }' \

This sends the JSON data in the -d parameter to the API.

Feel free to use any HTTP client or create your own. If there are no issues with the request, you can expect to receive a 204 No Content response.

Given this information, consider the following events:

"id": 123,
"location": "USA",
"event_time": "2018-03-01 13:00:00",
"created_date": "2018-03-01",
"is_valid": true,
"record_info": { "order": 1 }

The data type of each Redshift column in the resulting table is dependent on the type received in the data. The following column types can be expected from an event:

This would generate the following table: 

Now that the table has been generated, consider another couple events such as these:

"id": 321,
"location": null,
"event_time": "2018-03-01 14:20:00",
"created_date": "2018-03-01",
"is_valid": false,
"record_info": null
"id": 456,
"location": "GER",
"event_time": null,
"created_date": null,
"is_valid": true,
"record_info": { "order": 2 }

Because the table has already been initialized, the NULLs will be accepted normally. The resulting table will be as follows:

Frequently Asked Questions

Can I use GET requests to query the database via the webhook?

No. The events API is intended to consume POST requests. As such, the only functionality is for data insertion - not querying, deletion or database administrative tasks. 

Can I send non JSON payloads?

No, only JSON-formatted payloads are accepted. No other data formats are currently supported. 

What is the max size of the JSON payload?

The payload must be less than 10 MB. Only one event can be processed from a single payload. 

Can I send multiple JSON events in a single payload?

No, in order to process multiple events, each must be sent in its own POST request.

Did this answer your question?