Once you have activated a data pipeline, it can take 24-48 hours for the first sync to occur. In some cases, the sync may start within a couple of hours or take as much as a couple of weeks (e.g., Amazon Pay Settlement Reports) due to how the source system generates data.

Types of Data Pipeline Schedules

There are four primary types of schedules:

  1. Daily

  2. Hourly

  3. Lookback

  4. Historical

Daily Schedules

Schedules for daily data mean we request data based on an offset. The use of an offset ensures our requests align with data availability. For example, for Amazon, we will use a -1 offset. On 1/6/2021, we ask for data for 1/5/2021 or a -1 offset. Daily schedules are commonly used for reporting and insights APIs.

Hourly Schedules

Schedules for hourly data requests each hour. For example, 1 AM to 2 AM, 2 AM to 3 AM, and so forth. Hourly schedules are typical for transactional systems such as orders or shipping data APIs. These APIs and connectors are often referred to as "real-time" or "near real-time".

Lookback Schedules

Lookback schedules recollect data for prior dates. These schedules often run in combination with go-forward daily processes. Some data sources will update previous dates with new data. For example, on 10/1/2021, Amazon updated impression counts for 9/1/2021. Lookbacks are common in ad platforms, which support changes to performance attribution metrics (sales, impressions, clicks).

Historical Schedules & Requests

The purpose of historical schedules is to recreate, by date, a snapshot of data in a source. An example would be on July 1 requesting data for the prior 180 days. This will create requests to an API, per day, for 180 days of data. If the connector covered 5 reports, this would equal 900 reports (180x5). The number of requests to the API could easily exceed 3000 recreate 180 days.

Given the nature of historical requests, these jobs are scheduled are run as secondary processes to minimize impacts to "go forward" daily, hourly or lookback schedules. These jobs are optimized to run as long-running background schedules, carefully requesting prior dates as API capacity permits. For example, running historical schedules for Amazon retail reports requires close to 10,000 API requests for 12 months of data. If the API limit is 60 requests an hour, this can take close to 4 weeks to complete.

Due to the nature of recreating, by day, snapshots of historical data, these requests can take many days or weeks to complete.

Additional Information For Schedules & Timing

For more information on the topic of Data Source and Destination automation and timing, see Key Considerations For Data Source and Destination Automation Timing.

Did this answer your question?