Transactional logs will be manifested as two tables;
Transactions reflect responses received from upstream source systems and downstream data destinations. Here are the columns;
Transaction events are typically at the API request level. For example, when calling the Amazon API for orders data, there would be collections of events for each unique order type data feed (e.g.,
mws_orders_by last_update). This is because each API request may have different request limits, throttles, permissions, and retry constraints.
Any error messages are supplied by the source or destination system directly, which allows you to look up possible reasons for an error. For example, if an authorization token has expired, an event would list the cause as
"code":"UNAUTHORIZED","details":"Not authorized to access scope 441232344543454","requestId":"123234211223443"
In the event of an error, retries will occur for up to 8 hours. Once the initial retry attempts have failed, the event messages are transferred to a dead-letter queue or DLQ.
A DLQ reflects a holding queue for messages that cannot be processed as expected.
For example, if we get an
AUTHORIZATION ERROR that prevents our ability to connect to a source or destination; the event will ultimately get routed to a DLQ. Since these messages cannot be processed, they are stored for later retrieval and re-processing.
The re-processing is a secondary retry process at random intervals based on the source or destination. For example, in the event of an
AUTH error for Facebook, we will retry data collection once a day for 7 days after the initial failure.
Another example is if a BigQuery data destination removed access permissions, In this case, all loading will fail. The load attempt event messages will ultimately get placed into a DLQ because of the failed load attempts. The messages in the DLQ will be reattempted at various intervals, though no more than 7 days. If the permission issues are not resolved, then any message in the DLQ will expire or no longer be attempted.
In the event of hard failures like the ones detailed above, meaning we can not connect due to issues like authorization failures or long-term source/destination system outages, we expire the event message, and the trigger will "suspend" the pipeline until the customer can take action. In cases of prolonged failure, there is a risk that collected data can not be recovered.
Meta Data (ob_transactions_metadata_v1)
Metadata contains information about a specific pipeline. For example, the ID, when it was created, who it was created by, associated authorizations, status of the pipeline, and connected data destinations.
The metadata is useful as you can join the transactions to identify the specific subscription ID an event is linked to. For example, if you observed an AUTH error in transactions, you can look up the details of the subscription ID in the metadata. The
AUTH error may have occurred because a user made a breaking change to a subscription in our system (i.e., removed permission).