When we deliver data to a destination like Azure Data Lake, BigQuery, AWS Athena, AWS Redshift, or Redshift Spectrum, we append additional metadata unique to the information resident in a record. Your tables and views will include a series of system generated fields that provide users with vital information about the meaning of the data we collected on your behalf.

Not only does this provide critical context about a record, but it also simplifies queries and data modeling.

System Generated Metadata Fields

In addition to the fields retrieved from source files, your table will include a series of system generated fields (all with a prefix of ob_*). Each of these fields is described below.

ob_date - The date that was used in the request to retrieve data from the source system. This field is only present in non-batch integrations (e.g. Facebook)

ob_transaction_id - A system-generated unique id based on the hash of all field values for each row of data. This field is used to prevent duplicate data from being loaded to a table. The system generates the hash value for each row being loaded and compares it to all previously loaded data. If a matching record is found, the new row is excluded from the load process.

ob_file_name - A system-generated field including the (Amazon S3) path and file name from which the data was loaded. This field can be useful for quality control purposes to validate all data from a particular source file was loaded successfully.

ob_processed_at - A system-generated field representing the timestamp that file processing started. For some sources that provide lifetime metrics, this date represents the date for which lifetime metrics are valid.

ob_modified_date - A system-generated field representing the timestamp that file processing started. For some sources that provide lifetime metrics, this date represents the date for which lifetime metrics are valid.

Did this answer your question?