When we deliver data to a destination like Azure Data Lake, BigQuery, AWS Athena, AWS Redshift, or Redshift Spectrum, we append additional metadata unique to the information resident in a record. Your tables and views will include a series of system generated fields that provide users with vital information about the meaning of the data we collected on your behalf.
Not only does this provide critical context about a record, but it also simplifies queries and data modeling.
System Generated Metadata Fields
In addition to the fields retrieved from source files, your table will include a series of system generated fields (all with a prefix of ob_*
). Each of these fields is described below.
ob_date
- The date that was used in the request to retrieve data from the source system. This field is only present in non-batch integrations (e.g. Facebook)
ob_transaction_id
- A system-generated unique id based on the hash of all field values for each row of data. This field is used to prevent duplicate data from being loaded to a table. The system generates the hash value for each row being loaded and compares it to all previously loaded data. If a matching record is found, the new row is excluded from the load process.
ob_file_name
- A system-generated field including the (Amazon S3) path and file name from which the data was loaded. This field can be useful for quality control purposes to validate all data from a particular source file was loaded successfully.
ob_processed_at
- A system-generated field representing the timestamp that file processing started. For some sources that provide lifetime metrics, this date represents the date for which lifetime metrics are valid.
ob_modified_date
- A system-generated field representing the timestamp that file processing started. For some sources that provide lifetime metrics, this date represents the date for which lifetime metrics are valid.