Skip to main content
File Naming Overview
Openbridge Support avatar
Written by Openbridge Support
Updated this week

Our file naming convention is designed to generate unique and traceable names for all file objects based on the transaction ID and the job’s scheduled run time. This consistent approach ensures that each file can be easily identified and managed.

Transaction ID Calculation

The transaction ID is a key component of our naming scheme. It is computed using the following Python snippet:

tx_raw = str(subscription_id) + str(runtime) + str(payload_id)
return hashlib.sha1(tx_raw.encode('utf8')).hexdigest()

This code concatenates the subscription ID, runtime, and payload ID, then applies SHA-1 hashing to produce a unique transaction ID.

File Name Generation

Once the transaction ID is determined, the complete file name is generated using this code:

file_id = '{tx_id}-{uuid}_{start_date}'.format(
tx_id=id.split(':', 1)[-1],
uuid=hashlib.sha256(self.name.encode('utf8')).hexdigest()[:5],
start_date=context['start_date'].strftime('%Y%m%d%H%M%S')
)

Breakdown of Components:

  • Transaction ID (tx_id): The code splits the combined identifier to extract only the transaction ID, removing the subscription component for clarity.

  • UUID Fragment (uuid): A portion of the SHA-256 hash of the file’s base name. This ensures that files generated by the same payload receive distinct names.

  • Start Date (start_date): Formatted as YYYYMMDDHHMMSS, this timestamp reflects either the attribution date or hour, depending on the job type. It corresponds to the report date or data date inherent to the payload.

File Chunking and Extensions

If a file is large, the system automatically splits it into multiple chunks. Each chunk receives:

  • A Sequential Suffix: A five-digit index appended to the file name (e.g., .00000, .00001, etc.).

  • A File Extension: Currently, supported extensions include .gz for compressed files or .SNAPPY.parquet for Parquet files using Snappy compression.

Understanding Payloads

payload represents the definition of a report or upstream dataset, which is defined by the source system. For example, an “Amazon Vendor Central Sales Report” would be a specific payload as defined by Amazon. Amazon will also offer a "Net PPM report", which is a different payload from Sales. Each payload influences file naming by contributing to its uniqueness through the transaction ID and UUID.

Post-Processing and Storage

Once a file name is generated:

  • Immutability: The file name remains unchanged after creation and is copied under a specific prefix to a configured destination.

  • Storage Agnosticism: This process is consistent across all storage systems where applicable.

This naming logic applies uniformly, including for historical data or “lookback” requests, regardless of the report date or request type.

Following this standardized naming process, our system ensures that every file object is uniquely identifiable, traceable, and ready for seamless integration with various storage and retrieval operations.

Did this answer your question?