The process outlined in this document described how to configure Openbridge to automate reception of Adobe Analytics export feeds via SFTP, processing them and load them into a target database.
Step 1: Define Your Adobe Reporting Exports
The process of defining the extracts and scheduling delivery occurs within the Adobe website. In the Adobe management interface you can generate ad hoc data reports for ongoing or historical reporting data.
In the Adobe UI you build a query to filter your data and isolate specific feeds. The vast majority of requests take less than a day to process, but depending on the complexity of your query and the amount of data it can take longer to process.
The following are a few Adobe documents which describe how to setup Adobe to deliver exports:
The reports are delivered as CSV files via SFTP from Adobe to Openbridge.
Understanding Adobe Report Types
A report type is a user defined export. This means reports are defined according to the structure and data that was configured in the Adobe interface for export. For example, reports may be a bounces
export or it may be related to channels
, products
or visits
. End users have the flexibility to define these exports in Adobe as they see fit.
Defining What Is In Each Adobe Reporting Export
As we stated earlier the content of an export is defined by the user. The delivery format from Adobe will be uncompressed CSV files. The raw CSV files will have headers that look like this based on the entities the user set during the report export setup in Adobe:
|Date|Visits|Unique Visitors|Something (23) (event23)|RSID|
The headers as supplied are technically not valid for import into a database. In above example there are mixed case, special characters and spacing used for column names. Our system will automatically standardize the headers, among other things, supplied by Adobe to ensure that downstream databases can properly import the data.
Our cleanaup process will generate a resulting header and table that look like this:
|date|visits|unique_visitors|something_23_event23|RSID|
The a sample row of values associated with the header should follow typical CSV conventions:
"January 1, 2015",111292,12513,8139, myRSID
Did you notice the RSID in the output? We suggest ALWAYS including this unique identifier in the output. If you ever include multiple RSIDs in a feed, you would have no means to distinguish which results aligns with a given property. Also, any other unique identifiers that will help you filter or associate data across properties should also be considered
Step 2: Define Your Batch Data Pipeline For Each Reporting Export
For each report type export out of Adobe you will need to setup a delivery location, or data pipeline, within Openbridge so we can process and load the data.
You will need to setup a pipeline for each export from Adobe. This article details the process of setting up a batch data pipeline for services like Adobe: https://blog.openbridge.com/how-to-setup-a-batch-data-pipeline-for-csv-files-8c4d0cd7394b
Delivery should occur via SFTP. The standard host name is the following:
SFTP Hostname =
pipeline-01.openbridge.io
The following are the PORT numbers you will need to make sure are used by the system responsible for file delivery:
SFTP Port =
22 or 443
Credentials are provided during the pipeline setup process referenced above.
Organizing Adobe Reports Exports
Reporting exports have specific directories on the Openbridge system that correspond to the pipeline you configured in Step 2 below. As a result the output from Adobe must be delivered to the unique destination that aligns with the corresponding pipeline configuration. Also, you should use a unique filemask for exports that corresponds to the report suite and report type being sent. For example: "/(report_type)/(report_suite_report_type_YYYY-mm-dd).csv"
This would result in a unique destination and filemask would like this: "/adob_site_totals/mysuiteid_adob_site_totals_20150101.csv"
Why use the suite ID? in the event you are sending multiple RSID exports to the same report type pipeline, this will ensure you can audit which files belong to which report suite. Without this identifier it might make any auditing you perform more time consuming.
Testing Report Deliveries
Please make sure they are only putting sample or test data into the /testing
directory. Each pipeline comes with a /testing
directory to allow you to transfer files without Openbridge processing them. For example, you would tell Adobe to deliver your test files into the testing path:
/adobe_channel/testing
This means the test data will always be sent to like this:
/adobe_channel/testing/mytestfiletovalidate.csv
/adobe_channel/testing/mytestfiletovalidate2.csv
/adobe_channel/testing/mytestfiletova098.csv
/adobe_channel/testing/mytestvalidate.csv
/adobe_channel/testing/mytestforsuetocheck.csv
Example Delivery: Multiple Reporting Exports
The following shows a collection of user defined report exports each has data that was delivered by Adobe to Openbridge. For example;
/adobe_channel/mysuiteid_adobe_channel_20180101.csv
/adobe_pages/mysuiteid_adobe_pages_20180101.csv
/adobe_products/mysuiteid_adob_products_20180101.csv
/adobe_site_totals/mysuiteid_adob_site_totals_20180101.csv
/adobe_video/mysuiteid_adob_video_20180101.csv
/adobe_visits/mysuiteid_adob_visits_20180101.csv
In the example above the report export called adobe_channel
was defined as a single pipeline via the configuration process. This means all subsequent adobe_channel
exports would be delivered to the same location:
/adobe_channel/mysuiteid_adobe_channel_20180101.csv
/adobe_channel/mysuiteid_adobe_channel_20180102.csv
/adobe_channel/mysuiteid_adobe_channel_20180103.csv
/adobe_channel/mysuiteid_adobe_channel_20180104.csv
/adobe_channel/mysuiteid_adobe_channel_20180105.csv
The data would be loaded into a target table you defined in your batch setup process. You repeat this process for all your pipelines.
Loading Adobe Warehouse Data To The Database
Once the pipelines have been configured and delivery from Adobe has commenced feeds will arrive daily. For example, this shows daily files being delivered to a user defined report called adob_site_totals
;
"/adobe_site_totals/rsid_adob_site_tot_20180101.csv"
"/adobe_site_totals/rsid_adob_site_tot_20180102.csv"
"/adobe_site_totals/rsid_adob_site_tot_20180103.csv"
The system will process each CSV file according to the organizational structure the files are being delivered. For example, based on the above structure a table would be created called adob_site_totals
. This is based on the report type directory name.
Based on the CSV header residing in the report type folder adob_site_totals
columns are created which align with the schema of the file. In this example, the CSV contained the header:
date, visits, unique_visitors, something_23_event23, rsid
The CSV file will have values that would align to those headers. In this example the CSV file has following row values:
"January 8, 2015",111292,12513,8139,rsid
associated with each header item
date, visits, unique_visitors, something_23_event23, rsid
With each subsequent file delivery of for the adob_site_totals
report suite the same process would occur.This would result in new rows being added to adob_site_totals
:
date,visits,unique_visitors,something_23_event23,rsid
"January 1, 2015",111292,3256,2243,x
"January 2, 2015",171540,12385,5654,x
"January 3, 2015",131791,23423,8453,x
"January 4, 2015",81292,4513,1296,x
Access Credentials
Adobe supports standard username and password credential for SFTP. Openbridge will supply these to you to supply during the batch pipeline setup process.
Please note, that Adobe will only have access to the specific directory path in a given pipeline. They are "locked" there and only have access to deliver files to the target directory.
Delivery Frequency
You have the option of scheduling the frequency of your data warehouse with the Adobe website. We suggest that the feeds are scheduled daily, delivered between 3AM and 6AM daily.