All Collections
Data Sources
Adobe
Setting Up Adobe Data Warehouse Feeds (ADW)
Setting Up Adobe Data Warehouse Feeds (ADW)
Openbridge Support avatar
Written by Openbridge Support
Updated over a week ago

The process outlined in this document described how to configure Openbridge to automate reception of Adobe Analytics export feeds via SFTP, processing them and load them into a target database.

Step 1:  Define Your Adobe Reporting Exports

The process of defining the extracts and scheduling delivery occurs within the Adobe website. In the Adobe management interface you can generate ad hoc data reports for ongoing or historical reporting data. 

In the Adobe UI you build a query to filter your data and isolate specific feeds. The vast majority of requests take less than a day to process, but depending on the complexity of your query and the amount of data it can take longer to process.

The following are a few Adobe documents which describe how to setup Adobe to deliver exports:

The reports are delivered as CSV files via SFTP from Adobe to Openbridge.

Understanding Adobe Report Types

A report type is a user defined export. This means reports are defined according to the structure and data that was configured in the Adobe interface for export. For example, reports may be a bounces export or it may be related to channels, products or visits. End users have the flexibility to define these exports in Adobe as they see fit.

Defining What Is In Each Adobe Reporting Export

As we stated earlier the content of an export is defined by the user. The delivery format from Adobe will be uncompressed CSV files. The raw CSV files will have headers that look like this based on the entities the user set during the report export setup in Adobe:

|Date|Visits|Unique Visitors|Something (23) (event23)|RSID|

The headers as supplied are technically not valid for import into a database. In above example there are mixed case, special characters and spacing used for column names. Our system will automatically standardize the headers, among other things, supplied by Adobe to ensure that downstream databases can properly import the data.

Our cleanaup process will generate a resulting header and table that look like this:

|date|visits|unique_visitors|something_23_event23|RSID|

The a sample row of values associated with the header should follow typical CSV conventions:

"January 1, 2015",111292,12513,8139, myRSID

Did you notice the RSID in the output? We suggest ALWAYS including this unique identifier in the output. If you ever include multiple RSIDs in a feed, you would have no means to distinguish which results aligns with a given property. Also, any other unique identifiers that will help you filter or associate data across properties should also be considered

Step 2:  Define Your Batch Data Pipeline For Each Reporting Export

For each report type export out of Adobe you will need to setup a delivery location, or data pipeline, within Openbridge so we can process and load the data. 

You will need to setup a pipeline for each export from Adobe. This article details the process of setting up a batch data pipeline for services like Adobe: https://blog.openbridge.com/how-to-setup-a-batch-data-pipeline-for-csv-files-8c4d0cd7394b

Delivery should occur via SFTP. The standard host name is the following:

  • SFTP Hostname = pipeline-01.openbridge.io

The following are the PORT numbers you will need to make sure are used by the system responsible for file delivery:

  • SFTP Port = 22 or 443

Credentials are provided during the pipeline setup process referenced above.

Organizing Adobe Reports Exports

Reporting exports have specific directories on the Openbridge system that correspond to the pipeline you configured in Step 2 below. As a result the output from Adobe must be delivered to the unique destination that aligns with the corresponding pipeline configuration. Also, you should use a unique filemask for exports that corresponds to the report suite and report type being sent. For example: "/(report_type)/(report_suite_report_type_YYYY-mm-dd).csv"

This would result in a unique destination and filemask would like this: "/adob_site_totals/mysuiteid_adob_site_totals_20150101.csv"

Why use the suite ID? in the event you are sending multiple RSID exports to the same report type pipeline, this will ensure you can audit which files belong to which report suite. Without this identifier it might make any auditing you perform more time consuming.

Testing Report Deliveries

Please make sure they are only putting sample or test data into the /testing  directory. Each pipeline comes with a /testing  directory to allow you to transfer files without Openbridge processing them. For example, you would tell Adobe to deliver your test files into the testing path:

/adobe_channel/testing

This means the test data will always be sent to like this:

/adobe_channel/testing/mytestfiletovalidate.csv
/adobe_channel/testing/mytestfiletovalidate2.csv
/adobe_channel/testing/mytestfiletova098.csv
/adobe_channel/testing/mytestvalidate.csv
/adobe_channel/testing/mytestforsuetocheck.csv

Example Delivery: Multiple Reporting Exports

The following shows a collection of user defined report exports each has data that was delivered by Adobe to Openbridge. For example;

/adobe_channel/mysuiteid_adobe_channel_20180101.csv
/adobe_pages/mysuiteid_adobe_pages_20180101.csv
/adobe_products/mysuiteid_adob_products_20180101.csv
/adobe_site_totals/mysuiteid_adob_site_totals_20180101.csv
/adobe_video/mysuiteid_adob_video_20180101.csv
/adobe_visits/mysuiteid_adob_visits_20180101.csv

In the example above the report export called adobe_channel was defined as a single pipeline via the configuration process. This means all subsequent adobe_channel exports would be delivered to the same location:

/adobe_channel/mysuiteid_adobe_channel_20180101.csv
/adobe_channel/mysuiteid_adobe_channel_20180102.csv
/adobe_channel/mysuiteid_adobe_channel_20180103.csv
/adobe_channel/mysuiteid_adobe_channel_20180104.csv
/adobe_channel/mysuiteid_adobe_channel_20180105.csv

The data would be loaded into a target table you defined in your batch setup process. You repeat this process for all your pipelines.

Loading Adobe Warehouse Data To The Database

Once the pipelines have been configured and delivery from Adobe has commenced feeds will arrive daily. For example, this shows daily files being delivered to a user defined report called adob_site_totals;

"/adobe_site_totals/rsid_adob_site_tot_20180101.csv"
"/adobe_site_totals/rsid_adob_site_tot_20180102.csv"
"/adobe_site_totals/rsid_adob_site_tot_20180103.csv"

The system will process each CSV file according to the organizational structure the files are being delivered. For example, based on the above structure a table would be created called adob_site_totals. This is based on the report type directory name.

Based on the CSV header residing in the report type folder adob_site_totals columns are created which align with the schema of the file. In this example, the CSV contained the header:

date, visits, unique_visitors, something_23_event23, rsid

The CSV file will have values that would align to those headers. In this example the CSV file has following row values:

"January 8, 2015",111292,12513,8139,rsid

associated with each header item

date, visits, unique_visitors, something_23_event23, rsid

With each subsequent file delivery of for the adob_site_totals report suite the same process would occur.This would result in new rows being added to adob_site_totals:

date,visits,unique_visitors,something_23_event23,rsid 
"January 1, 2015",111292,3256,2243,x
"January 2, 2015",171540,12385,5654,x
"January 3, 2015",131791,23423,8453,x
"January 4, 2015",81292,4513,1296,x

Access Credentials

Adobe supports standard username and password credential for SFTP. Openbridge will supply these to you to supply during the batch pipeline setup process.

Please note, that Adobe will only have access to the specific directory path in a given pipeline. They are "locked" there and only have access to deliver files to the target directory.

Delivery Frequency

You have the option of scheduling the frequency of your data warehouse with the Adobe website. We suggest that the feeds are scheduled daily, delivered between 3AM and 6AM daily.

Did this answer your question?