Amazon Redshift Spectrum extends Redshift by offloading data to S3 for querying. Getting set up with Amazon Redshift Spectrum is quick and easy. The process should take no more than 5 minutes. There are a few steps that you will need to care for:
Create an S3 bucket to be used for Openbridge and Amazon Redshift Spectrum. This bucket holds your final processed data that was sourced from third-party integrations, our batch API, or streaming events API
Create an IAM policy that allows Openbridge to access S3 and Amazon Redshift Spectrum
Step 1: Create an Amazon S3 Bucket
First, log into Amazon: https://console.aws.amazon.com/
Note: If you already have a bucket you want to use, skip to Step 2: Setting up IAM
Name and region: Create an S3 Bucket with a name like “mycompany001-openbridge-spectrum”. This can be anything you want but please be aware that the bucket name should be a unique name. When you are ready the click next...
2. Set properties: No additional properties or permissions are required from us If you want to set them for your own purposes, please feel free to do so. When you are ready the click next...
3. Set permissions: No additional properties or permissions required at this step. When you are ready the click next...
4. Review: Take a look at the setup and if all looks well you can select “Create bucket”.
All set!
Next is attaching an IAM policy to the bucket which will grant us permission to read and write to it.
Step 2: Setup IAM Role
If you are not already signed into AWS please do so here: Sign in to the IAM console at https://console.aws.amazon.com/iam/
By default, Amazon Redshift Spectrum uses the AWS Glue data catalog in regions that support AWS Glue. In other regions, Redshift Spectrum uses the Athena data catalog. Your cluster needs the authorization to access your external data catalog in AWS Glue or Athena and your data files in Amazon S3.
1. Create an IAM Role
Once complete, click "Create Role..."
Next, select the type of role "AWS Service", then "Redshift". Under "Select your use case" you will want to choose "Redshift-Customizable"
Select "Next: Permissions".
Do not attach any permissions yet. We will add an inline policy later.
Select "Next: Review"
Provide a "Role Name". Lets use "openbridge-redshift-spectrum"
On the next screen, you will see a list of roles. Put "openbridge" in the search bar to filter them. Then select the openbridge-redshift-spectrum role you created: You will now see your new role listed:
There are a couple of things that you will need to do next. First, take note of the "Role ARN". You will want to copy and save this for later use: arn:aws:iam::477979747229:role/openbridge-redshift-spectrum
Step 3: Attached Spectrum Policy
Next, you want to select "Add inline policy"
In the "Create Policy" window select "JSON". The window will look like this with your policy in place:
This is the JSON policy to paste:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListMultipartUploadParts",
"s3:ListBucket",
"s3:ListBucketMultipartUploads"
],
"Resource": [
"arn:aws:s3:::yourbucketname",
"arn:aws:s3:::yourbucketname/*"
]
},
{
"Effect": "Allow",
"Action": [
"glue:CreateDatabase",
"glue:DeleteDatabase",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:UpdateDatabase",
"glue:CreateTable",
"glue:DeleteTable",
"glue:BatchDeleteTable",
"glue:UpdateTable",
"glue:GetTable",
"glue:GetTables",
"glue:BatchCreatePartition",
"glue:CreatePartition",
"glue:DeletePartition",
"glue:BatchDeletePartition",
"glue:UpdatePartition",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition"
],
"Resource": [
"*"
]
}
]
}
IMPORTANT: Remember to change the yourbucketname
in the policy document to the AWS S3 bucket, you set up in Step 1.
Select "Review Policy".
The last step is to name the policy. Use something like "openbridge-aws-redshift-spectrum" which should be unique and easy to identify.
Next, select "Create Policy". Congratulations, the IAM role, and policy are ready! The last step is attaching this to Redshift.
Step 4: Add IAM User
1. Create an IAM User
Create an IAM User with the username “openbridge-redshift-spectrum”. Next, select the access type of “Programmatic access”.
Once complete, click "Next..."
Skip “2. Permissions” and go to "3. Review". You may see an alert stating the user has no permissions, just ignore that. Click “Create user”
Make sure to download the credentials.csv file. Also, keep this file in a safe place. The CSV contains your AWS Secret and AWS Key for the user you just created. You will need this later.
All set? Congratulations, the user is ready!
Find the IAM openbridge-redshift-spectrum user and click to “Add inline policy”.
Name this policy as "openbridge-spectrum-policy".
In the dialog screen, you want to paste the policy where it asks for "Policy Document"
Here is the most recent spectrum-s3.json
policy document to paste. Below is a sample:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "123",
"Effect": "Allow",
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::yourbucketname/*"
]
}
]
}
IMPORTANT: Remember to change the yourbucketname
in the policy document to the AWS S3 bucket, you set up in Step 1.
Step 5: Add IAM Role To Your Redshift Cluster
Sign in to the AWS Management Console and open the Amazon Redshift console: https://console.aws.amazon.com/redshift/.
In the navigation pane, choose Clusters.
In the list, choose the cluster that you want the IAM role you created to be associated with.
Choose Manage IAM Roles.
To associate an IAM role with the cluster, select your IAM role from the Available roles list.
You can also manually enter an IAM role if you don’t see it included in the list (for example, if the IAM role hasn’t been created yet).
In this case, select your "openbridge-aws-redshift-spectrum"
To disassociate an IAM role from the cluster, choose X for the IAM role that you want to disassociate.
After you have finished modifying the IAM role associations for the cluster, choose Apply Changes to update the IAM roles that are associated with the cluster.
After you select to apply, the cluster will be updated with the new changes.
That is it! You have configured Amazon to allow Openbridge to manage Amazon Redshift Spectrum on your behalf.