Redshift setup
profiles.yml file is for dbt Core users onlyIf you're using dbt Cloud, you don't need to create a profiles.yml file. This file is only for dbt Core users. To connect your data platform to dbt Cloud, refer to About data platforms.
- Maintained by: dbt Labs
- Authors: core dbt maintainers
- GitHub repo: dbt-labs/dbt-redshift
- PyPI package:
dbt-redshift - Slack channel: #db-redshift
- Supported dbt Core version: v0.10.0 and newer
- dbt Cloud support: Supported
- Minimum data platform version: n/a
Installing dbt-redshift
Use pip to install the adapter. Before 1.8, installing the adapter would automatically install dbt-core and any additional dependencies. Beginning in 1.8, installing an adapter does not automatically install dbt-core. This is because adapters and dbt Core versions have been decoupled from each other so we no longer want to overwrite existing dbt-core installations.
Use the following command for installation:
Configuring dbt-redshift
For Redshift-specific configuration, please refer to Redshift configs.
Configurations
| Profile field | Example | Description |
|---|---|---|
type | redshift | The type of data warehouse you are connecting to |
host | hostname.region.redshift.amazonaws.com or workgroup.account.region.redshift-serverless.amazonaws.com | Host of cluster |
port | 5439 | |
dbname | my_db | Database name |
schema | my_schema | Schema name |
connect_timeout | None or 30 | Number of seconds before connection times out |
sslmode | prefer | optional, set the sslmode to connect to the database. Default prefer, which will use 'verify-ca' to connect. For more information on sslmode, see Redshift note below |
role | None | Optional, user identifier of the current session |
autocreate | false | Optional, default false. Creates user if they do not exist |
db_groups | ['ANALYSTS'] | Optional. A list of existing database group names that the DbUser joins for the current session |
ra3_node | true | Optional, default False. Enables cross-database sources |
autocommit | true | Optional, default True. Enables autocommit after each statement |
retries | 1 | Number of retries |
Authentication Parameters
The authentication methods that dbt Core supports on Redshift are:
Database— Password-based authentication (default, will be used ifmethodis not provided)IAM User— IAM User authentication via AWS Profile
Click on one of these authentication methods for further details on how to configure your connection profile. Each tab also includes an example profiles.yml configuration file for you to review.
- Database
- IAM User via AWS Profile (Core)
The following table contains the parameters for the database (password-based) connection method.
| Profile field | Example | Description |
|---|---|---|
method | database | Leave this parameter unconfigured, or set this to database |
user | username | Account username to log into your cluster |
password | password1 | Password for authentication |
Example profiles.yml for database authentication
company-name:
target: dev
outputs:
dev:
type: redshift
host: hostname.region.redshift.amazonaws.com
user: username
password: password1
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
sslmode: prefer
role: None
ra3_node: true
autocommit: true
threads: 4
connect_timeout: None
The following table lists the authentication parameters to use IAM authentication.
To set up a Redshift profile using IAM Authentication, set the method parameter to iam as shown below. Note that a password is not required when using IAM Authentication. For more information on this type of authentication, consult the Redshift Documentation and boto3 docs on generating user credentials with IAM Auth.
If you receive the "You must specify a region" error when using IAM Authentication, then your aws credentials are likely misconfigured. Try running aws configure to set up AWS access keys, and pick a default region. If you have any questions, please refer to the official AWS documentation on Configuration and credential file settings.
| Profile field | Example | Description |
|---|---|---|
method | IAM | use IAM to authenticate via IAM User authentication |
iam_profile | analyst | dbt will use the specified profile from your ~/.aws/config file |
cluster_id | CLUSTER_ID | Required for IAM authentication only for provisoned cluster, not for Serverless |
user | username | User querying the database, ignored for Serverless (but field still required) |
region | us-east-1 | Region of your Redshift instance |
Example profiles.yml for IAM
my-redshift-db:
target: dev
outputs:
dev:
type: redshift
method: iam
cluster_id: CLUSTER_ID
host: hostname.region.redshift.amazonaws.com
user: alice
iam_profile: analyst
region: us-east-1
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
threads: 4
connect_timeout: None
retries: 1
role: None
sslmode: prefer
ra3_node: true
autocommit: true
autocreate: true
db_groups: ['ANALYSTS']
Specifying an IAM Profile
When the iam_profile configuration is set, dbt will use the specified profile from your ~/.aws/config file instead of using the profile name default
Redshift notes
sslmode change
Before dbt-redshift 1.5, psycopg2 was used as the driver. psycopg2 accepts disable, prefer, allow, require, verify-ca, verify-full as valid inputs of sslmode, and does not have an ssl parameter, as indicated in PostgreSQL doc.
In dbt-redshift 1.5, we switched to using redshift_connector, which accepts verify-ca, and verify-full as valid sslmode inputs, and has a ssl parameter of True or False, according to redshift doc.
For backward compatibility, dbt-redshift now supports valid inputs for sslmode in psycopg2. We've added conversion logic mapping each of psycopg2's accepted sslmode values to the corresponding ssl and sslmode parameters in redshift_connector.
The table below details accepted sslmode parameters and how the connection will be made according to each option:
sslmode parameter | Expected behavior in dbt-redshift | Actions behind the scenes |
|---|---|---|
| disable | Connection will be made without using ssl | Set ssl = False |
| allow | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
| prefer | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
| require | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
| verify-ca | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
| verify-full | Connection will be made using verify-full | Set ssl = True & sslmode = verify-full |
When a connection is made using verify-ca, will look for the CA certificate in ~/redshift-ca-bundle.crt.
For more details on sslmode changes, our design choices, and reasoning — please refer to the PR pertaining to this change.
autocommit parameter
The autocommit mode is useful to execute commands that run outside a transaction. Connection objects used in Python must have autocommit = True to run operations such as CREATE DATABASE, and VACUUM. autocommit is off by default in redshift_connector, but we've changed this default to True to ensure certain macros run successfully in your dbt project.
If desired, you can define a separate target with autocommit=True as such:
profile-to-my-RS-target:
target: dev
outputs:
dev:
type: redshift
...
autocommit: False
profile-to-my-RS-target-with-autocommit-enabled:
target: dev
outputs:
dev:
type: redshift
...
autocommit: True
To run certain macros with autocommit, load the profile with autocommit using the --profile flag. For more context, please refer to this PR.
Deprecated profile parameters in 1.5
-
iam_duration_seconds -
keepalives_idle
sort and dist keys
Where possible, dbt enables the use of sort and dist keys. See the section on Redshift specific configurations.
retries
If dbt-redshift encounters an operational error or timeout when opening a new connection, it will retry up to the number of times configured by retries. If set to 2+ retries, dbt will wait 1 second before retrying. The default value is 1 retry. If set to 0, dbt will not retry at all.