Amazon Redshift
Connect Querri to your Amazon Redshift data warehouse to analyze data using natural language. Once connected, simply ask questions—Querri handles all the SQL for you.
Overview
Section titled “Overview”Amazon Redshift is a fully managed, petabyte-scale data warehouse service from AWS. It is based on PostgreSQL, so Querri connects to it using the PostgreSQL wire protocol for efficient, streaming data access.
The Redshift connector supports:
- Amazon Redshift Provisioned: Traditional cluster-based data warehouse
- Amazon Redshift Serverless: On-demand, auto-scaling data warehouse
- Amazon Redshift RA3 nodes: Managed storage with compute separation
Prerequisites
Section titled “Prerequisites”- An AWS account with an active Redshift cluster or Redshift Serverless workgroup
- Database credentials (username and password) with appropriate permissions
- Network access from Querri to your Redshift cluster endpoint (public accessibility or VPC peering)
IP Whitelisting
Section titled “IP Whitelisting”Important: For Redshift clusters with restricted network access, whitelist the following IP address:
18.189.33.77This is Querri’s outbound IP address. Add this to your Redshift cluster’s VPC security group inbound rules.
Finding Your Cluster Endpoint
Section titled “Finding Your Cluster Endpoint”Redshift Provisioned Cluster
Section titled “Redshift Provisioned Cluster”- Open the Amazon Redshift console
- Click Clusters in the left navigation
- Select your cluster
- Under General information, find the Endpoint field
- The endpoint looks like:
my-cluster.abc123xyz.us-east-1.redshift.amazonaws.com:5439/mydb - The host is everything before the colon (e.g.,
my-cluster.abc123xyz.us-east-1.redshift.amazonaws.com)
Redshift Serverless
Section titled “Redshift Serverless”- Open the Amazon Redshift console
- Click Serverless dashboard in the left navigation
- Select your workgroup
- Under General information, find the Endpoint field
- The endpoint looks like:
my-workgroup.123456789012.us-east-1.redshift-serverless.amazonaws.com:5439/dev - The host is everything before the colon
Creating the Connector
Section titled “Creating the Connector”-
Navigate to Connectors
- Go to Settings > Connectors
- Click “Add Connector”
-
Select Amazon Redshift
- Choose “Amazon Redshift” from the database connectors
-
Configure Connection
Host: my-cluster.abc123xyz.us-east-1.redshift.amazonaws.comPort: 5439 (default Redshift port)Database: your_database_nameUsername: your_usernamePassword: your_passwordSchema: public (default)Connection Parameters:
| Parameter | Description | Default |
|---|---|---|
| Host | Cluster or serverless workgroup endpoint | Required |
| Port | TCP port for Redshift connections | 5439 |
| Database | Name of the database to connect to | Required |
| Username | Database user with read permissions | Required |
| Password | Database password | Required |
| Schema | Database schema to use | public |
-
Test Connection
- Click “Test Connection” to verify connectivity
- If successful, click “Save”
-
Discover and Select Data
- After connecting, Querri discovers available databases and schemas
- Select the tables you want to work with, or use custom SQL queries
- Selected tables appear in your Library
-
Start Analyzing
- Create a project from any table in your Library
- Ask questions in natural language
Network Configuration
Section titled “Network Configuration”Public Accessibility
Section titled “Public Accessibility”If your Redshift cluster has public accessibility enabled, add an inbound rule to its VPC security group:
Type: RedshiftProtocol: TCPPort: 5439Source: 18.189.33.77/32Description: Querri AccessVia AWS Console:
- Go to your Redshift cluster in the AWS Console
- Click on the Properties tab
- Under Network and security, click the VPC security group link
- Add an inbound rule with the settings above
- Click Save rules
Private Clusters (VPC Only)
Section titled “Private Clusters (VPC Only)”If your Redshift cluster is not publicly accessible, you have two options:
- Enable public accessibility on the cluster and restrict access via security group rules (recommended for simplicity)
- Set up VPC peering or a bastion host to allow Querri to reach the private endpoint
To enable public accessibility:
- Open the Amazon Redshift console
- Select your cluster
- Choose Actions > Modify publicly accessible setting
- Enable Turn on Publicly accessible
- Add Querri’s IP to the security group (see above)
Redshift Serverless
Section titled “Redshift Serverless”For Redshift Serverless workgroups, configure network access through the workgroup settings:
- Go to Serverless dashboard > Workgroups
- Select your workgroup
- Under Network and security, review the VPC security group
- Add an inbound rule allowing TCP port 5439 from
18.189.33.77/32
Database User Setup
Section titled “Database User Setup”Read-Only User (Recommended)
Section titled “Read-Only User (Recommended)”Create a dedicated read-only user for Querri:
-- Create userCREATE USER querri_readonly PASSWORD 'SecurePassword123!';
-- Grant usage on the schemaGRANT USAGE ON SCHEMA public TO querri_readonly;
-- Grant select on all existing tablesGRANT SELECT ON ALL TABLES IN SCHEMA public TO querri_readonly;
-- Grant select on future tablesALTER DEFAULT PRIVILEGES IN SCHEMA publicGRANT SELECT ON TABLES TO querri_readonly;Restricting Access to Specific Schemas
Section titled “Restricting Access to Specific Schemas”For tighter security, grant access only to specific schemas:
-- Create userCREATE USER querri_readonly PASSWORD 'SecurePassword123!';
-- Grant access to analytics schema onlyGRANT USAGE ON SCHEMA analytics TO querri_readonly;GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_readonly;ALTER DEFAULT PRIVILEGES IN SCHEMA analyticsGRANT SELECT ON TABLES TO querri_readonly;Restricting Access to Specific Tables
Section titled “Restricting Access to Specific Tables”-- Grant access to specific tables onlyGRANT USAGE ON SCHEMA public TO querri_readonly;GRANT SELECT ON public.orders TO querri_readonly;GRANT SELECT ON public.customers TO querri_readonly;GRANT SELECT ON public.products TO querri_readonly;See Database Best Practices for detailed guidance on creating analytics views and restricting access.
Verify Permissions
Section titled “Verify Permissions”-- Check user existsSELECT usename, usesysid, usecreatedb, usesuperFROM pg_userWHERE usename = 'querri_readonly';
-- Check table permissionsSELECT schemaname, tablename, has_table_privilege('querri_readonly', schemaname || '.' || tablename, 'SELECT') as has_selectFROM pg_tablesWHERE schemaname = 'public'ORDER BY tablename;Troubleshooting
Section titled “Troubleshooting”Cannot Connect to Cluster
Section titled “Cannot Connect to Cluster”Problem: Connection timed out or refused
Solutions:
- Verify Querri’s IP
18.189.33.77is in the cluster’s VPC security group - Check that the cluster is in Available status
- Confirm the cluster endpoint and port (5439) are correct
- Verify the cluster has Publicly accessible enabled (if connecting over the internet)
- Check that the database name in the connection matches an existing database
Authentication Failed
Section titled “Authentication Failed”Problem: Invalid credentials or user does not exist
Solutions:
- Double-check the username and password
- Verify the user exists:
SELECT * FROM pg_user WHERE usename = 'your_username'; - Ensure the password does not contain characters that need special escaping
- Check that the user has not been locked or disabled
Permission Denied
Section titled “Permission Denied”Problem: Permission denied for relation or schema
Solutions:
- Grant the necessary permissions (see Database User Setup above)
- Verify the schema name is correct
- Check that
GRANT USAGE ON SCHEMAhas been issued - Run:
SELECT has_schema_privilege('querri_readonly', 'public', 'USAGE');
SSL Connection Error
Section titled “SSL Connection Error”Problem: SSL-related connection failure
Solutions:
- Querri uses SSL mode
requireby default, which works with standard Redshift SSL certificates - Ensure your cluster has not disabled SSL (Redshift enables SSL by default)
- If your cluster requires a specific CA certificate, verify the SSL configuration
- Check the cluster’s
require_SSLparameter in the parameter group
Slow Data Sync
Section titled “Slow Data Sync”Problem: Data sync takes a long time
Querri copies table data from Redshift into its own storage for fast local analysis. Sync speed depends on table size and network throughput, not Redshift query optimization.
Solutions:
- Use table selection to sync only the tables you need, rather than all tables
- Set a row limit in Advanced Settings to cap the number of rows synced per table
- Create views in Redshift that pre-filter to recent data (e.g., last 2 years) and sync those instead of full tables
- Check cluster status — a paused or heavily loaded cluster will slow data transfer
Security Best Practices
Section titled “Security Best Practices”Use SSL Encryption
Section titled “Use SSL Encryption”Querri enforces SSL connections to Redshift by default (sslmode=require). Redshift clusters also enable SSL by default. This means your data is encrypted in transit without additional configuration.
Use Read-Only Credentials
Section titled “Use Read-Only Credentials”Always connect Querri with a read-only database user:
- Prevents accidental data modification
- Limits potential security impact
- Easier to audit and monitor
Restrict Schema Access
Section titled “Restrict Schema Access”Grant access only to the schemas and tables analysts need:
-- Create an analytics schema with pre-joined viewsCREATE SCHEMA analytics;
-- Grant access only to analytics schemaGRANT USAGE ON SCHEMA analytics TO querri_readonly;GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_readonly;Monitor Access
Section titled “Monitor Access”Track queries from the Querri user:
-- View recent queries by the Querri userSELECT query, querytxt, starttime, endtime, elapsed / 1000000.0 as secondsFROM stl_queryWHERE userid = (SELECT usesysid FROM pg_user WHERE usename = 'querri_readonly')ORDER BY starttime DESCLIMIT 20;Redshift-Specific Considerations
Section titled “Redshift-Specific Considerations”How Querri Uses Redshift
Section titled “How Querri Uses Redshift”Querri copies your Redshift table data into its own storage, then runs all analysis locally. This means:
- Querri does not run analytical queries directly against your cluster
- After the initial sync, your Redshift cluster is not under load from Querri
- Redshift-side optimizations (sort keys, distribution keys) do not affect Querri’s analysis performance
- Sync speed depends on table size and network bandwidth
Redshift Spectrum
Section titled “Redshift Spectrum”If you use Redshift Spectrum to query data in S3, those external tables are accessible through Querri as well, as long as the connected user has the appropriate permissions:
GRANT USAGE ON SCHEMA external_schema TO querri_readonly;GRANT SELECT ON ALL TABLES IN SCHEMA external_schema TO querri_readonly;Reducing Cluster Impact
Section titled “Reducing Cluster Impact”Since Querri streams full table data during sync, consider:
- Scheduling syncs during off-peak hours to avoid competing with other workloads
- Using concurrency scaling in your WLM settings so Querri syncs get dedicated capacity
- Syncing fewer tables — select only the tables analysts need rather than syncing everything
Next Steps
Section titled “Next Steps”- Database Best Practices — Create analytics views and optimize for natural language queries
- Data Connectors Overview — See all available connectors
- Managing Connections — Edit and monitor your connectors