Database Best Practices for Analytics

When connecting a database to Querri, a little preparation goes a long way. This guide covers how administrators can set up databases to give analytics users a clean, fast, and secure experience—without exposing them to the complexity of raw normalized tables.

Raw Tables vs Analytics Views

Most production databases are normalized—data is split across many tables to reduce redundancy and maintain integrity. This is great for applications but frustrating for analytics.

The Problem with Raw Tables

Consider a typical e-commerce database:

orders          → order_items      → products      → categories
    ↓                  ↓
customers       → addresses        → regions
    ↓
customer_segments

To answer “What’s our revenue by product category and customer segment?”, an analyst would need to join 6 tables. In a natural language tool, this creates two problems:

Complexity — The AI needs to understand all the relationships
Performance — Multi-table joins on large datasets can be slow
Confusion — Users see dozens of tables and don’t know where to start

The Solution: Analytics Views

Create pre-joined views that present data the way analysts think about it:

-- PostgreSQL / MySQL
CREATE VIEW analytics.order_summary AS
SELECT
    o.order_id,
    o.order_date,
    o.total_amount,
    c.customer_name,
    c.email,
    cs.segment_name as customer_segment,
    p.product_name,
    cat.category_name,
    oi.quantity,
    oi.unit_price,
    oi.line_total
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN customer_segments cs ON c.segment_id = cs.segment_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories cat ON p.category_id = cat.category_id;

Now users see one order_summary view instead of six tables. They can immediately ask:

“Show revenue by customer segment”
“What are the top products by category?”
“Revenue trend by month”

No joins required—the view handles it.

When to Create Views

Create Views For:

Commonly joined data If analysts always need customer info with orders, pre-join them.

Aggregated summaries Daily/weekly/monthly rollups that are expensive to compute on the fly.

Filtered subsets “Active customers only” or “Last 2 years of orders” views run faster than filtering the full table every time.

Calculated fields Profit margins, customer lifetime value, days since last order—compute once in the view.

Denormalized dimensions Flatten hierarchies (Product → Subcategory → Category → Department) into single columns.

Keep Raw Tables For:

Ad-hoc deep dives Sometimes analysts need the raw detail.

Data that changes structure frequently Views on volatile schemas break easily.

Very large fact tables Views on billion-row tables may not perform better than direct access.

Creating Effective Analytics Views

Naming Conventions

Use clear, descriptive names:

-- Good: Clear what it contains
CREATE VIEW analytics.daily_sales_summary ...
CREATE VIEW analytics.customer_lifetime_value ...
CREATE VIEW analytics.product_performance ...

-- Avoid: Cryptic abbreviations
CREATE VIEW v_dss_01 ...
CREATE VIEW rpt_cust_ltv ...

Include Useful Calculated Fields

CREATE VIEW analytics.customer_metrics AS
SELECT
    c.customer_id,
    c.customer_name,
    c.email,
    c.created_at as signup_date,

    -- Calculated fields analysts always need
    COUNT(o.order_id) as total_orders,
    SUM(o.total_amount) as lifetime_value,
    AVG(o.total_amount) as avg_order_value,
    MIN(o.order_date) as first_order_date,
    MAX(o.order_date) as last_order_date,
    DATEDIFF(day, MAX(o.order_date), CURRENT_DATE) as days_since_last_order,

    -- Derived segments
    CASE
        WHEN COUNT(o.order_id) >= 10 THEN 'VIP'
        WHEN COUNT(o.order_id) >= 5 THEN 'Regular'
        WHEN COUNT(o.order_id) >= 2 THEN 'Repeat'
        ELSE 'New'
    END as customer_tier

FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name, c.email, c.created_at;

Pre-Aggregate Time Series

For dashboards that always show monthly or weekly data:

CREATE VIEW analytics.monthly_revenue AS
SELECT
    DATE_TRUNC('month', order_date) as month,
    COUNT(*) as order_count,
    SUM(total_amount) as revenue,
    COUNT(DISTINCT customer_id) as unique_customers,
    SUM(total_amount) / COUNT(DISTINCT customer_id) as revenue_per_customer
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '3 years'
GROUP BY DATE_TRUNC('month', order_date);

Consider Materialized Views for Large Data

For expensive aggregations, use materialized views (pre-computed and stored):

PostgreSQL:

CREATE MATERIALIZED VIEW analytics.product_performance AS
SELECT
    p.product_id,
    p.product_name,
    p.category,
    SUM(oi.quantity) as total_units_sold,
    SUM(oi.line_total) as total_revenue,
    COUNT(DISTINCT o.customer_id) as unique_buyers
FROM products p
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY p.product_id, p.product_name, p.category;

-- Refresh periodically (e.g., nightly)
REFRESH MATERIALIZED VIEW analytics.product_performance;

SQL Server:

-- Use indexed views for automatic refresh
CREATE VIEW analytics.product_performance WITH SCHEMABINDING AS
SELECT
    p.product_id,
    p.product_name,
    SUM(oi.line_total) as total_revenue,
    COUNT_BIG(*) as row_count
FROM dbo.products p
JOIN dbo.order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name;

CREATE UNIQUE CLUSTERED INDEX IX_product_performance
ON analytics.product_performance(product_id);

Restricting Access to Specific Tables and Views

Don’t give analytics users access to everything. Create a focused, clean environment.

Create an Analytics Schema

Keep analytics views separate from raw tables:

PostgreSQL:

-- Create analytics schema
CREATE SCHEMA analytics;

-- Create analytics user
CREATE USER querri_analytics WITH PASSWORD 'secure_password';

-- Grant access ONLY to analytics schema
GRANT USAGE ON SCHEMA analytics TO querri_analytics;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_analytics;
ALTER DEFAULT PRIVILEGES IN SCHEMA analytics
    GRANT SELECT ON TABLES TO querri_analytics;

-- Explicitly deny access to other schemas (optional, for extra security)
REVOKE ALL ON SCHEMA public FROM querri_analytics;

MySQL:

-- Create analytics database (MySQL doesn't have schemas like PostgreSQL)
CREATE DATABASE analytics;

-- Create analytics user with access only to analytics database
CREATE USER 'querri_analytics'@'%' IDENTIFIED BY 'secure_password';
GRANT SELECT ON analytics.* TO 'querri_analytics'@'%';
FLUSH PRIVILEGES;

SQL Server:

-- Create analytics schema
CREATE SCHEMA analytics;

-- Create login and user
CREATE LOGIN querri_analytics WITH PASSWORD = 'secure_password';
USE your_database;
CREATE USER querri_analytics FOR LOGIN querri_analytics;

-- Grant access only to analytics schema
GRANT SELECT ON SCHEMA::analytics TO querri_analytics;

-- Deny access to other schemas
DENY SELECT ON SCHEMA::dbo TO querri_analytics;

Grant Access to Specific Tables Only

If you don’t want to use a separate schema, grant access table by table:

PostgreSQL:

-- Create user with no default permissions
CREATE USER querri_analytics WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE your_database TO querri_analytics;
GRANT USAGE ON SCHEMA public TO querri_analytics;

-- Grant access to specific tables only
GRANT SELECT ON orders TO querri_analytics;
GRANT SELECT ON customers TO querri_analytics;
GRANT SELECT ON products TO querri_analytics;

-- User cannot see: invoices, payments, user_credentials, etc.

MySQL:

CREATE USER 'querri_analytics'@'%' IDENTIFIED BY 'secure_password';

-- Grant access to specific tables only
GRANT SELECT ON your_database.orders TO 'querri_analytics'@'%';
GRANT SELECT ON your_database.customers TO 'querri_analytics'@'%';
GRANT SELECT ON your_database.products TO 'querri_analytics'@'%';
FLUSH PRIVILEGES;

SQL Server:

CREATE USER querri_analytics FOR LOGIN querri_analytics;

-- Grant access to specific tables only
GRANT SELECT ON dbo.orders TO querri_analytics;
GRANT SELECT ON dbo.customers TO querri_analytics;
GRANT SELECT ON dbo.products TO querri_analytics;

Reducing Data Clutter

The fewer tables users see, the faster they can find what they need.

Hide Internal/System Tables

Don’t expose:

Migration tracking tables (schema_migrations, __drizzle_migrations)
Audit logs (unless specifically needed)
Session/cache tables
Internal configuration tables

Exclude Sensitive Data

Remove or mask columns analysts don’t need:

-- View without sensitive columns
CREATE VIEW analytics.customers_safe AS
SELECT
    customer_id,
    customer_name,
    -- Mask email
    CONCAT(LEFT(email, 2), '***@***', RIGHT(email, 4)) as email_masked,
    city,
    state,
    country,
    segment,
    created_at
    -- Excluded: password_hash, ssn, credit_card_last4, etc.
FROM customers;

Archive Old Data

Move historical data out of main tables:

-- Main table: recent data only
CREATE VIEW analytics.orders_recent AS
SELECT * FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '2 years';

-- Archive table: older data (separate, if needed)
CREATE VIEW analytics.orders_archive AS
SELECT * FROM orders_archive;

Sample Analytics Schema

Here’s a complete example of an analytics-ready setup:

-- Create schema
CREATE SCHEMA analytics;

-- Core views
CREATE VIEW analytics.orders AS
SELECT
    o.order_id,
    o.order_date,
    o.total_amount,
    o.status,
    c.customer_id,
    c.customer_name,
    c.segment as customer_segment,
    c.city,
    c.state,
    c.country
FROM raw.orders o
JOIN raw.customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '3 years';

CREATE VIEW analytics.order_items AS
SELECT
    oi.order_id,
    oi.line_number,
    p.product_id,
    p.product_name,
    p.category,
    p.subcategory,
    oi.quantity,
    oi.unit_price,
    oi.line_total
FROM raw.order_items oi
JOIN raw.products p ON oi.product_id = p.product_id;

CREATE VIEW analytics.customer_summary AS
SELECT
    c.customer_id,
    c.customer_name,
    c.segment,
    c.city,
    c.state,
    c.country,
    c.created_at as signup_date,
    COUNT(o.order_id) as total_orders,
    COALESCE(SUM(o.total_amount), 0) as lifetime_value,
    MAX(o.order_date) as last_order_date
FROM raw.customers c
LEFT JOIN raw.orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name, c.segment,
         c.city, c.state, c.country, c.created_at;

CREATE VIEW analytics.monthly_summary AS
SELECT
    DATE_TRUNC('month', order_date) as month,
    COUNT(*) as orders,
    SUM(total_amount) as revenue,
    COUNT(DISTINCT customer_id) as customers
FROM raw.orders
WHERE order_date >= CURRENT_DATE - INTERVAL '3 years'
GROUP BY DATE_TRUNC('month', order_date);

-- Create analytics user
CREATE USER querri_analytics WITH PASSWORD 'secure_password';
GRANT USAGE ON SCHEMA analytics TO querri_analytics;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_analytics;

Now analytics users see 4 clean views instead of dozens of normalized tables.

Checklist for Database Admins

Before connecting a database to Querri:

Next Steps

PostgreSQL Connector — PostgreSQL-specific setup
MySQL Connector — MySQL-specific setup
SQL Server Connector — SQL Server-specific setup
Amazon Redshift Connector — Redshift data warehouse setup
Managing Connections — Edit and monitor your connectors