Database Best Practices for Analytics
When connecting a database to Querri, a little preparation goes a long way. This guide covers how administrators can set up databases to give analytics users a clean, fast, and secure experience—without exposing them to the complexity of raw normalized tables.
Raw Tables vs Analytics Views
Section titled “Raw Tables vs Analytics Views”Most production databases are normalized—data is split across many tables to reduce redundancy and maintain integrity. This is great for applications but frustrating for analytics.
The Problem with Raw Tables
Section titled “The Problem with Raw Tables”Consider a typical e-commerce database:
orders → order_items → products → categories ↓ ↓customers → addresses → regions ↓customer_segmentsTo answer “What’s our revenue by product category and customer segment?”, an analyst would need to join 6 tables. In a natural language tool, this creates two problems:
- Complexity — The AI needs to understand all the relationships
- Performance — Multi-table joins on large datasets can be slow
- Confusion — Users see dozens of tables and don’t know where to start
The Solution: Analytics Views
Section titled “The Solution: Analytics Views”Create pre-joined views that present data the way analysts think about it:
-- PostgreSQL / MySQLCREATE VIEW analytics.order_summary ASSELECT o.order_id, o.order_date, o.total_amount, c.customer_name, c.email, cs.segment_name as customer_segment, p.product_name, cat.category_name, oi.quantity, oi.unit_price, oi.line_totalFROM orders oJOIN customers c ON o.customer_id = c.customer_idJOIN customer_segments cs ON c.segment_id = cs.segment_idJOIN order_items oi ON o.order_id = oi.order_idJOIN products p ON oi.product_id = p.product_idJOIN categories cat ON p.category_id = cat.category_id;Now users see one order_summary view instead of six tables. They can immediately ask:
- “Show revenue by customer segment”
- “What are the top products by category?”
- “Revenue trend by month”
No joins required—the view handles it.
When to Create Views
Section titled “When to Create Views”Create Views For:
Section titled “Create Views For:”Commonly joined data If analysts always need customer info with orders, pre-join them.
Aggregated summaries Daily/weekly/monthly rollups that are expensive to compute on the fly.
Filtered subsets “Active customers only” or “Last 2 years of orders” views run faster than filtering the full table every time.
Calculated fields Profit margins, customer lifetime value, days since last order—compute once in the view.
Denormalized dimensions Flatten hierarchies (Product → Subcategory → Category → Department) into single columns.
Keep Raw Tables For:
Section titled “Keep Raw Tables For:”Ad-hoc deep dives Sometimes analysts need the raw detail.
Data that changes structure frequently Views on volatile schemas break easily.
Very large fact tables Views on billion-row tables may not perform better than direct access.
Creating Effective Analytics Views
Section titled “Creating Effective Analytics Views”Naming Conventions
Section titled “Naming Conventions”Use clear, descriptive names:
-- Good: Clear what it containsCREATE VIEW analytics.daily_sales_summary ...CREATE VIEW analytics.customer_lifetime_value ...CREATE VIEW analytics.product_performance ...
-- Avoid: Cryptic abbreviationsCREATE VIEW v_dss_01 ...CREATE VIEW rpt_cust_ltv ...Include Useful Calculated Fields
Section titled “Include Useful Calculated Fields”CREATE VIEW analytics.customer_metrics ASSELECT c.customer_id, c.customer_name, c.email, c.created_at as signup_date,
-- Calculated fields analysts always need COUNT(o.order_id) as total_orders, SUM(o.total_amount) as lifetime_value, AVG(o.total_amount) as avg_order_value, MIN(o.order_date) as first_order_date, MAX(o.order_date) as last_order_date, DATEDIFF(day, MAX(o.order_date), CURRENT_DATE) as days_since_last_order,
-- Derived segments CASE WHEN COUNT(o.order_id) >= 10 THEN 'VIP' WHEN COUNT(o.order_id) >= 5 THEN 'Regular' WHEN COUNT(o.order_id) >= 2 THEN 'Repeat' ELSE 'New' END as customer_tier
FROM customers cLEFT JOIN orders o ON c.customer_id = o.customer_idGROUP BY c.customer_id, c.customer_name, c.email, c.created_at;Pre-Aggregate Time Series
Section titled “Pre-Aggregate Time Series”For dashboards that always show monthly or weekly data:
CREATE VIEW analytics.monthly_revenue ASSELECT DATE_TRUNC('month', order_date) as month, COUNT(*) as order_count, SUM(total_amount) as revenue, COUNT(DISTINCT customer_id) as unique_customers, SUM(total_amount) / COUNT(DISTINCT customer_id) as revenue_per_customerFROM ordersWHERE order_date >= CURRENT_DATE - INTERVAL '3 years'GROUP BY DATE_TRUNC('month', order_date);Consider Materialized Views for Large Data
Section titled “Consider Materialized Views for Large Data”For expensive aggregations, use materialized views (pre-computed and stored):
PostgreSQL:
CREATE MATERIALIZED VIEW analytics.product_performance ASSELECT p.product_id, p.product_name, p.category, SUM(oi.quantity) as total_units_sold, SUM(oi.line_total) as total_revenue, COUNT(DISTINCT o.customer_id) as unique_buyersFROM products pJOIN order_items oi ON p.product_id = oi.product_idJOIN orders o ON oi.order_id = o.order_idWHERE o.order_date >= CURRENT_DATE - INTERVAL '1 year'GROUP BY p.product_id, p.product_name, p.category;
-- Refresh periodically (e.g., nightly)REFRESH MATERIALIZED VIEW analytics.product_performance;SQL Server:
-- Use indexed views for automatic refreshCREATE VIEW analytics.product_performance WITH SCHEMABINDING ASSELECT p.product_id, p.product_name, SUM(oi.line_total) as total_revenue, COUNT_BIG(*) as row_countFROM dbo.products pJOIN dbo.order_items oi ON p.product_id = oi.product_idGROUP BY p.product_id, p.product_name;
CREATE UNIQUE CLUSTERED INDEX IX_product_performanceON analytics.product_performance(product_id);Restricting Access to Specific Tables and Views
Section titled “Restricting Access to Specific Tables and Views”Don’t give analytics users access to everything. Create a focused, clean environment.
Create an Analytics Schema
Section titled “Create an Analytics Schema”Keep analytics views separate from raw tables:
PostgreSQL:
-- Create analytics schemaCREATE SCHEMA analytics;
-- Create analytics userCREATE USER querri_analytics WITH PASSWORD 'secure_password';
-- Grant access ONLY to analytics schemaGRANT USAGE ON SCHEMA analytics TO querri_analytics;GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_analytics;ALTER DEFAULT PRIVILEGES IN SCHEMA analytics GRANT SELECT ON TABLES TO querri_analytics;
-- Explicitly deny access to other schemas (optional, for extra security)REVOKE ALL ON SCHEMA public FROM querri_analytics;MySQL:
-- Create analytics database (MySQL doesn't have schemas like PostgreSQL)CREATE DATABASE analytics;
-- Create analytics user with access only to analytics databaseCREATE USER 'querri_analytics'@'%' IDENTIFIED BY 'secure_password';GRANT SELECT ON analytics.* TO 'querri_analytics'@'%';FLUSH PRIVILEGES;SQL Server:
-- Create analytics schemaCREATE SCHEMA analytics;
-- Create login and userCREATE LOGIN querri_analytics WITH PASSWORD = 'secure_password';USE your_database;CREATE USER querri_analytics FOR LOGIN querri_analytics;
-- Grant access only to analytics schemaGRANT SELECT ON SCHEMA::analytics TO querri_analytics;
-- Deny access to other schemasDENY SELECT ON SCHEMA::dbo TO querri_analytics;Grant Access to Specific Tables Only
Section titled “Grant Access to Specific Tables Only”If you don’t want to use a separate schema, grant access table by table:
PostgreSQL:
-- Create user with no default permissionsCREATE USER querri_analytics WITH PASSWORD 'secure_password';GRANT CONNECT ON DATABASE your_database TO querri_analytics;GRANT USAGE ON SCHEMA public TO querri_analytics;
-- Grant access to specific tables onlyGRANT SELECT ON orders TO querri_analytics;GRANT SELECT ON customers TO querri_analytics;GRANT SELECT ON products TO querri_analytics;
-- User cannot see: invoices, payments, user_credentials, etc.MySQL:
CREATE USER 'querri_analytics'@'%' IDENTIFIED BY 'secure_password';
-- Grant access to specific tables onlyGRANT SELECT ON your_database.orders TO 'querri_analytics'@'%';GRANT SELECT ON your_database.customers TO 'querri_analytics'@'%';GRANT SELECT ON your_database.products TO 'querri_analytics'@'%';FLUSH PRIVILEGES;SQL Server:
CREATE USER querri_analytics FOR LOGIN querri_analytics;
-- Grant access to specific tables onlyGRANT SELECT ON dbo.orders TO querri_analytics;GRANT SELECT ON dbo.customers TO querri_analytics;GRANT SELECT ON dbo.products TO querri_analytics;Reducing Data Clutter
Section titled “Reducing Data Clutter”The fewer tables users see, the faster they can find what they need.
Hide Internal/System Tables
Section titled “Hide Internal/System Tables”Don’t expose:
- Migration tracking tables (
schema_migrations,__drizzle_migrations) - Audit logs (unless specifically needed)
- Session/cache tables
- Internal configuration tables
Exclude Sensitive Data
Section titled “Exclude Sensitive Data”Remove or mask columns analysts don’t need:
-- View without sensitive columnsCREATE VIEW analytics.customers_safe ASSELECT customer_id, customer_name, -- Mask email CONCAT(LEFT(email, 2), '***@***', RIGHT(email, 4)) as email_masked, city, state, country, segment, created_at -- Excluded: password_hash, ssn, credit_card_last4, etc.FROM customers;Archive Old Data
Section titled “Archive Old Data”Move historical data out of main tables:
-- Main table: recent data onlyCREATE VIEW analytics.orders_recent ASSELECT * FROM ordersWHERE order_date >= CURRENT_DATE - INTERVAL '2 years';
-- Archive table: older data (separate, if needed)CREATE VIEW analytics.orders_archive ASSELECT * FROM orders_archive;Sample Analytics Schema
Section titled “Sample Analytics Schema”Here’s a complete example of an analytics-ready setup:
-- Create schemaCREATE SCHEMA analytics;
-- Core viewsCREATE VIEW analytics.orders ASSELECT o.order_id, o.order_date, o.total_amount, o.status, c.customer_id, c.customer_name, c.segment as customer_segment, c.city, c.state, c.countryFROM raw.orders oJOIN raw.customers c ON o.customer_id = c.customer_idWHERE o.order_date >= CURRENT_DATE - INTERVAL '3 years';
CREATE VIEW analytics.order_items ASSELECT oi.order_id, oi.line_number, p.product_id, p.product_name, p.category, p.subcategory, oi.quantity, oi.unit_price, oi.line_totalFROM raw.order_items oiJOIN raw.products p ON oi.product_id = p.product_id;
CREATE VIEW analytics.customer_summary ASSELECT c.customer_id, c.customer_name, c.segment, c.city, c.state, c.country, c.created_at as signup_date, COUNT(o.order_id) as total_orders, COALESCE(SUM(o.total_amount), 0) as lifetime_value, MAX(o.order_date) as last_order_dateFROM raw.customers cLEFT JOIN raw.orders o ON c.customer_id = o.customer_idGROUP BY c.customer_id, c.customer_name, c.segment, c.city, c.state, c.country, c.created_at;
CREATE VIEW analytics.monthly_summary ASSELECT DATE_TRUNC('month', order_date) as month, COUNT(*) as orders, SUM(total_amount) as revenue, COUNT(DISTINCT customer_id) as customersFROM raw.ordersWHERE order_date >= CURRENT_DATE - INTERVAL '3 years'GROUP BY DATE_TRUNC('month', order_date);
-- Create analytics userCREATE USER querri_analytics WITH PASSWORD 'secure_password';GRANT USAGE ON SCHEMA analytics TO querri_analytics;GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO querri_analytics;Now analytics users see 4 clean views instead of dozens of normalized tables.
Checklist for Database Admins
Section titled “Checklist for Database Admins”Before connecting a database to Querri:
- Identify key analytics tables — What do analysts actually need?
- Create pre-joined views — For commonly combined data
- Add calculated fields — Metrics analysts always compute
- Create an analytics schema — Separate from raw tables
- Set up a dedicated user — With minimal permissions
- Grant access to views only — Not raw tables
- Exclude sensitive columns — Mask or remove PII
- Consider materialized views — For expensive aggregations
- Document the views — What each one contains and when to use it
- Test with Querri — Verify tables appear and queries work
Next Steps
Section titled “Next Steps”- PostgreSQL Connector — PostgreSQL-specific setup
- MySQL Connector — MySQL-specific setup
- SQL Server Connector — SQL Server-specific setup
- Managing Connections — Edit and monitor your connectors