Skip to content

Understanding Your Data

Before you can answer questions with data, you need to understand what you’re working with. Data exploration helps you discover what’s in your dataset, find quality issues early, and identify what questions your data can actually answer.

Querri automatically showing a data overview with quality cautions

When you add data to a project, Querri automatically gives you an overview that includes row and column counts, data types, and data quality cautions—potential issues like missing values, duplicates, or unusual patterns that might affect your analysis. This immediate feedback helps you understand what you’re working with before you start asking questions.

Start by bringing data into your project:

  1. Upload a file — Drag and drop CSV, Excel, or JSON files directly into the chat
  2. Connect a source — Link to databases, Google Sheets, or business apps like QuickBooks or HubSpot
  3. Use existing library data — Pull from datasets already in your Library

Once your data is loaded, Querri shows you a preview with basic statistics and any quality warnings it detects.

A quick visual inspection is one of the best ways to understand what came in and spot potential issues.

Sorting and filtering controls for exploring your data

Use the table controls to:

  • Sort columns — Click headers to sort ascending/descending. Large values at the top? Outliers? Nulls clustering together?
  • Filter values — Narrow down to specific categories or ranges to understand subsets
  • Scan for patterns — Are dates in the expected range? Do amounts look reasonable? Any obvious data entry errors?

This hands-on exploration helps you form better questions and catch issues that summary statistics might miss.

Once you’ve done a visual scan, use these prompts to deeply understand your data.

One of the most powerful exploration techniques is seeing your data from multiple angles at once:

"Draw 5 different graphs to show sales trends"
"Create 5 visualizations showing customer distribution in different ways"
"Show me revenue from 5 different perspectives"

This reveals patterns you might miss with a single chart—a trend line might show growth, but a histogram reveals most months are flat with a few spikes.

Aggregate your data by key dimensions to surface insights:

"Group by Customer Name and calculate total revenue, order count, average order value, and days since last order"
"Aggregate by State showing total sales, customer count, average transaction size, and year-over-year growth"
"Break down by Product Category with units sold, revenue, return rate, and profit margin"

These aggregations transform raw transactions into actionable summaries.

Understand how your values are spread:

"Show me the distribution of order amounts as a histogram"
"What percentage of revenue comes from the top 10% of customers?"
"Create a box plot of delivery times by region"

Look for patterns over time:

"Show me weekly trends for the past 6 months"
"Compare this quarter to the same quarter last year by product"
"Are there any seasonal patterns in this data?"

Find and understand extreme values:

"Show me the top 20 and bottom 20 orders by value"
"Which customers have unusually high return rates?"
"Find transactions that are more than 3 standard deviations from average"

Get oriented quickly with these essential prompts.

"What's in this data?"

This gives you a summary including:

  • Number of rows and columns
  • Column names and types
  • Sample values from each column
"How many rows and columns do I have?"

Knowing the size helps you:

  • Estimate how long operations will take
  • Decide if you need to filter or sample
  • Understand the scope of your data

Get deeper understanding of individual columns.

"Show me statistics for the revenue column"

This returns:

  • Count: How many non-null values
  • Mean: The average
  • Median: The middle value (less affected by outliers)
  • Min/Max: The range
  • Standard deviation: How spread out values are
StatisticWhat it reveals
Mean vs Median far apartYou have outliers or skewed data
High standard deviationValues vary widely
Min is negativeMight be returns, corrections, or errors
Max is very highPossible outliers or data entry errors
Count < total rowsYou have null values

Understand what values exist in your categorical columns.

"What are the unique categories in the status column?"
"Show me all the different product types"

This helps you:

  • Understand your categorization
  • Spot unexpected values (typos, inconsistencies)
  • Plan your groupings for analysis
"How many orders are in each status?"
"Count customers by segment"

This reveals:

  • The distribution of your data
  • Whether categories are balanced or skewed
  • Potential data quality issues (too many “Unknown”)

Catch problems early before they affect your analysis.

"How many null values are in each column?"
"Which columns have missing data?"

Null values can:

  • Cause calculations to fail or return wrong results
  • Indicate data collection problems
  • Need to be handled before analysis
"Are there duplicate rows?"
"Find duplicate customer IDs"
"How many unique order IDs vs total rows?"

Duplicates can:

  • Inflate your counts and totals
  • Come from data import issues
  • Indicate legitimate scenarios (multiple orders per customer)
"Are there any negative amounts in the revenue column?"
"Show rows where email doesn't contain @"
"Find orders with dates in the future"

Invalid values might be:

  • Data entry errors
  • System glitches
  • Legitimate edge cases you need to understand
"Show me all variations of state names"
"Are there different spellings of product names?"

Inconsistencies like “CA” vs “California” vs “calif” will cause grouping problems.

For important analyses, work through this checklist:

  • How many rows and columns?
  • What time period does it cover?
  • What does each column represent?
  • Any null values in key columns?
  • Any duplicate records?
  • Any obviously invalid values?
  • What are the unique values in categorical columns?
  • Are numeric columns in reasonable ranges?
  • Are there unexpected outliers?
  • Do key columns join properly to other data?
  • Are there expected patterns (e.g., revenue increasing over time)?

Exploration answers “what do I have?” Analysis answers “what does it mean?”

Exploration finds…Analysis might ask…
Revenue ranges from $10 to $50,000What’s driving the high-value orders?
30% of customers are in CaliforniaHow does CA performance compare to other states?
Sales dropped in MarchWhat caused the March decline?
5% of orders have null customer IDsAre these anonymous purchases? Guest checkouts?

Use your exploration findings to guide your analysis questions.

  1. Start broad, then narrow — Overview first, then drill into specifics
  2. Use multiple views — Don’t rely on a single chart or statistic
  3. Question everything — If something looks odd, investigate it
  4. Document as you go — Note what you learn for later reference
  5. Don’t skip this step — Even familiar data changes over time

Now that you understand your data: