Skip to content

Statistics Tool

The Statistics tool runs quantitative statistical analyses on your data to answer questions about relationships, differences, trends, and patterns. It generates Python code using industry-standard statistical libraries, executes it against your dataset, and returns both printed findings and interactive visualizations.

The Statistics tool generates and runs statistical analysis code against your data. It draws on a suite of scientific Python libraries — statsmodels, scipy.stats, scikit-learn, and numpy — to answer rigorous quantitative questions.

Key capabilities:

  • Test relationships between categorical variables using chi-squared tests of independence
  • Model relationships between numeric variables with OLS linear regression (R², coefficients, p-values)
  • Test whether groups differ significantly using ANOVA (Analysis of Variance)
  • Decompose time series into trend, seasonal, and residual components with STL decomposition
  • Run multiple analyses in one step — e.g., a regression followed by a correlation matrix
  • Generate beginner-friendly visualizations to support each finding
ScenarioExample
Testing variable relationships”Is there a relationship between customer satisfaction and flight distance?”
Measuring correlation”What’s the relationship between marketing spend and revenue?”
Comparing group means”Does fertilizer type significantly affect crop yield?”
Understanding seasonality”How do sales break down into trend, seasonal, and noise components?”
Validating assumptions”Is the difference in churn rates between plans statistically significant?”
Exploratory data analysis”What’s driving variation in support ticket resolution time?”

The Statistics tool isn’t the right choice when:

  • You need to predict future values — use the Forecaster for time series forecasting
  • You need row-by-row AI analysis — use the Researcher for classification and extraction
  • You need financial statements — use the Financials tool for P&L and cash flow
  • Simple aggregations or calculations — use standard analysis steps for sums, averages, and filters
  • You need to reuse a trained model later — the Statistics tool is self-contained and does not persist models across steps

Describe your analytical question in plain language. The agent recognizes statistical analysis requests and invokes the Statistics tool automatically.

Testing relationships:

  • “Is there a statistically significant relationship between customer tier and churn?”
  • “Does customer satisfaction depend on how long they’ve been a customer?”
  • “Test whether region and product category are independent”

Regression and correlation:

  • “What’s the relationship between ad spend and conversions?”
  • “How well does deal size predict time to close?”
  • “Run a regression of revenue on headcount and marketing spend”

Group comparisons (ANOVA):

  • “Do the three pricing plans have significantly different average usage?”
  • “Does support channel affect resolution time?”
  • “Is there a significant difference in NPS by cohort?”

Time series decomposition:

  • “How do sales break down into trend and seasonality?”
  • “What’s the seasonal pattern in support ticket volume?”
  • “Decompose monthly revenue into its components”

Exploratory analysis:

  • “What factors are most correlated with customer lifetime value?”
  • “Run a full statistical summary of this dataset”

The Statistics tool prints its findings directly in the analysis output. For each test or model, you’ll typically see:

OutputWhat It Means
P-valueProbability the result occurred by chance — below 0.05 is typically considered significant
R² (R-squared)For regression: what fraction of variation in the outcome the model explains (0–1)
CoefficientFor regression: how much the outcome changes per unit increase in a predictor
F-statisticFor ANOVA: how much group means differ relative to variation within groups
Chi-squared statisticFor categorical tests: how far observed counts deviate from expected
Correlation coefficientStrength of linear relationship between two variables (−1 to +1)

A p-value below 0.05 is the conventional threshold for statistical significance — it means there is less than a 5% chance the observed pattern happened by chance. The tool always reports exact p-values so you can apply your own threshold.

The Statistics tool creates visualizations to support every key finding:

  • Scatter plots with trendlines — for regression analyses
  • Box plots — for ANOVA and group comparisons
  • Grouped bar/histogram charts — for chi-squared categorical breakdowns
  • STL component charts — four-panel plots showing observed, trend, seasonal, and residual components
  • Correlation matrices — for multi-variable exploratory analyses

When an analysis produces multiple charts, they are composed into a single figure with labeled subplots.

Tests whether two categorical variables are associated or independent.

Use when: Both variables are categorical (e.g., plan type, region, satisfaction label)

What you get:

  • Chi-squared statistic and p-value
  • Degrees of freedom
  • Cross-tabulation of observed vs. expected counts
  • Grouped histogram visualization

Example prompt:

"Is there a relationship between customer satisfaction rating and the support channel they used?"

Fits a linear model to quantify the relationship between one or more numeric predictors and a numeric outcome.

Use when: You want to understand or measure how one variable affects another, or predict an outcome from known inputs

What you get:

  • Full model summary (R², adjusted R², F-statistic)
  • Coefficients and p-values for each predictor
  • Pearson correlation with p-value
  • Scatter plot with OLS trendline

Example prompt:

"What is the relationship between deal size and time to close?"

Tests whether the means of a numeric variable differ significantly across groups defined by a categorical variable.

Use when: You have one categorical variable (the factor) and want to know if it has a significant effect on a numeric outcome

What you get:

  • ANOVA table with F-statistic and p-value
  • Sum of squares breakdown (between-group vs. within-group)
  • Box plot showing distribution per group

Example prompt:

"Does pricing plan have a significant effect on monthly active usage?"

Decomposes a time series into three components: trend (long-term direction), seasonal (repeating cycle), and residual (unexplained noise).

Use when: You have time series data and want to understand its underlying structure

What you get:

  • Summary statistics for each component (mean, standard deviation)
  • Linear regression on the trend component (slope and significance)
  • Four-panel chart: observed, trend, seasonal, residual

Example prompt:

"How do monthly sales break down into trend and seasonal patterns?"

The Statistics tool works best when categorical variables are already encoded. Before running analyses involving categories, prepare them:

"Convert the satisfaction column to a numeric scale (1=Low, 2=Medium, 3=High), then run a correlation analysis"
"One-hot encode the plan type column, then run a regression"

Missing values and outliers can skew statistical results significantly:

"Filter out rows where revenue is null or zero, then run the regression"
"Remove outliers more than 3 standard deviations from the mean before the analysis"

For STL decomposition and time series analyses, aggregate to a consistent time grain before running:

"Aggregate to monthly totals by summing revenue, then decompose the time series"
Your QuestionData Type NeededRight Test
Are two categories related?Two categorical columnsChi-squared
Does X predict Y?Two numeric columnsOLS Regression
Do groups differ in their averages?One categorical + one numeric columnANOVA
What’s driving trends over time?Date + numeric column (20+ points)STL Decomposition

You don’t need to specify which statistical method to use. Just ask your question and the tool selects the appropriate analysis:

"Is there a significant relationship between customer age group and churn?"

The tool will recognize that age group is categorical and churn is binary, and run a chi-squared test.

You can request several related tests in a single step:

"Run a regression of revenue on headcount, then also show a correlation matrix for all numeric columns"

The tool will run both analyses and compose the visualizations into a single output.

After a broad analysis, ask focused follow-up questions:

Step 1: "What factors correlate most with customer churn?"
Step 2: "Run a regression of churn rate on the top three factors from the correlation analysis"
Step 3: "Chart actual vs. predicted churn for each region"

The Statistics tool runs its code in an isolated execution environment. If it trains a model (e.g., an OLS regression), that model is gone when the step finishes. Plan downstream steps accordingly — extract coefficients, predictions, or summaries as printed output rather than relying on a model object in a later step.

Example 1: Testing Whether Plan Type Affects Usage

Section titled “Example 1: Testing Whether Plan Type Affects Usage”

Starting data: Customer records with plan_type (Free, Pro, Enterprise) and monthly_active_days

Step 1: Validate your data

"Show me the count and average monthly_active_days for each plan_type"

Step 2: Run ANOVA

"Does plan type have a statistically significant effect on monthly active days?"

Step 3: Interpret

"Summarize the ANOVA findings — is the difference significant, and which plan has the highest average usage?"

Example 2: Measuring the Impact of Ad Spend on Revenue

Section titled “Example 2: Measuring the Impact of Ad Spend on Revenue”

Starting data: Weekly records with ad_spend and revenue

Step 1: Clean the data

"Filter to weeks where ad_spend is greater than zero"

Step 2: Run regression

"What is the relationship between ad spend and revenue? Run a linear regression."

Step 3: Visualize

"Chart the regression with actual data points and the fitted line"

Example 3: Decomposing Revenue Seasonality

Section titled “Example 3: Decomposing Revenue Seasonality”

Starting data: Three years of monthly revenue

Step 1: Prepare

"Aggregate to monthly totals and sort by month ascending"

Step 2: Decompose

"Decompose monthly revenue into trend, seasonal, and residual components"

Step 3: Quantify

"What's the average seasonal lift in Q4 based on the decomposition?"

Starting data: Customer records with multiple numeric and categorical features

Step 1: Explore correlations

"Run a correlation matrix of all numeric columns against the churn flag"

Step 2: Test categorical factors

"Is there a significant relationship between support_tier and churn? Run a chi-squared test."

Step 3: Model the top drivers

"Run a logistic regression of churn on contract_length, num_support_tickets, and days_since_last_login"

Cause: The data has multicollinearity, near-constant columns, or extreme outliers that prevent the model from fitting.

Fix:

  • Remove highly correlated predictors: “Check the correlation matrix and drop columns with correlation above 0.9”
  • Remove outliers: “Filter to rows where the value is within 3 standard deviations of the mean”
  • Standardize inputs: “Normalize all numeric columns before running the regression"

"P-values are all near zero but the model seems wrong”

Section titled “"P-values are all near zero but the model seems wrong””

Cause: Very large datasets can make even trivial effects statistically significant.

Fix: Look at the effect size (coefficients, R²) alongside p-values. A significant p-value with a near-zero coefficient or low R² indicates the effect, while real, may not be practically meaningful.

”STL decomposition fails or produces flat results”

Section titled “”STL decomposition fails or produces flat results””

Cause: Fewer than two full seasonal cycles in the data, or the time series isn’t indexed correctly.

Fix:

  • Ensure data covers at least 2 full cycles of the seasonal period you expect
  • Aggregate to a consistent time grain first: “Convert to monthly totals sorted by date”
  • Specify the period explicitly in your prompt: “Decompose assuming a 12-month seasonal cycle"

Cause: Some methods (like certain clustering or random-initialization algorithms) have stochastic elements.

Fix: Ask for a random seed: “Run the analysis with a fixed random seed of 42 for reproducibility.”

  1. Start with exploration — look at distributions and summaries before jumping to hypothesis tests
  2. Clean before you analyze — outliers and missing values are the most common source of misleading results
  3. Match the test to your data types — categorical vs. numeric column combinations map to specific tests
  4. Report effect size, not just significance — a p-value alone doesn’t tell you how important a finding is
  5. Run multiple related tests together — ask for correlation, regression, and visualization in one step
  6. Prepare categorical variables — encode or label categories before running models that expect numeric inputs