Statistics Tool

The Statistics tool runs quantitative statistical analyses on your data to answer questions about relationships, differences, trends, and patterns. It generates Python code using industry-standard statistical libraries, executes it against your dataset, and returns both printed findings and interactive visualizations.

What the Statistics Tool Does

The Statistics tool generates and runs statistical analysis code against your data. It draws on a suite of scientific Python libraries — statsmodels, scipy.stats, scikit-learn, and numpy — to answer rigorous quantitative questions.

Key capabilities:

Test relationships between categorical variables using chi-squared tests of independence
Model relationships between numeric variables with OLS linear regression (R², coefficients, p-values)
Test whether groups differ significantly using ANOVA (Analysis of Variance)
Decompose time series into trend, seasonal, and residual components with STL decomposition
Run multiple analyses in one step — e.g., a regression followed by a correlation matrix
Generate beginner-friendly visualizations to support each finding

When to Use It

Scenario	Example
Testing variable relationships	”Is there a relationship between customer satisfaction and flight distance?”
Measuring correlation	”What’s the relationship between marketing spend and revenue?”
Comparing group means	”Does fertilizer type significantly affect crop yield?”
Understanding seasonality	”How do sales break down into trend, seasonal, and noise components?”
Validating assumptions	”Is the difference in churn rates between plans statistically significant?”
Exploratory data analysis	”What’s driving variation in support ticket resolution time?”

When NOT to Use It

The Statistics tool isn’t the right choice when:

You need to predict future values — use the Forecaster for time series forecasting
You need row-by-row AI analysis — use the Researcher for classification and extraction
You need financial statements — use the Financials tool for P&L and cash flow
Simple aggregations or calculations — use standard analysis steps for sums, averages, and filters
You need to reuse a trained model later — the Statistics tool is self-contained and does not persist models across steps

How to Prompt for Statistical Analysis

Describe your analytical question in plain language. The agent recognizes statistical analysis requests and invokes the Statistics tool automatically.

Effective Prompts

Testing relationships:

“Is there a statistically significant relationship between customer tier and churn?”
“Does customer satisfaction depend on how long they’ve been a customer?”
“Test whether region and product category are independent”

Regression and correlation:

“What’s the relationship between ad spend and conversions?”
“How well does deal size predict time to close?”
“Run a regression of revenue on headcount and marketing spend”

Group comparisons (ANOVA):

“Do the three pricing plans have significantly different average usage?”
“Does support channel affect resolution time?”
“Is there a significant difference in NPS by cohort?”

Time series decomposition:

“How do sales break down into trend and seasonality?”
“What’s the seasonal pattern in support ticket volume?”
“Decompose monthly revenue into its components”

Exploratory analysis:

“What factors are most correlated with customer lifetime value?”
“Run a full statistical summary of this dataset”

Understanding Your Results

Statistical Output

The Statistics tool prints its findings directly in the analysis output. For each test or model, you’ll typically see:

Output	What It Means
P-value	Probability the result occurred by chance — below 0.05 is typically considered significant
R² (R-squared)	For regression: what fraction of variation in the outcome the model explains (0–1)
Coefficient	For regression: how much the outcome changes per unit increase in a predictor
F-statistic	For ANOVA: how much group means differ relative to variation within groups
Chi-squared statistic	For categorical tests: how far observed counts deviate from expected
Correlation coefficient	Strength of linear relationship between two variables (−1 to +1)

Reading Significance

A p-value below 0.05 is the conventional threshold for statistical significance — it means there is less than a 5% chance the observed pattern happened by chance. The tool always reports exact p-values so you can apply your own threshold.

Visualizations

The Statistics tool creates visualizations to support every key finding:

Scatter plots with trendlines — for regression analyses
Box plots — for ANOVA and group comparisons
Grouped bar/histogram charts — for chi-squared categorical breakdowns
STL component charts — four-panel plots showing observed, trend, seasonal, and residual components
Correlation matrices — for multi-variable exploratory analyses

When an analysis produces multiple charts, they are composed into a single figure with labeled subplots.

Statistical Methods Reference

Chi-Squared Test of Independence

Tests whether two categorical variables are associated or independent.

Use when: Both variables are categorical (e.g., plan type, region, satisfaction label)

What you get:

Chi-squared statistic and p-value
Degrees of freedom
Cross-tabulation of observed vs. expected counts
Grouped histogram visualization

Example prompt:

"Is there a relationship between customer satisfaction rating and the support channel they used?"

OLS Linear Regression

Fits a linear model to quantify the relationship between one or more numeric predictors and a numeric outcome.

Use when: You want to understand or measure how one variable affects another, or predict an outcome from known inputs

What you get:

Full model summary (R², adjusted R², F-statistic)
Coefficients and p-values for each predictor
Pearson correlation with p-value
Scatter plot with OLS trendline

Example prompt:

"What is the relationship between deal size and time to close?"

ANOVA (Analysis of Variance)

Tests whether the means of a numeric variable differ significantly across groups defined by a categorical variable.

Use when: You have one categorical variable (the factor) and want to know if it has a significant effect on a numeric outcome

What you get:

ANOVA table with F-statistic and p-value
Sum of squares breakdown (between-group vs. within-group)
Box plot showing distribution per group

Example prompt:

"Does pricing plan have a significant effect on monthly active usage?"

STL Decomposition

Decomposes a time series into three components: trend (long-term direction), seasonal (repeating cycle), and residual (unexplained noise).

Use when: You have time series data and want to understand its underlying structure

What you get:

Summary statistics for each component (mean, standard deviation)
Linear regression on the trend component (slope and significance)
Four-panel chart: observed, trend, seasonal, residual

Example prompt:

"How do monthly sales break down into trend and seasonal patterns?"

Preparing Your Data for Best Results

Encode Categorical Variables First

The Statistics tool works best when categorical variables are already encoded. Before running analyses involving categories, prepare them:

"Convert the satisfaction column to a numeric scale (1=Low, 2=Medium, 3=High), then run a correlation analysis"

"One-hot encode the plan type column, then run a regression"

Clean Your Data

Missing values and outliers can skew statistical results significantly:

"Filter out rows where revenue is null or zero, then run the regression"

"Remove outliers more than 3 standard deviations from the mean before the analysis"

Aggregate Time Series First

For STL decomposition and time series analyses, aggregate to a consistent time grain before running:

"Aggregate to monthly totals by summing revenue, then decompose the time series"

Match Your Data to the Right Test

Your Question	Data Type Needed	Right Test
Are two categories related?	Two categorical columns	Chi-squared
Does X predict Y?	Two numeric columns	OLS Regression
Do groups differ in their averages?	One categorical + one numeric column	ANOVA
What’s driving trends over time?	Date + numeric column (20+ points)	STL Decomposition

Tips for Better Results

Let the Tool Pick the Best Analysis

You don’t need to specify which statistical method to use. Just ask your question and the tool selects the appropriate analysis:

"Is there a significant relationship between customer age group and churn?"

The tool will recognize that age group is categorical and churn is binary, and run a chi-squared test.

Ask for Multiple Analyses at Once

You can request several related tests in a single step:

"Run a regression of revenue on headcount, then also show a correlation matrix for all numeric columns"

The tool will run both analyses and compose the visualizations into a single output.

Follow Up with Drill-Down Questions

After a broad analysis, ask focused follow-up questions:

Step 1: "What factors correlate most with customer churn?"
Step 2: "Run a regression of churn rate on the top three factors from the correlation analysis"
Step 3: "Chart actual vs. predicted churn for each region"

Don’t Expect the Model to Persist

The Statistics tool runs its code in an isolated execution environment. If it trains a model (e.g., an OLS regression), that model is gone when the step finishes. Plan downstream steps accordingly — extract coefficients, predictions, or summaries as printed output rather than relying on a model object in a later step.

Real-World Examples

Example 1: Testing Whether Plan Type Affects Usage

Starting data: Customer records with plan_type (Free, Pro, Enterprise) and monthly_active_days

Step 1: Validate your data

"Show me the count and average monthly_active_days for each plan_type"

Step 2: Run ANOVA

"Does plan type have a statistically significant effect on monthly active days?"

Step 3: Interpret

"Summarize the ANOVA findings — is the difference significant, and which plan has the highest average usage?"

Example 2: Measuring the Impact of Ad Spend on Revenue

Starting data: Weekly records with ad_spend and revenue

Step 1: Clean the data

"Filter to weeks where ad_spend is greater than zero"

Step 2: Run regression

"What is the relationship between ad spend and revenue? Run a linear regression."

Step 3: Visualize

"Chart the regression with actual data points and the fitted line"

Example 3: Decomposing Revenue Seasonality

Starting data: Three years of monthly revenue

Step 1: Prepare

"Aggregate to monthly totals and sort by month ascending"

Step 2: Decompose

"Decompose monthly revenue into trend, seasonal, and residual components"

Step 3: Quantify

"What's the average seasonal lift in Q4 based on the decomposition?"

Example 4: Exploring Churn Drivers

Starting data: Customer records with multiple numeric and categorical features

Step 1: Explore correlations

"Run a correlation matrix of all numeric columns against the churn flag"

Step 2: Test categorical factors

"Is there a significant relationship between support_tier and churn? Run a chi-squared test."

Step 3: Model the top drivers

"Run a logistic regression of churn on contract_length, num_support_tickets, and days_since_last_login"

Troubleshooting

”Model failed to converge”

Cause: The data has multicollinearity, near-constant columns, or extreme outliers that prevent the model from fitting.

Fix:

Remove highly correlated predictors: “Check the correlation matrix and drop columns with correlation above 0.9”
Remove outliers: “Filter to rows where the value is within 3 standard deviations of the mean”
Standardize inputs: “Normalize all numeric columns before running the regression"

"P-values are all near zero but the model seems wrong”

Cause: Very large datasets can make even trivial effects statistically significant.

Fix: Look at the effect size (coefficients, R²) alongside p-values. A significant p-value with a near-zero coefficient or low R² indicates the effect, while real, may not be practically meaningful.

”STL decomposition fails or produces flat results”

Cause: Fewer than two full seasonal cycles in the data, or the time series isn’t indexed correctly.

Fix:

Ensure data covers at least 2 full cycles of the seasonal period you expect
Aggregate to a consistent time grain first: “Convert to monthly totals sorted by date”
Specify the period explicitly in your prompt: “Decompose assuming a 12-month seasonal cycle"

"Results look different each run”

Cause: Some methods (like certain clustering or random-initialization algorithms) have stochastic elements.

Fix: Ask for a random seed: “Run the analysis with a fixed random seed of 42 for reproducibility.”

Statistics Best Practices

Start with exploration — look at distributions and summaries before jumping to hypothesis tests
Clean before you analyze — outliers and missing values are the most common source of misleading results
Match the test to your data types — categorical vs. numeric column combinations map to specific tests
Report effect size, not just significance — a p-value alone doesn’t tell you how important a finding is
Run multiple related tests together — ask for correlation, regression, and visualization in one step
Prepare categorical variables — encode or label categories before running models that expect numeric inputs

Next Steps

Forecaster Tool — predict future values over time
Researcher Tool — classify and extract from unstructured data
Working with Dates — prepare time series data for decomposition
Aggregating Data — aggregate before statistical analysis
Creating Visualizations — customize charts from statistical outputs