Data Types
Data Types
Section titled “Data Types”Querri Data Frames (QDF) support a variety of data types for storing and processing information. Understanding these types helps you work effectively with data in Querri.
Supported Data Types
Section titled “Supported Data Types”Text (String)
Section titled “Text (String)”Description: Text data of any length, including letters, numbers, and special characters.
Common Uses:
- Names, titles, descriptions
- Categories and labels
- IDs and identifiers
- Email addresses and URLs
Examples:
"John Smith""Product Category A""john.smith@company.com""https://example.com"Characteristics:
- No length limit
- Case-sensitive
- Supports Unicode characters
- Can contain special characters and whitespace
Operations:
- Concatenation
- Pattern matching
- Text search and filtering
- Length calculation
- Case conversion
Number (Integer)
Section titled “Number (Integer)”Description: Whole numbers without decimal points, positive or negative.
Common Uses:
- Count values
- IDs and indices
- Year values
- Quantity fields
Examples:
42-1501000000Range:
- Minimum: -9,223,372,036,854,775,808
- Maximum: 9,223,372,036,854,775,807
Operations:
- Arithmetic (+, -, ×, ÷)
- Comparison (>, <, =)
- Aggregation (sum, count, average)
- Statistical functions
Number (Float/Decimal)
Section titled “Number (Float/Decimal)”Description: Numbers with decimal points, used for precise measurements.
Common Uses:
- Prices and monetary values
- Percentages
- Scientific measurements
- Statistical calculations
Examples:
3.14159-0.599.990.00001Precision:
- Up to 15-17 significant decimal digits
- Suitable for most business calculations
Operations:
- All arithmetic operations
- Rounding and truncation
- Statistical analysis
- Financial calculations
Note: For high-precision financial calculations, values are stored with full decimal precision.
Description: Calendar dates without time information.
Common Uses:
- Birth dates
- Transaction dates
- Deadlines and due dates
- Event dates
Formats Recognized:
2024-03-1503/15/202415-Mar-2024March 15, 20242024-03-15T00:00:00Z (parsed as date)Storage Format: ISO 8601 (YYYY-MM-DD)
Operations:
- Date arithmetic (add/subtract days)
- Date comparisons
- Date range filtering
- Date part extraction (year, month, day)
- Date formatting
Range:
- Minimum: 1900-01-01
- Maximum: 2100-12-31
DateTime (Timestamp)
Section titled “DateTime (Timestamp)”Description: Date and time combined, with optional timezone information.
Common Uses:
- Created/modified timestamps
- Event timestamps
- Log entries
- Scheduled times
Formats Recognized:
2024-03-15T14:30:00Z2024-03-15 14:30:0003/15/2024 2:30 PM2024-03-15T14:30:00+00:002024-03-15T14:30:00.123456ZStorage Format: ISO 8601 with UTC timezone
Precision: Microsecond (up to 6 decimal places)
Operations:
- All date operations
- Time arithmetic
- Timezone conversion
- Time part extraction (hour, minute, second)
- Duration calculation
Timezone Handling:
- All timestamps stored in UTC
- Automatic conversion from local timezones
- Display in user’s local timezone
Boolean
Section titled “Boolean”Description: True or false values, representing binary states.
Common Uses:
- Status flags (active/inactive)
- Feature toggles
- Condition checks
- Binary choices (yes/no)
Values:
truefalseRecognized Inputs (converted to boolean):
- Text: “true”, “false”, “yes”, “no”, “1”, “0”
- Numbers: 1 (true), 0 (false)
- Null/empty: false
Operations:
- Logical operations (AND, OR, NOT)
- Filtering
- Counting
- Conditional logic
JSON (Object)
Section titled “JSON (Object)”Description: Structured data objects with nested properties.
Common Uses:
- API responses
- Configuration data
- Complex nested data
- Metadata storage
Examples:
{ "name": "John Smith", "age": 30, "address": { "street": "123 Main St", "city": "New York" }}Characteristics:
- Supports nested objects
- Supports arrays within objects
- Preserves data structure
- Can contain any valid JSON types
Operations:
- Property access (dot notation)
- Nested value extraction
- JSON path queries
- Structure validation
Querying JSON Fields:
-- Extract nested propertySELECT data.address.city FROM table
-- Filter by JSON propertyWHERE data.status = 'active'Array (List)
Section titled “Array (List)”Description: Ordered collections of values of the same or mixed types.
Common Uses:
- Tag lists
- Multiple selections
- Ordered sequences
- Collection data
Examples:
["apple", "banana", "orange"][1, 2, 3, 4, 5]["text", 123, true, null]Characteristics:
- Ordered elements
- Can contain duplicates
- Zero-based indexing
- Can contain mixed types
Operations:
- Element access by index
- Array length
- Contains/membership checks
- Array aggregation
- Flattening
Querying Arrays:
-- Check if array contains valueWHERE 'tag1' IN tags
-- Array lengthSELECT array_length(tags) FROM tableDescription: Represents missing, unknown, or undefined values.
Common Uses:
- Optional fields without values
- Missing data in datasets
- Not-applicable values
- Unknown information
Representation: null or empty
Behavior:
- Distinct from empty string ""
- Distinct from zero 0
- Distinct from false
- Excluded from most aggregations
Operations:
- NULL checking (IS NULL, IS NOT NULL)
- NULL handling in calculations
- COALESCE (default values)
- NULL propagation in expressions
Important Notes:
- NULL + anything = NULL (in arithmetic)
- NULL comparisons require IS NULL syntax
- COUNT(*) includes NULLs, COUNT(column) excludes NULLs
Type Conversion and Casting
Section titled “Type Conversion and Casting”Automatic Type Inference
Section titled “Automatic Type Inference”When you upload data, Querri automatically infers data types:
Number Detection:
- Values like “123” → Integer
- Values like “45.67” → Float
- Currency values “$1,234.56” → Float
Date Detection:
- Common date formats automatically recognized
- ISO 8601 formats preferred
- Ambiguous formats (MM/DD vs DD/MM) use context
Boolean Detection:
- “true”, “false”, “yes”, “no” → Boolean
- “1”, “0” (in appropriate context) → Boolean
JSON Detection:
- Valid JSON strings → JSON object
- Detects nested structures
Manual Type Conversion
Section titled “Manual Type Conversion”Convert between types using conversions:
To Text:
-- Convert any type to textCAST(number_field AS TEXT)To Number:
-- Convert text to integerCAST(text_field AS INTEGER)
-- Convert text to floatCAST(text_field AS FLOAT)
-- Handle conversion errors with defaultCOALESCE(CAST(text_field AS INTEGER), 0)To Date/DateTime:
-- Convert text to dateCAST(text_field AS DATE)
-- Convert text to datetimeCAST(text_field AS TIMESTAMP)
-- Parse specific date formatSTRPTIME(text_field, '%Y-%m-%d')To Boolean:
-- Convert to booleanCAST(value AS BOOLEAN)
-- Explicit conversionCASE WHEN value = 'yes' THEN true ELSE false ENDType Conversion Rules
Section titled “Type Conversion Rules”Numbers:
- Integer → Float: Always safe
- Float → Integer: Rounds/truncates decimals
- Number → Text: Preserves full precision
- Text → Number: Fails if not numeric
Dates:
- Date → DateTime: Sets time to 00:00:00
- DateTime → Date: Discards time component
- Text → Date: Requires valid date format
- Date → Text: Uses ISO 8601 format
Booleans:
- Boolean → Text: “true” or “false”
- Boolean → Number: 1 or 0
- Number → Boolean: 0 = false, non-zero = true
- Text → Boolean: Specific values recognized
JSON/Arrays:
- JSON → Text: Serializes to JSON string
- Array → Text: Serializes to JSON array string
- Text → JSON: Requires valid JSON syntax
Type Handling Best Practices
Section titled “Type Handling Best Practices”Upload Data Preparation
Section titled “Upload Data Preparation”Consistent Formats:
- Use consistent date formats within a column
- Use consistent number formats (same decimal separator)
- Be explicit with boolean values
Clear Data Types:
- Avoid mixing types in a single column
- Use headers that indicate data type
- Include sample values that demonstrate type
Handle Missing Values:
- Use NULL or leave empty for missing data
- Avoid placeholder text like “N/A” in numeric columns
- Be consistent with missing value representation
Working with Types
Section titled “Working with Types”Type Safety:
- Verify types after upload
- Use explicit casting when needed
- Handle conversion errors gracefully
Performance:
- Use appropriate types (integer vs float)
- Avoid unnecessary conversions
- Index frequently-queried columns
Data Quality:
- Validate data after type conversion
- Check for conversion errors
- Monitor NULL values
Common Type Issues
Section titled “Common Type Issues”Mixed Types in Columns
Section titled “Mixed Types in Columns”Problem: Column contains both numbers and text Solution:
- Clean data before upload
- Use text type and convert specific values
- Filter out invalid values
Date Format Ambiguity
Section titled “Date Format Ambiguity”Problem: Dates like “01/02/2024” unclear (Jan 2 or Feb 1?) Solution:
- Use ISO format YYYY-MM-DD
- Specify format in column name
- Manually specify format during upload
Precision Loss
Section titled “Precision Loss”Problem: Large numbers or many decimals lose precision Solution:
- Use text for very large numbers
- Store full precision monetary values
- Consider external calculation for high precision
NULL vs Empty String
Section titled “NULL vs Empty String”Problem: Confusion between NULL and "" Solution:
- Be explicit about meaning
- Use NULL for missing data
- Use empty string "" for intentionally blank text
Data Type Reference Table
Section titled “Data Type Reference Table”| Type | Storage | Range/Size | Example | NULL Allowed |
|---|---|---|---|---|
| Text | Variable | Unlimited | ”Hello World” | Yes |
| Integer | 8 bytes | ±9.2 quintillion | 42 | Yes |
| Float | 8 bytes | ~15 digits precision | 3.14159 | Yes |
| Date | 4 bytes | 1900-2100 | 2024-03-15 | Yes |
| DateTime | 8 bytes | Microsecond precision | 2024-03-15T14:30:00Z | Yes |
| Boolean | 1 byte | true/false | true | Yes |
| JSON | Variable | Unlimited nesting | {“key”: “value”} | Yes |
| Array | Variable | Unlimited elements | [1, 2, 3] | Yes |
Next Steps
Section titled “Next Steps”- Learn about chart types for visualizing different data types
- Explore connectors to import data with proper types
- Review environment configuration for data processing settings