Troubleshooting
Troubleshooting
Section titled “Troubleshooting”This guide covers common deployment and operational issues, debugging procedures, and solutions for Querri administrators.
Common Deployment Issues
Section titled “Common Deployment Issues”Services Failing to Start
Section titled “Services Failing to Start”Symptoms:
- Containers exit immediately after starting
- Services show “Exited (1)” status
- Application not accessible
Diagnosis:
# Check service statusdocker compose ps
# View service logsdocker compose logs <service-name>
# Check container exit codedocker inspect <container-name> | grep -A 5 "State"Common Causes & Solutions:
1. Missing Environment Variables
Section titled “1. Missing Environment Variables”Error in logs:
KeyError: 'MONGODB_HOST'Environment variable MONGODB_HOST not setSolution:
# Verify .env-prod existsls -la .env-prod
# Check required variables are setgrep MONGODB_HOST .env-prodgrep WORKOS_API_KEY .env-prod
# Restart servicesdocker compose up -d2. Database Connection Failure
Section titled “2. Database Connection Failure”Error in logs:
pymongo.errors.ServerSelectionTimeoutError: Connection refusedSolution:
# Check MongoDB is runningdocker compose ps mongo
# Verify MongoDB credentialsdocker compose exec mongo mongosh -u querri -p your_password
# Check network connectivitydocker compose exec server-api ping mongo
# Restart services in orderdocker compose restart mongosleep 10docker compose restart hub server-api3. Port Conflicts
Section titled “3. Port Conflicts”Error in logs:
Error starting userland proxy: listen tcp 0.0.0.0:8080: bind: address already in useSolution:
# Find process using portsudo lsof -i :8080
# Kill process or change port in docker-compose.ymlports: - "8081:80" # Changed from 8080 to 8081
# Restart servicesdocker compose up -dAuthentication Failures
Section titled “Authentication Failures”Symptoms:
- Users cannot log in
- “Unauthorized” errors
- Redirect loops
Diagnosis:
# Check hub service logsdocker compose logs hub | grep -i auth
# Test WorkOS connectioncurl -H "Authorization: Bearer ${WORKOS_API_KEY}" \ https://api.workos.com/organizations
# Verify JWT configurationdocker compose exec hub env | grep WORKOSCommon Causes & Solutions:
1. Invalid WorkOS Credentials
Section titled “1. Invalid WorkOS Credentials”Solution:
# Verify credentials in .env-prodgrep WORKOS_API_KEY .env-prodgrep WORKOS_CLIENT_ID .env-prod
# Test credentialscurl -u ${WORKOS_CLIENT_ID}:${WORKOS_API_KEY} \ https://api.workos.com/user_management/users
# Update credentials if expirednano .env-proddocker compose restart hub2. Incorrect Redirect URI
Section titled “2. Incorrect Redirect URI”Error: “redirect_uri_mismatch”
Solution:
# Check configured redirect URIgrep WORKOS_REDIRECT_URI .env-prod
# Should match: {PUBLIC_BASE_URL}/hub/auth/callback# Example: https://app.yourcompany.com/hub/auth/callback
# Update WorkOS dashboard to match# Or update .env-prodnano .env-proddocker compose restart hub3. JWT Validation Failure
Section titled “3. JWT Validation Failure”Error in logs:
JWT signature verification failedSolution:
# Verify JWT private key is setgrep JWT_PRIVATE_KEY .env-prod
# Ensure key format is correct (PEM)# Key should start with: -----BEGIN PRIVATE KEY-----
# Regenerate if neededopenssl genrsa -out private.pem 2048cat private.pem
# Update .env-prodnano .env-proddocker compose restart hub reverse-proxyDatabase Connection Errors
Section titled “Database Connection Errors”Symptoms:
- “Connection refused” errors
- Timeout errors
- “Authentication failed”
Diagnosis:
# Check MongoDB statusdocker compose ps mongo
# Test connectiondocker compose exec mongo mongosh -u querri -p
# Check MongoDB logsdocker compose logs mongo | tail -50Common Causes & Solutions:
1. MongoDB Not Running
Section titled “1. MongoDB Not Running”Solution:
# Start MongoDBdocker compose up -d mongo
# Wait for MongoDB to be readysleep 10
# Verify it's runningdocker compose ps mongo2. Incorrect Credentials
Section titled “2. Incorrect Credentials”Error:
Authentication failedSolution:
# Verify credentials in .env-prodgrep MONGO_INITDB_ROOT_USERNAME .env-prodgrep MONGO_INITDB_ROOT_PASSWORD .env-prod
# Reset MongoDB with new credentialsdocker compose stop mongodocker volume rm querri_mongodb_data # WARNING: Deletes data!docker compose up -d mongo
# Or update application to use correct credentialsnano .env-proddocker compose restart server-api hub3. Network Issues
Section titled “3. Network Issues”Solution:
# Check Docker networkdocker network lsdocker network inspect querri_default
# Recreate networkdocker compose downdocker compose up -d
# Verify connectivitydocker compose exec server-api ping mongoPerformance Issues
Section titled “Performance Issues”Symptoms:
- Slow page loads
- Timeouts
- High CPU/memory usage
Diagnosis:
# Check resource usagedocker stats --no-stream
# Check service logs for slow queriesdocker compose logs server-api | grep -i "slow\|timeout"
# Check MongoDB performancedocker compose exec mongo mongosh -u querri -p> use querri> db.currentOp({"secs_running": {$gte: 5}})Common Causes & Solutions:
1. Insufficient Resources
Section titled “1. Insufficient Resources”Solution:
# Check available resourcesfree -hdf -h
# Increase container limits (docker-compose.yml)services: server-api: deploy: resources: limits: cpus: '4.0' memory: 8G
# Scale replicasSERVER_API_REPLICAS=8 docker compose up -d --scale server-api=82. Missing Database Indexes
Section titled “2. Missing Database Indexes”Solution:
// Connect to MongoDBdb.projects.createIndex({created_by: 1, created_at: -1})db.projects.createIndex({organization_id: 1})db.files.createIndex({uploaded_by: 1})db.audit_log.createIndex({timestamp: -1})
// Verify indexesdb.projects.getIndexes()3. Redis Memory Issues
Section titled “3. Redis Memory Issues”Solution:
# Check Redis memorydocker compose exec redis redis-cli INFO memory
# Clear cache if neededdocker compose exec redis redis-cli FLUSHALL
# Configure maxmemorydocker compose exec redis redis-cli CONFIG SET maxmemory 2gbdocker compose exec redis redis-cli CONFIG SET maxmemory-policy allkeys-lruService Connectivity Problems
Section titled “Service Connectivity Problems”Traefik Routing Issues
Section titled “Traefik Routing Issues”Symptoms:
- 404 errors
- Services not accessible
- Incorrect routing
Diagnosis:
# Check Traefik logsdocker compose logs reverse-proxy
# View Traefik configurationdocker compose exec reverse-proxy cat /etc/traefik/traefik.yml
# Check Traefik dashboard (if enabled)curl http://localhost:8080/dashboard/Solutions:
# Verify service labels in docker-compose.yml# Ensure services have correct Traefik labels
# Restart Traefikdocker compose restart reverse-proxy
# Check Traefik can reach servicesdocker compose exec reverse-proxy ping web-appdocker compose exec reverse-proxy ping server-apiInternal Service Communication
Section titled “Internal Service Communication”Symptoms:
- Services can’t communicate
- “Connection refused” between services
Diagnosis:
# Check Docker networkdocker network inspect querri_default
# Test connectivitydocker compose exec web-app ping server-apidocker compose exec server-api ping mongodocker compose exec server-api ping redisSolutions:
# Ensure all services on same network# In docker-compose.yml, verify network configuration
# Recreate networkdocker compose downdocker compose up -d
# Check service DNS resolutiondocker compose exec server-api nslookup mongoFile Upload Issues
Section titled “File Upload Issues”Symptoms:
- File uploads fail
- “Permission denied” errors
- Files not appearing
Diagnosis:
# Check file storage configurationgrep FILE_STORAGE .env-prod
# Check local storage permissionsls -la ./server-api/files/
# View server-api logsdocker compose logs server-api | grep -i "upload\|file"Solutions:
1. Permission Issues (Local Storage)
Section titled “1. Permission Issues (Local Storage)”# Fix directory permissionssudo chown -R 1000:1000 ./server-api/files/sudo chmod -R 755 ./server-api/files/
# Verify volume mountdocker compose exec server-api ls -la /app/files/2. S3 Configuration Issues
Section titled “2. S3 Configuration Issues”# Verify S3 credentialsgrep AWS_ACCESS_KEY_ID .env-prodgrep AWS_SECRET_ACCESS_KEY .env-prod
# Test S3 accessaws s3 ls s3://your-bucket-name/
# Verify bucket permissionsaws s3api get-bucket-policy --bucket your-bucket-name3. File Size Limits
Section titled “3. File Size Limits”# Check Traefik file size limits# Add to traefik/dynamic.ymlhttp: middlewares: limit: buffering: maxRequestBodyBytes: 104857600 # 100MB
# Update nginx if usedclient_max_body_size 100M;Integration Issues
Section titled “Integration Issues”Prismatic Integration Failures
Section titled “Prismatic Integration Failures”Symptoms:
- Integrations not loading
- “Authentication failed”
- Sync failures
Diagnosis:
# Check Prismatic configurationgrep PRISMATIC_KEY .env-prodgrep PRISMATIC_TOKEN .env-prod
# View integration logsdocker compose logs server-api | grep -i prismatic
# Check database for integration errorsdocker compose exec mongo mongosh -u querri -p> use querri> db.integration_logs.find({status: "error"}).sort({timestamp: -1})Solutions:
1. Expired Prismatic Token
Section titled “1. Expired Prismatic Token”# Token needs refresh# Get new token from Prismatic dashboard# Update .env-prodnano .env-prod# Add new PRISMATIC_TOKEN
# Restart servicesdocker compose restart server-api2. Integration Configuration Errors
Section titled “2. Integration Configuration Errors”// Check integration configurationdb.integrations.find({status: "error"})
// Update integration configdb.integrations.updateOne( {_id: "int_xxxxxxxxxxxxx"}, { $set: { "config.api_key": "new_key", status: "active" } })AI/OpenAI Issues
Section titled “AI/OpenAI Issues”AI Features Not Working
Section titled “AI Features Not Working”Symptoms:
- Chat not responding
- Analysis failures
- “AI service unavailable”
Diagnosis:
# Check AI configurationgrep OPENAI_API_KEY .env-prodgrep AZURE_OPENAI_ENDPOINT .env-prod
# Test OpenAI connectioncurl https://api.openai.com/v1/models \ -H "Authorization: Bearer ${OPENAI_API_KEY}"
# Check server-api logsdocker compose logs server-api | grep -i "openai\|ai\|gpt"Solutions:
1. Invalid API Key
Section titled “1. Invalid API Key”# Verify API key is validcurl https://api.openai.com/v1/models \ -H "Authorization: Bearer ${OPENAI_API_KEY}"
# Update key if invalidnano .env-proddocker compose restart server-api2. Rate Limiting
Section titled “2. Rate Limiting”Error: “Rate limit exceeded”
# Check current usage in OpenAI dashboard# Implement rate limiting in application# Upgrade OpenAI tier if needed
# Temporary: Reduce AI usage# Set in .env-prod:AI_RATE_LIMIT_REQUESTS_PER_MINUTE=103. Model Not Available
Section titled “3. Model Not Available”Error: “Model not found”
# Verify model namesgrep STANDARD_MODEL .env-prod
# For Azure OpenAI, use deployment names# For OpenAI, use model IDs: gpt-4o, gpt-4o-mini
# Update model configurationnano .env-proddocker compose restart server-apiLog Locations and Debug Mode
Section titled “Log Locations and Debug Mode”Service Log Locations
Section titled “Service Log Locations”# Docker logs (ephemeral)docker compose logs <service-name>
# Persistent logs (if configured)/var/log/querri/server-api.log/var/log/querri/hub.log/var/log/querri/web-app.log
# MongoDB logsdocker compose exec mongo cat /var/log/mongodb/mongod.log
# Traefik access logsdocker compose exec reverse-proxy cat /var/log/traefik/access.logEnabling Debug Mode
Section titled “Enabling Debug Mode”Application Debug Mode
Section titled “Application Debug Mode”# Enable debug loggingecho "LOG_LEVEL=DEBUG" >> .env-prod
# Restart servicesdocker compose restart server-api hub
# View debug logsdocker compose logs -f server-api | grep DEBUGTraefik Debug Mode
Section titled “Traefik Debug Mode”log: level: DEBUG
accessLog: enabled: true filePath: /var/log/traefik/access.log format: json# Restart Traefikdocker compose restart reverse-proxy
# View debug logsdocker compose logs reverse-proxyMongoDB Debug Mode
Section titled “MongoDB Debug Mode”# Enable profiling for slow queriesdocker compose exec mongo mongosh -u querri -p> use querri> db.setProfilingLevel(2) // Log all queries> db.setProfilingLevel(1, {slowms: 100}) // Log queries > 100ms
# View profiled queries> db.system.profile.find().sort({ts: -1}).limit(10)Debugging Workflows
Section titled “Debugging Workflows”1. Service Won’t Start
Section titled “1. Service Won’t Start”# Step 1: Check logsdocker compose logs <service-name>
# Step 2: Check environmentdocker compose exec <service-name> env
# Step 3: Check dependenciesdocker compose ps
# Step 4: Manual start with debugdocker compose run --rm <service-name> bash# Then manually start service to see errors2. Request Failing
Section titled “2. Request Failing”# Step 1: Check Traefik routingdocker compose logs reverse-proxy | grep <request-path>
# Step 2: Check service logsdocker compose logs server-api | grep <request-id>
# Step 3: Check databasedocker compose exec mongo mongosh -u querri -p> db.audit_log.find({request_id: "<request-id>"})
# Step 4: Reproduce with curlcurl -v -H "Authorization: Bearer $JWT" \ https://app.yourcompany.com/api/endpoint3. Performance Issues
Section titled “3. Performance Issues”# Step 1: Identify bottleneckdocker stats --no-stream
# Step 2: Check slow queriesdocker compose exec mongo mongosh -u querri -p> db.currentOp({"secs_running": {$gte: 5}})
# Step 3: Check application metricsdocker compose logs server-api | grep "response_time\|latency"
# Step 4: Profile code# Add timing decorators to Python codeAdvanced Troubleshooting
Section titled “Advanced Troubleshooting”Container Debugging
Section titled “Container Debugging”Access Container Shell
Section titled “Access Container Shell”# Access running containerdocker compose exec server-api bash
# Or start new container for debuggingdocker compose run --rm server-api bash
# Install debugging toolsapt update && apt install -y curl vim less procpsInspect Container State
Section titled “Inspect Container State”# Full container inspectiondocker inspect querri-server-api
# Specific informationdocker inspect querri-server-api | grep -A 10 "State"docker inspect querri-server-api | grep -A 20 "NetworkSettings"docker inspect querri-server-api | grep -A 10 "Mounts"Network Debugging
Section titled “Network Debugging”# Install network toolsdocker compose exec server-api apt updatedocker compose exec server-api apt install -y iputils-ping dnsutils netcat
# Test connectivitydocker compose exec server-api ping mongodocker compose exec server-api nc -zv mongo 27017docker compose exec server-api nslookup mongo
# View network configurationdocker network inspect querri_defaultDatabase Debugging
Section titled “Database Debugging”# Connect to MongoDBdocker compose exec mongo mongosh -u querri -p
# Check database stats> use querri> db.stats()
# Check collection stats> db.projects.stats()
# Current operations> db.currentOp()
# Server status> db.serverStatus()
# Replication status (if using replica set)> rs.status()Getting Help
Section titled “Getting Help”Information to Gather
Section titled “Information to Gather”When reporting issues, collect:
-
System information:
Terminal window uname -adocker --versiondocker compose version -
Service status:
Terminal window docker compose psdocker compose logs --tail=100 <service-name> -
Configuration (sanitized):
Terminal window # Remove sensitive data before sharingcat .env-prod | grep -v "PASSWORD\|KEY\|SECRET" -
Error messages:
Terminal window docker compose logs | grep -i "error\|exception\|failed" -
Resource usage:
Terminal window docker stats --no-streamdf -hfree -h
Support Channels
Section titled “Support Channels”- Documentation: https://docs.querri.com
- GitHub Issues: https://github.com/querri/querri-stack/issues
- Email Support: support@querri.com
- Community Forum: https://community.querri.com
Preventive Measures
Section titled “Preventive Measures”Health Monitoring
Section titled “Health Monitoring”Set up automated health checks:
# Create monitoring scriptcat > /usr/local/bin/check-querri-health.sh << 'EOF'#!/bin/bashSTATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8180/healthz)if [ "$STATUS" -ne 200 ]; then echo "Health check failed: HTTP $STATUS" # Send alert curl -X POST $SLACK_WEBHOOK -d '{"text":"Querri health check failed"}'fiEOF
chmod +x /usr/local/bin/check-querri-health.sh
# Add to crontabecho "*/5 * * * * /usr/local/bin/check-querri-health.sh" | crontab -Log Rotation
Section titled “Log Rotation”Prevent disk space issues:
# Configure Docker log rotation{ "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" }}
sudo systemctl restart dockerRegular Maintenance
Section titled “Regular Maintenance”# Weekly cleanup scriptcat > /usr/local/bin/querri-maintenance.sh << 'EOF'#!/bin/bash
# Clean Docker systemdocker system prune -f
# Check disk spaceDISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')if [ $DISK_USAGE -gt 80 ]; then echo "WARNING: Disk usage at ${DISK_USAGE}%"fi
# Backup database/path/to/backup_mongodb.shEOF
chmod +x /usr/local/bin/querri-maintenance.sh
# Schedule weeklyecho "0 2 * * 0 /usr/local/bin/querri-maintenance.sh" | crontab -Quick Reference
Section titled “Quick Reference”Common Commands
Section titled “Common Commands”# View all service logsdocker compose logs -f
# Restart all servicesdocker compose restart
# Rebuild and restart specific servicedocker compose up -d --build server-api
# Check service healthdocker compose ps
# View resource usagedocker stats
# Clean updocker system prune -a -f
# Backup databasedocker compose exec -T mongo mongodump ...
# Restore databasedocker compose exec -T mongo mongorestore ...Environment Variables Check
Section titled “Environment Variables Check”# Required variables checklistfor var in MONGO_INITDB_ROOT_USERNAME MONGO_INITDB_ROOT_PASSWORD WORKOS_API_KEY WORKOS_CLIENT_ID JWT_PRIVATE_KEY; do if grep -q "^${var}=" .env-prod; then echo "✓ $var is set" else echo "✗ $var is MISSING" fidoneNext Steps
Section titled “Next Steps”- Monitoring & Usage - Set up proactive monitoring
- Backup & Maintenance - Regular maintenance procedures
- Installation & Deployment - Review deployment setup
- Security & Permissions - Security troubleshooting