Files
crawlab/specs/003-task-system/20250930-task-worker-configuration.md
Marvin Zhang 97ab39119c feat(specs): add detailed documentation for gRPC file sync migration and release 0.7.0
- Introduced README.md for the file sync issue after gRPC migration, outlining the problem, root cause, and proposed solutions.
- Added release notes for Crawlab 0.7.0 highlighting community features and improvements.
- Created a README.md for the specs directory to provide an overview and usage instructions for LeanSpec.
2025-11-10 14:07:36 +08:00

3.8 KiB

Crawlab Task Worker Configuration Examples

Basic Configuration

config.yml

# Task execution configuration
task:
  workers: 20  # Number of concurrent task workers (default: 10)

# Node configuration (optional)
node:
  maxRunners: 50  # Maximum total tasks per node (0 = unlimited)

Environment Variables

# Set via environment variables
export CRAWLAB_TASK_WORKERS=20
export CRAWLAB_NODE_MAXRUNNERS=50

Configuration Guidelines

Worker Count Recommendations

Scenario Task Workers Queue Size Memory Usage
Development 5-10 25-50 ~100MB
Small Production 15-20 75-100 ~200MB
Medium Production 25-35 125-175 ~400MB
Large Production 40-60 200-300 ~800MB

Factors to Consider

  1. Task Complexity: CPU/Memory intensive tasks need fewer workers
  2. Task Duration: Long-running tasks need more workers for throughput
  3. System Resources: Balance workers with available CPU/Memory
  4. Database Load: More workers = more database connections
  5. External Dependencies: Network-bound tasks can handle more workers

Performance Tuning

Too Few Workers (Queue Full Errors)

WARN task queue is full (50/50), consider increasing task.workers configuration

Solution: Increase task.workers value

Too Many Workers (Resource Exhaustion)

ERROR failed to create task runner: out of memory
ERROR database connection pool exhausted

Solution: Decrease task.workers value

Optimal Configuration

INFO Task handler service started with 20 workers and queue size 100
DEBUG task[abc123] queued, queue usage: 5/100

Docker Configuration

docker-compose.yml

version: '3'
services:
  crawlab-master:
    image: crawlab/crawlab:latest
    environment:
      - CRAWLAB_TASK_WORKERS=25
      - CRAWLAB_NODE_MAXRUNNERS=100
    # ... other config

  crawlab-worker:
    image: crawlab/crawlab:latest
    environment:
      - CRAWLAB_TASK_WORKERS=30  # Workers can be different per node
      - CRAWLAB_NODE_MAXRUNNERS=150
    # ... other config

Kubernetes ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: crawlab-config
data:
  config.yml: |
    task:
      workers: 25
    node:
      maxRunners: 100

Monitoring Worker Performance

Log Monitoring

# Monitor worker pool status
grep -E "(workers|queue usage|queue is full)" /var/log/crawlab/crawlab.log

# Monitor task throughput
grep -E "(task.*queued|task.*finished)" /var/log/crawlab/crawlab.log | wc -l

Metrics to Track

  • Queue utilization percentage
  • Average task execution time
  • Worker pool saturation
  • Memory usage per worker
  • Task success/failure rates

Troubleshooting

Queue Always Full

  1. Increase worker count: task.workers
  2. Check task complexity and optimization
  3. Verify database performance
  4. Consider scaling horizontally (more nodes)

High Memory Usage

  1. Decrease worker count
  2. Optimize task memory usage
  3. Implement task batching
  4. Add memory monitoring alerts

Slow Task Processing

  1. Profile individual tasks
  2. Check database query performance
  3. Optimize external API calls
  4. Consider async task patterns

Testing Configuration Changes

# Test new configuration
export CRAWLAB_TASK_WORKERS=30
./scripts/test_goroutine_fixes.sh 900 10

# Monitor during peak load
./scripts/test_goroutine_fixes.sh 3600 5

Best Practices

  1. Start Conservative: Begin with default values and monitor
  2. Load Test: Always test configuration changes under load
  3. Monitor Metrics: Track queue utilization and task throughput
  4. Scale Gradually: Increase worker count in small increments
  5. Resource Limits: Set appropriate memory/CPU limits in containers
  6. High Availability: Configure different worker counts per node type