Concurrency

Ductwork uses a multi-process, multi-threaded architecture designed for resilience, scalability, and efficient resource utilization. Understanding this model helps you optimize performance and troubleshoot issues effectively.

Architecture Overview

Each bin/ductwork instance runs a Supervisor Process that manages child processes for your configured pipelines. The supervisor orchestrates:

One Pipeline Advancer Thread per Pipeline - Creates a thread for each configured pipeline to advance pipelines through their steps
One Job Worker Process per Pipeline - Each creates multiple threads to execute step jobs concurrently

This design isolates pipeline execution while enabling concurrent processing within each pipeline.

Process Model

Supervisor Process

The supervisor is the parent process responsible for:

Launching and monitoring all child processes (advancers and workers)
Detecting failures through heartbeat monitoring
Automatically restarting crashed processes
Coordinating graceful shutdown

Failure recovery: The supervisor monitors child process health through periodic heartbeats. If a child process fails to report a heartbeat within 5 minutes—indicating a crash or hang—the supervisor automatically spawns a replacement process to maintain pipeline availability.

Pipeline Advancer Process

A single advancer process handles all configured pipelines by creating one thread per pipeline. Each thread:

Monitors its pipeline’s current stage
Checks when all steps in a stage reach advancing status
Transitions the pipeline to the next stage when ready
Sleeps for pipeline_advancer.polling_timeout seconds when no work is available

Job Worker Processes

Each configured pipeline gets its own dedicated job worker process. The process creates multiple threads (configured via job_worker.count) that:

Query the database for available jobs
Execute step logic
Update job status atomically
Sleep for job_worker.polling_timeout seconds when no jobs are available

Work distribution: Each worker thread independently queries for the oldest available job using an atomic UPDATE...WHERE query. This ensures:

No job is processed by multiple threads simultaneously
Work is distributed fairly across threads
No external queue or coordinator is needed

Thread Model

Job Worker Threads

Worker threads operate in a continuous loop:

Query for work - Fetch the oldest available job from the database
Claim the job - Atomically update the job record to mark it as in-progress
Execute - Run the step’s #execute method
Update status - Mark the job as complete or failed
Repeat - If no jobs are available, sleep for the configured polling timeout

This simple, database-backed coordination eliminates the need for complex distributed locking or external job queues.

Pipeline Advancer Threads

Each pipeline’s advancer thread:

Query pipeline state - Check if all steps in the current stage are complete
Advance if ready - When all steps reach advancing status, create the next stage’s step records
Sleep - If the pipeline isn’t ready to advance, sleep for the polling timeout
Repeat - Continue monitoring until the pipeline completes

Database Connections

Worker threads check out connections from Rails’ connection pool. This is critical for configuration:

# config/database.yml
production:
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 10 } %>

# config/ductwork.yml
default: &default
  job_worker:
    worker_count: 10
  pipeline:
    worker_count: 10

Recommendation: Scale your connection pool size to match the number of worker threads. If you have 3 pipelines with 10 workers each and 10 pipeline advancers total, you’ll need at least 40 database connections.

Scaling Considerations

Vertical Scaling (More Threads per Process)

Increase job_worker.count to process more jobs concurrently within each pipeline:

Pros:

Simple configuration change
Better resource utilization on larger machines
Lower overhead than multiple processes

Cons:

Requires more database connections
Limited by single-process resources
Ruby GIL may limit CPU-bound work

Horizontal Scaling (Multiple Instances)

Run multiple bin/ductwork instances with different configuration files:

Pros:

Better fault isolation
Can run different pipelines on different machines
Bypasses single-process limitations

Cons:

More complex orchestration
Higher operational overhead
More database connections total

Example:

# High-priority pipelines on one instance
bin/ductwork -c config/ductwork.critical.yml

# Background pipelines on another
bin/ductwork -c config/ductwork.background.yml

Performance Tips

Keep steps short - Design steps to complete in seconds, or at least less than the shutdown timeouts
Use expand wisely - Expanding to millions of steps can overwhelm the system
Monitor connection pool - Watch for connection exhaustion as you scale threads
Tune polling timeouts - Shorter timeouts increase responsiveness but add database load
Profile your steps - Use logging and monitoring to identify slow steps
Consider batching - Instead of expanding to 1,000,000 steps, batch into 10,000 steps that process 100 items each