Ductwork Documentation
Build powerful data pipelines with Ruby—no complexity required.
Ductwork is a pipeline framework designed for developers who want to get things done. Build sophisticated multi-stage workflows using familiar Ruby patterns and an intuitive DSL. No complicated object models, no separate infrastructure—just elegant Ruby code that scales.
What Makes Ductwork Different
Simple by design. Define pipelines with a fluent DSL that reads like English. Connect steps, fan out work, merge results—all with natural Ruby syntax.
Runs where your app runs. No message brokers, no separate worker infrastructure. Ductwork runs alongside your Rails application using processes and threads you control.
Built for resilience. Automatic process recovery, graceful shutdowns, and configurable retries keep your pipelines running through failures.
Scales with you. Start with a single process and scale to hundreds of concurrent workers. Adjust thread counts, tune timeouts, and isolate critical pipelines—all through simple YAML configuration.
Quick Example
class EnrichAllUsersDataPipeline < Ductwork::Pipeline
define do |pipeline|
pipeline.start(QueryUsersRequiringEnrichment) # Begin with a single step
.expand(to: LoadUserData) # Fan out to process each user
.divide(to: [FetchDataFromSourceA, FetchDataFromSourceB]) # Split into parallel branches
.combine(into: CollateUserData) # Bring branches back together
.chain(UpdateUserData) # Sequential processing
.collapse(into: ReportUserEnrichmentSuccess) # Final aggregation
end
end
# Trigger from anywhere in your Rails app
EnrichAllUsersDataPipeline.trigger(days_outdated: 7)
Core Concepts
- Getting Started - Install Ductwork and run your first pipeline
- Defining Pipelines - Learn the DSL for connecting steps and managing workflow
- Configuration - Tune workers, timeouts, and scaling for your needs
- Concurrency - Understand Ductwork’s multi-process, multi-threaded architecture
When to Use Ductwork
Ductwork excels at multi-stage data processing workflows:
- Data enrichment pipelines - Fetch, transform, and enrich records from multiple sources
- ETL processes - Extract, transform, and load data with fault tolerance
- Batch operations - Process large datasets in manageable, concurrent chunks
- Multi-step integrations - Coordinate complex workflows across services and APIs
- Report generation - Aggregate data from multiple sources into comprehensive reports
Ready to Start?
Jump into the Installation Guide and build your first pipeline in minutes.
Have questions? Open an issue on GitHub or upgrade to Pro for custom support.