Dagu is a powerful Cron alternative that comes with a Web UI. It allows you to define dependencies between commands in a declarative YAML Format. Additionally, Dagu natively supports running Docker containers, making HTTP requests, and executing commands over SSH. Dagu was designed to be easy to use, self-contained, and require no coding, making it ideal for small projects.
- Why Dagu?
- Core Features
- Common Use Cases
- Community
- Installation
- Quick Start Guide
- Usage / Command Line Interface
- Example DAG
- Minimal examples
- Named Parameters
- Positional Parameters
- Conditional DAG
- Script Execution
- Variable Passing
- Scheduling
- Calling a sub-DAG
- Running a docker image
- Environment Variables
- Notifications on Failure or Success
- HTTP Request and Notifications
- Execute commands over SSH
- Advanced Preconditions
- Handling Various Execution Results
- JSON Processing Examples
- Web UI
- Running as a daemon
- Contributing
- Contributors
- License
Dagu is a modern workflow engine that combines simplicity with power, designed for developers who need reliable automation without the overhead. Here's what makes Dagu stand out:
-
Language Agnostic: Run any command or script regardless of programming language. Whether you're working with Python, Node.js, Bash, or any other language, Dagu seamlessly integrates with your existing tools and scripts.
-
Local-First Architecture: Deploy and run workflows directly on your machine without external dependencies. This local-first approach ensures complete control over your automation while maintaining the flexibility to scale to distributed environments when needed.
-
Zero Configuration: Get started in minutes with minimal setup. Dagu uses simple YAML files to define workflows, eliminating the need for complex configurations or infrastructure setup.
-
Built for Developers: Designed with software engineers in mind, Dagu provides powerful features like dependency management, retry logic, and parallel execution while maintaining a clean, intuitive interface.
-
Cloud Native Ready: While running perfectly on local environments, Dagu is built to seamlessly integrate with modern cloud infrastructure when you need to scale.
- Workflow Management
- Declarative YAML definitions
- Dependency management
- Parallel execution
- Sub-workflows
- Conditional execution with regex
- Timeouts and automatic retries
- Execution & Integration
- Native Docker support
- SSH command execution
- HTTP requests
- JSON processing
- Email notifications
- Operations
- Web UI for monitoring
- Real-time logs
- Execution history
- Flexible scheduling
- Environment variables
- Automatic logging
- Data Processing
- Scheduled Tasks
- Media Processing
- CI/CD Automation
- ETL Pipelines
- Agentic Workflows
- Issues: GitHub Issues
- Discussion: GitHub Discussions
- Chat: Discord
Dagu can be installed in multiple ways, such as using Homebrew or downloading a single binary from GitHub releases.
curl -L https://raw.githubusercontent.com/dagu-org/dagu/main/scripts/installer.sh | bash
Download the latest binary from the Releases page and place it in your $PATH
(e.g. /usr/local/bin
).
brew install dagu-org/brew/dagu
Upgrade to the latest version:
brew upgrade dagu-org/brew/dagu
docker run \
--rm \
-p 8080:8080 \
-v ~/.config/dagu:/config \
-e DAGU_TZ=`ls -l /etc/localtime | awk -F'/zoneinfo/' '{print $2}'` \
ghcr.io/dagu-org/dagu:latest dagu start-all
Note: The environment variable DAGU_TZ
is the timezone for the scheduler and server. You can set it to your local timezone (e.g. America/New_York
).
See Environment variables to configure those default directories.
Start the server and scheduler with the command dagu start-all
and browse to http://127.0.0.1:8080
to explore the Web UI.
Navigate to the DAG List page by clicking the menu in the left panel of the Web UI. Then create a DAG by clicking the NEW
button at the top of the page. Enter example
in the dialog.
Note: DAG (YAML) files will be placed in ~/.config/dagu/dags
by default. See Configuration Options for more details.
Go to the SPEC
Tab and hit the Edit
button. Copy & Paste the following example and click the Save
button.
Example:
schedule: "* * * * *" # Run the DAG every minute
params:
- NAME: "Dagu"
steps:
- name: Hello world
command: echo Hello $NAME
- name: Done
command: echo Done!
depends: Hello world
You can execute the example by pressing the Start
button. You can see "Hello Dagu" in the log page in the Web UI.
# Runs the DAG
dagu start <file or DAG name>
# Runs the DAG with named parameters
dagu start <file or DAG name> [-- <key>=<value> ...]
# Runs the DAG with positional parameters
dagu start <file or DAG name> [-- value1 value2 ...]
# Displays the current status of the DAG
dagu status <file or DAG name>
# Re-runs the specified DAG run
dagu retry --req=<request-id> <file or DAG name>
# Stops the DAG execution
dagu stop <file or DAG name>
# Restarts the current running DAG
dagu restart <file or DAG name>
# Dry-runs the DAG
dagu dry <file or DAG name> [-- <key>=<value> ...]
# Launches both the web UI server and scheduler process
dagu start-all [--host=<host>] [--port=<port>] [--dags=<path to directory>]
# Launches the Dagu web UI server
dagu server [--host=<host>] [--port=<port>] [--dags=<path to directory>]
# Starts the scheduler process
dagu scheduler [--dags=<path to directory>]
# Shows the current binary version
dagu version
A simple example with a named parameter:
params:
- NAME: "Dagu"
steps:
- name: Hello world
command: echo Hello $NAME
- name: Done
command: echo Done!
depends:
- Hello world
Using a pipe:
steps:
- name: step 1
command: echo hello world | xargs echo
Specifying a shell:
steps:
- name: step 1
command: echo hello world | xargs echo
shell: bash # The default shell is `$SHELL` or `sh`.
You can define named parameters in the DAG file and override them when running the DAG.
# Default named parameters
params:
NAME: "Dagu"
AGE: 30
steps:
- name: Hello world
command: echo Hello $NAME
- name: Done
command: echo Done!
depends: Hello world
Run the DAG with custom parameters:
dagu start my_dag -- NAME=John AGE=40
You can define positional parameters in the DAG file and override them when running the DAG.
# Default positional parameters
params: input.csv output.csv 60 # Default values for $1, $2, and $3
steps:
# Using positional parameters
- name: data processing
command: python
script: |
import sys
import pandas as pd
input_file = "$1" # First parameter
output_file = "$2" # Second parameter
timeout = "$3" # Third parameter
print(f"Processing {input_file} -> {output_file} with timeout {timeout}s")
# Add your processing logic here
Run the DAG with custom parameters:
dagu start my_dag -- input.csv output.csv 120
You can define conditions to run a step based on the output of a command.
steps:
- name: monthly task
command: monthly.sh
preconditions:
- condition: "`date '+%d'`"
expected: "re:0[1-9]" # Run only if the day is between 01 and 09
You can run a script using the script
field.
steps:
# Python script example
- name: data analysis
command: python
script: |
import json
import sys
data = {'count': 100, 'status': 'ok'}
print(json.dumps(data))
sys.stderr.write('Processing complete\n')
output: RESULT
stdout: /tmp/analysis.log
stderr: /tmp/analysis.error
# Shell script with multiple commands
- name: cleanup
command: bash
script: |
#!/bin/bash
echo "Starting cleanup..."
# Remove old files
find /tmp -name "*.tmp" -mtime +7 -exec rm {} \;
# Archive logs
cd /var/log
tar -czf archive.tar.gz *.log
echo "Cleanup complete"
depends: data analysis
You can pass the output of one step to another step using the output
field.
steps:
# Basic output capture
- name: generate id
command: echo "ABC123"
output: REQUEST_ID
- name: use id
command: echo "Processing request ${REQUEST_ID}"
depends: generate id
# Capture JSON output
steps:
- name: get config
command: |
echo '{"port": 8080, "host": "localhost"}'
output: CONFIG
- name: start server
command: echo "Starting server at ${CONFIG.host}:${CONFIG.port}"
depends: get config
You can specify flexible schedules using the cron format.
schedule: "5 4 * * *" # Run at 04:05.
steps:
- name: scheduled job
command: job.sh
Or you can set multiple schedules.
schedule:
- "30 7 * * *" # Run at 7:30
- "0 20 * * *" # Also run at 20:00
steps:
- name: scheduled job
command: job.sh
If you want to start and stop a long-running process on a fixed schedule, you can define start
and stop
times:
schedule:
start: "0 8 * * *" # starts at 8:00
stop: "0 13 * * *" # stops at 13:00
steps:
- name: scheduled job
command: job.sh
You can call another DAG from a parent DAG.
steps:
- name: parent
run: sub-dag
output: OUT
- name: use output
command: echo ${OUT.outputs.result}
depends: parent
The sub-DAG sub-dag.yaml
:
steps:
- name: sub-dag
command: echo "Hello from sub-dag"
output: result
THe parent DAG will call the sub-DAG and write the output to the log (stdout).
The output will be Hello from sub-dag
.
You can run a docker image as a step:
steps:
- name: hello
executor:
type: docker
config:
image: alpine
autoRemove: true
command: echo "hello"
You can define environment variables and use them in the DAG.
env:
- DATA_DIR: ${HOME}/data
- PROCESS_DATE: "`date '+%Y-%m-%d'`"
steps:
- name: process logs
command: python process.py
dir: ${DATA_DIR}
preconditions:
- "test -f ${DATA_DIR}/logs_${PROCESS_DATE}.txt" # Check if the file exists
You can send notifications on failure in various ways.
env:
- SLACK_WEBHOOK_URL: "https://hooks.slack.com/services/XXXXX/YYYYY/ZZZZZ"
dotenv:
- .env
smtp:
host: $SMTP_HOST
port: "587"
username: $SMTP_USERNAME
password: $SMTP_PASSWORD
handlerOn:
failure:
command: |
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"DAG Failed ($DAG_NAME")}' \
${SLACK_WEBHOOK_URL}
steps:
- name: critical process
command: important_job.sh
retryPolicy:
limit: 3
intervalSec: 60
mailOn:
failure: true # Send an email on failure
If you want to set it globally, you can create ~/.config/dagu/base.yaml
and define the common configurations across all DAGs.
smtp:
host: $SMTP_HOST
port: "587"
username: $SMTP_USERNAME
password: $SMTP_PASSWORD
mailOn:
failure: true
success: true
You can also use mail executor to send notifications.
params:
- RECIPIENT_NAME: XXX
- RECIPIENT_EMAIL: [email protected]
- MESSAGE: "Hello [RECIPIENT_NAME]"
steps:
- name: step1
executor:
type: mail
config:
to: $RECIPIENT_EMAIL
from: [email protected]
subject: "Hello [RECIPIENT_NAME]"
message: $MESSAGE
You can make HTTP requests and send notifications.
dotenv:
- .env
smtp:
host: $SMTP_HOST
port: "587"
username: $SMTP_USERNAME
password: $SMTP_PASSWORD
steps:
- name: fetch data
executor:
type: http
config:
timeout: 10
command: GET https://api.example.com/data
output: API_RESPONSE
- name: send notification
executor:
type: mail
config:
to: [email protected]
from: [email protected]
subject: "Data Processing Complete"
message: |
Process completed successfully.
Response: ${API_RESPONSE}
depends: fetch data
You can execute commands over SSH.
steps:
- name: backup
executor:
type: ssh
config:
user: admin
ip: 192.168.1.100
key: ~/.ssh/id_rsa
command: tar -czf /backup/data.tar.gz /data
You can define complex conditions to run a step based on the output of a command.
steps:
# Check multiple conditions
- name: daily task
command: process_data.sh
preconditions:
# Run only on weekdays
- condition: "`date '+%u'`"
expected: "re:[1-5]"
# Run only if disk space > 20%
- condition: "`df -h / | awk 'NR==2 {print $5}' | sed 's/%//'`"
expected: "re:^[0-7][0-9]$|^[1-9]$" # 0-79% used (meaning at least 20% free)
# Check if input file exists
- condition: "test -f input.csv"
# Complex file check
- name: process files
command: batch_process.sh
preconditions:
- condition: "`find data/ -name '*.csv' | wc -l`"
expected: "re:[1-9][0-9]*" # At least one CSV file exists
You can use continueOn
to control when to fail or continue based on the exit code, output, or other conditions.
steps:
# Basic error handling
- name: process data
command: python process.py
continueOn:
failure: true # Continue on any failure
skipped: true # Continue if preconditions aren't met
# Handle specific exit codes
- name: data validation
command: validate.sh
continueOn:
exitCode: [1, 2, 3] # 1:No data, 2:Partial data, 3:Invalid format
markSuccess: true # Mark as success even with these codes
# Output pattern matching
- name: api request
command: curl -s https://api.example.com/data
continueOn:
output:
- "no records found" # Exact match
- "re:^Error: [45][0-9]" # Regex match for HTTP errors
- "rate limit exceeded" # Another exact match
# Complex pattern
- name: database backup
command: pg_dump database > backup.sql
continueOn:
exitCode: [0, 1] # Accept specific exit codes
output: # Accept specific outputs
- "re:0 rows affected"
- "already exists"
failure: false # Don't continue on other failures
markSuccess: true # Mark as success if conditions match
# Multiple conditions combined
- name: data sync
command: sync_data.sh
continueOn:
exitCode: [1] # Exit code 1 is acceptable
output: # These outputs are acceptable
- "no changes detected"
- "re:synchronized [0-9]+ files"
skipped: true # OK if skipped due to preconditions
markSuccess: true # Mark as success in these cases
# Error output handling
- name: log processing
command: process_logs.sh
stderr: /tmp/process.err
continueOn:
output:
- "re:WARNING:.*" # Continue on warnings
- "no logs found" # Continue if no logs
exitCode: [0, 1, 2] # Multiple acceptable exit codes
failure: true # Continue on other failures too
# Application-specific status
- name: app health check
command: check_status.sh
continueOn:
output:
- "re:STATUS:(DEGRADED|MAINTENANCE)" # Accept specific statuses
- "re:PERF:[0-9]{2,3}ms" # Accept performance in range
markSuccess: true # Mark these as success
You can use jq
executor to process JSON data.
# Simple data extraction
steps:
- name: extract value
executor: jq
command: .user.name # Get user name from JSON
script: |
{
"user": {
"name": "John",
"age": 30
}
}
# Output: "John"
# Transform array data
steps:
- name: get users
executor: jq
command: '.users[] | {name: .name}' # Extract name from each user
script: |
{
"users": [
{"name": "Alice", "age": 25},
{"name": "Bob", "age": 30}
]
}
# Output:
# {"name": "Alice"}
# {"name": "Bob"}
# Calculate and format
steps:
- name: sum ages
executor: jq
command: '{total_age: ([.users[].age] | add)}' # Sum all ages
script: |
{
"users": [
{"name": "Alice", "age": 25},
{"name": "Bob", "age": 30}
]
}
# Output: {"total_age": 55}
# Filter and count
steps:
- name: count active
executor: jq
command: '[.users[] | select(.active == true)] | length'
script: |
{
"users": [
{"name": "Alice", "active": true},
{"name": "Bob", "active": false},
{"name": "Charlie", "active": true}
]
}
# Output: 2
More examples can be found in the documentation.
Real-time status, logs, and configuration for each DAG. Toggle graph orientation from the top-right corner.
View all DAGs in one place with live status updates.
Search across all DAG definitions.
Review past DAG executions and logs at a glance.
Examine detailed step-level logs and outputs.
The easiest way to make sure the process is always running on your system is to create the script below and execute it every minute using cron (you don't need root
account in this way):
#!/bin/bash
process="dagu start-all"
command="/usr/bin/dagu start-all"
if ps ax | grep -v grep | grep "$process" > /dev/null
then
exit
else
$command &
fi
exit
We welcome new contributors! Check out our Contribution Guide for guidelines on how to get started.
Dagu is released under the GNU GPLv3.