🧮 Jobflow: Defining Multi-Step Workflows
A Jobflow allows you to define and execute a series of interconnected routines in the form of an acyclic graph. This approach is ideal for building both simple jobflows and complex hierarchical jobflows.
🧠 What is a Jobflow?
A Jobflow is a directed acyclic graph (DAG) of routines. Each routine in the graph is defined as a WKubeTask, and their relationships define the order and dependencies of execution.
- You can build:
- A simple linear jobflow (Routine A → B → C)
- A diverging jobflow (Routine A → B, A → C)
- A converging jobflow (Routine B → D, Routine C → D)
- Or a deep hierarchical tree
✍️ Defining a Jobflow in wkube.py
A Jobflow can be fully defined in your wkube.py file using the same WKubeTask class that defines single routines.
To support multiple interconnected routine, WKubeTask offers two methods:
add_callback(task): Schedule a routine to run after the current one finishesadd_child(task): Add a child routine (used with a holder or root task)
⚠️
add_child()can only be called on aWKubeTaskcreated with no arguments, which acts as a holder routine.
📂 Example: Diverging and Converging Job Graph
Below is an example of a Jobflow structure with diverging and converging paths:
from accli import WKubeTask
# Holder/root task (doesn't execute anything)
flow = WKubeTask()
# Define individual routines
prep = WKubeTask(
name="Prepare Data",
job_folder="./prep",
base_stack="PYTHON3_7",
command="python prepare.py",
...
)
model_a = WKubeTask(
name="Model A",
job_folder="./model_a",
base_stack="PYTHON3_7",
command="python model_a.py",
...
)
model_b = WKubeTask(
name="Model B",
job_folder="./model_b",
base_stack="PYTHON3_7",
command="python model_b.py",
...
)
combine = WKubeTask(
name="Combine Results",
job_folder="./combine",
base_stack="PYTHON3_7",
command="python combine.py",
...
)
finisher = WKubeTask(
name="Finisher",
job_folder="./finisher",
base_stack="PYTHON3_7",
command="python finalize.py",
...
)
# Define DAG relationships
flow.add_child(prep) # Root → prep
prep.add_callback(model_a) # prep → model_a
prep.add_callback(model_b) # prep → model_b
model_a.add_callback(combine) # model_a → combine
model_b.add_callback(combine) # model_b → combine
flow.add_callback(finisher) # Final step after all children complete🚀 Dispatching a Jobflow
Dispatch the entire jobflow using the accli CLI, just like a regular routine.
Run the following from the directory containing your wkube.py:
accli login
accli dispatch <project_slug> flowReplace <project_slug> with your actual project identifier. Here, flow is the root WKubeTask object (holder) you defined in the script.
This graph results in:
preprunning first- Then
model_aandmodel_brun in parallel - Then
combineruns after both have completed - Then
finalizerruns after all children are done
🔁 Data Sharing Between Routines
Routines in a Jobflow can share data using:
1. Mounted Workflow Volume
- Automatically mounted at
/mnt/pipeinside each routine's container - Use this for fast, intermediate file sharing
- Ideal for passing outputs between jobs in a jobflows.
2. Cloud or Remote Sources
- Use URLs such as
acc://or others (HTTP/S, FTP, S3, etc.) acc://is optimized and colocated with the compute cluster for performance- Other sources are valid if supported by your script and stack
🧱 Holder Routine: The Root of Your Jobflow
To organize a jobflow, define a holder WKubeTask with no arguments:
root = WKubeTask() # Holder or root nodeThis routine does not execute anything — it's just a logical container that groups the actual routines and connects them through add_child and add_callback().
🛠️ Minimal Requirements
To build a Jobflow:
- Extend your existing
wkube.pyfile - Create individual
WKubeTasknodes for each routine - Use
add_child()andadd_callback()to define the execution graph
No new files or formats are needed — it's all built using Python code.
🖥️ GUI Support
While this page focuses on defining Jobflows in code, the platform also supports building Jobflows visually through the web interface. You'll be able to:
- Drag and connect routines
- Set input/output mappings visually
- Launch and monitor execution interactively
👉 We'll explore the GUI-based Jobflow builder in the next section.
✅ Summary
- A Jobflow is a graph of routines built using
WKubeTask - Use a holder routine to define the root of the jobflow
- Connect routines with
add_child()andadd_callback() - Share data via
/mnt/pipeor cloud paths likeacc:// - All of this is done inside your existing
wkube.pyfile
This structure lets you build flexible, scalable, and maintainable jobflows — all within a familiar Python-based interface.