Skip to content

🧮 Jobflow: Defining Multi-Step Workflows

A Jobflow allows you to define and execute a series of interconnected routines in the form of an acyclic graph. This approach is ideal for building both simple jobflows and complex hierarchical jobflows.


🧠 What is a Jobflow?

A Jobflow is a directed acyclic graph (DAG) of routines. Each routine in the graph is defined as a WKubeTask, and their relationships define the order and dependencies of execution.

  • You can build:
    • A simple linear jobflow (Routine A → B → C)
    • A diverging jobflow (Routine A → B, A → C)
    • A converging jobflow (Routine B → D, Routine C → D)
    • Or a deep hierarchical tree

✍️ Defining a Jobflow in wkube.py

A Jobflow can be fully defined in your wkube.py file using the same WKubeTask class that defines single routines.

To support multiple interconnected routine, WKubeTask offers two methods:

  • add_callback(task): Schedule a routine to run after the current one finishes
  • add_child(task): Add a child routine (used with a holder or root task)

⚠️ add_child() can only be called on a WKubeTask created with no arguments, which acts as a holder routine.


📂 Example: Diverging and Converging Job Graph

Below is an example of a Jobflow structure with diverging and converging paths:

python
from accli import WKubeTask

# Holder/root task (doesn't execute anything)
flow = WKubeTask()

# Define individual routines
prep = WKubeTask(
    name="Prepare Data",
    job_folder="./prep",
    base_stack="PYTHON3_7",
    command="python prepare.py",
    ...
)

model_a = WKubeTask(
    name="Model A",
    job_folder="./model_a",
    base_stack="PYTHON3_7",
    command="python model_a.py",
    ...
)

model_b = WKubeTask(
    name="Model B",
    job_folder="./model_b",
    base_stack="PYTHON3_7",
    command="python model_b.py",
    ...
)

combine = WKubeTask(
    name="Combine Results",
    job_folder="./combine",
    base_stack="PYTHON3_7",
    command="python combine.py",
    ...
)

finisher = WKubeTask(
    name="Finisher",
    job_folder="./finisher",
    base_stack="PYTHON3_7",
    command="python finalize.py",
    ...
)

# Define DAG relationships
flow.add_child(prep)             # Root → prep
prep.add_callback(model_a)       # prep → model_a
prep.add_callback(model_b)       # prep → model_b
model_a.add_callback(combine)    # model_a → combine
model_b.add_callback(combine)    # model_b → combine

flow.add_callback(finisher)      # Final step after all children complete

🚀 Dispatching a Jobflow

Dispatch the entire jobflow using the accli CLI, just like a regular routine.

Run the following from the directory containing your wkube.py:

bash
accli login
accli dispatch <project_slug> flow

Replace <project_slug> with your actual project identifier. Here, flow is the root WKubeTask object (holder) you defined in the script.

This graph results in:

  • prep running first
  • Then model_a and model_b run in parallel
  • Then combine runs after both have completed
  • Then finalizer runs after all children are done

🔁 Data Sharing Between Routines

Routines in a Jobflow can share data using:

1. Mounted Workflow Volume

  • Automatically mounted at /mnt/pipe inside each routine's container
  • Use this for fast, intermediate file sharing
  • Ideal for passing outputs between jobs in a jobflows.

2. Cloud or Remote Sources

  • Use URLs such as acc:// or others (HTTP/S, FTP, S3, etc.)
  • acc:// is optimized and colocated with the compute cluster for performance
  • Other sources are valid if supported by your script and stack

🧱 Holder Routine: The Root of Your Jobflow

To organize a jobflow, define a holder WKubeTask with no arguments:

python
root = WKubeTask()  # Holder or root node

This routine does not execute anything — it's just a logical container that groups the actual routines and connects them through add_child and add_callback().


🛠️ Minimal Requirements

To build a Jobflow:

  • Extend your existing wkube.py file
  • Create individual WKubeTask nodes for each routine
  • Use add_child() and add_callback() to define the execution graph

No new files or formats are needed — it's all built using Python code.


🖥️ GUI Support

While this page focuses on defining Jobflows in code, the platform also supports building Jobflows visually through the web interface. You'll be able to:

  • Drag and connect routines
  • Set input/output mappings visually
  • Launch and monitor execution interactively

👉 We'll explore the GUI-based Jobflow builder in the next section.


✅ Summary

  • A Jobflow is a graph of routines built using WKubeTask
  • Use a holder routine to define the root of the jobflow
  • Connect routines with add_child() and add_callback()
  • Share data via /mnt/pipe or cloud paths like acc://
  • All of this is done inside your existing wkube.py file

This structure lets you build flexible, scalable, and maintainable jobflows — all within a familiar Python-based interface.