Web UI How-to Guide¶

Practical recipes for common Web UI tasks.

Connect to a Different Cluster¶

Using an SSH profile:

SRUNX_SSH_PROFILE=other-cluster srunx ui

Using direct connection parameters:

SRUNX_SSH_HOSTNAME=slurm.example.com SRUNX_SSH_USERNAME=user srunx ui

With a custom SSH key:

SRUNX_SSH_HOSTNAME=slurm.example.com \
SRUNX_SSH_USERNAME=user \
SRUNX_SSH_KEY=~/.ssh/id_ed25519 \
srunx ui

Change the Server Port¶

By default, the server runs on port 8000. Use the --port option:

srunx ui --port 3001

Monitor GPU Resources¶

Navigate to Resources in the sidebar
Each partition shows:
- GPU utilization bar with color coding (green \< 70%, orange \< 90%, red > 90%)
- Total/idle/down node counts
- Overall utilization percentage
Data refreshes every 10 seconds via polling

Cancel a Running Job¶

Navigate to Jobs
Find the job in the table (use search or status filter)
Click the red cancel button (X icon) in the Actions column
The job will be cancelled via scancel on the remote cluster

View Job Logs¶

Navigate to Jobs
Click the log icon next to a job
Switch between stdout and stderr tabs
For running jobs, logs are polled every 3 seconds

Upload and Visualize a Workflow¶

Navigate to Workflows
Click Upload YAML
Select a .yaml or .yml file
After upload, click View DAG on the workflow card
The DAG view shows:
- Job nodes with status badges
- Dependency edges between jobs
- Resource indicators (GPU count, node count)
- Click a node to see detailed job information
Toggle between DAG and List views using the toolbar buttons

Warning

Workflow YAML files containing python: args are rejected for security reasons. Use python: args only via the CLI (srunx flow run).

Build a Workflow with the DAG Builder¶

Navigate to Workflows and click New Workflow
Enter a workflow name in the toolbar input (alphanumeric, hyphens, and underscores only)
Click Add Job to add job nodes to the canvas
Click each node to open the property panel and configure name, command, and resources
Drag from a node's bottom handle to another node's top handle to create a dependency
Click Save Workflow to validate and persist the workflow as YAML

The builder performs client-side validation before submitting:

Every job must have a non-empty name and command
Job names must be unique
The dependency graph must be acyclic (no circular dependencies)

Edit Job Properties¶

Click a job node on the DAG builder canvas
The property panel slides in from the right with these sections:
- Basic — Name, command, work directory, log directory
- Resources — Nodes, GPUs per node, tasks per node, CPUs per task, memory, time limit, partition, node list
- Environment — Conda environment, virtualenv path, environment variables
- Container — Enable/disable container mode with runtime (Pyxis, Apptainer, Singularity), image, mounts, workdir
- Retry — Retry count and delay in seconds
Changes are applied immediately to the node on the canvas
Click the X button or click empty canvas space to close the panel
Click the trash icon to delete the selected job

Use the File Browser for Remote Paths¶

The file browser lets you pick files and directories from your local project tree, automatically translating them to remote paths.

Ensure you have at least one mount configured (see Web UI tutorial Step 8)
In the property panel, click the folder icon next to Command, Work Dir, or Log Dir
Select a mount from the Project dropdown
Browse the directory tree by clicking folders to expand them
Click a file or directory to select it (the footer shows the remote path)
Click Select to insert the remote path into the field

Note

When selecting a file for the Command field, the path is made relative to the work directory if one is set. This keeps your commands portable.

Manage Mount Points¶

Mount points define local-to-remote directory mappings for the file browser.

Add a mount:

srunx ssh profile mount add myserver ml-project \
    --local ~/projects/ml-project \
    --remote /home/researcher/ml-project

List mounts:

srunx ssh profile mount list myserver

Remove a mount:

srunx ssh profile mount remove myserver ml-project

Mounts are stored in the SSH profile configuration. The --local path is the directory on your local machine; --remote is the corresponding path on the SLURM cluster.

Sync Files Before Running¶

The file browser shows local files, but workflows execute on the remote cluster. Ensure files are synchronized before running.

From the file browser:

Open the file browser from any path field in the property panel
Click Sync Now in the yellow banner at the top of the file tree
Wait for the sync to complete (the button changes to "Synced")

From the command line:

rsync -avz --delete \
    -e "ssh -i ~/.ssh/id_ed25519" \
    ~/projects/ml-project/ \
    researcher@dgx.example.com:/home/researcher/ml-project/

Warning

If you modify local files after syncing, you must sync again before running the workflow on the cluster.

Change Dependency Types¶

By default, edges use afterok (run only if the upstream job completes successfully). To change the dependency type:

Click an edge on the DAG builder canvas
A popover appears with four options:
- afterok — Run after successful completion (default)
- after — Run after the upstream job starts running
- afterany — Run after the upstream job ends regardless of status
- afternotok — Run only if the upstream job fails
Click the desired type. The edge updates immediately

Run a Workflow¶

Navigate to Workflows and click a workflow card to open the detail page
Click Run Workflow in the toolbar
The system identifies referenced mounts by matching each job's work_dir against mount remote paths (longest prefix match), then syncs them via rsync
Jobs are submitted in dependency order. SLURM handles scheduling through --dependency flags (e.g., --dependency=afterok:12345)
Status updates appear in the DAG view every 10 seconds. Each node shows the current SLURM state (PENDING, RUNNING, COMPLETED, FAILED)

Note

The run transitions through these phases: syncing (pushing files), submitting (sending sbatch commands), running (polling job statuses), then a terminal state (completed, failed, or cancelled).

Run a Parameter Sweep¶

Launch the same workflow over a cross-product of arg values without duplicating YAML files. Every cell runs as an independent workflow_run under one parent sweep_run.

Open the Run dialog:

Navigate to Workflows
On the target workflow card, click the Run button (triangle icon)

Toggle LIST mode on args:

Each arg row has a SINGLE | LIST toggle on the right
Switch the arg you want to sweep to LIST
Enter comma-separated values (for example 0.001, 0.01, 0.1)
Repeat for any additional sweep axes -- the cell count is the cross-product of all LIST args
A Sweep: N cells . lr[3] . seed[3] badge appears above the submit button, showing the total cell count and the cardinality of each axis

Configure Advanced options:

Expand the Advanced section below the args
Toggle fail_fast to cancel peers on first failure (default: off)
Set max_parallel (default: 4; required for sweeps)

Submit:

Click RUN SWEEP (N) where N is the cell count
The UI navigates to /sweep_runs/{id} for the new sweep

Track progress on the sweep detail page:

Progress bar -- segmented by cell status (completed / failed / cancelled / running)
ETA -- linear estimate based on completed cells; hidden until at least one cell reaches a terminal state
Draining indicator -- spinner plus "draining pending cells" label while a cancelled sweep finishes up active cells
Cells table with:
- Status filter dropdown (all / pending / running / completed / failed / cancelled)
- Sortable columns: #, Status, Started, Completed, Duration
- Per-cell cancel x button (shown only for non-terminal cells)
Click any cell row to drill into /workflow_runs/{cell_id} for the per-cell DAG view and logs

Note

Sweep submissions always route through the configured SSH adapter via a per-sweep pool. If no SSH profile is configured, the RUN SWEEP button is disabled.

Cancel a Running Workflow¶

While a workflow is running, click Cancel in the toolbar on the workflow detail page
All submitted SLURM jobs are cancelled via scancel
The run status changes to cancelled

Warning

Jobs that have already completed before the cancel request are not affected. Only PENDING and RUNNING jobs are cancelled.

Edit an Existing Workflow¶

On the workflow detail page, click the Edit button (pencil icon) in the toolbar
Alternatively, on the workflow list page, click Edit on a workflow card
The DAG builder opens with all jobs and dependency edges pre-loaded on the canvas
Modify job properties by clicking nodes, add or remove connections by dragging handles, or add new jobs with Add Job
Click Update Workflow to validate and save the changes

The update overwrites the existing YAML file on disk. The same validation rules apply: unique job names, non-empty commands, and an acyclic dependency graph.

Delete a Workflow¶

On the workflow detail page, click the delete button (trash icon)
Confirm in the dialog that appears
The YAML file is removed from the workflow directory and you are redirected to the workflow list

Warning

Deletion is permanent. There is no undo. Active runs for the deleted workflow are not affected (they continue to track job statuses in memory), but you will not be able to start new runs.

Manage Mounts from the Web UI¶

In addition to the CLI, mounts can be managed directly from the DAG builder.

In the DAG builder, click the gear icon in the toolbar to open Mount Settings
The modal shows all configured mounts with their name, local path, and remote path
To add a new mount, fill in the name, local path, and remote path fields, then click Add
To remove a mount, click the trash icon on the corresponding row
Changes are saved immediately to the SSH profile configuration (~/.config/srunx/config.json)

Note

Mounts added through the Web UI are identical to those added via srunx ssh profile mount add. Both methods modify the same configuration file.

View Job Logs from a Workflow Run¶

During or after a workflow run, click a job node in the DAG view
The sidebar shows job details including the SLURM job ID and current status
Click View Logs to see stdout and stderr output
For running jobs, log content is polled and refreshed automatically

Run Without SSH (Frontend Only)¶

For frontend development or demos, the server can start without an SSH connection:

srunx ui

If no SSH profile is configured, the server starts with a warning:

No SSH configuration provided. Set SRUNX_SSH_PROFILE or
SRUNX_SSH_HOSTNAME + SRUNX_SSH_USERNAME to connect to a SLURM cluster.

SLURM endpoints will return 503 Service Unavailable, but the frontend loads normally.

Develop the Frontend¶

For frontend development with hot-reload:

# Terminal 1: Start the backend
srunx ui

# Terminal 2: Start the Vite dev server
cd src/srunx/web/frontend
npm install
npm run dev

The Vite dev server runs on http://localhost:3000 and proxies API requests to the backend on port 8000.

Run Tests¶

Backend tests:

uv run pytest tests/web/ -v

Frontend E2E tests:

cd src/srunx/web/frontend
npm run test:e2e

Frontend type check:

cd src/srunx/web/frontend
npx tsc --noEmit