Categorygithub.com/clemsonciti/jobperf
repositorypackage
0.1.0
Repository: https://github.com/clemsonciti/jobperf.git
Documentation: pkg.go.dev

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# README

Jobperf

A tool to check resource usage of jobs on HPC clusters. Jobperf was originally developed by Clemson University for the use on their HPC cluster, Palmetto.

For more information on usage, see the Clemson's jobperf documentation.

The design of this system was documented in "A Simple Resource Usage Monitor for Users of PBS and Slurm", presented at PEARC24.

Install

Pre-built binaries are available as GitHub releases for Linux/amd64.

Requirements

  • For GPUs, nvidia-smi should be installed and available.
  • Jobperf has been run on both PBS and Slurm. However, scheduler deployments vary wildly and it is not expected that it will work on all clusters. Here are results of testing jobperf on a variety of cluster.
ClusterSchedulerVersionJobAcctGatherTypeCLI works?Web works?
Palmetto 1OpenPBS20.0.0N/AYesYes
Palmetto 2Slurm23.11.3jobacct_gather/cgroupYesYes
Stampede 3 (TACC)Slurm23.11.1jobacct_gather/linuxNoNo
Anvil (Purdue)Slurm23.11.1jobacct_gather/linuxYesYes
Delta (NCSA)Slurm23.02.7jobacct_gather/cgroupYesNo
Bridges-2 (PSC)Slurm22.05.11jobacct_gather/cgroupNoNo
Expanse (SDSC)Slurm23.02.7jobacct_gather/linuxYesYes
  • Jobperf failed on Stampede 3 due to not having expected cgroups and scontrol listpids not working as expected.
  • Jobperf failed on Bridges-2 due to not squeue failing with the --json flag while filtering by job ID. This could be due to an older version of Slurm.

Jobperf may break on future versions of Slurm as it relies on consistent output from the JSON formatted output of squeue and sacct. Usually it is not to hard to fix Jobperf once the new version's output is known.

Configuration

TODO

Build

To build this tool, you need a recent version of go (at least version 1.21). For complete install instructions see the official website.

Then build like many other go tools, including running go generate to fetch the JS dependencies:

go generate ./...
go build ./cmd/jobperf

The built binary will be available as jobperf.

Build Configuration

When building the binary, you can embed the version and some default configuration parameters with -ldflags:

ParameterMeaning
buildVersionThe build version.
buildCommitThe build commit.
buildDateThe build date.
defaultSupportURLThe URL used for the support link.
defaultDocsURLThe URL used for the documentation link.
defaultUseOpenOnDemandIf Open OnDemand should be used as a reverse proxy when HTTP mode is enabled.
defaultOpenOnDemandURLThe URL of the Open OnDemand instance to use as reverse proxy.

For example, to set the documentation URL to https://example.com when building, run:

go build -ldflags='-X main.defaultDocsURL=https://example.com' ./cmd/jobperf

This repo also has a goreleaser configuration file (.goreleaser.yaml) which sets these appropriately for the releases.