The fudgelet is a lightweight agent that runs as a background process on your nodes; managing workloads, reporting hardware metrics and sending your application logs to the server.
You should install the fudgelet on every GPU node in your cluster. If you are running a slurm cluster, you should also install the fudgelet on the login nodes.
Installer Scripts
To install the fudgelet using the standard package manager for your distribution, you can use the install.sh script.
Install the fudgelet using a linux package manager
curl https://get.clusterfudge.com/install.sh | { export CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE; sudo -E bash; }
If you'd just like to download the latest fudgelet version and have your fudgelet.toml config file initialized, use the download.sh script.
Download the fudgelet binary using wget/curl
curl https://get.clusterfudge.com/download.sh | { export CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE; sudo -E bash; }
Custom Install
If you want more control over how the fudgelet is installed, you can download the latest packages direct from our package repository.
Download the latest version directly
curl -fsSL -o $OUT https://storage.googleapis.com/clusterfudge-releases/fudgelet/$OS/$ARCH/latest/fudgelet
Configure the Apt repo manually
curl https://europe-west2-apt.pkg.dev/doc/repo-signing-key.gpg | gpg --dearmor -o /usr/share/keyrings/google-ar-archive-keyring.gpg
chmod 0644 /usr/share/keyrings/google-ar-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/google-ar-archive-keyring.gpg] https://europe-west2-apt.pkg.dev/projects/clusterfudge-images fudgelet main" | sudo tee -a /etc/apt/sources.list.d/artifact-registry.list
Configure the Yum repo manually
sudo yum makecache
sudo tee -a /etc/yum.repos.d/artifact-registry.repo << EOL
[fudgelet]
name=fudgelet
baseurl=https://europe-west2-yum.pkg.dev/projects/clusterfudge-images/fudgelet-rpm
enabled=1
repo_gpgcheck=0
type=rpm
gpgcheck=0
metadata_expire=6h
EOL
sudo yum install -y fudgelet
sudo yum makecache