PodWarden
Use Cases

Blender Render Farms

Manage homogeneous and heterogeneous render fleets for 3D animation and VFX production — all supporting infrastructure included

Rendering 3D scenes is embarrassingly parallel: every frame is independent. This makes render farms conceptually simple — split the frame range across workers, collect the output — but operationally messy. You have to provision workers consistently, share scene files across machines, route frames to the right render engine, and keep the farm healthy during a production crunch.

PodWarden manages the render workers and every supporting service they depend on. If you already have shared storage, a render manager, or a monitoring stack, bring it. If you don't, every piece is in the Hub catalog.


What You Need

ComponentBring your ownOr deploy from Hub
Shared storageExisting NFS server, S3 bucketNFS-Ganesha (scene files + frames over NFS) or MinIO / RustFS (S3 object storage)
Render managerDeadline, OpenCue, Flamenco, customFlamenco — open-source Blender render manager, deploys as a workload
DatabaseExisting PostgreSQL or SQLitePostgreSQL — from Hub, used by Flamenco and other services
MonitoringExisting Prometheus + GrafanaPrometheus + Grafana — cluster and GPU utilization dashboards
SSO / access controlExisting identity providerKeycloak — from Hub, SSO for the render manager web UI and PodWarden itself

Stack Architecture


Building the Foundation

1. Shared Storage

Render workers need to read scene files and write rendered frames. All workers must reach the same storage location.

If you have an NFS server or S3 bucket, register it as a storage connection in PodWarden. PodWarden checks that every node in your cluster can reach the storage before you deploy workers that depend on it.

If you don't:

  • NFS-Ganesha — Import from Hub. Deploy to a node with fast local disk (or attach a large volume). Workers mount /scenes and /output via NFS. Every node sees the same filesystem.
  • MinIO or RustFS — If your render pipeline supports S3, deploy either as a workload. Use separate buckets for inbound scene archives and outbound frame output.

For large teams with many render nodes, NFS is simpler for Blender (it reads .blend files with relative paths). For distributed pipelines or cloud burst nodes, S3 works well for archival and transfer.

2. Render Manager

The render manager assigns frame ranges to workers and collects output.

If you use Deadline or OpenCue, PodWarden manages the worker containers that connect to your existing manager.

If you don't have one: import Flamenco from Hub. Flamenco is the Blender Foundation's own render manager — it understands Blender jobs natively, runs as a server + worker model, and stores state in PostgreSQL.

  • Deploy Flamenco Manager (one instance, persistent deployment) pointing at your NFS storage
  • Deploy Flamenco Worker (the workload that runs on render nodes) as a DaemonSet or deployment
  • Import PostgreSQL from Hub for Flamenco's database if you don't have one

3. Monitoring

Import Prometheus and Grafana from Hub. For GPU nodes, add DCGM Exporter as a DaemonSet — it runs automatically on every node and exposes per-GPU metrics: utilization, VRAM usage, temperature.

The Grafana dashboard shows at a glance which nodes are rendering, which are idle, and which are struggling (thermal throttle, VRAM pressure, stalled jobs).


Adding Render Nodes

Any machine with SSH access becomes a PodWarden host: gaming PCs, workstations, VMs, cloud instances. PodWarden detects GPU model and VRAM automatically at provisioning time.

Provision one host as the cluster control plane. Then join additional nodes — one click per node, PodWarden installs K3s and the GPU runtime via SSH.

Homogeneous farms

All nodes have the same hardware. Frame distribution is predictable and even. Create one render worker template, deploy to the cluster. The scheduler places workers across all nodes.

Heterogeneous farms

Mixed hardware — some A100 nodes, some older GTX 1080 workstations, some CPU-only machines. Two approaches:

Node selectors — Tag nodes with hardware labels during provisioning:

{ "gpu": "a100" }
{ "gpu": "rtx-4090" }
{ "cpu-only": "true" }

Create separate stacks for each hardware class, each with a matching node_selector. Deploy all definitions to the same cluster. Blender workers with CUDA targets land on GPU nodes; CPU workers land on everything else.

Separate clusters — Group high-end nodes into a "fast" cluster and older hardware into a "slow" cluster. Deploy different Flamenco worker templates to each. Your render manager submits heavy frames to the fast cluster and lighter frames to the slow one.


Render Worker Templates

GPU worker (CUDA/OptiX)

Kind:           Deployment   (or DaemonSet for one-per-node)
Image:          linuxserver/blender:latest
GPU count:      1
VRAM:           16Gi
CPU:            16
Memory:         32Gi
Node selector:  { "accelerator": "nvidia" }
VariableExampleDescription
RENDER_DEVICEOPTIXRender device: OPTIX, CUDA, HIP, CPU
RENDER_ENGINECYCLESBlender engine
FLAMENCO_MANAGERhttp://flamenco.mesh:8080Render manager URL
WORKER_NAME(hostname)Worker display name in Flamenco
TILE_SIZE256Tile size — larger for GPU, smaller for CPU

CPU worker

Kind:           Deployment
Image:          linuxserver/blender:latest
GPU count:      0
CPU:            32
Memory:         64Gi
Node selector:  { "cpu-only": "true" }

CPU workers are useful for older hardware still capable of rendering, nodes without NVIDIA GPUs, or scenes that don't run efficiently on GPU (volumes, certain shaders).

Volume mounts

Mount pathVolume typeContents
/scenesNFSSource .blend files
/outputNFSRendered frame output
/cacheemptyDirBlender texture and shader cache

DaemonSet Mode

Use kind: DaemonSet to run exactly one render worker per node, automatically. Add a node to the cluster and it gets a worker within seconds — no manual assignment. Remove a node and the worker is gone. Ideal when you want maximum utilization with zero scheduling overhead.


Scaling for Deadline Crunches

When a production schedule compresses:

When the crunch passes, remove the rented nodes. Your permanent studio machines stay in the cluster. Billed time on cloud nodes: only the hours they were rendering.


End-to-End Production Sequence


Hub Templates for This Stack

TemplateRole
Blender render worker (CUDA/OptiX)GPU render worker
Blender render worker (CPU)CPU render worker
Flamenco ManagerRender job manager
Flamenco WorkerAlternative to native Blender worker
Deadline WorkerWorker for AWS Thinkbox Deadline
OpenCue WorkerWorker for Google OpenCue
NFS-GaneshaNFS server for scene files and frames
MinIOS3 object storage for archives and transfers
RustFSHigh-performance S3 object storage
PostgreSQLDatabase for Flamenco Manager
PrometheusMetrics collection
GrafanaRender farm dashboards
DCGM ExporterPer-GPU metrics (DaemonSet)
KeycloakSSO for render manager and PodWarden access

All of these are standard stacks. They live in the same cluster as your render workers, managed the same way, monitored together. A complete render farm — storage, render manager, workers, monitoring — deployed from a single catalog.