๐งฐ ezpz CLIโ๏ธ
Once installed, ezpz provides a CLI with a few useful utilities
to help launch distributed PyTorch applications.
Explicitly, these are ezpz <command>:
- ๐
ezpz launch: Launch commands with automatic job scheduler detection (PBS, Slurm) - ๐ฏ
ezpz test: Run simple distributed smoke test - ๐ฉบ
ezpz doctor: Health check your environment -
๐
ezpz.examples: Collection of distributed training examples-
Distributed Training Examples
See the Examples page for full details.
-
test: Simplest DDP training loop -
fsdp: FSDP for memory-efficient training -
vit: Vision Transformer with FSDP + optionaltorch.compile -
fsdp_tp: 2D parallelism (FSDP + Tensor Parallel) -
diffusion: Diffusion model training with FSDP -
hf: Fine-tune causal LM with explicit training loop (Accelerate + FSDP) -
hf_trainer: Hugging Face Trainer integration
-
-
-
Experimental
- ๐ฆ
ezpz tar-env: Package current Python environment as a tarball - ๐
ezpz yeet-env: Broadcast environment tarball to all worker nodes via MPI
- ๐ฆ
-
ezpz --helpTo see the list of available commands, run:
$ ezpz --help Usage: ezpz [OPTIONS] COMMAND [ARGS]... ezpz distributed utilities. Options: --version Show the version and exit. -h, --help Show this message and exit. Commands: doctor Inspect the environment for ezpz launch readiness. launch Launch a command across the active scheduler. test Run the distributed smoke test.