I am looking for a method to define a sequence of spot tasks that run sequentially. Can you provide me with a code example?
Chenggang Wu
Asked on Mar 25, 2024
Yes, SkyPilot allows you to define a sequence of spot tasks that run one after another through its Spot Pipeline feature. This is particularly useful for running a series of jobs that depend on each other, such as training a model followed by running inference on it. The Spot Pipeline feature enables you to specify different resource requirements for each task, optimizing resource utilization and cost savings while simplifying task management. To define a sequence of spot tasks, you can specify the sequence in a YAML file. Below is an example of how to define a pipeline with two tasks, train
and eval
, which are executed sequentially:
name: pipeline
---
name: train
resources:
accelerators: V100:8
file_mounts:
/checkpoint:
name: train-eval # NOTE: Fill in your bucket name
mode: MOUNT
setup: |
echo setup for training
run: |
echo run for training
echo save checkpoints to /checkpoint
---
name: eval
resources:
accelerators: T4:1
file_mounts:
/checkpoint:
name: train-eval # NOTE: Fill in your bucket name
mode: MOUNT
setup: |
echo setup for eval
run: |
echo load trained model from /checkpoint
echo eval model on test set
To submit this pipeline, you would use the command sky spot launch -n pipeline pipeline.yaml
, and you can monitor the status of the pipeline with sky spot queue
or sky spot dashboard
.