I'm launching my training jobs from a docker container using Skypilot, but it seems that the Ray cluster needs to be running the exact same Python/Ray versions. How can I specify the versions that Skypilot installs?
Nathan Matare
Asked on Sep 05, 2023
You can create a new conda environment and start the Ray cluster head using the new conda environment. Then, all the workers can connect to the cluster using the new environment as well. Here's an example snippet:
setup: |
conda activate myray
if [ $? != "0" ]; then
conda create -n myray -y python=${PYTHON_VERSION}
conda activate myray
fi
pip install <https://s3-us-west-2.amazonaws.com/ray-wheels/master/${RAY_COMMIT}/ray-3.0.0.dev0-cp38-cp38-manylinux2014_x86_64.whl>
# works for single node (if you need to do multiple nodes, we can talk more about it)
ray start --head
run: |
# you should be able to connect to your own ray cluster in your program
conda activate myray
python myprogram.py