Ananth is trying to install a custom version of Ray in a Skypilot environment to run xgboost-ray. He is facing issues with the version compatibility and dashboard display. He is also concerned about multiple Ray clusters being created due to setup commands. An example configuration and setup script are provided in the discussion thread.
Ananth G
Asked on Feb 16, 2024
Yes, it is possible to install a custom version of Ray in a Skypilot environment.
Here is a full YAML example that is functional:
num_nodes: 3
resources:
ports:
# ray ports
- 8265-8266
setup: |
echo "Running setup."
# We are installing a separate ray cluster than the one installed by skypilot.
conda deactivate
conda activate custom_ray
if [ $? -ne 0 ]; then
conda create -n custom_ray python=3.10 -y
conda activate custom_ray
conda install pip
pip install 'ray[default]==2.8.1'
fi
run: |
num_nodes=`echo "$SKYPILOT_NODE_IPS" | wc -l`
head_ip=`echo "$SKYPILOT_NODE_IPS" | head -n1`
conda deactivate
conda activate dwn_ray
if [ "$SKYPILOT_NODE_RANK" == "0" ]; then
ps aux | grep ray | grep 6379 &> /dev/null || ray start --head --disable-usage-stats --port 6379
sleep 10
else
sleep 15
ps aux | grep ray | grep 6379 &> /dev/null || ray start --address $head_ip:6379 --disable-usage-stats
fi
Here are some key points to consider:
Use a setup script to install the custom Ray version:
conda deactivate
conda activate custom_ray
if [ $? -ne 0 ]; then
conda create -n custom_ray python=3.10 -y
conda activate custom_ray
conda install pip
pip install 'ray[default]==2.8.1'
fi
Ensure proper activation and installation steps in the setup script to avoid conflicts with the existing Ray version.
Use conditional statements to start Ray head node only on specific nodes:
if [ "$SKYPILOT_NODE_RANK" == "0" ]; then
ps aux | grep ray | grep 6379 &>/dev/null || ray start --head --disable-usage-stats --port 6379
sleep 10
fi
Address issues with multiple Ray clusters by controlling the setup commands based on node rank.
Install Ray with necessary components for dashboard functionality:
pip install 'ray[default]'
Monitor log messages for IP address conflicts and ensure proper connection to the Ray cluster.
Adjust the setup script and installation commands as needed to achieve the desired custom Ray version installation and functionality in the Skypilot environment.