skypilot-users

How to troubleshoot a stuck spot job in STARTING status?

I have a spot job that appears to be stuck at STARTING status. I have checked for resource availability but couldn't find any. How can I troubleshoot this issue?

Ja

Jason Krone

Asked on Feb 29, 2024

  1. Check spot job logs using sky spot logs --controller <job-id> to gather more information about the starting job.
  2. Look for any resource availability issues in the logs.
  3. If no available resources are found, investigate further to resolve the resource constraint.
  4. Ensure that the necessary resources are allocated for the spot job to progress successfully.
Feb 29, 2024Edited by