skypilot-users
Does Skypilot Serve have a 'scale to 0' behavior for its services?
I'm wondering if Skypilot Serve has any kind of 'scale to 0' behavior for its services, except the controller. I would like to automatically spin down all services for very low traffic models as a cost-saving measure.
Be
Benjamin Botwin
Asked on Jan 11, 2024
Yes, Skypilot Serve supports setting min_replicas
to 0 in the service configuration. This means that the service will scale the number of replicas to 0 if the service experiences consistently no traffic. When the service has 0 replicas, new requests will trigger an immediate scale-up.
Jan 12, 2024Edited by