skypilot-users

Does Skypilot Serve have a 'scale to 0' behavior for its services?

I'm wondering if Skypilot Serve has any kind of 'scale to 0' behavior for its services, except the controller. I would like to automatically spin down all services for very low traffic models as a cost-saving measure.

Be

Benjamin Botwin

Asked on Jan 11, 2024

Yes, Skypilot Serve supports setting min_replicas to 0 in the service configuration. This means that the service will scale the number of replicas to 0 if the service experiences consistently no traffic. When the service has 0 replicas, new requests will trigger an immediate scale-up.

Jan 12, 2024Edited by