I'm curious about how the Sky Serve readiness probe manages routing of traffic to different replicas based on the readiness signal. Specifically, I want to know if traffic is immediately routed to a different replica upon receiving a 'not ready' signal, such as when the max queue size is reached within a container. Is this behavior only during startup or does it continue during operation?
Aleks Smechov
Asked on Mar 19, 2024