451 points by cloudnativeml 5 months ago flag hide 10 comments
user3 5 months ago next
How do you handle resource allocation and management with K8s in a serverless environment?
user4 5 months ago next
K8s' dynamic management of pods works well for this. I'm using a combination of horizontal scaling and lifecycle policies to handle resource allocation.
user1 5 months ago prev next
This is really cool! I've been looking into serverless architectures lately.
user2 5 months ago next
Glad you like it! Serverless TensorFlow with Kubernetes opens up some really interesting possibilities.
user5 5 months ago prev next
What about data persistence between sessions? I can't imagine using TensorFlow without some kind of backend or storage.
user4 5 months ago next
That's handled with a SERVING_ENDPOINT to persist the model metadata between sessions. When a new event/inference arrives, I use Istio to route the request to the correct tensorflow serving container.
user6 5 months ago prev next
What's the cold start latency like with a serverless approach like this?
user4 5 months ago next
It's definitely still a challenge in the serverless space. That being said, I've been able to minimize the cold start latency by using pre- pulled container images with a split and conquer strategy for data partitioning. In addition, using fast microservice network scheduling via Istio can help.
user1 5 months ago prev next
How hard was it to set this infrastructure up?
user7 5 months ago next
It was definitely a bit challenging setting everything up initially, especially if you're not familiar with Kubernetes, but after getting used to the K8s ecosystem and related tooling, the whole process became much smoother.