Great writeup on the core fundamentals, saved this to share with engineers who are new to k8s and need a quick primer.
Re: This piece -
> Given the Controller pattern, why isn't there support for "Cloud Native" architectures?
> I would like to have a ReplicaSet which scales the replicas based on some simple calculation for queue depth (eg: queue depth / 16 = # replicas)
> Defining interfaces for these types of events (queue depth, open connections, response latency) would be great
> Basically, Horizontal Pod Autoscaler but with sensors which are not just "CPU"
HPAs are actually still what you want here - you can configure HPAs to scale automatically based on custom metrics. If you run Prometheus (or a similar collector), you can define the metric you want (e.g. queue-depth) and the autoscaler will make scaling decisions with these in mind.
CPU tracking is provided by the metrics API, which either reads kubelet metrics directly (the original, old, but simplest way), or a metrics adapter that reads the metrics from a third party collector and implements the API.
The behavior is supported by the v1 api rules so no, it’s extremely unlikely to be moved out.
That said, with GPU workloads gaining steam I wouldn’t be surprised if we added new “supported everywhere” metrics at some point.
Karpenter is great for managing spot fleet nodes as well. Most of our clusters run a small aws managed node group for karpenter and the rest of the nodes will be spot fleet and managed by karpenter.
^ this right here. We used KEDA to query DynamoDB to look at a queue depth we wrote to a table. If number was X, then we would scale on it. Was pretty slick.
Re: This piece -
> Given the Controller pattern, why isn't there support for "Cloud Native" architectures?
> I would like to have a ReplicaSet which scales the replicas based on some simple calculation for queue depth (eg: queue depth / 16 = # replicas)
> Defining interfaces for these types of events (queue depth, open connections, response latency) would be great
> Basically, Horizontal Pod Autoscaler but with sensors which are not just "CPU"
HPAs are actually still what you want here - you can configure HPAs to scale automatically based on custom metrics. If you run Prometheus (or a similar collector), you can define the metric you want (e.g. queue-depth) and the autoscaler will make scaling decisions with these in mind.
Resources:
https://kubernetes.io/docs/tasks/run-application/horizontal-...
https://learnk8s.io/autoscaling-apps-kubernetes