Codenil

10 Key Insights into Kubernetes v1.36's Server-Side Sharded List and Watch

Published: 2026-05-10 16:20:35 | Category: Cloud Computing

As Kubernetes clusters scale to tens of thousands of nodes, controllers that watch high-cardinality resources like Pods hit a performance bottleneck. Each replica of a horizontally scaled controller receives the full event stream from the API server, wasting CPU, memory, and network bandwidth on objects it doesn't own. Kubernetes v1.36 introduces a game-changing alpha feature: server-side sharded list and watch (KEP-5866). This moves filtering upstream, so each replica only receives the slice of data it needs. Below are 10 essential things you need to know about this feature.

1. The Scaling Wall Faced by Controllers

In large Kubernetes clusters, controllers that monitor high-cardinality resources (like Pods) struggle as the number of nodes grows. Each controller replica receives the entire event stream from the API server, even if it only manages a subset of the objects. This means every replica deserializes every event, consuming CPU and memory, only to discard the ones it doesn't own. Scaling out the controller doesn't help—it multiplies the wasted resources. The result is a scaling wall that limits cluster size and operational efficiency.

10 Key Insights into Kubernetes v1.36's Server-Side Sharded List and Watch

2. How Client-Side Sharding Falls Short

Some controllers, like kube-state-metrics, already support horizontal sharding by dividing the keyspace among replicas. Each replica discards objects it doesn't own. While this works logically, it doesn't reduce the data flow from the API server. The network bandwidth still scales with the number of replicas, not with the shard size. CPU cycles spent on deserialization are wasted for the discarded fraction. Client-side sharding is a band-aid, not a cure—the full event stream still hits every replica.

3. The Core Idea: Server-Side Filtering

Server-side sharded list and watch solves the problem by moving filtering upstream into the API server itself. Instead of sending the entire event stream to every replica, the API server now only sends matching events based on the shard assignment. Each replica tells the API server which hash range it owns, and the API server returns only the objects that fall within that range. This dramatically reduces network, CPU, and memory overhead, making horizontal scaling efficient for the first time.

4. The New shardSelector Field

The feature adds a shardSelector field to ListOptions. Clients specify a hash range using the shardRange() function, for example: shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). This tells the API server to return objects whose deterministic hash falls within the specified range. The shardSelector applies to both list responses and watch event streams, ensuring consistent filtering across all API calls from a given replica.

5. Deterministic Hashing with FNV-1a

The API server computes a deterministic 64-bit FNV-1a hash of the specified field (e.g., UID or namespace). This hash is consistent across all API server instances, meaning multiple replicas can safely coordinate shards. The hash space is divided into ranges, and each replica claims a range. Because the hash function is deterministic, the same object will always map to the same hash value, ensuring stable shard assignments even as the cluster scales or API servers restart.

6. Supported Field Paths for Sharding

Currently, the feature supports two field paths for hashing: object.metadata.uid and object.metadata.namespace. Using UID gives a uniform distribution across all objects, while namespace-based sharding can be useful for multi-tenant or per-namespace controllers. The choice of field depends on the workload pattern. For example, a controller that manages Pods across all namespaces might use UID for even load distribution, while a namespace-specific operator might use namespace to naturally partition work.

7. Implementing Sharded Watches in Controllers

Controllers using the informer pattern can integrate server-side sharding by injecting the shardSelector into ListOptions via WithTweakListOptions. In Go, you create a shardSelector string and pass it to a custom tweak function. This allows each replica to request a specific hash range. The informer will then only list and watch objects in that range, reducing the data each replica processes. Here's a snippet:

factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
 informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
 opts.ShardSelector = shardSelector
 }),

8. Example: Splitting a Two-Replica Deployment

For a two-replica deployment, you split the 64-bit hash space in half. Replica 0 takes the range '0x0000000000000000' to '0x8000000000000000', and Replica 1 takes the remainder. Each replica sets its shardSelector accordingly. This ensures that every object's hash falls into exactly one shard, and no replica receives duplicate events. The result: network traffic is halved per replica, and CPU/memory usage drops proportionally. Scaling to more replicas divides the load further.

9. Benefits Over Existing Solutions

Server-side sharding offers clear advantages over client-side approaches. Network bandwidth scales with shard size, not replicas. CPU usage drops because deserialization is only done for relevant objects. Memory is reduced as the cache holds only the shard's objects. Additionally, the feature works seamlessly with existing controllers that use standard informers—no major architectural changes required. It also plays well with multiple API server replicas, making it a robust solution for large clusters.

10. Alpha Status and What's Next

As of Kubernetes v1.36, server-side sharded list and watch is an alpha feature. It is not enabled by default. The community is actively testing it, and feedback is welcome. Future enhancements may include support for custom field paths, dynamic shard rebalancing, and integration with more controller frameworks. If you run large-scale clusters, try this feature in a non-production environment and share your experience. It could be the key to unlocking truly scalable controllers.

Conclusion

Server-side sharded list and watch marks a significant shift in how Kubernetes manages event streams at scale. By moving filtering to the API server, it eliminates the wasted resources inherent in client-side sharding. This feature paves the way for controllers that can efficiently monitor tens of thousands of nodes without overwhelming the cluster. As the feature matures, it will become an essential tool for anyone operating large Kubernetes deployments. Stay tuned for updates and start experimenting with v1.36 today.