Karpenter 1.0 isn’t just another version update—it’s a milestone in the evolution of autoscaling for Amazon EKS

Karpenter adoption has exploded as the technology has proven itself to be the most advanced node scheduling technology for EKS available. 

And now with Karpenter’s General Availability (GA), the technology has reached a level of maturity that many organizations have been waiting for, even as others have already been running it successfully in production. 

Here at nOps, we’re big fans of Karpenter. Find out what we’re most excited about in the key updates.

The four most important updates in Karpenter 1.0

First, let’s dive into the features with the biggest impact: disruption controls by reason, consolidateAfter for underutilized nodes, terminationGracePeriod and Drift feature.

1. Disruption Controls by Reason

Karpenter’s disruption controls were already an amazing feature for cost-efficiency and availabilty. Karpenter automatically discovers disruptable nodes and spins up replacements when needed. Disruption controls give users more control over how and when Karpenter terminates nodes.

Version 1.0 introduces disruption budgets that can be specified by reason—such as Underutilized, Empty, or Drifted. 

This feature is KEY in production environments where maintaining service availability is critical. Imagine you’re running an e-commerce platform during Black Friday. During peak traffic, underutilized nodes might be consolidated to save costs, but this could unintentionally disrupt ongoing transactions, leading to a poor customer experience. 

Without Karpenter’s disruption budgets by reason, there was no easy way to prevent this. Now, you can define policies that restrict node consolidation during critical periods while allowing it during off-peak times, such as flash sales or holiday promotions, ensuring your service remains uninterrupted and cost-efficient.

2.
consolidateAfter 
can now be used for underutilized nodes

Traditional autoscaling methods often result in frequent node churn, especially in environments with fluctuating demand. This can lead to inefficiencies and increased operational costs. 

Karpenter 1.0 introduces the ConsolidateAfter feature, which gives you granular control over how quickly underutilized nodes are consolidated. By delaying consolidation during temporary traffic spikes, this feature minimizes node churn while ensuring that sufficient capacity is maintained. This is particularly valuable for dynamic workloads where demand can change rapidly, ensuring cost efficiency without sacrificing performance.

Consider a media streaming service like Netflix, where viewer demand fluctuates throughout the day. Without intelligent node management, the platform could experience frequent node churn as Karpenter quickly consolidates underutilized nodes during low demand periods, only to scale them up again during evening peaks. 

With the ConsolidateAfter feature, you can set a delay before underutilized nodes are consolidated. This way, the service can handle sudden spikes in viewership without the inefficiencies of constant scaling, optimizing both performance and cost.

3. New Termination Grace Period feature

Security and compliance are critical in maintaining a stable and secure Kubernetes environment. However, managing the lifecycle of nodes to ensure they remain compliant can be challenging. 

For example, if you need to comply with strict security and compliance regulations, you might want to ensure that no node runs for longer than a predefined period to avoid potential vulnerabilities. 

Before Karpenter 1.0, managing node lifecycles to meet these regulations required manual intervention or custom automation. The terminationGracePeriod automates this process by enforcing a maximum node lifetime, automatically terminating and replacing nodes that have exceeded their predefined lifespan.

This prevents the use of outdated or potentially non-compliant nodes, ensuring that the infrastructure remains secure and compliant without manual oversight and reducing the risk of outdated software or configurations.

4. Drift Feature promoted to stable

With the drift feature now stable and the feature gate removed, Karpenter will automatically replace nodes that have drifted from the desired state. This helps ensure that environments remain consistent and secure especially when dealing with AMI updates. 

For example, imagine that your organization regularly updates its AMIs to include the latest security patches and configurations. However, over time, some nodes in your Kubernetes cluster might drift from the desired state due to manual changes, configuration drifts, or outdated AMIs.

Before the drift feature was stabilized, ensuring that all nodes were aligned with the updated AMIs required manual intervention or complex automation scripts. Now, with the drift feature enabled and stable in Karpenter 1.0, any node that deviates from the desired AMI configuration is automatically replaced. This change simplifies the management of node states, ensuring that clusters remain up-to-date without requiring manual intervention.

More important updates in Karpenter 1.0

Here are five more quick key updates you need to know:

5. Required
amiSelectorTerms
for Better AMI Management

By making
amiSelectorTerms 
a required field and introducing the alias term, Karpenter now provides a more robust way for users to manage AMI updates. This change is especially important for production environments, where users need more control over the specific AMI versions their nodes use. It helps avoid unintentional upgrades and ensures that nodes are running the expected version for more stability and predictability.

6. Default Restriction of Instance Metadata Service Access

Karpenter now restricts access to the Instance Metadata Service (IMDS) by default for new EC2NodeClass, aligning with Amazon EKS best practices. This helps prevent unauthorized access to IAM instance profiles from within pods. By enforcing this restriction, Karpenter reduces the risk of privilege escalation attacks.

7. Moved Kubelet Configuration to EC2NodeClass

Kubelet configuration has been relocated to the EC2NodeClass API. Why is this change good? It ensures that different NodePools can have customized kubelet settings without causing configuration drift or conflicts. It also aids in maintaining consistency across different NodePools, which is essential for large, complex environments.

More on Karpenter 1.0

For a comprehensive overview of these features and additional changes introduced in version 1.0.0, you can refer to the Karpenter 1.0 launch blog. To delve deeper into Karpenter, visit the official website at karpenter.sh.

The future of autoscaling with Karpenter

Looking ahead, Karpenter 1.0 sets the stage for a future where autoscaling isn’t reactive; it’s proactive and strategic. The introduction of these new features means that Karpenter can anticipate the needs of your workloads and make intelligent decisions in real-time, adapting to changes in demand and infrastructure with minimal human intervention.

As organizations continue to adopt Karpenter, we’re moving towards a future where the complexities of cluster management are increasingly automated, allowing DevOps teams to focus on higher-level strategic initiatives.

If you’re looking to optimize your Kubernetes infrastructure with greater precision, efficiency and confidence, Karpenter + nOps is even better. 

nOps helps engineering teams to more easily and effectively leverage the power of Karpenter and Spot for cost savings and reliability, freeing them to focus on building and innovating.

nOps Compute Copilot enhances Karpenter with organization-wide awareness of your RI and SP, intelligent Spot instance selection, Spot instance diversification, instance lifespan optimization, real-time workload reconsideration and more.

An image highlighting nOps Compute Copilot’s Capabilities
nOps was recently ranked #1 in G2’s cloud cost management category. Book a demo to find out how to save in just 10 minutes.