ANNOUCEMENT Announcing self-paced Karpenter Lab series - START NOW
To reduce AWS monthly bill, there is another practice that a cloud admin can do to stop or terminate idle instances from the AWS account. There is a default way to find out whether EC2 instances that declare the instance is inactive or not. The CPU average is less than 2%, and the average network I/O has been less than 5MB since last week.
Learn More about Optimize Your Cost by Scheduling Idle Resources
AWS EBS stands for Elastic block storage. EBS lets you store huge amounts of data of any kind i.e. Files system data, transactional Data, relational databases, etc. An EBS volume is like a hard drive attached to an EC2 instance. EBS provides high availability and durability, and is ideal for intensive applications.
Learn More about Monitoring Unattached EBS Volumes to follow best practices of FinOps
Elastic IP (EIP) is an IP address one can reserve for their AWS account. Static IP addresses by nature are associated to a particular machine. However, to keep-up with the dynamic needs of public cloud, users generally use Elastic IP addresses.These IP addresses are called ‘Elastic’ as they can be reassigned or remapped to another instanceas an organization keeps launching and terminating resources.
Check for any unattached Elastic IP (EIP) addresses in your AWS account and release (remove) them in order to lower the cost of your monthly AWS bill.
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
Amazon Web Services enforce a small hourly charge if an Elastic IP (EIP) address within your account is not associated with a running EC2 instance or an Elastic Network Interface (ENI). nOps recommends releasing any unassociated EIPs that are no longer needed to reduce your AWS monthly costs.
Identify any Amazon EC2 instances that appear to be idle and stop or terminate them to help lower the cost of your monthly AWS bill. By default, an EC2 instance is considered ‘idle’ when meets the following criteria (to declare the instance ‘idle’ both conditions must be true):
It is important that your EC2 instances are tagged with correct tags which provide visibility into their usage profile and help you decide whether it’s safe or not to stop or terminate these resources. For Example, knowing the role and the owner of an EC2 instance before you take the decision to stop/terminate it is very important and can avoid unwanted termination of actually used workloads.
This rule can help you with the following compliance standards:
This rule can also help you work with the AWS Well-Architected Framework.
Idle instances represent a good candidate to reduce your monthly AWS costs and avoid accumulating unnecessary EC2 usage charges.
EBS (Elastic Block Storage) volumes are attached to EC2 Instances as storage devices. Unused (Unattached) EBS Volumes can keep accruing costs even when their associated EC2 instances are no longer running.
This rule checks whether there are unused EBS Volumes in your AWS account. nOps recommends you consider deleting non-used EBS volumes to reduce your monthly AWS bills.
This rule can help you with the following:
Compliance frameworks report
AWS Well-Architected Lens
on EC2 instances that appear to be under-utilised and downsize (resize) them to help lower the cost of your monthly AWS bill. By default, an EC2 instance is considered under-utilised when matches the following criteria (to declare the instance under-utilised both conditions must be met):
By default, AWS CloudWatch doesn’t capture an EC2 instance memory utilisation because the necessary metric cannot be implemented at the hypervisor level. In order to report the memory utilisation using CloudWatch you need to install an agent (script) on the instance and create a custom metric (let’s name it
EC2MemoryUtilization
) on the AWS CloudWatch dashboard. The instructions required for installing the monitoring agent depends on the Operating System used by the instance. Please refer to this URL for more details.
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
Downsizing under-utilised EC2 instances to meet the capacity needs at the lowest cost represents an efficient strategy to reduce your monthly AWS costs. For example, resizing a
c5.xlarge
instance provisioned in theUS-East (N. Virginia)
region to ac5.large-type
instance due to CPU and memory underuse, you can roughly reduce your AWS costs by half.
0
for the last 7 days.The AWS CloudWatch metrics used to detect idle RDS instances are:
This rule can help you work with the AWS Well-Architected Framework
You must check for idle instances regularly and terminate them in order to avoid unnecessary charges in your AWS Monthly bill.
However, it is important to consider the following things:
We consider an RDS database instance as “underutilized” when it meets the following criteria:
average CPU utilization
has been less than 30%
for the last 7 days
.total number of ReadIOPS and WriteIOPS
recorded per day for the previous7 days
has been less than 100
on average.The following AWS CloudWatch metrics can be used to detect underutilized RDS instances:
nOps uses this rule in the AWS Well-Architected Framework Lens. It can also help you when checking workloads’ compliance in preparing the SOC 2 Readiness Report.
Downsizing underused RDS database instances can have a tremendous positive impact on your monthly AWS cost. For example, downgrading a db.m5.2xlarge RDS PostgreSQL database instance to a db.m5.large instance due to CPU and IOPS underuse allows you to save roughly 25% (as of September 2021).
This rule can help you with the following
By default, nOps considers a DynamoDB table as “underutilized” when the number of read/write capacity units (RCUs and WCUs) consumed is 30% lower
than the number of provisioned read/write capacity units set for a table over a specified time period.
The following AWS CloudWatch metrics can be helpful to detect such underused DynamoDB Tables:
ProvisionedReadCapacityUnits
– the number of provisioned read capacity units for a DynamoDB table (Units: Count).ConsumedReadCapacityUnits
– the number of read capacity units consumed over the specified time period (Units: Count).ProvisionedWriteCapacityUnits
– the number of provisioned write capacity units for a DynamoDB table (Units: Count).ConsumedWriteCapacityUnits
– the number of write capacity units consumed over the specified time period (Units: Count).When you create a DynamoDB Table in Provisioned
Mode, you are charged for the Provisioned Read/Write Capacity regardless of whether you consume them or not. However, when you create a DynamoDB table in On-Demand
mode, you pay only for the capacity you use.
In Provisioned mode, you can also make use of
AutoScaling
feature where you can specify a minimum capacity and a maximum capacity. DynamoDB can then scale your Provisioned Capacity Units based on scaling configuration. This feature is discussed in a separate rule page.
Auto-scaling is enabled by default for DynamoDB tables. With the help of AWS CloudWatch, it dynamically adjusts the throughput (read and write) capacity of your provisioned DB tables to meet traffic demands.
For performance and cost optimization, nOps recommends you consider enabling Autoscaling on the DB tables.
This rule can help you work with the following:
When dealing with production data that is crucial to your business, it is highly recommended to implement encryption in order to protect it from attackers or unauthorised personnel. With Elastic Block Store encryption enabled, the data stored on the volume, the disk I/O and the snapshots created from the volume are all encrypted. The EBS encryption keys use AES-256 algorithm and are entirely managed and protected by the AWS key management infrastructure, through AWS Key Management Service (AWS KMS).
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
When dealing with production data that is crucial to your business, it is highly recommended to implement encryption in order to protect it from attackers or unauthorised personnel. With Elastic Block Store encryption enabled, the data stored on the volume, the disk I/O and the snapshots created from the volume are all encrypted. The EBS encryption keys use AES-256 algorithm and are entirely managed and protected by the AWS key management infrastructure, through AWS Key Management Service (AWS KMS).
Ensure that all users with AWS Console access have Multi-Factor Authentication (MFA) enabled in order to secure your AWS environment and adhere to IAM security best practices.
This rule can help you with the following compliance standards:
This rule can also help you work with the AWS Well-Architected Framework
Having MFA-protected IAM users is the best way to protect your AWS resources and services against attackers. An MFA device signature adds an extra layer of protection on top of your existing IAM user credentials (username and password), making your AWS account virtually impossible to penetrate without the MFA generated passcode.
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
Having an MFA-protected root account is the best way to protect your AWS resources and services against attackers. An MFA device signature adds an extra layer of protection on top of your existing root credentials making your AWS root account virtually impossible to penetrate without the MFA generated passcode.
AWS S3-managed keys (SSE-S3)
or AWS KMS-managed keys (SSE-KMS)
forServer-Side Encryption.
This rule can help you with the following:
Compliance frameworks
AWS Well-Architected Lens
AWS S3 default encryption setting directs AWS to automatically encrypt your S3 data as it is stored in S3 buckets to prevent unauthorized attackers from accessing it.
maximum two active access keys
but it is recommended only during the key rotation process. nOps strongly recommends to deactivate the old key once the new one has been created so that only one access key remain active for a given IAM user.This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
When dealing with production databases that hold sensitive and critical data, it is highly recommended to implement encryption in order to protect your data from unauthorised access. When you enable RDS encryption, the data stored on the instance, the underlying storage, the automated backups, Read Replicas, and snapshots, all are encrypted. The RDS encryption keys implement AES-256 algorithm and are entirely managed and protected by the AWS key management infrastructure through AWS Key Management Service (AWS KMS).
90 days
. It is highly recommended to remove these unused roles from your AWS account to prevent unauthorized access.This rule can help you with the following:
Compliance Frameworks
AWS Well-Architected Lens
To help you identify these unused roles, IAM now reports the
last-used timestamp
that represents when a role was last used to make an AWS request. You or your security team can use this information to identify, analyze, and then confidently remove unused roles. This helps you improve the security posture of your AWS environments. Additionally, by removing unused roles, you can simplify your monitoring and auditing efforts by focusing only on roles that are in use.
To apply the concept of least privilege, traffic must be authorized from only known hosts, services needed IP addresses or other security groups.
Allowing unlimited access to an EC2 instance on port 22 allows an attacker to brute force their way into the system and potentially acquire access to the entire network. This can result in malicious activities such as hacking and man-in-the-middle (MITM) assaults.
Port 22 is used to establish an SSH connection to an EC2 instance and access a shell.
This rule can help you with the following:
Compliance Frameworks’ reports
AWS Well-Architected Lens
This rule can help you with the following:
The AWS account
root
user password and IAM useraccess keys
are not covered by the IAM password policy. If a password expires, the IAM user can no longer sign in to the AWS Management Console but still use their access keys.
If an administrator does not configure a custom
password policy, IAM user passwords must adhere to the AWS default password policy
. The default password policy enforces the following conditions:
minimum of 8 characters and a maximum of 128 characters
minimum of three of the following character types: uppercase, lowercase, numbers, and ‘! @ # $ % & * () + – = [] | ” symbols
Must not be the same as your AWS account name or email address.
nOps recommends that you must configure a custom
password policy for IAM users with the following conditions :
! @ # $ % ^ & * ( ) _ + - = [ ] { } | '
When you activate the log file integrity validation
option, CloudTrail will generate a hash using industry-standard algorithms for each log file that it delivers to your specified S3 bucket.
This rule can help you with the following:
CloudTrail Event Log
for all AWS regions.This rule checks for and lists AWS Accounts that don’t have AWS CloudTrail Event log enabled.
CloudTrail is enabled by default when you establish an AWS account. CloudTrail events are produced anytime an AWS account event occurs. In the CloudTrail console, click Event history to see the previous 90 days’ occurrences.
However, if you want to manage ongoing events efficiently, you should create a trail, which is just a configuration that permits events to be sent to a specified S3 bucket.
A CloudTrail might be regional or global. Regional trails exclusively record occurrences from a specified region, whereas global trails, which are recommended, record events from all regions.
Amazon GuardDuty is an intelligent threat detection service that continuously monitors your provisioned AWS workloads for malicious activities like API requests from harmful IP addresses and unauthorized data S3 access.
It also provides comprehensive security insights for visibility and remediation. To identify and prioritize potential threats, GuardDuty leverages various techniques, like machine learning (ML), anomaly detection, and integrated threat intelligence. GuardDuty can analyze tens of billions of events curated from AWS CloudTrail event logs, Amazon Virtual Private Cloud (VPC) flow logs, and DNS query logs, among many other data sources.
This rule can help you with the following:
AWS Config is a service that allows you to inspect, audit, and review your AWS resource configurations. Config monitors and records all AWS resource configurations in real-time, enabling you to match recorded configurations against desired configurations seamlessly.
AWS Config also helps to analyze changes in AWS resource configurations, dig into particular resource configuration histories, and evaluate compliance with the configuration defined in your internal policies.
nOps recommends you consider enabling AWS Config for better security.
This rule can help you with the following:
nOps suggests enforcing the least privilege principle by defining IAM users/roles and restricting them to only the actions they need to do their tasks.
This rule can help you with the following:
Compliance Frameworks
AWS Well-Architected Lens
You should provision AWS RDS instances in private subnets to shield them from direct internet traffic. However, suppose you must deploy an RDS instance on public subnets for any reason. In that case, you must verify that no inbound rules exist in any security group that permits unfettered access (i.e., 0.0.0.0/0 or::/0) (particularly on the TCP/IP port that your Database listens on).
The table below lists the default endpoint ports for each RDS database engine:
Database Engine | Default Port |
---|---|
Aurora/MySQL/MariaDB | 3306 |
PostgreSQL | 5432 |
SQL Server | 1433 |
Oracle | 1521 |
To use the least privilege principle, only known hosts, services, IP addresses, or security groups should be permitted. Unrestricted access to an RDS instance allows malicious attackers to brute force their way in and potentially get network access. This can lead to harmful activities like hacking and man-in-the-middle (MITM) attacks.
This rule can help you with the following:
Compliance Frameworks’ reports
AWS Well-Architected Lens
This rule can help you with the following compliance standards:
Amazon RDS Multi-AZ deployments provide enhanced availability for databases within a single region. In the event of a planned or unplanned outage of your DB instance, Amazon RDS automatically switches to a standby replica in another Availability Zone if you have enabled Multi-AZ.
This rule can help you with the following compliance standards:
This rule can help you with the following compliance standards which aligns with AWS Well-Architected Framework:
This rule can help you with the following compliance standards:
Creating point-in-time EBS snapshots periodically will allow you to handle efficiently your data recovery process in the event of a failure, save your data before shutting down an EC2 instance, back up data for geographical expansion, and maintain your disaster recovery stack up to date.
This rule can help you with the following compliance standards:
ElastiCache
resources have a Multi-AZ deployment configuration to enhance High Availability (HA). This ensures that the service can automatically failover to a read replica when the primary cache node fails, for example, in case of planned maintenance, the unlikely event of a primary node, or Availability Zone failure.ElastiCache will handle this failover transparently, and there is no need to create or provision a new primary node. You can resume writing to the new primary as soon as read replica promotion to the primary node is complete.
This rule can help you with:
AWS Well-Architected Lens
Please note that:
Redis Cache Multi-AZ with automatic failover does not support T1 and T2 cache node types or cache clusters with the Redis engine version earlier than 2.8.6
Redis Cache Multi-AZ with automatic failover is only available if the cluster has at least one read replica.
This rule is used by the following::
This rule can help you with the following compliance standards:
Naming (tagging) your AWS EBS volumes logically and consistently has several advantages such as providing additional information about the volume location and usage, promoting consistency within the selected environment, distinguishing fast similar resources from one another, avoiding naming collisions, improving clarity in cases of potential ambiguity and enhancing the aesthetic and professional appearance.
For example, if an AWS account is hosting production systems and critical workloads, it is highly recommended that your AWS Support Plan should be Business or Enterprise
.
Amazon Web Services provides the following support plans:
Basic
– The plan is included for all AWS customers and includes the following:
Developer
– This plan is recommended for customers that are experimenting or testing in AWS. This plan includes the following additional features on top of basic plan:
Business
– This plan is recommended and suitable for most of the production workloads in AWS. This plan includes the following additional features on top of the Developer
Support Plan:Production System Impaired/Down
cases. i.e. less than 4 hours for impaired production systems and less than 1 hour for production systems that are experience downtimeAWS Support API
Enterprise
– This plan is recommended for business and/or mission critical workloads in AWS. If you are an enterprise businesses that are running mission critical workloads on AWS and require high-touch proactive/preventive support, then this plan is for you. This plan includes the following additional features on top of the Business
Support Plan:You can find up-to-date information and pricing on these AWS Support Plans here.
The purpose of this nOps rule is to validate the support plan required for your AWS account/environment.
This rule can help you with the following compliance standards:
Any AMI older than 180 days is considered obsolete and is missing important patches and security updates required for reliable operations.
This rule can help you with the following:
An AWS ELBv2 load balancer is considered “unused” when the associated target group has no EC2 target instance registered or when the registered target instances are not healthy anymore.
This rule can help you work with the AWS Well-Architected Framework.
You are charged for each hour or partial hour that an AWS ELBv2 load balancer is running, regardless whether you are using the resource or not. Removing unused AWS resources like an Application Load Balancer (ALB) or a Network Load Balancer (NLB) will help you avoid unexpected charges on your AWS bill.
By default, an AWS Redshift cluster is considered under-utilised when matches the following criteria:
CPU utilization
has been less than 60%
for the last 30 days
.ReadIOPS
and WriteIOPS
registered per day for the last 30 days has been less than 100
on average
.The AWS CloudWatch metrics utilized to detect underused Redshift clusters are:
CPUUtilization
– the percentage of CPU utilization (Units: Percent).ReadIOPS
– The average number of disk read operations per second. (Units: Count/Second)WriteIOPS
– The average number of disk write operations per second. (Units: Count/Second)You can change the default threshold values for this rule on the nOps console and set your own values for CPU utilization, the total number of ReadIOPS and WriteIOPS to configure the underuse level for your Redshift clusters.
This rule can help you work with the AWS Well-Architected Framework
This rule can help you with the following:
Compliance Frameworks
AWS Well-Architected Lens
CloudWatch
, AWS monitoring service can be used monitor your NAT gateway via information it collects from the specified NAT gateway. This information is collected and presented in readable metrics at 1 minute
intervals and are stored for 15 months
. nOps uses one such metric to determine if a NAT Gateway is considered unused or not. This metric is BytesOutToDestination
which is The number of bytes sent out through the NAT gateway to the destination.
A NAT gateway is considered unused if the value of BytesOutToDestination
is 0 for the last 7 days.
© nOps 2024. All Rights Reserved.