A common occurrence in Instance management is the risk of overutilization of disk space. Several factors can cause an increase in Diskutilization to go over 90%. For example, user-initiated heavy workloads, analytic queries, prolonged deadlocks & lock waits, multiple concurrent transactions, long-running transactions, or other processes that utilize CPU resources.
Over-utilized instances can incur several performance issues that later affect your budget. Having a simple, automated means of scaling the volume of your instances when necessary take off any management overhead from your side. This use case focuses on automatically increasing disk space by a defined amount when a DSU of above 90% is detected.
In this particular template, you instruct the workflow to increase the DB size by 20GB when the disk space utilization crosses 90%. This event (DSU > 90%) sets off a CloudWatch Alarm, which triggers the TotalCloud workflow. Even if your CloudWatch Alarm alerts you of overutilization in the middle of the night, the workflow would have handled it for you before you even think of having to respond. Since it’s automated, the fix is executed immediately - eliminating any response time delays. If you wish to approve of the action before it occurs, you can enable user approval as well.
The workflow increases your EBS volume by 20GB, as a default value. This value can be altered depending on your workload demands. When a CloudWatch Alarm goes off and sends an SNS alert for high disk space utilization, the workflow is automatically triggered and executes the action. Like we’ve pointed out, you can set the trigger to be anything - a CloudWatch Alarm, any other external system, platform, or ticketing system such as JIRA. In this case, you can also instruct the workflow to create the ticket on your ticketing platform when the Alarm goes off. It can then close the ticket once remediation is completed. This is helpful for logging purposes, and to enable end-to-end automation.
After the Workflow matches the instances to be modified, it requests for user approval. On receiving a green signal, it increases the EBS volume and sends an SSM command that will attach the new volume to its EBS, and inform the OS.
The workflow has two primary steps being achieved with a total of 8 nodes. The first step is to filter out the right instance(s) using simple conditional operations. The second is to modify the volume and apply it to your instance.