6 min read

Cloud Infrastructure - Here's Why You are Paying Too Much

March 23, 2020




It's estimated that in 2020 organizations will spend over $266 Billion on public cloud services. $50 Billion of that buying Infrastructure as a Service (IaaS) and $33 Billion on compute from vendors such as AWS, Azure and GCP.

The growth comes in part from a wider adoption of public cloud solutions coupled with a large proportion of existing cloud users scaling infrastructure and bringing more applications into the cloud.

No doubt there are a lot of organizations who have scaled their infrastructure and it has grown way ahead of the growth of their businesses and resource utilization. 

A recent Gartner report estimated the IaaS spend will rise in 2020 to $50 Billion and given that the typical ratio of compute spend is around two thirds, that places a $33 Billion dollar expense in the segment most vulnerable to wasted spend.

The Growth of Cloud Overspend

While management and particularly CFO's are acutely aware of the potential for wasted spending on unused resources, it's human nature to build in some healthy buffers into compute and storage design.  The trouble is paying an additional $100 here and $200 there for under utilized resources can soon mount up to thousands, hundreds of thousands or even millions across a large enterprise multi cloud environment.

The source of overspend stems from four main areas

  • Oversized Resources
  • Unused Resources
  • Idle Resources
  • Unused Storage

Oversized Resources

Studies have shown that just under half of the configured instances in tested environments were at least one size larger than the workload they were handling, which means that simply stepping down one size could immediately save 50% of the cost on that instance.  Stepping down 2 sizes and your bill ends up at 25% of what you are paying now with no discernible impact on the operation of your environment.

Even on the compute spend alone, if cloud users dropped the size of under utilized resources down one level across the estimated 40% of oversized instances, that would save around $6 Billion per year (Sorry Jeff, Thomas, Scott)  

Unused Resources

There is generally a phase when developing your applications & production environments where testing occurs in isolation. Maybe thats a whole test environment, QA staging or separate staging area where all the magic happens. Either way there comes a time in a project lifecycle where the production environment is stable and some of these resources are redundant.

We've encountered multiple client environments in our consulting business where there were multiple orphan resources related to old test environments still running and costing thousands of dollars per month for no good reason.  

Idle Resources

There are of course lots of test environments, staging environments and QA staging infrastructure that are very much in use on a daily basis. What you have to ask yourself is "When are these resources being used?".

Typically with an in-house or local contract consulting team, these test and development environments are in use Monday thru Friday 8am - 6pm that's  45 maybe 50 hours per week. So there's a whopping 70% of the 7 days where they are unused. So the other question to ask is why are they still running and costing you money if they are just sitting idle?

The same cost saving calculation can be applied. If 40% of your cloud infrastructure is non production and you switch off the resources for 70% of the week, the entire industry compute spend of $33 Billion would be reduced by a further $9 Billion. Money that would be far better spent on Engineers or automated diagraming tools!

Unused Storage

Another typical source of wasted expense is orphaned volumes. These stand out prominently on automated infrastructure diagrams. Not so easy to spot, but also important to monitor is database utilization.  Having severely under utilized databases is another pointless cosy coming directly off your client's or organisation's bottom line.

Deleting unused volumes and superceded snapshots can deliver immediate one off savings, while pairing back on bloated instances can deliver smaller but permanent ongoing monthly savings long into the future. 

How to reduce your cloud spend.

The key to any well managed business practice is to measure. You can't manage what you don't measure as they old saying goes.

You need a method of identifying the outliers in your cloud infrastructure and seeing what's making up your cloud spend. 

Our organisation provided cloud consulting and DevOps engineering services for many years and we often faced the challenge of identifying exactly what a client was running, what instances and resources were in use, which ones were outliers and what the whole shooting match was estimated to be costing the client on a monthly basis.

This took days if not weeks, which was great if we were charging by the hour, but not so great for the client and we certainly had way better things to be doing with our time than trawling through consoles identifying what was running where. We couldn't rely on manually created documentation as it was never accurate, up to date and in most cases it didn't exist.

There had to be a better way...... There wasn't...... So we built hava.io

In a nutshell, you simply create some ready-only cloud credentials or an AWS cross-account role, plug them into the hava dashboard and the software automatically pulls back your entire AWS, Azure and GCP infrastructure and renders them into logically laid out interactive diagrams.

You can instantly see EVERYTHING that is running including the estimated costs down to the individual instance or resource. 

This allows you to easily spot the outliers, identify unused instances and volumes, see which databases are live so you can use your consoles to check the utilisation vs the allocated storage you are paying for.

In a real world scenario, one of our clients who shall remain nameless started using hava a year or so ago and were checking out their freshly rendered infrastructure diagrams. There was a subnet with a rather large database instance sitting there, minding it's own business.  Nobody knew what it was, it wasn't related to any production infrastructure. There were no connections showing up on the interactive diagram either. A classic outlier.

Clicking on the database, the Attributes tab to the side of the diagram displayed $1257/m.  A little more digging and it turned out it was a really really old test environment, where everything had been switched off or deleted except for the database instance.

It had been costing them over 15k year for well over two years - so north of $30k 

The current team had no reason to go looking for it. It was a large cloud user, so the expense was buried so deep in the AWS invoice that nobody noticed.  A easy mistake to make, but also at $30k+ a very expensive one.

Needless to say the ROI for that client for their modest investment in hava will be paying off for years to come. 

Like to find out more?  Come say hi on our socially distanced chat widget in the bottom right of this page, email sales@hava.io, or hit up the contact us page below.

We look forward to seeing how much you can save your business with hava.io



See also: 7 great reasons to visualize your cloud infrastructure



Team Hava

Written by Team Hava

The Hava content team