It’s common to believe that Cloud Services make everything cheaper. Not always so.
While it’s true that Cloud Services frequently offer business advantages that can enable rapid growth, signing up for AWS doesn’t just instantly cut costs. And it can increase them.
Symptom:
Client moved its database from a Co-Located Data Center to AWS. Despite provisioning similar server configurations, the AWS bill was enormous.
Diagnostics:
First step is to check the AWS billing detail. The charges relate to (1) CPU, (2) Server Memory, (3) Disk Storage, and (4) Data Transfer.
In this case, a very high rate of Data Transfer was responsible for the largest share of the billing. This was surprising because the data for this business would not normally be considered high-traffic. (I would have a different answer for a video streaming company, for instance, where high data transfer is expected.)
Technical Diligence:
Examine typical database queries. To protect my real client, let’s imagine a hypothetical one called “First Imaginary Bank of the US”, or “FIBUS”. A typical query is “show all of Zia Zuckerman’s checks that cleared last month”. The client’s product had a web interface that would allow queries like this.
When I looked at the SQL code generated by this Front End (web interface), I read the exact queries it generated. No matter how many search criteria were involved, the entire database was scanned for each query.
Translated into “pseudocode”, the instructions looked like:
- For every transaction this bank has ever had:
- Was this transaction for Zia Zuckerman?
- Was this transaction conducted this month?
- Was this transaction a check?
- Only output a result if the answer to all those questions is yes.
Do you see the problem? Why is every single transaction in the bank’s history being re-evaluated in response to every query?
There are so many ways the data could have been pre-simplified, in order to reduce data transfer requirements:
- When opening Zia Zuckerman’s account information, create an intermediate list containing only her transactions (millions of times more efficient), and then use that intermediate list for further queries
- Pre-cache transactions in groups by year
- and more.
Why did this inefficient code cause such huge cost?
It’s all about who owns the wire.
In a computer, relatively small amounts of data are held in RAM, but large amounts of data are stored on disk. The CPU does math, but to access the disk drive there’s a data transfer that takes place.
When my client owned the computer, in their own Data Center, they also owned the wire that connected the computer to the disk storage. Infinite use of that wire was free.
AWS charges for “IOPS” (inputs and outputs per second).
This grossly inefficient database was somewhat harmless in the Data Center, but became a debilitating expense on AWS.
Remedy:
The code had to be refactored. In the case of this particular client, a major customer delivery contract had already been signed with a stipulation for cloud services. (Perhaps the team should have done its own experiment before signing such a contract, but by the time I was engaged, it was too late.)
There was no choice but to embark on the 7-figure project of improving the core product, which had these SQL inefficiencies throughout.