During the last week we witnessed the worst possible show from the most prominent cloud provider of the world: Amazon.

One of the best articles I saw in the aftermath of the event was the one written by Massimo Re Ferrè (@mrferre), where he details the differences between “TCP” and “UDP” clouds, where he mentions the “design for fail” approach. It’s definitely a worth read but I would like to add something more on the cost of the “design for fail” approach starting from the more traditional environments.

theory

One of the drivers of cloud adoption is the ability to lower the overall TCO of your infrastructure. In a few words: moving your application stack and data to the cloud simply means cutting down fixed (oversized infrastructure) IT costs and pay only for what you use in day by day operations.

reality

Theoretically, it’s great and gives you an elastic approach to your IT needs but, often, theory is way far from practice!
If you want to design a new application stack or, in the worst case, migrate a legacy one, you need to keep in mind the availability/resiliency limitations of this cloud infrastructure and the problem becomes bigger!

Moving High Availability mechanisms up into the application logic instead of leaving them at the infrastructure level is very risky and can compromise your future developments, here’s why:

you need a more skilled Development team

Think about it: resiliency and availability (while maintaining scalability and performances) needs a better development team and a complex overall architecture to be maintained and supported: this will drive the costs up in an unpredictable way.

you need a longer development process

After the development phase you need to test your work and the more lines of code you write the longer will be the testing time! Every new release of software will need to be tested from many points of view and, if you add the availability layers in between, you will need to test them too!! …and costs will keep growing.

you will be locked to the cloud provider

The risk in building a custom availability layer for your application stack is that you’ll have to deeply integrate it with your public cloud provider APIs, which will result in a locked-in environment!
Here I see another big loss in terms of your freedom and ability to negotiate better prices with different service providers.

is this the cloud we want?

Moving budget resources from hardware infrastructures to human resources and locking the infrastructure to a single provider isn’t the right way to do it.
It reminds me of the old Mainframe model: big skills, tons of custom software and lock-ins!!!

Is Amazon-like Public Cloud the next mainframe?