Saturday, August 25, 2018

Infrastructure as code - troposphere

Much is said about infrastructure as code but, in my experience, very few (if any) companies do it. Personally, I've never seen one. Companies work in the cloud as they used with physical infrastructure, starting machines, defining subnets and configuring load balancers manually. Then they write a bunch of shell scripts on these machines and claim their infrastructure is automated.



We were no different and followed exactly that same recipe. There's nothing wrong with that and it's a very efficient strategy when one has to manage just a few products running on a few dozen machines. But things start to get out of control when the amount of resources being managed grow past a certain point (for us this point was around 10 products running in 3 AWS regions). That was when we took the decision to embrace infrastructure as code.

The first difficulty I faced was to get used to the cloud formation API. I was used to interacting with AWS only by the web console and its fantastic wizards so I never spent much time learning all the attributes of the resources created and how they applied to different scenarios. When using the cloud formation API I didn't have the wizards so knowing the resources attributes and how they were supposed to be used became necessary. It was not a big pain, for sure, but it took me some time to get used to the AWS documentation (very good, by the way!) and get reasonably fluent building my stack templates.

I was then writing stack templates for our applications, parameterizing all that could change, using cloud formation functions but still didn't fell like code. It was more like writing a report on libreoffice writer. I could add format, use some functions, get some input parameters, but in the end it's just text (in my case, YAML!).

That, per se, was not a problem. The way I feel about something is not relevant if it's the best way to solve a problem. But the templates were starting to get extremely confusing and have a lot of duplicate "code" as the stacks grew. As for the template being difficult to understand, one can solve that simply breaking a large template in smaller ones. It's possible to reference templates within templates. But that would not solve the "code" duplication issue. Enters now troposphere!



Troposphere is a python library that allows for the creation of cloud formation templates programmatically. It really makes infrastructure as **code**. Real code. Python. One can encapsulate resource creation in functions avoiding code duplication. One can hierarchicalize resource creation, making it easier to understand and maintain. And one can use any python code to extend what is possible to do with cloud formation templates. And in the end one just has to run the python code to produce a pure JSON template that can be used in AWS cloud formation.

My intention here is not to pitch about troposphere. I'm not connected with the project in any way and I don't gain anything by making it more popular. I'm just suggesting that if someone wants to treat infrastructure as code he/she should really take a look at it as it makes, in my personal opinion, cloud formation templating much more powerful, easier and maintanable.

Troposphere:

https://github.com/cloudtools/troposphere

PS: there are other tools like troposphere available out there. I haven't evaluated them. If you like some other tool, please leave me a comment.