Geekery of all stripes

Intro to Cloud Foundry and BOSH

· David Bishop

I’ve started looking into Cloud Foundry and BOSH for work, and something that I’ve noticed is a lack of “mid-range” documentation. It’s quite possible I’m blind, but I’ve seen a lot of 30,000’ things (“Cloud Foundry will accelerate your velocity!”) and some great docs for the people who already know what they’re talking about (“The Director uses the CPI to tell the IaaS to launch a VM”) but I haven’t seen any introductions written with an eye towards someone who is not a PaaS expert, but is also not a manager. This is my attempt to fill that gap.

To start with, Cloud Foundry is an open-source version of Heroku (in intent, and it’s even compatible in some ways). It’s an implementation of a “Platform as a Service” (PaaS) product, that you can run either in-house or on various cloud platforms that provide you with “normal” virtual machines. It’s been explicitely designed to be agnostic about what sort of machine it’s running on (as long as that machine is linux, for now). Unlike traditional machines where, if you want to deploy a website you have to do ALL THE THINGS including configuring the webserver, the IP address, the database (if you use one), take care of logging, etc., in a PaaS they try and abstract as much of that away as possible. All you have to do is push your code (along with a config file that determines things like how much RAM, CPU, and disk space you need and what services you will be talking to) and you get your own little slice of a machine that is isolated into it’s own container. Using a PaaS is the natural end-game of someone that is trying to follow the 12 Factor App manifesto. A good case can be made that using a PaaS should be the default for any newly written code, unless you have very good reasons why it won’t work. Of course, existing shops with their legacy code and processes are a different story - migrating even a medium-sized project from the “traditional” approach over to using a PaaS can be quite a bit of work.

To divert for a second, a container is kinda like a virtual machine, in that it gets its own IP address and disk space, but it’s not a “full fledged” machine - it shares a kernel with the other containers and the host machine. This is done using a software component of Cloud Foundry called Warden, which works very similarly to LXC or Solaris Zones (in fact, early versions of Warden used LXC as the base). If you want to scale up the performance of your app, the way of doing that is to add additional instances. Cloud Foundry will take care of making sure that requests to your app get routed to both instances. It will also restart the container if it detects things like running out of RAM or that the application is crashing.

BOSH is sort of like Puppet or Chef, but working at the infrastructure layer vs. working with individual resources on each machine. By that I mean you describe how you want your infrastructure to look and then BOSH will make it look that way. With puppet you describe how you want a given machine to look, and it’s up to you to make the machines all work together. It uses virtual-machine templates (called stemcells), code, configuration and scripts (called ‘releases’), and a manifest file that describes how the infrastructure, code, network architecture, and everything else combines together. All three of those components wrapped up is a “deployment”, and can be pushed using BOSH. BOSH will provision machines, upload your code, install packages, and then monitor them to make sure everything is healthy, redeploying machines if something breaks. If you want to do something like add more virtual machines, you modify the manifest and redeploy. You can also update the stemcell (maybe switch from Ubuntu to CentOS) without changing anything else and redploy. And of course you can change the actual code you’re pushing and redeploy. Each time, if it sees something that is different on the running machines versus what’s in the manifest, BOSH will rectify the situation, whether it needs to shut down a machine and redeploy it entirely (newly-updated stemcell), just deploy new packages, or whatever.

BOSH is used to deploy Cloud Foundry onto your infrastructure and if it sounds like it kind of works like Cloud Foundry, that’s because it does. However, BOSH is aimed at building out infrastructure and is a bit unwieldy for J. Random Developer to use just to deploy their webapp. Deciding how to setup networking, managing packages, those are all the sorts of things that a PaaS abstracts away from you. However, BOSH does a bang-up job of building out large, complicated software like Cloud Foundry.

One note about the difference between BOSH and bosh-lite. Using BOSH in production, you will use it to provision virtual machines. However, bosh-lite is a way to build out Cloud Foundry in a sandbox, so that you can play around with it and understand it. As such, it will provision the various bits of Cloud Foundry as containers on the machine that is running the bosh-lite director.

The next article will go over installing BOSH, then using BOSH to install Cloud Foundry, and then using Cloud Foundry to host an app.