This post is about how you can automate the iterative process of code-test-deploy by using Docker, Git and Amazon Elastic Beanstalk based on our experience on a project which involved building self-contained micro-services. Each micro-service would run as an HTTP service and would be deployed on Amazon EC2 cloud.
Our team was a small one. Basically three developers and one QA. We did not have any operational team members to support us. What we had though was access to an Amazon account and the tools available in that eco-system.
Amongst other things there are two major activities that we did to automate certain processes that held us in good stead to deliver quality software. I will be talking briefly about each one of them. They are:
- Using Docker for doing end to end tests
- Using AWS Elastic Bean stalk to deploy our code to production
Using Docker for doing end to end tests:
While we had a suite of unit tests to test out the various code units within the service, we also needed to test the service end-to-end just as any client would use it. These end-to-end tests would need to be successfully executed before the developer pushed any code to the remote GitHub repository. An obvious approach would be to bring up a local development instance of the service and run the end to end scripts against that local instance. This approach however, has one major problem. The local development instance may have some differences with the eventual deployment environment in terms of the OS that is being used or version of some particular library or framework on the local development instance. These anomalies would result in situations where the end-to-end tests pass on the local development instance but would eventually fail on the production instance.
Another approach would be to use a VM instance that mimics the production environment and run the end to end tests against that instance. While at first glance this seemed to be a fair solution, soon enough we realized that here too there were some issues. The primary issue was that in order to run the end to end tests against a remote VM, we would first need to push the code to remote GitHub repository so that the testing VM can sync up the latest code, rebuild and redeploy before we ran our tests. This, however, rendered our original objective of running the end to end tests before pushing the code to remote repository moot. Also using one VM to test would lead to situation where multiple developers cannot be truly testing their changes independently of each other.
A straightforward solution to this problem was creating one VM for each developer. But this would bring in problems of provisioning as the team size scaled up. Also this was a slightly expensive option to create one VM per developer with all the required software and hardware.
Docker to the rescue:
Docker is a lightweight Linux container that gives you a Linux instance running on top of your host OS. (The host OS would usually be Linux but can be Mac or Windows with a plugin called Boot2Docker). The Docker Linux containers are somewhat like a Virtual Machine but much more lightweight. In essence these are actually just OS processes with complete isolation. Virtual machines on the other hand are a complete guest operating system by themselves. The following diagram gives a good idea about the basic difference between a Virtual Machine and a Docker container. For more on Docker, I would suggest to start with www.Docker.com
We decided to use Docker containers to ship our service. What it means is that our service was not going to be delivered as code which needed to be deployed on some hardware with a bunch of software dependencies. But our service would be delivered as the code as well as all the libraries as well as the underlying operating system, all packaged together as a single unit of deployment i.e. a Docker container.
We had to perform the following activities to enable the usage of Docker containers for our end to end tests:
- Configure our container: This basically involves providing instructions of how to setup our container. We need to specify things like, what OS is to be used, what libraries need to be installed, what code/binaries need to be deployed, what ports need to be exposed etc. And all of this is specified via a single file viz. Dockerfile which needs to be a part of the project code base itself. A sample Dockerfile is illustrated below.
- Automate spinning up the container and end-to-end tests execution: This was done by implementing a shell script viz. dockerize.sh that would build a container based on the configuration specified in the Dockerfile and run it on a local Docker infrastructure. Note that even though our container is running on the local Docker infrastructure it is effectively a completely isolated instance. The isolation being provided by the Docker infrastructure (which needs to be installed on each local development instance). So our container is in no way affected by the OS or any software installed on the local development instance. It will run the same way on the local instance or on the deployment instance (where too a Docker infrastructure is present). Eventually, the end to end tests shall be invoked against the Docker container spun up. The image below illustrates what a dockerize.sh script would look like.
Once we have the above scripts in place, all a developer needs to do to make a code change is build the binaries and run the dockerize.sh script which shall build a new Docker instance for him. He or she can then run the end to end tests against that instance. That completes the automation of the dev-testing cycle. We will see how we can automate the deployment to cloud in the next section.
Using AWS Elastic Bean stalk to deploy our code to production
Our services, as mentioned earlier, were eventually deployed on Amazon EC2 cloud. Deploying a service on EC2 typically involves setting up an EC2 VM instance, select the amount of memory, CPU, storage and network options for the VM etc. Then we need to connect to the VM and deploy all the required software including our binaries in order to make the service usable for clients. The good news is that Amazon has provided a service called Amazon Elastic Beanstalk that makes it possible to automate all of these tasks. Elastic beanstalk also integrates with Git i.e. it can pull code from Git repository and deploy it on the EC2 cloud. Additionally, and more importantly Beanstalk also supports deployment using Docker containers. We used all these capabilities of Elastic Beanstalk to automate the deployment of our service to Amazon cloud.
This was a two-step process:
- Step 1: Create an environment on EC2 using Elastic Bean stalk. The environment would basically be comprised of the specifications for selection of hardware and software required to setup the VM. It would also contain some other details like security related setup. This can be done from the AWS Management Console.
- Step 2: Implement a shell script that would build our binaries and check them into git. Let us call the script as eb-deploy.sh. Thereafter, we use Elastic Beanstalk command line tools, in our eb-deploy.sh script, to deploy to our AWS Elastic Beanstalk environment using our git repository. Elastic bean stalk would then use its git integration to pull our code and binaries from git. It would use the Dockerfile in our code base to construct our container and deploy it to the EC2 environment we created in the step above. The image below illustrates a sample eb-deploy.sh script. The script basically pulls the code from the git branch “deploy,” merges from the master branch, builds a fat jar and checks it into the deploy branch. Thereafter, it uses Elastic beanstalk commands eb terminate and create respectively to bring down the running instance and bring up the new instance with the latest binaries.
To summarize, below is a high level depiction of what we are doing here:
As a developer/operations person, the deployment would simply involve checking out the relevant git branch and invoking the eb-deploy.sh script to deploy the service to specific AWS Elastic Beanstalk environment and that’s about it.
I hope this post gives you a good high level understanding of how you can use Docker along with AWS Beanstalk to automate your code-test-deploy process to increase the overall efficiency of your team.
– Blog written by Abhay Sadadekar