From Managed Services to Self-Orchestrating Micro-Services

We recently added orchestration support to our Synchro platform, and we used that support to migrate our own Synchro API server from a managed-service deployment on Microsoft Azure to a self-orchestrating micro-services deployment on Joyent Triton. We presented our experiences at the August Seattle Docker Meetup and that presentation is availabled here

We had two main goals in undertaking orchestration support:

  • Provide a turn-key self-orchestrating solution for our customers/users that works with any orchestration platform
  • Use that support ourselves to migrate our production API server from a proprietary, vendor-specific managed service deployment to a open standards, portable, container-based deployment

Basic Container Support

Back in February we blogged about Deploying Synchro using Docker. At that time we created a Dockerfile which we began bunding in our installation to make it easy to package an installed, configured Synchro installation into a Docker image.

That Dockerfile looked like this:

FROM node:argon 

# Create app directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

# Bundle app source
COPY . /usr/src/app

# Install deps
RUN npm install
RUN cd synchro-apps && npm install

ENV SYNCHRO__PORT 80
EXPOSE $SYNCHRO__PORT 

CMD [ "node", "app.js" ]

Orchestration Support

When we started the process of containerizing our production API server, we quickly learned that there is a lot more to containerizing a solution than just wrapping your app in a Docker image. Our app, like many apps, runs as one part of a system of interconnected services. When running in a pure managed environment, those services are provided for you, and your interface to/from those services abstracts you from their implementation/operations. When running an orchestration of micro-services in containers, you are responsible for the entire solution (creating, configuring, and running all of the micro-service containers and connecting them to each other).

Orchestration solutions, like Docker Compose, Mesosphere Marathon, Google Kubernetes, and others, provide tools to support these complex container deployments and interactions, and guidance for modifying your application to support that tooling. But as a solution provider ourselves, we were looking for a solution that supported whichever orchestration platform our customers had chosen, with little or no further integration required.

The AutoPilot Pattern using ContainerPilot

Our friends at Joyent made us aware of a solution that they promote called The AutoPilot Pattern using ContainerPilot, which is well documented here. Note that while Joyent actively promotes and supports this solution and its components, it is not Joyent-specific (everything involved is open-source and will work on any platform that can run containers).

This solution delivers "app-centric micro-orchestration", where containers collaborate to adjust to each other automatically (both as containers scale, and as they become healthy/unhealthy), without needing any help from your orchestration system. These containers run on "auto-pilot".

All the orchestration solution / scheduler has to do is run your containers, and the containers themselves will self-organize. As a solution provider ourselves, this features was a must-have. Our customers can deploy our AutoPilot containers, from publicly published images, with no changes, regardless of the orchestration system they use, and everything will just work. And if a customer changed from one orchestration system / scheduler to another, no changes would be required to any Synchro micro-service Docker images.

Implementing Micro-Orchestration

We aren't going to cover the basic implementation of the AutoPilot Pattern using ContainerPilot, as Joyent has excellent documentation and many implementation samples. That being said, we have published our solution on GitHub and you are free to reference it and use it as desired (more details below).

For some context, here is what our production API server deployment looked like on Microsoft Azure when we started this process:

Managed Solution on Microsoft Azure

And here is what our solution looks like now, running on Joyent Triton:

Self-Orchestrating Micro-Services on Joyent Triton

Note that we will always run two or more Nginx and Synchro containers, and we will run Consul in a raft of three containers, and Redis in a cluster of three containers, so a minimal deployment will consist of ten containers, all working together (automatically). And of course we can scale out Nginx or Synchro as needed to handle load and the system will handle that automatically.

Container Best-Practices

Flexible Configuration

Getting configuration into micro-services is one of the significant challenges in these types of solutions. With Docker containers, we have the option to build configuration into the container itself using environment variables or bundled files, or we can inject configuration at runtime using environment variables or Docker volumes. There are pros and cons to each of these mechanisms. As a solution provider, we want our own solution to be non-opinionated, which is a fancy way of saying that we want to support any of these mechanisms (including combinations).

To support the broadest range of configuration options, we recommend the following:

  • Support configuration via environment variables where possible - In order to support flexible configuration, support environment variables as broadly as possible.
  • Any file-based configuration should have a configurable local path with a reasonable default - This is particularly important if you intend to inject configuration using Docker volumes. Having a configurable local file path allows you to map volumes to locations where they will not interfere with other configuration elements or potentially leak other local files from the container to the volume unintentionally.
  • Any file-based configuration should be able to get its contents from a base64 encoded environment variable - It can be tricky to pass binary contents via environment variables, and even for multi-line text files, the differences between the support of Docker command line, the Docker env file, and Docker Compose make this more complex than it needs to be. Supporting base64 encoded environment variables is easy to do and works in all modalities of environment configuration (whether build-time or run-time).

ContainerPilot makes is very easy to do the above, as well as supporting flexible configuration in general, with its preStart entrypoint. Following is an example of the ContainerPilot preStart in our Nginx container:

#!/bin/sh
preStart() 
{
  : ${SSL_CERTS_PATH:="/etc/ssl/certs/ssl.crt"}
  if [ -n "$SSL_CERTS_BASE64" ]; then
    echo $SSL_CERTS_BASE64 | base64 -d > $SSL_CERTS_PATH
  fi;

  # >>> Process ssl.key, any other files as above

  # Rest of preStart logic (consul-template, etc)
}

With this kind of configuration, we can do any of the following:

  • Copy the ssl.crt file into our image at the default location with no environment variables (build-time)
  • Populate the ssl.crt file from an environment variable (build-time or run-time)
  • Supply the ssl.crt file from a Docker volume to a custom location (run-time)

Embrace the Proxy

Solutions consisting of micro-services will typically require a proxy server. We happen to use Nginx as our proxy. It is not unusual for comparable implementations to use HAProxy.

While managed services solutions provided much of this functionality for you, micro-service solutions often require you to do some or all of the below:

Load distribution

The primary function of the proxy is to distribute traffic among your services. ContainerPilot, through its use of Consul and consul-template, more or less handles this part for you.

If you have session affinity requirements (as Synchro does), you want to make sure your proxy configuration accommodates those. With nginx we use ip_hash for session affinity.

SSL termination

Another significant function of the proxy in micro-container orchestrations is to terminate SSL connections. This requires basic SSL/TLS configuration, as well as injection of SSL credentials into the proxy container. While this can be a cumbersome process, it does allow you to tune your SSL implementation for performance and security.

When you have configured SSL successfully, you should run an external validation test like the one from SSL Labs.

You will also likely want to forward non-SSL traffic to your SSL endpoint.

Static file caching

In our managed services deployment, we served all static files from a CDN (Content Distribution Network). Because we did not want to rely on target environments having CDN support, we instead implemented static file caching in Nginx (with support from Synchro). We found the performance to be comparable to a CDN, and the deployment/maintenance complexity to be much lower with this approach.

Websocket support

Not all proxies that support http traffic will support WebSockets out of the box. Nginx in particular requires specific configuration to support WebSockets.

Proxy Config in Practice

Our Nginx configuration template does all of the above. See: Synchro AutoPilot nginx.conf.ctmpl on GitHub

Prepare your App

You application may require some modification in order to be well-behaved in a ContainerPilot deployment:

Health check

You should implement a health check if you app does not already have one. We recommend that the health check endpoint be configurable in your app. We also recommend that you take care not to expose the health check endpoint unintentionally (your app can check to see if the connecting host is the local machine, or you can have your proxy deny access to the health check from external clients, as appropriate).

Signal handling

Your app should implement signal handling in order to be well behaved (and responsive) in an orchestrated/scheduled environment. The Node.js default handlers, in particular, are not well-behaved. You should handle:

  • SIGINT - Shutdown
  • SIGHUP - Reconfiguration (if your app can reconfigure itself, such as may be required by ContainerPilot if your app uses resources that may change over its execution)

Static file cache support

If your app serves static files, it probably already has a caching support implementation in place. Since our app previously relied on a CDN, we had to add static file caching support (our proxy now handles the cache, but our app still had to write the appropriate cache headers).

The Synchro AutoPilot Solution

Our AutoPilot implementation is available on GitHub. You are free to review it and use it as desired. We have also published the images from our solution on DockerHub. Note that our own production API servers use these exact images. If you are deploying Synchro in a container environment, we recommend that you use these same images.

Comparing Solutions: Azure Managed Service vs Triton Orchestrated Containers

Cost

We were initially concerned with the cost of running all of the containers that our new solution required. Below we show the cost of our previous managed solution versus the cost to run our new orchestrated container solution on Joyent Triton:

You can see that even with running 12 containers in our baseline/minimal containerized configuration, our Triton cost is less than 50% of our previous Azure solution that only had a single managed app container.

That being said, the pricing is not apples-to-apples. With Azure, there are many arbitrary limitations put on your instances that force you to buy capacity that you don't need. For example, we did not need nearly as much memory or processor as the "S1 Standard" image costs on Azure, but the smaller images that would have been more appropriate had arbitrary limitations that made them unusable (poor networking, maximum number of instances that preventing scaling, etc).

So part of the cost improvement is related to being able to pay for exactly the right size container on Triton (they provide a very wide variety of container profiles with no arbitrary limitations). But part of it is that Triton is just less expensive (the Triton "g4-general-4G" is comparable to the Azure "S1 Standard" except that it has more than twice as much RAM, and it's only 65% of the cost).

Performance

We have not done any detailed end-to-end performance testing. We did do some initial smoke testing, and we profiled some specific areas of concern (our app is very dependent on low-latency, and on fast intra-solution transactions between the app and Redis). For everything that we did test, we saw marked improvement in performance. And our app "feels" faster, for whatever that's worth.

We can say with some confidence that the performance didn't get worse, and particularly with the cost improvement, we would have been happy with that.

Service and Support

In using Azure for the past year plus, we have suffered from being a small customer dealing with a giant service provider. We didn't run into too many issues, but when we did, support was non-existent. The model is essentially that to get reasonable support you either have to be a very large customer, or you have to subscribe and pay a non-trivial amount for the right to access support. This was not much different with our use of Amazon AWS services in the past (except that Amazon had much better self-support and peer-support, so it was easier to solve your own problems if you were willing to put in the work).

Our experience with Joyent has been a breath of fresh air. We had a couple of operational issues with our account provisioning and the support tickets were handled almost immediately with a high level of professionalism/competence. We opened a couple of GitHub issues against ContainerPilot and got thoughtful feedback from the maintainer (a Joyent employee). We even asked in advance if we could get a technical backstop in case we ran into porting issues, and an SE was assigned to us (and they knew in advance that we weren't going to be spending a lot of money with them in the near term). Joyent just seemed top-to-bottom to be committed to making sure we were successful.

Operations

Environmental: A major, unexpected benefit of moving to a orchestrated container solution is that we can now simulate our production environment in development with a high degree of fidelity, and we can actually directly use our production configuration and state from staging environments before deployment. For example, we can set up a deployment environment that points at the Joyent Docker daemon for production, and set up the exact same environment (same directory, same environment variables, etc) pointing to a local Docker daemon, allowing us to create the exact same Docker environment for local/dev or remote/prod deployment. This is something we couldn't ever really do with any confidence on Azure.

Portability: Because ContainerPilot is open source and will run anywhere containers will run, and because all of the elements of our solution now run in containers, we could port this solution to another provider very quickly and easily (it took two weeks to port from Azure to Joyent, but we're confident we could redeploy our current solution on another provider in matter of hours at most). We are currently using Docker Compose as our scheduler with Triton, but again, we could change to a different orchestration solution / scheduler, even on a different cloud provider, very easily and with a high degree of confidence.

Summary

We are very happy with our new AutoPilot implementation of Synchro and its components. It has helped us consolidate our application state and move to a production environment that is a much better fit for us (in terms of cost, performance, tooling, environmental fidelity, and other factors).

We are confident that our AutoPilot support will make it easier for customers deploying Synchro to integrate it into their operating environments, regardless of their cloud provider or orchestration solution. And it means that we are now able to offer a turn-key solution instead of just an "integrate it yourself" Docker container.

More to Come

If you saw our Docker Meetup presentation, you know that our actual deployment uses an additional configuration processing container called StashBox. We didn't want to confuse the story of our moving to AutoPilot/ContainerPilot and to Joyent, so we decided to leave that out of this blog entry. Our next blog entry focuses on StashBox (how we use it, and how it pairs with ContainerPilot). See: http://blog.synchro.io/stashbox/