/aws

HAProxy Service Discovery and Load Balancing

Introducton

I use AWS ECS to run a number of web facing services, including my web landing page https://www.richardjameskendall.com (although this sits behind Cloudfront).

I use a spot fleet for my ECS instances which helps minimise cost (you can see my Terraform module to deploy a cluster here)

As part of building this I wanted a way to automatically expose my services without building new Load Balancers for each one and without needing to add rules to an existing Load Balancer.

I decided to use HAProxy with a small addon to query the AWS CloudMap Service Discovery API and build a config based on the results from those queries. This is available on Docker Hub here and on GitHub here

The Setup

The setup is shown at a high-level in the diagram below. In summary:

  • HAProxy runs as a service on the ECS cluster
  • The service is linked to an Application Load Balancer (ALB)
  • This ALB does SSL offload and forwards requests to the instances of HAProxy
  • HAProxy matches the hostname in the requests (based on rules built from the Service Discovery API)
  • The requests are forwarded to the Application Service instances based on the hostname matched

Haproxy arch diagram

HAProxy

HAProxy is a free, highly performant, high availability load balancer designed for HTTP and TCP. It is written in C and has a reputation for being very conservative for CPU and memory usage even under high load.

AWS CloudMap Service Discovery

CloudMap is a cloud resource discovery service which allows services and instances to be registered and then queried by other applications. It has DNS and ECS integration.

Config Builder

This is a nodejs program which runs in the same container as HAProxy which (on a configurable frequency) queries the service registry and using the results builds a HAProxy config. When HAProxy config changes are detected it reloads (SIGHUP) HAProxy so it picks up the new config.

By default it refreshes the config every 30 seconds, but it only reloads HAProxy if a change is detected.

The following IAM permissions are needed to allow the CloudMap queries to work. I recommend using IAM Roles for your tasks as documented here

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ServiceDiscoveryStmt",
      "Action": [
        "servicediscovery:ListInstances",
        "servicediscovery:ListNamespaces",
        "servicediscovery:ListServices"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

Deploying

I use Terraform modules to deploy most things on AWS. I have modules which build the various dependencies including:

  • the ECS cluster
  • dependencies for the cluster including security groups and IAM roles
  • the Service Registry instance

I've also created modules for deploying services on the ECS cluster and a variant of that module which specifcally deploys the HAProxy service and creates the associated Application Load Balancer. You can find the ecs-haproxy module here

What else it can do

The configuration builder script also enables the following features:

  • HAProxy Stats Page (with a username and password)
  • Prometheus Metrics Endpoint (with a username and password)

You can see the stats page by going to any site hosted on your cluster /stats e.g. www.example.com/stats

You can get more details about the HAProxy Stats Page here and I will write another article about monitoring HAProxy with Prometheus and Grafana.

Example

See my gist here with an example of how to deploy the Terraform module using Terragrunt.

-- Richard, Feb 2020