Hashi-ing things out: Consul

The quest to find a replacement for Hashicorp products, part 1: Consul.

We can’t have nice things

Hashicorp recently changed their licensing for all their product. This new Business Source License isn’t open-source so that made me think of my usage of their product. With this change, I’ve decided it is time to move on from their product as it’s unsure where it’s going. Furthermore, some distributions like NixOS already marked the first version after the license change as non-free.

Since I am currently using Consul, Nomad and Vault, I have to find alternatives for each of them.

Since then I’ve written a solution for a Consul replacement. Let’s look at it.

The quest

The main goals are rather simple yet specific:

  • Be supported as a configuration provider for Traefik
  • Be a cluster so multiple nodes can use it.
  • Built in support for Master/Slave constructs would be nice to build on top of it.
  • Good API and/or SDK.
  • Supported as a NixOS module to quickly deploy/maintain would be a massive bonus.

Current usage

Most of my usage of Consul is for the service discovery through the catalog service. Pretty much all of that usage is done through Traefik. I have 2 Traefik running under different “prefix”. This prefix is used as the beginning of the configuration mechanism to differentiate multiple Traefik instances using the same configuration providers (Consul, KV stores, Kubernetes…). I currently use 2 groups of instances, one for the public (which uses the public prefix) and the other for private usage with an internal DNS zone (using the default traefik prefix).

The alternatives

Looking at what Traefik supports gave a rather short amount of choices:

  • Docker: This seems to only work on the same instance as it reads the docker labels. Useful but doesn’t fit the requirements.
  • Consul K/V: Can’t since we are trying to get rid of Consul
  • BoltDB: Cool idea but doesn’t really work in a multiple instance cluster (unless NFS can play ball).
  • Zookeeper: seems hard to configure and is Java based so expects lots of memory usage (it’s not just me, it’s in the manual!).
  • ECS: Not running on AWS.
  • Etcd: Winner winner, chicken dinner.
  • Consul Catalog: Currently what I am using.
  • Rancher: More Kubernetes, not something I want to deploy.
  • Marathon: Cool project but seems abandoned?
  • Kubernetes: Killing a fly with a nuke.

The clear winner here is Etcd. Not only is a stable cluster key/value system that most notably used to store the state of Kubernetes clusters, it’s also been added as a graduated project of the Cloud Native Computing Foundation. Being in the CNCF means we won’t see another rug pull which is very comforting.

Etcd?

The official web site describes Etcd as “a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node”.

Apples to bananas

Comparing Consul to Etcd isn’t fair as they they don’t aim to do the same thing. They do similar things like the key/value store and provide through their API some primitives to build software on to of it (leader election among other things).

Not my first rodeo

it’s not my first time that I use Etcd. Before I switched everything to consul and try to sell it to pretty much everyone that wished to hear me or not, I built something called Overseer that was used to configure services built in Go. The software would use the library to fetch its configuration (a struct stored in JSON format). I ended up abandoning this project because I wasn’t super pleased with how rigid the configuration had to be. I would like to revisit it someday as I think it would have some merit in an environment where DNS is heavily used and finding a host by name would be simple.

What now?

At this point I know I’m going to use Etcd and I know I’m going have to write some software to handle writing the right keys in Etcd for Traefik to see the routers/middlewares and backends.

What I came up is Traffikey. I’ll cover my plans for it and what I already have implemented. This is what I’ve been using and what is powering the routing for this website.