Working with Hostnames in a Dynamically Scaled Environment

… or how Chef and Route53 helped humanize machine names.

Working with servers in a virtualized environment is a funny thing. They all have names like i-01234567, ec2-123-45-67-89.compute-1.amazonaws.com, and so on. And that works! Servers seem just as happy if you tell them their memcache server is at 10.2.3.4 as if you had said it’s at memcache3.parse.com. Sadly, we poor humans working with these machines are not as flexible. I have no idea if the machine at i-abcd9876 is running memcache or nginx, or just sitting there imagining rainbows. I need some help figuring out who’s doing what.

We use Chef here at Parse to manage what software runs on each of our machines, so adding a cookbook and a recipe to every machine is something that’s pretty easy to do. We were able to take advantage of a combination of EC2 tagging and the Chef Route53 cookbook to make a small change that made a world of difference to how easy it is to keep track of all of our servers.

Launching a new server happens in two ways: by hand or using an AWS auto-scaling group. When launching instances by hand, we get to assign a node name; when the auto-scaling group launches an instance, it needs to figure out what it should be named. Our solution for making sure hosts have human-usable names must accommodate both of these scenarios. The first is easier, so let’s walk through that first.

This code snippet first loads up the route53 cookbook and our AWS credentials. It then figures out the fully qualified domain name for our server (using a special subdomain indicated by the int_domain attribute to keep our publicly accessible names like www.parse.com safe from accidents), and creates the record in Route53. Nice and simple, right?

# register the dns name in route53 based on the hostname
include_recipe "route53"
aws_creds = Chef::EncryptedDataBagItem.load("passwords", "aws")
full_nodename = "#{node.name}.#{node[:route53][:int_domain]}"
route53_record "create CNAME record" do
  name                   full_nodename
  value                  node[:ec2][:public_hostname]
  type                   "CNAME"
  zone_id                node[:route53][:zone_id]
  aws_access_key_id      aws_creds["aws_access_key_id"]
  aws_secret_access_key  aws_creds["aws_secret_access_key"]
  ttl                    600
  action :create
end

Chef recipes are supposed to be idempotent, meaning that you can run it multiple times and you should get the same result. What this really means is that if the CNAME record already exists for this host, we shouldn’t do anything. The route53 cookbook LWRP route53_record takes care of this for us with a simple check:

record = zone.records.get(name, type)
if record.nil?
  create
  Chef::Log.info "Record created: #{name}"
elsif value != record.value.first
  record.destroy
  create
  Chef::Log.info "Record modified: #{name}"
end

Try and retrieve the record. If it doesn’t exist, create it. If it’s different from what it should be, fix it (this happens when you stop and start an instance and its IP address changes). If it already exists and is correct, do nothing.

Now that we have manual host naming under control, managing auto-scaling groups becomes simple; all we need to do is figure out an alternate source for node.name. In the AMI we give the auto-scaling group, we include a record of the role with which the AMI was created. This role is the prefix for the hostname we want assigned. That prefix is used by the following snippet as an alternate source for node.name.

zone = r53.zones.get(zone_id)
records = zone.records.all!
# get an array with all the records that start with $prefix
matches = Array.new
records.each do |r|
  match = r.name.match /#{prefix}[0-9]+/
  if match.to_s.length != 0
    matches << match
  end
end
# find the number of the last one (they come sorted) and increment
num = matches.last.to_s.match /[0-9]+/
incr = num.to_s.to_i + 1
puts "#{prefix}#{incr}"

With these recipes and the route53 cookbook in place, Chef verifies and corrects the DNS entries for all of our servers automatically with every run, ensuring that we can safely use human-readable host names to reference all of our machines.

Ben Hartshorne
February 28, 2013
blog comments powered by Disqus

Comments are closed.

Archives

Categories

RSS Feed Follow us Like us