At Kumina we maintain a Kubernetes setup running on Amazon EC2. For the low-level networking between containers, we make use of Calico. Calico configures all of our EC2 systems to form a mesh network. The systems in this mesh network all run an instance of the BIRD Internet Routing Daemon.
One of the problems we ran into with Calico is that it’s sometimes hard to get a holistic view of the state of the system. Calico ships with a utility called calicoctl that can be used to print the state of a single node in the mesh, but using this utility can easily become laborious as the number of EC2 instances increases.
Given that we already make strong use of Prometheus for our monitoring, we’ve solved this by writing a tool called Birdwatcher that exports the metrics generated by BIRD in Prometheus’ format. This allows us to put alerts in place for when an excessive number of changes to routes occur, or when routes simply fail to work for a prolonged period of time.
Today we’re happy to announce that Birdwatcher is now available on our company’s GitHub page. If you’re a user of both Calico and Prometheus, be sure to give it a try. Enjoy!