Posts Tagged ‘trending’

The Collectd encrypted packet format

Friday, March 21st, 2014

Yesterday, Logstash 1.4.0 was released containing many improvements, one of which was contributed by us. We’ve implemented signature verification and packet decryption in the collectd input plugin. This blogpost will give an overview of how encryption and signing is used in the collectd binary protocol.

We’re currently working on deploying a logstash infrastructure that will eventually extend our monitoring and trending capabilties. At the same time, we want to move from our pull-based trending (Munin) to push-based (Collectd). Logstash recently added a Collectd input plugin, but it didn’t support decryption and signature verification of collectd packets. As we send (some) of this data over the public internet, we need to encrypt this traffic, so we decided to implement this.

During implementation, we discovered that the documentation was scarce and the comments in the collectd source-code appeared incomplete. This post gives a description of the collectd signed and encrypted packet formats. It assumes that you’re familiar with the collectd binary protocol.

(more…)

Automated monitoring? Easy!

Tuesday, February 11th, 2014

One of the things we take very serious here at Kumina is our monitoring. We’ve always done so, but even we must admit that during the starting years, we sometimes forgot to include all possible checks for a new service or host. And it sucks when you forget to setup the monitoring for a specific item, because you generally only find out about it when it’s actually down already…

We like to check as much as possible (if not everything). For example, we check if a service is up and running, we check if a vhost is returning the expected response, if an SSL certificate is still valid or if it will expire within 30 days, we check if OpenVPN certificates are close to expiration and if all loaded Apache modules actually come from a Debian package. And we check often, generally every 30 seconds, but we would prefer to do it even more often. However, these are not things you want to configure manually over and over again.

Automate everything

We’re using Icinga in two datacenters in failover mode, the second node takes over if the primary is unreachable. We currently monitor 319 hosts (including some failover virtual hosts) and a grand total of nearly 10000 checks. Although this fluctuates daily, since most changes on a server also adds or removes checks. It is all done automatically. This prevents us from forgetting to setup monitoring for a specific item or host and also allows us to quickly deploy new checks on the entire infrastructure. Consider the Fokirtor check we created last year, it’s very easy for us to simply deploy it on all those machines.

Using the tools at hand

We’re currently pretty heavy Puppet users, so we leverage the infrastructure we already have in place for that.

Since a puppet agent runs on our monitoring hosts every few hours, it’ll deploy new configuration a few times per day. It’s not exactly continuous delivery, but close enough for our needs for now. Equally important, it removes checks we no longer need. For instance, if we’ve create a redirect that was changed into a full-fledged site, the check is automatically changed to no longer expect a 301 response but a 200 with a correct string (that we provided, of course, it’s not that automated).

We started out using the power of puppet’s exported resources but over time as our config grew, it started to take way too long for Puppet to deploy new configuration on the monitoring hosts. We now deploy the configuration for both Icinga instances using a script that reads the stored config from the Puppet database.

Other uses

As you might imagine, we also do this for trending with Munin. We automatically deploy the Munin plugins on the clients when we deploy a new service and we automatically deploy the host configuration on the Munin server. As well as required firewall rules on the client side.

Using munin to trend puppetmaster

Wednesday, April 27th, 2011

We wanted to trend our puppetmaster to give us an idea of the amount of nodes and the time it takes to compile a catalog. Searching on the web didn’t yield the results we needed, so we made our own.

We use munin to trend our machines, and our puppetmaster is no different. We could not get a clear picture of the number of nodes that connect over a amount of time, however. There is a munin plugin to monitor the memory usage of the puppetmaster, but as we have a dedicated machine as our puppetmaster we have less need for it. After some looking around we found a plugin on munin-exchange. This plugin had some bugs and oddities (finding them I leave as an excercise to the reader) but nonetheless, it served as a starting point for our own plugin.

The plugin is called puppet_ and is a wildcard plugin. It can be called in two ways: as puppet_nodes and as puppet_totals.
(more…)