Posts Tagged ‘puppet’

Job Opening: Debian Linux System Administrator at Kumina

Friday, March 28th, 2014

Thank you for all the responses and the interest, we are currently evaluating the candidates and we’re hopeful to find our new coworker among them!

We’re growing and are looking for Debian Linux System Administors that would like to grow with us and provide sysadmin services to our customers.

TL;DR: (in random order) Debian Linux / Apache / PHP / MySQL / Python / Puppet / KVM / Qemu / PostgreSQL / Tomcat / GlassFish / Logstash / ElasticSearch / HAProxy / Graphite / Heartbeat / Pacemaker / Postfix / Icinga / NFS / DRBD / OCFS2 / Ext4 / Varnish / Unbound / POSIX File Permissions / Git / Kibana / Docker / Awesomeness

Who We Are

Kumina has been around since 2007, but we’re slow growers. Our ideal is to grow no larger than 10-12 people. Expansion should be done by automating the hell out of our work and making ourself obsolete so we can persue new knowledge and improve our own workflow. We stand behind the ideals of open source software and prefer working with packages from Debian. Our customers are very diverse, from large magazines to stockphoto sellers, from large multinational corporations that cater to telecom companies to a small foundation that helps people who kicked a bad habit back to work. The biggest common denominator between the customers are that they all run webapplications. We want to provide top quality services, whether it’s on our own VM platform, a customer’s own hardware or somewhere else entirely. We’re always innovating and trying to find new solutions to old problems. We take the time to implement proper solutions and tend to dislike workarounds for problems.

Your coworker’s Linux experience range from 15+ years to almost 2 years of experience, so there’s always something to learn or teach.

Who We Are Looking For

The most important quality we’re looking for is someone who fits into our team. We can be pretty chaotic, direct, obnoxious, distracted, distractive, busy and hyperactive, all at the same time, no kidding. You need to be able to work in this environment and although there are lots of moments in which everyone is focussed on their task, it can change in the blink of an eye.

We’re looking for someone with Linux/Unix experience, Debian Linux experience is preferred, but mostly because that’s what we use. If you’re a CentOS person and do not mind switching and learning about Debian, that’s fine too. The top of this post has a list of software we interact with on a regular basis, knowledge and experience with a few of them would be appreciated. We do not have actual LPI certifications (although we have some good books in our bookcase regarding LPI), but having knowledge equivalent to LPI level 2 is definitely preferred.

Currently, we do most of our configuration management via Puppet. But we’re planning massive changes in our workflow, so experience with other configuration systems is appreciated.

Willingness to learn and teach is another important quality we’re looking for. We try to have weekly presentations regarding technology, varying from an explanation about Apache MPMs to building Kibana dashboards. You’re expected to pass on knowledge to your coworkers and document much (and a bit more) as needed. We do consider most of our puppet code documentation as well (and have separate documentation for classes and defines, of course).

We’re not really interested in your formal education or math skill, we do want to see some of your work regarding scripting in Shell script or Python or Ruby. Being fluent in English is a must and speaking/writing excellent Dutch is definitely a pre.

The Job

Most of our time is spent on innovating our own business process and this is a continuous process, expect a lot of changes. We spent time on testing new technologies and integrating that into our processes.

That said, we always make time to help customers. Requests vary from adding serveraliases (which we should automate!) to setting up entire new environments for customers. We tend to work with customers, so expect a lot of communication, but we do prefer email and irc over other forms of contact.

Another part of our work is troubleshooting of problems. We have a 24×7 service for which we operate with rotating weekly shifts. Once you have enough experience with the way we work, you’ll be scheduled into those shifts as well (don’t expect too many calls outside of business hours, tough, one time during a single rotation is considered a lot). Once you’re part of the rotating shifts, you’ll get a phone from us as well.

We work from the office in Eindhoven 4 days a week, generally. Since we prefer traveling by public transit and none of us actually live near Eindhoven, we arrive at the office somewhere between 8:50 and 9:10. Although we try to be flexible in working hours, it’s important that we can depend on you being there at the times we’ve agreed upon!

Lunch is on Kumina when we’re working at the office (we’ve got a grocery store next door, where we simply get bread and cheese and fruits and the like) and we’ve got our own coffee machine which grinds beans and makes awesome coffee. Should problems or projects require us to stay in the office late, dinner is on Kumina as well (but the last time that happened is almost a year ago).

Everything we do is a team effort and we expect you to be a team player. We do not have room for isolated individuals, we need you to complement our team. We expect you to be pro-active and pick up requests and problems from customers when they arrive (or even before) without being told to.

The Location

Our office is situated in the centre of Eindhoven, next to the train station. We’re on the 12th floor of De Groene Toren (The Green Tower, a pretty well known sight in Eindhoven) and its an office space with a nice view.

We expect you to be able to work from our office several days a week so if you live in close proximity to Eindhoven, that’s convenient, but of course it’s up to you how far you’re willing to travel. Working from home is possible (requires an internet connection, which you need to get yourself and a computer, which we can supply) but is not mandatory.

What Do We Offer

We offer a slightly chaotic but very open environment in which your are encouraged to keep exploring new technologies and ideas. Contributing to the open source community is highly encouraged. We are proud of our work and the things we do and we encourage you to improve on our processes so you can be proud as well.

Based on your skills, we can offer you a salary between €35k and €45k, depending on the level of your skills. If you’re truly exceptional, we might be willing to pay you even more, but you’ll have to prove your exceptionality. Your income will grow with the company and pay raises can happen often, depending on our success. If the company is doing well financially, so will you. A 10% increase is not unheard of.

Based on a 40 hour work week, you’ll get 25 days of paid holiday. This excludes national holidays, so those are extra. If you prefer to work less, those 25 days are scaled to match.

Interested?

Let us know! Mail us at jobs@kumina.nl with some details about you. Links to online repository which contains your commits are appreciated, but we recognise that not all sysadmins are programmers. Tell us about you and why you’re interested in the job. A resume is interesting, just like links to your LinkedIn account. If you have questions, you can mail those as well.

Or join us on IRC (channel #kumina on irc.kumina.nl). If you type ‘kumina’ in a sentence, we’ll get a highlight, just keep in mind that it might take a little while before we react, depending on who’s online and how busy it is.

We’re looking forward to meet you!

Automated monitoring? Easy!

Tuesday, February 11th, 2014

One of the things we take very serious here at Kumina is our monitoring. We’ve always done so, but even we must admit that during the starting years, we sometimes forgot to include all possible checks for a new service or host. And it sucks when you forget to setup the monitoring for a specific item, because you generally only find out about it when it’s actually down already…

We like to check as much as possible (if not everything). For example, we check if a service is up and running, we check if a vhost is returning the expected response, if an SSL certificate is still valid or if it will expire within 30 days, we check if OpenVPN certificates are close to expiration and if all loaded Apache modules actually come from a Debian package. And we check often, generally every 30 seconds, but we would prefer to do it even more often. However, these are not things you want to configure manually over and over again.

Automate everything

We’re using Icinga in two datacenters in failover mode, the second node takes over if the primary is unreachable. We currently monitor 319 hosts (including some failover virtual hosts) and a grand total of nearly 10000 checks. Although this fluctuates daily, since most changes on a server also adds or removes checks. It is all done automatically. This prevents us from forgetting to setup monitoring for a specific item or host and also allows us to quickly deploy new checks on the entire infrastructure. Consider the Fokirtor check we created last year, it’s very easy for us to simply deploy it on all those machines.

Using the tools at hand

We’re currently pretty heavy Puppet users, so we leverage the infrastructure we already have in place for that.

Since a puppet agent runs on our monitoring hosts every few hours, it’ll deploy new configuration a few times per day. It’s not exactly continuous delivery, but close enough for our needs for now. Equally important, it removes checks we no longer need. For instance, if we’ve create a redirect that was changed into a full-fledged site, the check is automatically changed to no longer expect a 301 response but a 200 with a correct string (that we provided, of course, it’s not that automated).

We started out using the power of puppet’s exported resources but over time as our config grew, it started to take way too long for Puppet to deploy new configuration on the monitoring hosts. We now deploy the configuration for both Icinga instances using a script that reads the stored config from the Puppet database.

Other uses

As you might imagine, we also do this for trending with Munin. We automatically deploy the Munin plugins on the clients when we deploy a new service and we automatically deploy the host configuration on the Munin server. As well as required firewall rules on the client side.

On our new DNS infrastructure (and DNSSEC)

Thursday, August 15th, 2013

Recently we’ve been busy implementing a new DNS infrastructure for our resolvers as well as our authoritative servers. We wanted to be ready for future developments like DNSSEC and we wanted to re-new this important part of our infrastructure for a while. This blog-post gives an overview of our new setup.

(more…)

Debugging puppet queueing

Friday, December 9th, 2011

Today we ran into a problem where the data put in ActiveMQ by the puppetmaster seemed corrupted in some way. When running the puppet queue daemon on the foreground (with –debug –verbose –no-daemonize), we noticed messages like these:

info: Loaded queued catalog in 22.16 seconds
debug: Searched for resources in 0.31 seconds
err: Could not save queued catalog for web1.ourserver.com: syntax error on line 68, col 34: `  serverversion: 2.7.6  sshdsakey: [long string]'
notice: Processing queued catalog for web1.ourserver.com in 0.41 seconds

It seemed like for some reason there’s a newline missing there, but what exactly is it trying to do? It would be helpful if we could check the message in total, to see which resource is doing this. Python to the rescue!

On the machine that’s running the ActiveMQ, install python-stompy (we’re on Debian Squeeze). Open a python interactive shell and do this:

>>> from stompy.simple import Client
>>> stomp = Client()
>>> stomp.connect()
>>> stomp.subscribe("/queue/catalog")
>>> message = stomp.get_nowait()
>>> f = open("message","w")
>>> f.write(message.body)
>>> f.close()
>>> stomp.unsubscribe("/queue/catalog")
>>> stomp.disconnect()

You know have a file called message that contains the message. You might want to make the file a little bit easier to read by executing the following: sed -i 's/{/\n{/g' message, which adds a newline in front of each opening accolade. Now to search for the problem and the resource that causes it.

I hope this helps someone!

Facter facts for PCI devices

Friday, June 17th, 2011

We are in the process of building the configuration for our monitoring system from exported resources (more on that in the future). To accomplish one of the checks we needed a way to identify the brand of RAID controller in our physical servers. The best way to do this is facter.

We’ve written some custom facts before, but never facts for hardware. We could have taken the lazy route and use lspci -m | awk -F '" "' '/RAID/ { print $2 }'. But maybe we need more hardware facts in the future. So we built a script that parses the output of lspci -vmmk and builds a multidimensional hash for it.

We then iterate over the hash and add facts for every RAID controller we find. Because we have a hash of all the data from lspci, we can add facts for every type of controller on the PCI bus by adding a regex, a counter (there may be more than one controller of a certain type) and a factname (with embedded counter).

We are releasing this script in the hope that it will be useful to somebody. Installation is breeze: Just add pci_devices.rb to your puppet module in module_name/lib/facter and start using it.

The code is on github. If you have any improvements or questions, send us a pull request or a ticket.