Kumina | Blog

How to develop Kubernetes-friendly containerised applications, part 3

kubernetes_logo

kubernetes_logo

In this blog post series about designing Kubernetes-friendly Docker containers, we hope to provide you with some guidance to how you should be developing for Kubernetes. In case you missed the first two parts: Part 1, where we started with creating a Python 3 webserver, a pod and a service on a minikube Kubernetes ‘cluster’ can be found here, and part 2 where we looked at logging can be found here.

So, last time we looked at logging, this time I want to expand on that and add metrics as well. First, let’s make some assumptions.

JSON logging

Although Kubernetes does keep track of logs, once they are cleared from the disk on the node, their contents are gone. Therefore it is considered good practice to have a log aggregation mechanism in place for your Kubernetes cluster. The Elasticsearch stack (also known as EFK, Elasticsearch/Fluentd/Kibana) with fluentd seems pretty common and it is the one we prefer as well. This setup, like many others, prefers to receive log entries as JSON formatted strings. That allows the log aggregator to better organise the data, which in turn allows you to say something like “give me all log lines that are associated with this web application for all Pods that did not return a HTTP status 200 code”. You can probably imagine how useful this will be when you have 100 replicas of a Pod running.

Let’s modify the output of the application to emit JSON log entries. I’m not going to completely dissect the response, but leave that as an exercise for the reader. Change the code to the following:

import http.server
import sys
import threading
import signal
import json

class MyWebpage(http.server.BaseHTTPRequestHandler):
    def do_GET(s):
        s.send_response(200)
        s.send_header('Content-Type', 'text/html')
        s.end_headers()
        s.wfile.write(b'''
<!DOCTYPE html>
<html>
    <head>
        <title>Hello!</title>
    </head>
    <body>
        <p>This is a demo page!</p>
    </body>
</html>''')

    def log_message(self, format, *args):
        log = {
            'my_webserver':
               {
                   'client_ip': self.address_string(),
                   'timestamp': self.log_date_time_string(),
                   'message': format%args
               }
        }
        print(json.dumps(log))


if __name__ == '__main__':
    def do_shutdown(signum, frame):
        threading.Thread(target = httpd.shutdown).start()
        sys.exit(0)


    signal.signal(signal.SIGTERM, do_shutdown)

    httpd = http.server.HTTPServer(('0.0.0.0', 8080), MyWebpage)
    httpd.serve_forever()

Now build and run it:

$ docker build -t my_webserver:0.6 .
Sending build context to Docker daemon  4.096kB
Step 1 : FROM python:alpine3.6
 ---> 1b1ac8f23f73
Step 2 : ENV PYTHONUNBUFFERED 1
 ---> Using cache
 ---> 6099cbb7db7c
Step 3 : WORKDIR /usr/src/app
 ---> Using cache
 ---> 0518dfef5e4f
Step 4 : COPY my_webserver.py ./
 ---> 6aa1b76fa2b3
Removing intermediate container 1f91cdabfe1b
Step 5 : CMD python ./my_webserver.py
 ---> Running in 5727d0a6d980
 ---> d3d78d4e4f45
Removing intermediate container 5727d0a6d980
Successfully built d3d78d4e4f45
$ kubectl delete pod mywebserver
pod "mywebserver" deleted
$ kubectl run mywebserver --image=my_webserver:0.6 --image-pull-policy=Never --restart=Never --port=8080
pod "mywebserver" created

We’re already at version 0.6! How about that… Maybe we should fix the way we deploy this. But let’s leave that for now and start with testing our output:

$ curl $(minikube service mywebserver --url)
<!DOCTYPE html>
<html>
    <head>
        <title>Hello!</title>
    </head>
    <body>
        <p>This is a demo page!</p>
    </body>
</html>
$ kubectl logs mywebserver
{"my_webserver": {"client_ip": "172.17.0.1", "timestamp": "09/Oct/2017 13:21:00", "message": "\"GET / HTTP/1.1\" 200 -"}}

Nice, it works. An important thing to note here is the fact that we use a nested namespace. This is important, as Elasticsearch forces you to have unique keys of unique datatypes. This is needed for proper indexing of those keys. So we consider it good practice to create a separate key group (not sure if there’s a better name for them). The above example would give you the keys:

If we decide to have another app, called other_server, which stores an array in the message subkey, it will not cause problems, as that key would be other_server.message. This won’t conflict with Elasticsearch’s tender internals.

That said, the example above would be even better if we moved the timestamp to the top level, as that’s the main key used by Elasticsearch to index the data and it should always be a timestamp in some form or another. Let’s fix that:

...
    def log_message(self, format, *args):
        log = {
            'timestamp': self.log_date_time_string(),
            'my_webserver':
               {
                   'client_ip': self.address_string(),
                   'message': format%args
               }
        }
        print(json.dumps(log))
...

Rebuild the app and run it and check the output:

$ docker build -t my_webserver:0.7 .
Sending build context to Docker daemon  3.584kB
Step 1 : FROM python:alpine3.6
 ---> 1b1ac8f23f73
Step 2 : ENV PYTHONUNBUFFERED 1
 ---> Using cache
 ---> 6099cbb7db7c
Step 3 : WORKDIR /usr/src/app
 ---> Using cache
 ---> 0518dfef5e4f
Step 4 : COPY my_webserver.py ./
 ---> fefa4c2a3da5
Removing intermediate container 62ec5724be8e
Step 5 : CMD python ./my_webserver.py
 ---> Running in a3b8c02cec53
 ---> 96e5f342196c
Removing intermediate container a3b8c02cec53
Successfully built 96e5f342196c
$ kubectl delete pod mywebserver
pod "mywebserver" deleted
$ kubectl run mywebserver --image=my_webserver:0.7 --image-pull-policy=Never --restart=Never --port=8080
pod "mywebserver" created
$ curl $(minikube service mywebserver --url)
...
$ kubectl logs mywebserver
{"timestamp": "09/Oct/2017 20:13:05", "my_webserver": {"client_ip": "172.17.0.1", "message": "\"GET / HTTP/1.1\" 200 -"}}

And to make the last line a little better readable (you’ll need to install jq):

$ kubectl logs mywebserver | jq '.'
{
  "timestamp": "09/Oct/2017 20:13:05",
  "my_webserver": {
    "client_ip": "172.17.0.1",
    "message": "\"GET / HTTP/1.1\" 200 -"
  }
}

That seems legit! Now that we have this, let’s move on to the other part of gathering information about a running application: metrics.

Prometheus metrics

There really is just one solution for gathering metrics within Kubernetes, Prometheus. I know, there are other solutions, but they all seem kind of cobbled together. However, where Kubernetes is an offspring of Google’s Borg, Prometheus is an offspring of Google’s Borgmon. The two fit together quite easily that it’s almost a shame not to use this strong coupling.

In Python, adding support for Prometheus is pretty easy. But it does entail installing an additional library. For that, we must create a requirements.txt file (you can name it whatever you like, but this filename seems to be the de-facto default in the Python world). Add to it the following content:

prometheus-client==0.0.20

There might be never versions at the time you read this, but let’s make sure that for this tutorial, you’re using the same version as I am.

The image that we use has support for pip, we only need to tell it in the Dockerfile to actually install the requirements as listed in the requirements.txt file. The Dockerfile ends up looking like this:

FROM python:alpine3.6
ENV PYTHONUNBUFFERED=1
WORKDIR /usr/src/app
COPY requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt
COPY my_webserver.py ./
CMD ["python", "./my_webserver.py"]

When you build the Docker image now, you’ll notice the extra steps and a few seconds delay as the library is installed:

$ docker build -t my_webserver:0.8 .
Sending build context to Docker daemon  4.608kB
Step 1 : FROM python:alpine3.6
 ---> 1b1ac8f23f73
Step 2 : ENV PYTHONUNBUFFERED 1
 ---> Using cache
 ---> 6099cbb7db7c
Step 3 : WORKDIR /usr/src/app
 ---> Using cache
 ---> 0518dfef5e4f
Step 4 : COPY requirements.txt /requirements.txt
 ---> 221e52f8825c
Removing intermediate container db1f42afad0e
Step 5 : RUN pip3 install -r /requirements.txt
 ---> Running in 32ed19f9740d
Collecting prometheus-client==0.0.20 (from -r /requirements.txt (line 1))
  Downloading prometheus_client-0.0.20.tar.gz
Building wheels for collected packages: prometheus-client
  Running setup.py bdist_wheel for prometheus-client: started
  Running setup.py bdist_wheel for prometheus-client: finished with status 'done'
  Stored in directory: /root/.cache/pip/wheels/b5/13/d7/d8fca1b2db87bd1f9e5894722d6b7db24d27f617d6f040498c
Successfully built prometheus-client
Installing collected packages: prometheus-client
Successfully installed prometheus-client-0.0.20
 ---> 02c86904ebd5
Removing intermediate container 32ed19f9740d
Step 6 : COPY my_webserver.py ./
 ---> 5951ccbe14a1
Removing intermediate container 27373d8add71
Step 7 : CMD python ./my_webserver.py
 ---> Running in 594c8fb528af
 ---> e7bad8a8ee00
Removing intermediate container 594c8fb528af
Successfully built e7bad8a8ee00

Awesome. Now let’s use it in our code. For this example we’re going to collect the time it takes for the webserver to respond to HTTP requests. For that we first need to actually import the library. Then we create a special object for keeping track of these numbers, which we’ll call REQUEST_LATENCY. As you can see in the example, the object is a Histogram which we’re calling my_webpage_request_latency_seconds. The second argument of the constructor acts as documentation.

Prometheus metrics are very simple; they just present a key with a value next to it. The value is always a floating point number. Using the Histogram object causes the app to create multiple metrics, that each show the number of requests that fall within a certain range (in this case latency in seconds). We’ll look at the actual output in a bit as well and you’ll see what I mean.

Sampling of metrics in Python is easy as well. You just add a decorator to the method you’d like to track. In our case, that’s the do_GET method. That’s it. That’s all you need to do to start collecting metrics. Easy, right?

The last step is actually starting the listener. Prometheus tries to keep things dead simple, so all it does is create a listening HTTP service on a designated port. You can see it starting the server right before we start the actual webserver object that we created. I’ve assigned the random port number 9999 to it, but honestly, choose whatever you want (we’ll actually get back to that in a later installment of this series!).

The entire application now looks like this:

import http.server
import sys
import threading
import signal
import json
import prometheus_client

REQUEST_LATENCY = prometheus_client.Histogram(
    'my_webpage_request_latency_seconds',
    'Time it took to process incoming HTTP requests, in seconds.')

class MyWebpage(http.server.BaseHTTPRequestHandler):
    @REQUEST_LATENCY.time()
    def do_GET(s):
        s.send_response(200)
        s.send_header('Content-Type', 'text/html')
        s.end_headers()
        s.wfile.write(b'''
<!DOCTYPE html>
<html>
    <head>
        <title>Hello!</title>
    </head>
    <body>
        <p>This is a demo page!</p>
    </body>
</html>''')

    def log_message(self, format, *args):
        log = {
            'timestamp': self.log_date_time_string(),
            'my_webserver':
               {
                   'client_ip': self.address_string(),
                   'message': format%args
               }
        }
        print(json.dumps(log))


if __name__ == '__main__':
    def do_shutdown(signum, frame):
        threading.Thread(target = httpd.shutdown).start()
        sys.exit(0)


    signal.signal(signal.SIGTERM, do_shutdown)

    prometheus_client.start_http_server(9999)
    httpd = http.server.HTTPServer(('0.0.0.0', 8080), MyWebpage)
    httpd.serve_forever()

Are you still with me? At some point I’m only going to show you the snippets I’m actually changing, as the application is becoming quite large. Can you see all the changes I’ve made? It’s not a lot, but you get a huge amount of value out of this! Let’s check it out, rebuild and redeploy:

$ docker build -t my_webserver:0.9 .
...
$ kubectl delete pod mywebserver
pod "mywebserver" deleted
$ kubectl run mywebserver --image=my_webserver:0.9 --image-pull-policy=Never --restart=Never --port=8080
pod "mywebserver" created

But how do we access those metrics? Remember the kubectl expose command we used in the very first installment of this series? We’re going to use that one again to expose port 9999 as well!

$ kubectl expose pod mywebserver --port 9999 --type=NodePort --name=mywebserver-metrics
service "mywebserver-metrics" exposed

First we need to generate some statistics, try something like this:

$ for x in 'seq 100'; do curl -s $(minikube service mywebserver --url) > /dev/null; done =

This visits the webpage 100 times in a row. It’s not a good test, but good enough for now. When it’s done, we can check the metrics like so:

$ curl $(minikube service mywebserver-metrics --url)
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 79060992.0
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 17100800.0
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1507581925.57
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.19
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 7.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 65536.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="6",patchlevel="2",version="3.6.2"} 1.0
# HELP my_webpage_request_latency_seconds Time it took to process incoming HTTP requests, in seconds.
# TYPE my_webpage_request_latency_seconds histogram
my_webpage_request_latency_seconds_bucket{le="0.005"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.01"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.025"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.05"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.075"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.1"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.25"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.5"} 100.0
my_webpage_request_latency_seconds_bucket{le="0.75"} 100.0
my_webpage_request_latency_seconds_bucket{le="1.0"} 100.0
my_webpage_request_latency_seconds_bucket{le="2.5"} 100.0
my_webpage_request_latency_seconds_bucket{le="5.0"} 100.0
my_webpage_request_latency_seconds_bucket{le="7.5"} 100.0
my_webpage_request_latency_seconds_bucket{le="10.0"} 100.0
my_webpage_request_latency_seconds_bucket{le="+Inf"} 100.0
my_webpage_request_latency_seconds_count 100.0
my_webpage_request_latency_seconds_sum 0.038089960002253065

Wow! That’s a very long list of metrics created by adding just a few lines of code… I’m not going to explain all the lines (why would I need to, they are documented!), but just pointing out that you can see the Histogram at the bottom. That’s the one we created. And it’s not very interesting. All the requests were completed within 0.005 seconds. Nice, but boring.

So to illustrate what’s happening here, I modified my own version of the code to present some random delays, just so you can how it would look in a more life-like situation:

# HELP my_webpage_request_latency_seconds Time it took to process incoming HTTP requests, in seconds.
# TYPE my_webpage_request_latency_seconds histogram
my_webpage_request_latency_seconds_bucket{le="0.005"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.01"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.025"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.05"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.075"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.1"} 48.0
my_webpage_request_latency_seconds_bucket{le="0.25"} 49.0
my_webpage_request_latency_seconds_bucket{le="0.5"} 49.0
my_webpage_request_latency_seconds_bucket{le="0.75"} 52.0
my_webpage_request_latency_seconds_bucket{le="1.0"} 54.0
my_webpage_request_latency_seconds_bucket{le="2.5"} 67.0
my_webpage_request_latency_seconds_bucket{le="5.0"} 89.0
my_webpage_request_latency_seconds_bucket{le="7.5"} 98.0
my_webpage_request_latency_seconds_bucket{le="10.0"} 99.0
my_webpage_request_latency_seconds_bucket{le="+Inf"} 100.0
my_webpage_request_latency_seconds_count 100.0
my_webpage_request_latency_seconds_sum 178.68813334600782

As you can see, the Histogram shows nicely that 54 out of the 100 requests were performed in one second or less. 89 out of 100 completed within 5 seconds. Sadly, although Kubernetes and Prometheus work nicely together out of the box, actually setting up Prometheus to scrape our little application is a bit more work than what I want to cover in these series. So I’m going to leave that as an exercise for the reader! But if you’d like a sidebar-like post explaining it, let me know (details below) and I’ll consider it.

Wrapping up

For now, let’s recap:

As always, please let me know (via @timstoop) or via our company Twitter ( @kumina_bv) if you liked it!

 

Exit mobile version