In this blog post series about designing Kubernetes-friendly Docker containers, we hope to provide you with some guidance to how you should be developing for Kubernetes. In case you missed the first two parts: Part 1, where we started with creating a Python 3 webserver, a pod and a service on a minikube Kubernetes ‘cluster’ can be found here, and part 2 where we looked at logging can be found here.
So, last time we looked at logging, this time I want to expand on that and add metrics as well. First, let’s make some assumptions.
JSON logging
Although Kubernetes does keep track of logs, once they are cleared from the disk on the node, their contents are gone. Therefore it is considered good practice to have a log aggregation mechanism in place for your Kubernetes cluster. The Elasticsearch stack (also known as EFK, Elasticsearch/Fluentd/Kibana) with fluentd seems pretty common and it is the one we prefer as well. This setup, like many others, prefers to receive log entries as JSON formatted strings. That allows the log aggregator to better organise the data, which in turn allows you to say something like “give me all log lines that are associated with this web application for all Pods that did not return a HTTP status 200 code”. You can probably imagine how useful this will be when you have 100 replicas of a Pod running.
Let’s modify the output of the application to emit JSON log entries. I’m not going to completely dissect the response, but leave that as an exercise for the reader. Change the code to the following:
import http.server import sys import threading import signal import json class MyWebpage(http.server.BaseHTTPRequestHandler): def do_GET(s): s.send_response(200) s.send_header('Content-Type', 'text/html') s.end_headers() s.wfile.write(b''' <!DOCTYPE html> <html> <head> <title>Hello!</title> </head> <body> <p>This is a demo page!</p> </body> </html>''') def log_message(self, format, *args): log = { 'my_webserver': { 'client_ip': self.address_string(), 'timestamp': self.log_date_time_string(), 'message': format%args } } print(json.dumps(log)) if __name__ == '__main__': def do_shutdown(signum, frame): threading.Thread(target = httpd.shutdown).start() sys.exit(0) signal.signal(signal.SIGTERM, do_shutdown) httpd = http.server.HTTPServer(('0.0.0.0', 8080), MyWebpage) httpd.serve_forever()
Now build and run it:
$ docker build -t my_webserver:0.6 . Sending build context to Docker daemon 4.096kB Step 1 : FROM python:alpine3.6 ---> 1b1ac8f23f73 Step 2 : ENV PYTHONUNBUFFERED 1 ---> Using cache ---> 6099cbb7db7c Step 3 : WORKDIR /usr/src/app ---> Using cache ---> 0518dfef5e4f Step 4 : COPY my_webserver.py ./ ---> 6aa1b76fa2b3 Removing intermediate container 1f91cdabfe1b Step 5 : CMD python ./my_webserver.py ---> Running in 5727d0a6d980 ---> d3d78d4e4f45 Removing intermediate container 5727d0a6d980 Successfully built d3d78d4e4f45 $ kubectl delete pod mywebserver pod "mywebserver" deleted $ kubectl run mywebserver --image=my_webserver:0.6 --image-pull-policy=Never --restart=Never --port=8080 pod "mywebserver" created
We’re already at version 0.6! How about that… Maybe we should fix the way we deploy this. But let’s leave that for now and start with testing our output:
$ curl $(minikube service mywebserver --url) <!DOCTYPE html> <html> <head> <title>Hello!</title> </head> <body> <p>This is a demo page!</p> </body> </html> $ kubectl logs mywebserver {"my_webserver": {"client_ip": "172.17.0.1", "timestamp": "09/Oct/2017 13:21:00", "message": "\"GET / HTTP/1.1\" 200 -"}}
Nice, it works. An important thing to note here is the fact that we use a nested namespace. This is important, as Elasticsearch forces you to have unique keys of unique datatypes. This is needed for proper indexing of those keys. So we consider it good practice to create a separate key group (not sure if there’s a better name for them). The above example would give you the keys:
- my_webserver.client_ip
- my_webserver.timestamp
- my_webserver.message
If we decide to have another app, called other_server, which stores an array in the message subkey, it will not cause problems, as that key would be other_server.message
. This won’t conflict with Elasticsearch’s tender internals.
That said, the example above would be even better if we moved the timestamp to the top level, as that’s the main key used by Elasticsearch to index the data and it should always be a timestamp in some form or another. Let’s fix that:
... def log_message(self, format, *args): log = { 'timestamp': self.log_date_time_string(), 'my_webserver': { 'client_ip': self.address_string(), 'message': format%args } } print(json.dumps(log)) ...
Rebuild the app and run it and check the output:
$ docker build -t my_webserver:0.7 . Sending build context to Docker daemon 3.584kB Step 1 : FROM python:alpine3.6 ---> 1b1ac8f23f73 Step 2 : ENV PYTHONUNBUFFERED 1 ---> Using cache ---> 6099cbb7db7c Step 3 : WORKDIR /usr/src/app ---> Using cache ---> 0518dfef5e4f Step 4 : COPY my_webserver.py ./ ---> fefa4c2a3da5 Removing intermediate container 62ec5724be8e Step 5 : CMD python ./my_webserver.py ---> Running in a3b8c02cec53 ---> 96e5f342196c Removing intermediate container a3b8c02cec53 Successfully built 96e5f342196c $ kubectl delete pod mywebserver pod "mywebserver" deleted $ kubectl run mywebserver --image=my_webserver:0.7 --image-pull-policy=Never --restart=Never --port=8080 pod "mywebserver" created $ curl $(minikube service mywebserver --url) ... $ kubectl logs mywebserver {"timestamp": "09/Oct/2017 20:13:05", "my_webserver": {"client_ip": "172.17.0.1", "message": "\"GET / HTTP/1.1\" 200 -"}}
And to make the last line a little better readable (you’ll need to install jq):
$ kubectl logs mywebserver | jq '.' { "timestamp": "09/Oct/2017 20:13:05", "my_webserver": { "client_ip": "172.17.0.1", "message": "\"GET / HTTP/1.1\" 200 -" } }
That seems legit! Now that we have this, let’s move on to the other part of gathering information about a running application: metrics.
Prometheus metrics
There really is just one solution for gathering metrics within Kubernetes, Prometheus. I know, there are other solutions, but they all seem kind of cobbled together. However, where Kubernetes is an offspring of Google’s Borg, Prometheus is an offspring of Google’s Borgmon. The two fit together quite easily that it’s almost a shame not to use this strong coupling.
In Python, adding support for Prometheus is pretty easy. But it does entail installing an additional library. For that, we must create a requirements.txt file (you can name it whatever you like, but this filename seems to be the de-facto default in the Python world). Add to it the following content:
prometheus-client==0.0.20
There might be never versions at the time you read this, but let’s make sure that for this tutorial, you’re using the same version as I am.
The image that we use has support for pip, we only need to tell it in the Dockerfile to actually install the requirements as listed in the requirements.txt file. The Dockerfile ends up looking like this:
FROM python:alpine3.6 ENV PYTHONUNBUFFERED=1 WORKDIR /usr/src/app COPY requirements.txt /requirements.txt RUN pip3 install -r /requirements.txt COPY my_webserver.py ./ CMD ["python", "./my_webserver.py"]
When you build the Docker image now, you’ll notice the extra steps and a few seconds delay as the library is installed:
$ docker build -t my_webserver:0.8 . Sending build context to Docker daemon 4.608kB Step 1 : FROM python:alpine3.6 ---> 1b1ac8f23f73 Step 2 : ENV PYTHONUNBUFFERED 1 ---> Using cache ---> 6099cbb7db7c Step 3 : WORKDIR /usr/src/app ---> Using cache ---> 0518dfef5e4f Step 4 : COPY requirements.txt /requirements.txt ---> 221e52f8825c Removing intermediate container db1f42afad0e Step 5 : RUN pip3 install -r /requirements.txt ---> Running in 32ed19f9740d Collecting prometheus-client==0.0.20 (from -r /requirements.txt (line 1)) Downloading prometheus_client-0.0.20.tar.gz Building wheels for collected packages: prometheus-client Running setup.py bdist_wheel for prometheus-client: started Running setup.py bdist_wheel for prometheus-client: finished with status 'done' Stored in directory: /root/.cache/pip/wheels/b5/13/d7/d8fca1b2db87bd1f9e5894722d6b7db24d27f617d6f040498c Successfully built prometheus-client Installing collected packages: prometheus-client Successfully installed prometheus-client-0.0.20 ---> 02c86904ebd5 Removing intermediate container 32ed19f9740d Step 6 : COPY my_webserver.py ./ ---> 5951ccbe14a1 Removing intermediate container 27373d8add71 Step 7 : CMD python ./my_webserver.py ---> Running in 594c8fb528af ---> e7bad8a8ee00 Removing intermediate container 594c8fb528af Successfully built e7bad8a8ee00
Awesome. Now let’s use it in our code. For this example we’re going to collect the time it takes for the webserver to respond to HTTP requests. For that we first need to actually import the library. Then we create a special object for keeping track of these numbers, which we’ll call REQUEST_LATENCY. As you can see in the example, the object is a Histogram which we’re calling my_webpage_request_latency_seconds. The second argument of the constructor acts as documentation.
Prometheus metrics are very simple; they just present a key with a value next to it. The value is always a floating point number. Using the Histogram object causes the app to create multiple metrics, that each show the number of requests that fall within a certain range (in this case latency in seconds). We’ll look at the actual output in a bit as well and you’ll see what I mean.
Sampling of metrics in Python is easy as well. You just add a decorator to the method you’d like to track. In our case, that’s the do_GET
method. That’s it. That’s all you need to do to start collecting metrics. Easy, right?
The last step is actually starting the listener. Prometheus tries to keep things dead simple, so all it does is create a listening HTTP service on a designated port. You can see it starting the server right before we start the actual webserver object that we created. I’ve assigned the random port number 9999 to it, but honestly, choose whatever you want (we’ll actually get back to that in a later installment of this series!).
The entire application now looks like this:
import http.server import sys import threading import signal import json import prometheus_client REQUEST_LATENCY = prometheus_client.Histogram( 'my_webpage_request_latency_seconds', 'Time it took to process incoming HTTP requests, in seconds.') class MyWebpage(http.server.BaseHTTPRequestHandler): @REQUEST_LATENCY.time() def do_GET(s): s.send_response(200) s.send_header('Content-Type', 'text/html') s.end_headers() s.wfile.write(b''' <!DOCTYPE html> <html> <head> <title>Hello!</title> </head> <body> <p>This is a demo page!</p> </body> </html>''') def log_message(self, format, *args): log = { 'timestamp': self.log_date_time_string(), 'my_webserver': { 'client_ip': self.address_string(), 'message': format%args } } print(json.dumps(log)) if __name__ == '__main__': def do_shutdown(signum, frame): threading.Thread(target = httpd.shutdown).start() sys.exit(0) signal.signal(signal.SIGTERM, do_shutdown) prometheus_client.start_http_server(9999) httpd = http.server.HTTPServer(('0.0.0.0', 8080), MyWebpage) httpd.serve_forever()
Are you still with me? At some point I’m only going to show you the snippets I’m actually changing, as the application is becoming quite large. Can you see all the changes I’ve made? It’s not a lot, but you get a huge amount of value out of this! Let’s check it out, rebuild and redeploy:
$ docker build -t my_webserver:0.9 . ... $ kubectl delete pod mywebserver pod "mywebserver" deleted $ kubectl run mywebserver --image=my_webserver:0.9 --image-pull-policy=Never --restart=Never --port=8080 pod "mywebserver" created
But how do we access those metrics? Remember the kubectl expose
command we used in the very first installment of this series? We’re going to use that one again to expose port 9999 as well!
$ kubectl expose pod mywebserver --port 9999 --type=NodePort --name=mywebserver-metrics service "mywebserver-metrics" exposed
First we need to generate some statistics, try something like this:
$ for x in 'seq 100'; do curl -s $(minikube service mywebserver --url) > /dev/null; done =
This visits the webpage 100 times in a row. It’s not a good test, but good enough for now. When it’s done, we can check the metrics like so:
$ curl $(minikube service mywebserver-metrics --url) # HELP process_virtual_memory_bytes Virtual memory size in bytes. # TYPE process_virtual_memory_bytes gauge process_virtual_memory_bytes 79060992.0 # HELP process_resident_memory_bytes Resident memory size in bytes. # TYPE process_resident_memory_bytes gauge process_resident_memory_bytes 17100800.0 # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. # TYPE process_start_time_seconds gauge process_start_time_seconds 1507581925.57 # HELP process_cpu_seconds_total Total user and system CPU time spent in seconds. # TYPE process_cpu_seconds_total counter process_cpu_seconds_total 0.19 # HELP process_open_fds Number of open file descriptors. # TYPE process_open_fds gauge process_open_fds 7.0 # HELP process_max_fds Maximum number of open file descriptors. # TYPE process_max_fds gauge process_max_fds 65536.0 # HELP python_info Python platform information # TYPE python_info gauge python_info{implementation="CPython",major="3",minor="6",patchlevel="2",version="3.6.2"} 1.0 # HELP my_webpage_request_latency_seconds Time it took to process incoming HTTP requests, in seconds. # TYPE my_webpage_request_latency_seconds histogram my_webpage_request_latency_seconds_bucket{le="0.005"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.01"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.025"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.05"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.075"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.1"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.25"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.5"} 100.0 my_webpage_request_latency_seconds_bucket{le="0.75"} 100.0 my_webpage_request_latency_seconds_bucket{le="1.0"} 100.0 my_webpage_request_latency_seconds_bucket{le="2.5"} 100.0 my_webpage_request_latency_seconds_bucket{le="5.0"} 100.0 my_webpage_request_latency_seconds_bucket{le="7.5"} 100.0 my_webpage_request_latency_seconds_bucket{le="10.0"} 100.0 my_webpage_request_latency_seconds_bucket{le="+Inf"} 100.0 my_webpage_request_latency_seconds_count 100.0 my_webpage_request_latency_seconds_sum 0.038089960002253065
Wow! That’s a very long list of metrics created by adding just a few lines of code… I’m not going to explain all the lines (why would I need to, they are documented!), but just pointing out that you can see the Histogram at the bottom. That’s the one we created. And it’s not very interesting. All the requests were completed within 0.005 seconds. Nice, but boring.
So to illustrate what’s happening here, I modified my own version of the code to present some random delays, just so you can how it would look in a more life-like situation:
# HELP my_webpage_request_latency_seconds Time it took to process incoming HTTP requests, in seconds. # TYPE my_webpage_request_latency_seconds histogram my_webpage_request_latency_seconds_bucket{le="0.005"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.01"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.025"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.05"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.075"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.1"} 48.0 my_webpage_request_latency_seconds_bucket{le="0.25"} 49.0 my_webpage_request_latency_seconds_bucket{le="0.5"} 49.0 my_webpage_request_latency_seconds_bucket{le="0.75"} 52.0 my_webpage_request_latency_seconds_bucket{le="1.0"} 54.0 my_webpage_request_latency_seconds_bucket{le="2.5"} 67.0 my_webpage_request_latency_seconds_bucket{le="5.0"} 89.0 my_webpage_request_latency_seconds_bucket{le="7.5"} 98.0 my_webpage_request_latency_seconds_bucket{le="10.0"} 99.0 my_webpage_request_latency_seconds_bucket{le="+Inf"} 100.0 my_webpage_request_latency_seconds_count 100.0 my_webpage_request_latency_seconds_sum 178.68813334600782
As you can see, the Histogram shows nicely that 54 out of the 100 requests were performed in one second or less. 89 out of 100 completed within 5 seconds. Sadly, although Kubernetes and Prometheus work nicely together out of the box, actually setting up Prometheus to scrape our little application is a bit more work than what I want to cover in these series. So I’m going to leave that as an exercise for the reader! But if you’d like a sidebar-like post explaining it, let me know (details below) and I’ll consider it.
Wrapping up
For now, let’s recap:
- As logs for Pods are fairly short lived (and for other reasons we only mentioned shortly), it’s good practise to aggregate the logs in a central location
- As the EFK stack is commonly used for that, together with fluentd, it makes sense to present the log data as JSON
- Prometheus is very good at collecting metrics from Kubernetes Pods and we can add it to our Python application fairly simple by exposing another HTTP port for serving those metrics
As always, please let me know (via @timstoop) or via our company Twitter ( @kumina_bv) if you liked it!