Wednesday, February 17, 2016

Visualization of live data streams with the gnuplot and bash


This article describes a tiny framework consisting mainly of bash scripts to feed the gnuplot with live data. Different examples are presented to cover general use cases and usage scenarios.

 

Foreword


The article was written and formatted to be published by Linux Journal because they shown interest after I sent them a draft. Then I sent them the final version more than a month ago and they didn't even bother to reply back - not a single word! It's a bit frustrating because I spent so much time to format it like LJ wanted it to be formatted, to generate images as pdf's to make them LJ conformant, to adjust code so that it would fit into 52 columns and so on. Well, duck them, lesson learned. I'm not going to wait any longer and publishing it here, enjoy.

 

Introduction


It all started with an attempt to solve strange aperiodic Wi-Fi throughput drops I had between my Linux computer and my router. First I thought that the problem is related to the crowded 2.4GHz radio band but after I switched my client to the 5GHz band where I was alone the problem persisted. My first thought was that the router CPU is too weak to keep up with the transfer rate and when it starts with some CPU intensive operations the network throughput drops. So I needed a way to visualize router's CPU load and Wi-Fi interface throughput at the same time over relatively long period of time, say 10-15 minutes.

As far as I have the busybox and the dropbear installed on my router I could start simple scripts on it to measure its CPU load and network throughput and transfer the results over an opened ssh session to my client PC to visualize them. Of course the ssh session would be opened over the same Wi-Fi channel which adds to the overall router's throughput but its addition is negligible.

The scripts to measure a network interface throughput and CPU load were relatively easy to write but how do I visualize the data streams in real time on my Linux host? Well, of course the first thing that came to my mind was the gnuplot. The program is well known, powerful and supported on different platforms. The only problem is that the gnuplot doesn't offer a dynamic plot update capability out of the box and additional tools are needed to do that. A perl implementation to feed the gnuplot with live data which I found on the Internet was not exactly what I needed so I decided to write my own implementation.

 

Prerequisites and statements

  • This article is not a bash shell or gnuplot tutorial, reader should be familiar with the bash shell syntax and with the gnuplot basics. Knowledge of the AWK scripting language is desired but not necessary to use scripts described in the article.
  • gnuplot installation must support the either 'x11' or 'wxt' or 'qt' terminal device types. On Ubuntu based systems it is enough to install the gnuplot-x11 or gnuplot5-x11 package. Alternatively one could install the gnuplot-qt or gnuplot5-qt package and use the 'qt' terminal device type by adjusting the gnuplotwindow.sh script described in the article.
  • An AWK interpreter must be installed. All scripts are proven to work with the GNU AWK.
  • All scripts where only tested on GNU/Linux platforms. Other UNIX platforms support is possible but isn't guaranteed.

 

CPU load and a network interface throughput scripts


I will start with presenting of two scripts to measure CPU load and a network interface throughput because I will use them for demonstration purposes over the entire article.

 

CPU load measurement script

#!/bin/sh

ec='echo -e';[ -n "$($ec)" ] && ec='echo'
cpu="cpu$1 "
oe=0

while [ $oe -eq 0 ]; do
  res=$(awk -v cpu="$cpu" '{if(match($0,cpu)) {
                             for(i=2;i<=NF;++i){s+=$i}
                             printf "%.0f;%.0f\n",s,$5+$6;
                            }}' /proc/stat)
  total=${res%%;*}
  idle=${res#*;}
  [ -n "$prevtotal" ] && {
    totaldiff=$((total-prevtotal))
    s=$((100*(totaldiff-idle+previdle)))
    [ $totaldiff -ne 0 ] && s=$((s/totaldiff))
    if [ -t 1 ]; then
      $ec -n "\033[s$s\033[K\033[u"
    else
      echo "$s"
    fi
    oe=$?
  }

  prevtotal=$total
  previdle=$idle
  sleep 1
done
Listing 1 - cpustat.sh

Script receives a CPU core number as its input parameter and outputs that CPU core load in percent once per second. If no input arguments are given then overall CPU load is provided.

 

Network interface throughput measurement script

#!/bin/sh

# workaroud for the dash not handling -e correctly
ec='echo -e';[ -n "$($ec)" ] && ec='echo'

f="/sys/class/net/${1:-wlan0}/statistics"
otx=$(cat "$f/tx_bytes")
orx=$(cat "$f/rx_bytes")
oe=0

while [ $oe -eq 0 ]; do
  tx=$(cat "$f/tx_bytes")
  rx=$(cat "$f/rx_bytes")
  txd=$((tx-otx))
  rxd=$((rx-orx))
  [ $txd -lt 0 ] && txd=$((txd+4294967296))
  [ $rxd -lt 0 ] && rxd=$((rxd+4294967296))
  if [ -t 1 ]; then
    $ec -n "\033[s$((txd+rxd)) ${2+$txd} ${2+$rxd}\033[K\033[u"
  else
    echo "$((txd+rxd)) ${2+$txd} ${2+$rxd}"
  fi
  oe=$?
  otx=$tx
  orx=$rx
  sleep 1
done
Listing 2 - ifacestat.sh

Script receives network interface name as its first input parameter and outputs the overall interface throughput once per second. If any second input argument is specified then it outputs transmit and receive statistics additionally.

Both scripts check if they are connected directly to a terminal and if this is the case then they output always into the same one line.

There are two things worth to mention here:
  1. To use terminal control escape sequences I need to use echo's '-e' option. At the same time I'm forced to use '#!/bin/sh' interpreter to make it possible to run the scripts on the router with the busybox. But default 'sh' script interpreter on my computer is 'dash' which does not understand '-e' flag. So the workaround for the 'echo' command is added to make scripts be both cross-platform.
  2. The 'while' loop termination condition is a check for an output error. At first I just defined an infinite loop with the 'while true' condition which works perfectly if the script is started locally. But when I started the script remotely over an ssh session then the script would keep running over there even after the ssh connection were closed. Previously I have used a check for the script's process standard output descriptor existence but it appears to be not reliably working in all cases. A much easier solution is just to check the return status of the echo command. If it cannot output then the script should be stopped.

 

Feed gnuplot with live data


Some implementations I've seen use a temp file to store live data to be displayed by the gnuplot. This is really not a straightforward solution because there would be two processes accessing the same file simultaneously which eventually will cause conflicts. A better solution is to take data source, pipe it to a filter or a chain of filters then format the data so that the gnuplot would understand it and finally pipe it to the gnuplot. The gnuplotwindow.sh script presented below does the formatting of incoming data and pipes specified amount of last data samples to the gunplot:
#!/bin/bash

terminal="wxt"     # terminal type (x11,wxt,qt)
winsize=${1:-60}   # number of samples to show
yrange=${2:-0:100} # min:max values of displayed y range.
                   # ":" for +/- infinity. Default "0:100"
shift;shift        # the rest are the titles

styles_def=( "filledcurves x1" "boxes" "lines" )
# remove the color adjustment line below to get
# default gnuplot colors for the first six plots
colors_def=("red" "blue" "green" "yellow" "cyan" "magenta")
colors=( "${colors_def[@]}" )

# parsing input plots descriptions
i=0
IFS=$';'
while [ -n "$1" ]; do
  tmparr=( $1 )
  titles[$i]=${tmparr[0]}
  styles[$i]=${styles_def[${tmparr[1]}]-${styles_def[0]}}
  colors[$i]=${tmparr[2]-${colors_def[$i]}}
  i=$((i+1))
  shift
done

IFS=$'\n'
samples=0          # samples counter
(while read newLine; do
  [ -n "$newLine" ] && {
    nf=$(echo "$newLine"|awk '{print NF}')
    a=("${a[@]}" "$newLine") # add to the end
    [ "${#a[@]}" -gt $winsize ] && {
      a=("${a[@]:1}") # pop from the front
      samples=$((samples + 1))
    }
    echo "set term $(echo $terminal) noraise"
    echo "set yrange [$yrange]"
    echo "set xrange [${samples}:$((samples+${#a[@]}-1))]"
    echo "set style fill transparent solid 0.5"
    echo -n "plot "
    for ((j=0;j < $nf;++j)); do
      echo -n " '-' u 1:$((j+2)) t '${titles[$j]}' "
      echo -n "w ${styles[$j]-${styles_def[0]}} "
      [ -n "${colors[$j]}" ] && echo -n "lc rgb '${colors[$j]}'"
      echo -n ","
    done
    echo
    for ((j=0;j < $nf;++j)); do
      tc=0 # temp counter
      for i in ${a[@]}; do
        echo "$((samples+tc)) $i"
        tc=$((tc+1))
      done
      echo e # gnuplot's end of dataset marker
    done
  }
done) | gnuplot 2>/dev/null
Listing 3 - gnuplotwindow.sh

Script has the following input parameters:
  • number of last samples to show (default is 60). If data is fed once per second then the x axis represents the number of seconds since the script has been started
  • y axis min and max values formatted as "min:max". min and max or both could be omitted
  • the rest are the descriptions for each data column in the following form -- "Title;Style_index;Color". Here:
    • Title -- is the legend displayed on the graph
    • Style_index -- is the index into an array of predefined line styles. Currently the following styles are supported:
      Index Line Style Default
      0 Filled Curves yes
      1 Boxes no
      2 Lines no
    • Color -- color specified either as a color name supported by the gnuplot (e.g. red) or as a hex code (e.g. #ff007d)
Note: As far as semicolon is used as a separator it is not allowed to have it in the Title.

If colors aren't explicitly given as input parameters for the first six plots then a predefined color sequence will be used because I didn't like the default one. To get gnuplot defaults either change the colors array definition in the colors_def variable or remove the line completely like it is explained in the script.

A note from the gnuplot change log:
In version 5 a default overall color sequence can be selected using "set colors {default|classic|podo}". The "classic" sequence is red/green/blue/magenta/cyan/yellow as used by older gnuplot versions. The default and podo colors are chosen to be more easily distinguished in print and in particular by people with color vision problems."
So if for example you want to use the 'podo' color sequence then add "set colors podo" to your .gnuplotrc and remove color adjustment line from the script.

Number of input data streams is not limited by the script but could be limited by a maximum number of plots that gnuplot can put onto one graph. Simply put, script reads new line from its standard input and creates so many plots as number of records are given in that line separated by a white space (i.e. tabs or spaces).

Some parameters like gnuplot's terminal type are adjustable by editing the script.

gnuplotwindow.sh usage examples and other useful scripts


To show how to use the script I'll start with easy examples and continue with more advances ones. I'll present additional scripts for different use cases that make gnuplotwindow.sh even more powerful.

 

Plot 'wlan0' interface overall throughput

$ ifacestat.sh 'wlan0' | gnuplotwindow.sh 60 "0:" "wlan0"
wlan0 interface overall throughput
Figure 1 - wlan0 interface overall throughput
The same way I could start the ifacestat.sh on a remote host over an ssh session and display the stream locally:
$ ssh root@router 'ifacestat.sh "wlan0"' | \
  gnuplotwindow.sh 60 "0:" "wlan0"

 

Plot 'wlan0' interface overall throughput and its average


To add an average of a data column an additional script is needed:
#!/bin/bash

avs=${1:-10}    # average over so many last samples,
                # default is 10 
column=${2:-1}  # read new value from this column

awk -v avs=$avs -v col=$column '
BEGIN {
  sum=0;start=0;end=0
}
{
  if (match($0,/^.+$/)) {
    a[end] = $col
    end++;
    if (end>start+avs) {
      sum-=a[start];
      delete a[start];
      start++;
    } 
    sum+=$col;
    print $0 " " sum/avs;
    fflush()
  }
}' -
Listing 4 -- addaverage.sh

Script's input parameters are:
  • number of last samples to calculate an average. Default is 10
  • the column to calculate an average for. Default is 1 (the first column)
$ ifacestat.sh 'wlan0' | addaverage.sh | \
  gnuplotwindow.sh 60 "0:" "wlan0" "wlan0 average" 
alt text
Figure 2 - wlan0 interface overall throughput and its average

 

Remove unwanted data columns from a stream


There are situations when a stream sends undesired data columns along with information I'm interested in. In this case unwanted columns could be removed using the following script:
#!/bin/bash

# input parameter is the columns to be removed
# separated by comma - "1,3,5"

awk -v cols="$1" '
BEGIN {split(cols,a,",")}
{
  for(c in a)
    $a[c]="";
  print $0;
  fflush();
}' -
Listing 5 - removecolumns.sh

Script's input parameter are the columns to be removed separated by a comma.

As an example I will display 'wlan0' interface Tx and Rx throughput averages. This time I'm not interested in the original data but only want to display average for the Tx and Rx.

But first for the testing purposes I'll run a part of the pipeline:
$ ifacestat.sh 'wlan0' any | addaverage.sh 10 2 | \
  addaverage.sh 10 3
0 0 0 0 0
280493 5577 274916 557.7 27491.6
278470 3686 274784 926.3 54970
279330 4546 274784 1380.9 82448.4
209992 3904 206088 2628.4 149587
^C
First column is the overall throughput followed by Tx, Rx, Tx average and Rx average. Here I'm only interested in the last two columns so the first three columns needs to be removed:
$ ifacestat.sh 'wlan0' any | addaverage.sh 10 2 | \
 addaverage.sh 10 3 | removecolumns.sh '1,2,3' | \
 gnuplotwindow.sh 300 "0:" "Tx average" "Rx average"
[Image]
Figure 3 - wlan0 interface tx and rx average
This time I'm displaying last 300 samples or in this case seconds. Rx throughput drops are plainly visible with period of 120 seconds.

 

Plot asynchronous streams of data


What if one wants to display several independent data streams? In this case it is necessary to combine streams data into one line somehow.

Using bash ampersand operator it is possible to execute several processes asynchronously. As an example I'll combine CPU0 and CPU1 loads:
$ (cpustat.sh 0 & cpustat.sh 1) | cat -
17
17
22
16
^C
But it is not clear in this output which stream is which. The solution to this problem is easy - just mark all steams with the following one-liner:
#!/bin/bash

awk -v c="$1" '$0 ~ /^.+$/ {print c":"$0;fflush()}' -
Listing 6 - markpipe.sh

Script's input parameter is an index to be assigned to the data stream. Indices shall starts with 0:
$ ((cpustat.sh 0 | markpipe.sh 0) & \
  (cpustat.sh 1 | markpipe.sh 1)) | cat -
0:21
1:18
0:20
1:16
0:17
1:19
^C
Now I know which line belongs to which stream. But there's no guarantee that all streams output with the same frequency! Taking into account this consideration the combinepipes.sh script presented below combines marked data streams into one line which can be fed to the gnuplotwindow.sh script:
#!/bin/bash

# by default 2 pipes are combined
pipes=${1:-2}

awk -F: -v pipes=$pipes '
{
  a[$1]=$2;
  cnt=0;
  for (i in a) cnt++;
  if(cnt==pipes) {
    i=0;
    while(i < cnt) {
      printf("%s ", a[i])
      delete a[i++]
    }
    printf("\n")
    fflush()
  }
}' -
Listing 7 - combinepipes.sh

Script's input parameter is the number of streams to combine. The script waits until data for all streams is available and then outputs a line. Data from the stream marked with 0 goes to the first column and so on:
$ ((cpustat.sh 0 | markpipe.sh 0) & \
   (cpustat.sh 1 | markpipe.sh 1)) | \
  combinepipes.sh 2
17 20 
17 18 
21 23 
13 19 
^C
And now this output can be piped to the gnuplotwindow.sh:
$ ((cpustat.sh 0 | markpipe.sh 0) & \
   (cpustat.sh 1 | markpipe.sh 1)) | \
  combinepipes.sh 2 | \
  gnuplotwindow.sh 60 "0:100" "CPU0 load" "CPU1 load"
 
[Image]
Figure 4 - CPU0 and CPU1 load

 

Synchronize streams of data


In the example above both streams output once per second but what would happen if one of the streams would output once per five seconds? In this case combinepipes.sh would output once per five seconds too -- with the frequency of the slowest stream! This would mean that some data of faster streams would be lost! To "synchronize" streams of data (i.e. make streams output with the desired frequency) the following perl script can be used:
#!/usr/bin/env perl

use strict;
use warnings;
use Time::HiRes qw(setitimer ITIMER_REAL);

use IO::File;
STDOUT->autoflush(1);
STDERR->autoflush(1);

die "Usage: $0 timeout_in_sec [default value]\n" if @ARGV < 1;

my $lastStr;
if (defined $ARGV[1]) {$lastStr = "$ARGV[1]\n";}

$SIG{ALRM} = sub {
  if (defined $lastStr) {print "$lastStr";}
};

setitimer(ITIMER_REAL, $ARGV[0], $ARGV[0]);

while()
{
  $lastStr = $_;
}
Listing 8 - syncpipe.pl

Script's input parameters are:
  • a timeout in seconds (can be fractional) before the cached value is printed
  • a default value
The default value parameter is needed because if a stream outputs first time say after one minute the combinepipes.sh will have to wait for one minute until data for all streams is available. That means that data for faster processes would be lost for the first minute. This script is written in perl simply because it would not be possible to do it in bash or AWK.

To demonstrate the synchronization of several streams I'll do the following: I'll combine outputs of three processes that output once per second, three times per second and once per five seconds. Let's assume that the minimum output frequency I'm interested in is once per second:
$ ((cpustat.sh|markpipe.sh 0) & (while sleep 0.33; \
   do echo 40;done|markpipe.sh 1) & \
   (while sleep 5;do echo 60;done|syncpipe.pl 1 20|\
   markpipe.sh 2))|combinepipes.sh 3 | \
  gnuplotwindow.sh 60 "0:100" "CPU load" "40" "60" 

[Image]
Figure 5 - Synchronized data streams

The slowest process that outputs once per five seconds is synchronized to output value 60 once per second and use value 20 as the default value. If no default value for it would be specified then the first five seconds would not be presented on the plot at all. The faster process that outputs value 40 three times per second is synchronized automatically by the combinepipes.sh script because it outputs with the frequency of the slowest stream which is once per second in this case. That's why it is not necessary to synchronize it additionally using the syncpile.pl. As the result the final output frequency is once per second.

 

Plot with different line styles and colors


In this example I'll show how to plot streams with different line styles and colors. Also in this example I'll show how to perform additional computations on a data stream and add the result as a new data column:
$ (while sleep 1; do \
   cat /sys/class/thermal/thermal_zone0/temp; done)\
   |awk 'BEGIN{f=0} {x=$1/1000; \
         if(f==1){print x" "x-xold}; \
         f=1;xold=x;fflush()}'| \
   bin/gp/gnuplotwindow.sh 60 "-5:60" "CPU Temp;2" \
   "CPU Temp derivative;1;#008000"
In the example above I'm displaying CPU temperature as a normal line with a default color (red) and I'm adding and displaying the temperature derivative as bars with the color #008000. The AWK script in the example implements a simple first order high pass FIR filter and as far as the temperature is in millicelsius it divides input value by 1000. Here is the result:

[Image]
Figure 6 - CPU temperature and its derivative

 

Record and replay data streams


In this last example I'll show how to record data streams into a file and then replay them:
$ ifacestat.sh wlan0 any | \
  awk '{printf("%s %d\n",$0,$2+$3);fflush()}' | \
  removecolumns.sh 2,3 | tee /tmp/out.txt | \
  gnuplotwindow.sh 60 "0:" "wlan overall" "wlan tx+rx"
In the example above the sum of the wlan0's Tx and Rx throughputs is calculated by the AWK script and added as an additional column. As far as I only want to compare the calculated overall throughput with the one reported by the ifacestat.sh script I'm removing Tx and Rx columns and pipe the result to the gnuplotwindow.sh through the tee command that saves it into the /tmp/out.txt file. If my calculation is correct then both plots should align perfectly:

[Image]
Figure 7 - Calculated and reported 'wlan0' interface overall throughput

As you can see calculated and reported overall throughputs plots match perfectly.
To replay the result the following command could be used:
$ cat /tmp/out.txt | while read l; do echo $l; sleep 0.1; done | \
gnuplotwindow.sh 60 "0:" "wlan overall" "wlan tx+rx"
I've recorded with the frequency once per second but will replay with ten times per second. An animated GIF of the command above is shown below:
Figure 8 - Replayed record

 

Known issues


gnuplot 5.0 'wxt' terminal can crash with segmentation fault error if incoming stream delivers data too fast. Workaround is to use the 'qt' terminal device type. Additionally it is not possible to plot faster than 60 frames per second anyway and if you data stream is faster than that then add 'syncpipe.pl 0.016' to the end of your pipe before feeding the gnuplotwindow.sh with data.

 

Conclusion


Presented approach to feed the gnuplot with live data is straightforward, robust and easy to understand. Still it can be improved by adding different plot modes like 3D or scatter.

In case if you are still curious if I could find the reason for the aperiodic Wi-Fi throughput drops I mentioned in the beginning of the article then yes, I found it actually. Using aforementioned ifacestat.sh and cpustat.sh scripts I found out that there are periodic Wi-Fi throughput drops each 120 seconds caused by the Network Manager. Several years I was suffering from this problem and even filed a bug a year ago but only recently I've found out that filling out BSSID value in wireless station settings of the Network Manager fixes it! Who would know, huh?

All scripts described in the article are also available from my Github repository - https://github.com/flux242/dotfiles/tree/master/.bin

No comments: