October 26th, 2014 | Tags: ,

I have just installed the Crayon Syntax Highlighter plugin on this blog. Obviously a lot of the posts that I put here contain code snippets, so this great plugin will make that a little neater with feauters such as line numbers toggle, syntax highlight, copy into clipboard and many more. Over time I hope to update the existing content on the site to make use of the plugin.

October 18th, 2014 | Tags: , , ,

In this post we look at how text data can be transposed in a shell script. Suppose you have a comma-delimited text file (csv) which looks like this:


2014-10-01,Reading1,20.3
2014-10-01,Reading2,21.5
2014-10-01,Reading3,24.0
2014-10-01,Reading4,22.2
2014-10-02,Reading1,20.5
2014-10-02,Reading2,21.5
2014-10-02,Reading3,24.1
2014-10-02,Reading4,22.4
2014-10-03,Reading1,20.5
2014-10-03,Reading2,21.7
2014-10-03,Reading3,24.2
2014-10-03,Reading4,22.5

…and so on. Perhaps this is a set of sensor readings over a period of time, and in this case there are four readings per day. For further analysis it might be more suitable to store each date on a single line with the four readings as columns. In other words we want to transpose rows to columns, i.e. pivot the values on date. The file should look like this:


2014-10-01,20.3,21.5,24.0,22.2
2014-10-02,20.5,21.5,24.2,22.4
2014-10-03,20.5,21.7,24.1,22.5

Since this needs to process multiple input rows of to produce one output row, sed will not be suitable. Instead we need to use awk. The following tiny script will do the trick.

Now, what exactly is this doing? First, we need to specify that the file is comma delimited, which is what -F, does. Next, the main principle is that the code stored between the curly brackets will be executed individually for each row, however a session (including variables) is maintained throughout the execution of entire input. So val is a variable into which we are storing the third field on each row ($3) prepended by a comma. The if statement checks whether the row number (NR is a special built in variable which holds the number of the row being currently processed) is divisible by four (% is the modulo function, as in most languages). If yes, we print the date which is the first column ($1) as well as the val variable which now has the values from the previous three rows as well as this one, separated by commas. The variable is then reset.

Obviously, we are making an assumption here that the data is uniform, i.e. that there are exactly four readings available for each day; otherwise the script would be a little more complex.

October 18th, 2014 | Tags: , ,

Suppose you have a monthly process to archive some data such as log files etc. Each month a separate archive file is created, and so after a few months you will have several archive files – for example as shown below:


archive.2014-08.tar.gz
archive.2014-09.tar.gz
archive.2014.10.tar.gz

Now if you wish to extract your data from all three files, you could run individual commands such as:


tar -zxvf archive.2014-08.tar.gz
tar -zxvf archive.2014-09.tar.gz
tar -zxvf archive.2014-10.tar.gz

This works fine. However the following won’t work if you want a one-line which does them all in one short:


tar -xzvf *.tar.gz
tar: archive.2014-09.tar.gz: Not found in archive
tar: archive.2014-10.tar.gz: Not found in archive

Basically the way that tar command works, if more than one filename (or an expansion) is passed as an argument, it will look for the second, third, etc. files inside the first file. So the error is saying it cannot find the second and third archive file inside the first one (archive.2014-08.tar.gz). So in essence the tar command itself really does have to be called individually for each file. The way to simplify this on a one-liner is to use the xargs command which will do exactly that:


ls *.tar.gz | xargs -n1 tar -xzvf

October 25th, 2013 | Tags: , ,

FPrint is a PPA with packages for fingerprint-based authentication. Website includes good documentation on how to install and set it up.

October 16th, 2013 | Tags: , , ,

This is a great one-liner which removes old kernel images and frees up space in your boot partition:


sudo apt-get purge $(dpkg -l linux-{image,headers}-"[0-9]*" | awk '/ii/{print $2}' | grep -ve "$(uname -r | sed -r 's/-[a-z]+//')")

This comes from the top answer to a question on ask ubuntu.

September 20th, 2013 | Tags:

Here’s an article on Lego’s Land on how to reorder accounts in the left pane of Thunderbird.

August 16th, 2012 | Tags: , ,

The Nokia N900 is getting a little old now, but it still an amazing piece of kit. This post has a few pointers for making the most out of command line usage it enables.

  • Command Line Execution Widget: This widget lets you run commands from the desktop and outputs the results..
  • Cmd Shortcuts: Allows you to quickly run your own defined commands.
  • gPodder: This excellent podcast catcher comes with a command line interface. Run gpo from the terminal to get the full list of options.
  • FeedingIt: This RSS aggregator can be run from the command line. The known options are /usr/bin/FeedingIt update and /usr/bin/FeedingIt status.
  • Alarmed: This is a graphical interface to the cron scheduler. Apart from neat access to phone functionality (e.g. switching profiles, networking, and yes alarms) it also allows arbitrary shell commands to be scheduled. This is particularly useful with the gPodder and FeedingIt mentioned above. To use this tool you currently need to enable the testing repository.
August 11th, 2012 | Tags: , , ,

Most digital cameras store Exif data in the JPEG photo files. This includes things like date and time, camera model and camera settings and in some cases even GPS coordinates. jhead is a very useful command line utility which can read and edit the Exif data. For example, you may wish to remove the data from photos published online.

Another useful thing is to rename photos using the date and time information stored in Exif (note: read the documentation before running):

jhead -n%Y%m%d-%H%M%S *.jpg

And best of all jhead is in the Ubuntu universe repository, so installing it is as simple as:

sudo apt-get install jhead

April 20th, 2012 | Tags:

If you are evaluating reporting or analytics tools, or just like to mess about with them, it’s always good to get your hands on some “real world” data sets. It sure beats using the Steel Wheels, Classic Models, eFashion etc. sample databases typically shipped with the products.

For a truly comprehensive list of free data sets, check out this page on Quora. Below are some of my favourites:

The following sites are also worth following on a more ongoing basis:

February 4th, 2012 | Tags: ,

Steps to install Oracle Express Edition (XE) database 10g on Ubuntu 11.10 (Oneiric).

  1. Download the Oracle XE deb package (free registration is required).
  2. Double click the downloaded file and select to install it.
  3. In terminal run sudo /etc/init.d/oracle-xe configure.
  4. You will be prompted to enter the following parameters: HTTP port number, database listener port number, SYSTEM and SYS database accounts password and whether the service should be started upon boot.
  5. Thereafter the configuration might take a few minutes. That’s it. To start the service in the future run sudo /etc/init.d/oracle-xe start and to stop sudo /etc/init.d/oracle-xe stop.