CSV data manipulation in Bash

I used the akw, join and pr to do some csv data manipulation in bash.

I was in the desperate need for converting the csv file of my actual time tracking tool Toggl to a different format I can import. To relieve my weekend pain of manually editing spreadsheets, I wrote a bash script to do this for me.

This is the csv export of Toggl:

User,Email,Client,Project,Task,Description,Billable,Start date,Start time,End date,End time,Duration,Tags,Amount ()
stefan,mail@mail.com,goodclient,299 3021 mytask,"","",No,2013-04-29,08:24:08,2013-04-29,09:18:58,00:54:50,,

This is the format I need:


There is some information I can exclude, some information I need to reformat (eg. time, date) and add some extra stuff like the client id. Here is the script.

[code language=”bash”]

awk -F’,’ ‘NR==1 {$15="Mitarbeiter"} NR>1 {split($8,a,"-");$8=a[2]"/"a[3]"/"a[1];$15="55"}1′ OFS=’;’ input.csv > a.csv;
awk -F’,’ ‘NR==1 {print "Projekt Tät"} NR>1 {print $4}’ input.csv | awk ‘{print $1,$2,""}’ OFS=’;’ > b.csv;
awk -F’,’ ‘NR==1 {print "Von"} NR>1 {print $9}’ input.csv | awk -F’:’ ‘{print $1,$2;}’ OFS=":" > c.csv;
awk -F’,’ ‘NR==1 {print "Bis"} NR>1 {print $11}’ input.csv | awk -F’:’ ‘{print $1,$2;}’ OFS=":" > d.csv;
awk -F’,’ ‘NR==1 {print "Dauer"} NR>1 {print $12}’ input.csv | awk -F’:’ ‘{print $1,$2}’ OFS="." > e.csv;
pr -m -t -s";" a.csv b.csv c.csv d.csv e.csv > f.csv;
LANG=en_EN sort -t ";" -k 2,2 projektliste.csv > projektliste_cens.csv;
LANG=en_EN sort -t ";" -k 16 f.csv > f_cens.csv ;
LANG=en_EN join -t ";" -o 2.8,2.15,2.16,1.4,2.17,2.19,2.20,2.21,2.6 -1 2 -2 16 projektliste_cens.csv f_cens.csv > output.csv;

  • Awk is the perfect command for these kind of tasks. Howevere, I needed to save each step in its own temp file.
  • The first line adds the column “Mitarbeiter” with the according headline and my id. I also reformat the date with “/” rather than “-“.
  • The second line extracts the project and task id from the ‘project column”, which is delimited by spaces.
  • The third and fourth line removes the seconds from the start and end times.
  • The fifth line, basically, does the same thing but uses a dot instead of a colon for the duration (hh.mm).
  • Then, the whole csv files are merged using pr.

The grand final is done by join, because i needed to get some more information from an additional csv file, the client id. Join is pretty much the same as a ‘vlookup’ in spreadsheets. Before this works, however, the matching columns need to be sorted. I had big troubles getting this to work, since there was some general language fuckup. I found the solution here. Join also rearranges the columns.

Run shell scripts in gedit

Running scripts in gedit with the plugin external tools.

With the plugin External Tools you can comfortably run the script while you write it in gedit. After installing the plugin, go to Tools > Manage External Tools and create your own entrys and keyboard shortcuts.

For running a shell script I simply added:


and chose the Current Document as input. You’ll see the output in the bottom panel (Ctrl+F9).

For running an R script you simply add

/usr/bin/R --no-save

and chose the Current Document as input. The output is as well shown in the bottom panel.


Sophie und Nardos Kässpätzle



  • 2 eggs per person
  • 2 tablespoon of flour per egg
  • cheese in the amount of dough, three types
    • hard Bergkaese (25%)
    • soft Bergkaese (25%)
    • Edamer or Gouda (50%)
  • a pinch of salt
  • some onions


  • prepare the dough by mixing flour and eggs
  • stir the dough until it starts to blister
  • add the salt
  • meanwhile grind the cheese
  • roast the onions for the ultimate upgrade of Kässpätzle
  • press the dough into boiling water
  • get the Spätzle out as soon as you see foam
  • combine the Spätzle and the cheese layer by layer
  • put the whole thing in the oven at 120 °C
  • as soon as the cheese is fully melted -> bon apetit!