Shell Scripting

Error Handling

There are several ways to save the output of a shell script. The easiest way doing so is by simply appending the output to a textfile via ‘>’:

script.sh > logfile.log

The problem with this solution is that not every output is saved to the log file. If a program like ‘curl’ is called, it’s outputs are not appended to the logfile.

This has to do with the types of information a program is sending to the screen/terminal. One type regarding standard output information is called ‘STDOUT’ the other regarding standard error information is called ‘STDERR’. There are three ways in getting this information and writing it into a log file. The script function to do so is called ‘exec’ and has three output options

  1. Standard Output Information: The logfile output of ‘exec 1’ looks basically the same as appending a logfile to a script.

    #!/bin/bash
    exec 1>> logfile.log
    
    # script
    # ...
  2. Standard Error Information: By calling ‘exec 2’ just error messages are written to the log file.

    #!/bin/bash
    exec 2>> logfile.log
    
    # script
    # ...
  3. Combined Output and Error Information: Writing both output and error information to the logfile, call ‘exec &’.

    #!/bin/bash
    exec &>> logfile.log
    
    # script
    # ...

Link to script

just make a symbolic link in /bin to the custom script you want to launch…

References

Remember The Milk as a Ubuntu Widget

RTM is a nice task managing tool. Here is one solution to integrate it with your Ubuntu desktop as a widget.

If not already done install the Compiz Config Settings Manager as follows:

sudo apt-get install compizconfig-settings-manager

In the compiz settings by activating the “Widget Layer” (under category “Desktop”) it is possible to show by shortcut F9 any desired information or small tools like a calendar, weather information, a calculator, your tomboy notes, etc. right on top of your desktop. In other words it’s handy for all the small tools You need during a hard working day.

First of all you need to install the screenlets package:

sudo apt-get install screenlets

After that, start the screenlets daemon and click “install” on the left pane. Select “Convert Web Widget” from the drop box and proceed. If not selected, chose “Google Gadgets” and paste following HTML code into the field below:

<script src="http://www.gmodules.com/ig/ifr?url=http://www.rememberthemilk.com/services/modules/gmail/rtm.xml&amp;synd=open&amp;w=320&amp;h=300&amp;title=Remember+The+Milk&amp;border=%23ffffff%7C3px%2C1px+solid+%23999999&amp;output=js"></script>

If it doesen’t work, get the code directly from the Google Gadgets Webpage.

Name the screenlet the way you want to and confirm the settings. Now Your new screenlet should now be listed in the Screenlets Manager. Select it, go to the options in the left pane and check “Widget”.

Start it by double-clicking. Press F9 to show your Widgets. In Your RTM Widget You now should be able to log in and manage your tasks.

References

  • More solutions for desktop integration

Deluxe Ski Jump 2.1 in Ubuntu

Yes, it works. DSJ 2, the ingenious ski jumping game, bringing hours of joy to thousands of enthusiasts, runs under DOSBox. Big deal for everyone having a newer OS than Windows 95.

First, get and install DOSBox from the repositories:

sudo apt-get install dosbox

AFTER starting DOSBox, you find and open the DOSBox config file, usually stored under /home/USER/.dosbox/dosbox-XXX.conf. In order to run DSJ with proper speed, modify the following parameters:

fullscreen=false
output=overlay
scaler=normal2x
core=dynamic
cycles=100000
cycleup=2000
cycledown=2000

Open a shell and start DOSBox. Now, you will have to mount the directory where DSJ is installed and start the game:

mount c /path-to-dsj
C:
dsj.exe

Screenshot

dsj-screenshot

Important Shortcuts
  • [Alt]+[Enter]: Enter/exit fullscreen mode.
  • [Ctrl]+[F10]: Release mouse from DOSBox.
  • [Ctrl]+[F11]: Decrease speed.
  • [Ctrl]+[F12]: Increase speed.
Resources

Create User and Grant Rights

Create User and Roles

To create a user type following in the psql shell:

 CREATE USER myuser WITH PASSWORD 'mypassword';

Changes are made by exchanging CREATE with ALTER.

Grant Rights to User and Roles

In order to fulfill some tasks a user has to be granted rights on a database like:

 GRANT ALL PRIVILEGES ON DATABASE database TO user;

Rights can be granted on more specific actions (e.g., ‘INSERT’-statements) and to more specific objects than databases. Type the following to see further instructions:

 \h GRANT

Create a ‘READ-ONLY-USER’

In most situations there are some users only allowed to read data instead of making any modifications. Since the standard GRANT function only offers to specify a single table here is a script to grant ‘select’-rights to all tables of various schemas.

Just adapt myuser and myschema and run the script in your psql shell:

select 'grant select on '||schemaname||'.'||tablename||' to myuser;' from pg_tables 
where schemaname in ('myschema1', 'myschema2', 'myschema3') order by schemaname, tablename;

By substituting ‘grant select on’ to ‘grant all on’, ‘grant update on’, etc. you can easily apply this little script for other purposes.

Set Search Path

Setting a search path spares writing the database schema in every query, as postgres will then just search all schemata in the search path for the stated objects (e.g. tables)

 SET search_path TO myschema1, myschema2;

References

Create an UPSERT Trigger

Create an UPSERT Trigger

A common problem you’ll probably have to face when inserting new rows in your database is a conditional update if the row already exists or insert if the row doesen’t exist.

In this example ther is a table called “fsq_venues” containing the fields “fsq_id” (not the primary key but the unique identifier we’ll use in the trigger), “lat”, “lon”. The trigger “orderlines_insert_before_T” is called every time an INSERT is done to the table “fsq_venues” and the trigger runs the function “orderlines_insert_before_F()” which acually checks if the new row has the same “fsq_id” identifier as any row existing in the table.

DROP TABLE fsq_venues;

CREATE TABLE fsq_venues (
  ID SERIAL PRIMARY KEY,
  fsq_id varchar(255),
  lat numeric,
  lon numeric,
);

SELECT AddGeometryColumn(
  'fsq_venues',
  'geom',
  -1,
  'POINT',
  2
);

CREATE OR REPLACE FUNCTION orderlines_insert_before_F()
RETURNS TRIGGER
 AS $BODY$
DECLARE
    result INTEGER; 
BEGIN
    SET SEARCH_PATH TO PUBLIC;

    -- Find out if there is a row
    result = (select count(*) from fsq_venues
                where fsq_id = new.fsq_id
                --  and sku      = new.sku
               );

    -- On the update branch, perform the update
    -- and then return NULL to prevent the 
    -- original insert from occurring
    IF result = 1 THEN
        UPDATE fsq_venues 
           SET name = lat = new.lat, lon = new.lon
         WHERE fsq_id = new.fsq_id;
           -- AND sku      = new.sku;

        RETURN null;
    END IF;

    -- The default branch is to return "NEW" which
    -- causes the original INSERT to go forward
    RETURN new;

END; $BODY$
LANGUAGE 'plpgsql' SECURITY DEFINER;

-- That extremely annoying second command you always
-- need for Postgres triggers.
CREATE TRIGGER orderlines_insert_before_T
   before insert
   ON fsq_venues
   FOR EACH ROW
   EXECUTE PROCEDURE orderlines_insert_before_F();

References

Original code by Kenneth Downs via: The Database Programmer (blog): Approaches to “UPSERT”

Basics

Install

Ubuntu users find a straight forward explanation at Ubuntu Community Documentation.

Export & Import Data Dumps

Export dump by:

 pg_dump dbname > dbname.out

Or, to dump all databases:

 pg_dumpall > outfile

To import, create a DB and insert the dump:

 createdb -D [table_space] dbname
 psql -f dbname.out -U postgres dbname

Or, to load the entire databases:

 psql -f outfile

postgres

Have a look at optional function parameters for further specifications.

Import Data from Files

PostgreSQL offers built in functionality for data import from several file types, the function syntax is as follows:

copy my_table from '/path/to/csv/my_data.txt' DELIMITERS ',' CSV;

Make sure the postgresql user has according rights to read path and file!

Export Tables and Queries

The easiest way to export data from a PostgreSQL table or view is to just open the data view, mark the whole table, copy it and paste it again in a text or spreadsheet file. The advantage of this solution discloses in avoiding problems with encoding and German Umlauts.

The more complicated way is to open a shell, login to your DB and type the following:

dbname=> \o /home/user/file.csv
SELECT date, avg(sales) FROM sales_data GROUP BY data;

It is handy if do not have a X-server at hand or accessing remote data bases.

Start, Stop & Restart PostgreSQL Server

All this is done as follows, provding either start, stop or restart as parameter

 sudo /etc/init.d/postgresql-8.4 restart

Miscellaneous

Show Data Directory

Get the data directory within psql:

 SHOW data_directory;

Cast Data Types

How to change the data type of a field, e.g. when inserting it into another table? There is a simple way to do so by adding ‘::’ and the data typeafter the column name. Lets assume zip is of type character and we want to transform it into an integer:

 INSERT INTO cities (id, zip, city) SELECT id, zip::int, city FROM city_raw;

Same works for various date, numeric or character data types.

References

PostgreSQL Additional Facilities

Besides the main functionalities PostgreSQL provides a lot of additional functions not yet in the core distribution, but in the so called contribution distribution.

Download and install them by:

 sudo apt-get install postgresql-contrib

dblink enables to set up links to other PostgreSQL data bases, which comes very handy if you need to exchange data or want to do a join with data not in your current DB.

Install

Before use, you have to run an install sql script which is located in the contrib folder of your PG distribution.

Either copy the script into any folder you have permission to run sql

psql mydb < dblink.sql

or apply it directly with sudo

sudo psql mydb < /usr/share/postgresql/8.4/contrib/dblink.sql -U myuser

There is also an uninstall_dblink.sql in case you want to remove it again.

Apply

Shrink pdfs

If you want to reduce PDF file size using command line you can use ghostscript.Ghostscript is used for PostScript/PDF preview and printing. Usually as a back-end to a program such as ghostview, it can display PostScript and PDF documents in an X11 environment.

Furthermore, it can render PostScript and PDF files as graphics to be printed on non-PostScript printers. Supported printers include common dot-matrix, inkjet and laser models.

Package gsfonts contains a set of standard fonts for Ghostscript.

Use the following command syntax to reduce your pdf files

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

Replace your filenames accordingly and you are done!

References

Time Dimension for OLAP Cubes

When working with OLAP cubes there comes a point in time where you add a time dimension and therefore need a constant list of discrete time points, e.g. days, weeks etc. Moreover you have/want to define a hierarchy for this time dimension. Hierarchy levels can be years, quarters, months, weeks and the like.

In order to provide a complete list of time points and levels here is a small R script to create a data frame holding different time measures.

Define Function for Calendar Week Calculation

Since I could not find an appropriate function to calculate the calendar week from a date, not even for the US definition, I wrote one myself. It calculates the calendar week according to the European understanding, defined by industry standards DIN 1355 and ISO 8601, which is explained on Wikipedia in more detail.

calendarweek <- function(x){
  if(!class(x)[2] == "POSIXt") stop("Insert objects of class POSIXt!")
  year <- format(x, "%Y")
  fdys <- weekdays(as.Date(paste(year, "-01-0", 1:3, sep=""), "%Y-%m-%d")) # weekday of first day in the year
  cw_st <- ifelse(is.numeric(which(fdys == "Monday")), which(fdys == "Monday"), 4) # is a monday in the 3 first days?
  as.numeric(round(difftime(x, as.Date(paste(year, "-01-01", cw_st, sep=""), "%Y-%m-%d"), units="weeks")+.51))
}

Generate Time Dimension Data

By running the code below you generate a data frame holding the most used discrete time measures. Adjust the start and end date as well as the date format to you personal needs and copy the code into your R shell. No further libraries needed.

format_string	<- '%Y-%m-%d'
start_date	<- as.POSIXct('2000-01-01', format_string)
end_date	<- as.POSIXct('2020-12-31', format_string)

dates	<- seq(start_date, end_date, by="1 day")
wdays	<- weekdays(dates)
yrs	<- format(dates, "%Y")
qs	<- quarters(dates)
mnths	<- months(dates)
cws	<- calendarweek(dates)
label_cws	<- paste("CW", substring(as.character(cws+100),2,4)) # force weeks to double-digits

dim_date	<- cbind(as.character(dates), wdays, yrs, qs, mnths, cws, label_cws)
colnames(dim_date)	<- c("date", "weekday", "year", "quarter", "month", "calendarweek", "label_calendarweek")

head(dim_date)

Load the data frame into your data base by either using packages like RODBC, RJDBC, etc., or export a csv with ‘write.table()’ and import this.