Wrappers

Writing a wrapper to go around executables

The Problem

You've created a few commands which use some shared libraries. These might be binaries using .so shared libraries, Java commands looking for .jar files, or Perl or TCL scripts using packages. Most people will have a fairly uniform directory layout with a bin directory and a lib directory. In order that your commands know where to find their libraries you'll have set LD_LIBRARY_PATH, CLASSPATH, PERL5LIB or TCLLIBPATH as appropriate.

Then you move the installation area or someone copies it and everything stops while you patch up your environment variables.

Another example might be that you need to have a consistent set of environment variables for your code to use third party utilities but your users have a hot-potch of inconsistent values if they have any at all.

This all seems a bit too much like hard work. Surely the distribution can take care of this itself?

Of course it can.

The Solution

The solution is to wrapper all of your commands with something that works out where the distribution is and then sets the environment variables as necessary before finally running the real command.

What do we mean by wrappering? Well, we aren't going to be altering the command in any way, what we want to do is stash the real command away somewhere, have the user run us (the wrapper), we'll calculate the environment required then we'll run the real command that was stashed away earlier.

As a neat side-effect, we can also handle multi-architecture binaries, that is the one distribution can contain compiled executables for multiple architectures and the user is none the wiser.

Directory Layout

Part of the key is understanding the directory layout. In the beginning you had a directory layout something like:

-----bin--foo.exe
  |       bar.exe
  |
  |--lib--libfoo.so

we're going to change that to:

-----bin-----foo.exe (a link to the wrapper)
  |       |  bar.exe (a link to the wrapper)
  |       |
  |       |--executables--foo.exe (the real binary)
  |                       bar.exe (the real binary)
  |
  |--lib--libfoo.so

which isn't such a big change. The user will still have .../bin on their PATH.

Wrapper Basics

When the user runs foo.exe they'll be running the wrapper. The wrapper is the same for all the commands so not only will it be working out what directory it is in but it will also have to determine the name of the command the user was trying to run.

Pretty much all of that can be worked out by looking at $0, the name the script was invoked as. People invoke commands in all sorts of funny ways:

/full/path/to/bin/foo.exe - they typed the whole pathname in
./path/to/bin/foo.exe - they typed a relative pathname in
../path/to/bin/foo.exe - they typed a relative pathname in though this differs because we have to handle ..
foo.exe - they simply typed the command name and the shell found it on their PATH.

but they're fairly easy to handle.

Note

You always get the command name!

The wrappers goal is to recreate the full pathname to itself, /full/path/to/bin/foo.exe, after which we can work out the following:

BINDIR="${0%/*}"
EXENAME="${0##*/}"

which would be /full/path/to/bin and foo.exe respectively.

We can then calculate:

TOPDIR="${BINDIR%/*}"
LIBDIR="${TOPDIR}/lib"

and suddenly calculating LD_LIBRARY_PATH, CLASSPATH, PERL5LIB and TCLLIBPATH seem a whole lot easier.

Even easier still if we've included some simple functions to manipulate paths

Those functions will come in extremely handy if, for example, foo.exe invokes bar.exe and thus adds ${BINDIR}, say, to PATH repeatedly as we'd want to trim the path of repeated clutter.

Recreating the Full Path

Of the four ways we would normally expect to be invoked:

/full/path/to/bin/foo.exe - we can leave alone
./path/to/bin/foo.exe and ../path/to/bin/foo.exe - we can mix with ${PWD}, the current working directory
foo.exe - we'll have to find it on the PATH ourselves.

./

If the script was invoked as ./ then we should be able to prepend with ${PWD}. That will leave the slightly messy looking result of ${PWD}/./path/to/bin/foo.exe, ie. /full/./path/to/bin/foo.exe, with library environment variables getting /full/./path/to/lib which isn't the end of the world (it is correct) but doesn't look pretty. It's easy to fix by stripping the ./.

../

As noted, whilst this is the same as for ./ above, we can't just strip ../ as .. means we need to remove a directory. The problem is that we don't know (yet) whether the user started in /full/path/to/bin or / or somewhere else when they ran the command.

We'll assume a function to flatten pathnames exists.

Found on the PATH

If $0 just contains the command name then the shell found the command on the PATH. The easiest solution to finding out where the shell found the command is to ask it:

type -p $0

Caveat

There is at least one caveat in the above and it regards symlinks. If you create a symlink to .../bin/foo.exe called gotcha then the wrapper will be calculating not only the directory containing the gotcha symlink but it will think that the EXENAME is gotcha as well.

It is possible to work around this by testing to see if the full pathname $0 is a symlink and if so read the symlink's value and start again.

That is left as an exercise for the reader.

Running the Real Command

Finally, we need to run the real command. We know we stashed it in bin/executables so we should be able to execute it, passing it the arguments we were originally given:

exec "${BINDIR}"/executables/"${EXENAME}" ${1+"$@"}

Fat Wrappers

Suppose we've written some portable code and it's available on Linux, Solaris (SPARC and x86) and others. How can we differentiate the binaries in .../bin/executables?

Easy. Don't use executables as the stash directory, instead use something along the lines of Linux-x86_64 or SunOS-5.10-i386 depending on what Operating Systems, revisions and architectures you've ported your code to.

Binaries

Your friend here is uname, in particular the mrps flags. uname returns the results in a consistent fashion so you might see results like:

SunOS 5.11 i86pc i386
Darwin 9.8.0 i386 i386
Linux 2.6.18-164.el5 x86_64 x86_64

and you can quickly produce a useful little function like:

_uname ()
{
    set -- $(uname -mrps)
    OS="$1"
    OS_REV="$2"
    case "${OS}" in
    SunOS)
       OS_ARCH="$4"
       ;;
    Darwin|Linux)
       OS_ARCH="$3"
       ;;
    esac
}

from which your executable directory would be:

${OS}-${OS_REV}-${OS_ARCH}

${OS}/${OS_REV}/${OS_ARCH}

or some other combination.

Backwards Compatibility

Some operating systems support backwards compatibility, that is a command compiled for an older release of the OS should be able to run on the current release. With that in mind you could consider generating a list of possible binary directories to look in rather than just the one associated with the current release. For example, SunOS 5.11 should be able to run a binary compiled against SunOS' 5.10, 5.9, 5.8, ...

Scripts

That's good for binaries but what if I've written a script? Put it in a script subdirectory.

You could put it in a different subdirectory if you wanted to invoke a specific interpreter on the script. You have plenty of choice.

More Than One Place to Look?

With a possible binary directory (or directories) and a script directory to look in for your real command you have to make the wrapper a little more complicated as you try to see if the real command exists in any of the calculated subdirectories.

The order in which you search (primarily, do you look in the scripts directories before the binaries directories or vice versa) is a matter of personal taste. Of course, if you have both a script and a binary of the same name you might want to question what on earth you thought was going to happen.

Document Actions