Python, argparse, and custom actions and types

2017-07-19

I really like Python’s argparse module (which, if you are interested, has also been ported to JavaScript). It’s very featureful and easy to understand, and even though it might be called a bit verbose, I’ve found it to be useful, readable, and well-documented ❤️🐍.

Here’s an example program:

#!/usr/bin/env python3

import argparse
import sys

def main(*args, **kwargs):
    parser = argparse.ArgumentParser(description="This is our program")
    parser.add_argument(
        "--object", action='store', default='BEAR HEARTS', help="Do something with this object")

    parsed = parser.parse_args()

    print("I'm going to do something with:", parsed.object)

if __name__ == '__main__':
    sys.exit(main(*sys.argv))

You might call it like this:

./test.py --somearg "BEES"

And it might return:

I'm going to do something with: BEES

The argparse module automatically handles help too, so you can call ./test.py --help and it will print a help message that can serve as sufficient documentation for a lot of cases, depending on how much care you put into your help keyword arguments. This is what the help for the above program looks like, without writing any additional code:

usage: test.py [-h] [--object OBJECT]

This is our program

optional arguments:
  -h, --help       show this help message and exit
  --object OBJECT  Do something with this object

Modifying arguments after they are passed

I wrote a program where I had to pass lots of arguments that happened to be filesystem paths. A simplistic example:

parser.add_argument("--path1", help="The first path")
parser.add_argument("--path2", help="The second path")

Initially, I had a function like this:

def resolvepath(path):
    """Resolve and normalize a path

    1.  Handle tilde expansion; turn ~/.ssh into /home/user/.ssh and
        ~otheruser/bin to /home/otheruser/bin
    2.  Normalize the path so that it doesn't contain relative segments, turning
        e.g. /usr/local/../bin to /usr/bin
    3.  Get the real path of the actual file, resolving symbolic links
    """
    return os.path.realpath(os.path.normpath(os.path.expanduser(path)))

And I would apply it individually to all the arguments I was using after parsing them, like this:

parsed = parser.parse_args()
parsed.path1 = resolvepath(parsed.path1)
parsed.path2 = resolvepath(parsed.path2)

This works, but it feels wrong, and easy to forget, and I wanted something that worked like the type= parameter that can be passed to add_argument.

A custom action

As it turns out, you can write a custom action that does this modification at argument parsing time. Here’s what I ended up with (combined with my resolvepath() function listed above):

class StorePathAction(argparse.Action):
    """Resolve paths during argument parsing"""

    def __call__(self, parser, namespace, values, option_string=None):
        if type(values) is list:
            paths = [resolvepath(v) for v in values]
        else:
            paths = resolvepath(values)
        setattr(namespace, self.dest, paths)

def main(*args, **kwargs):
    # ...
    parser.add_argument("--path1", action=StorePathAction, help="The first path")
    parser.add_argument("--path2", action=StorePathAction, help="The second path")

What this means is that an action is just a callable that takes three required parameters plus an optional one. Under the hood, argparse calls the function and passes it parser (the parser itself), namespace (an object which is the collection of all already-parsed arguments), values (the value(s) passed to the argument; if you do not pass nargs=X to add_argument(), this will be a single value, but if you do pass nargs (even if it is set to 1) this will be a list of all the passed values), and option_string (which I have not researched).

All of my new functionality comes from the if/else statement inside __call__(). We call resolvepath() on all paths, whether values is singular or plural. That’s it.

A custom type

After doing all of this, I had gotten much more familiar with argparse’s documentation, and I found something that I hadn’t noticed before. It’s a more idiomatic, shorter, and easier to understand method of accomplishing the same end result - you can pass a callable as type= to add_argument() instead.

In fact, I could use my resolvepath() function directly, although I ended up renaming it to use the past participle (resolvedpath()) so that the code read more like English.

parser.add_argument("--path1", type=resolvedpath, help="The first path")
parser.add_argument("--path2", type=resolvedpath, help="The second path")

Use cases

Custom actions and types can be a powerful way to transpose or validate command line arguments at argument parsing time. What I’ve done is a simple transposition; other, more complex transposition might involve creating a custom object from command line input (which is of course always going to come in as a string). argparse has support for some basic types like dates and numbers built in, but this lets you add support for any number of custom classes whenever you like. It’s also useful for parameter validation, like ensuring an argument is a valid IP address or XML document.

That said, a word of caution: not every type of validation belongs in the argument parser. A classic Python pattern is to write code and wrap it in try/catch blocks where it might fail; one reason for this is that checking everything in advance is frequently not as useful as it seems. For example, if you need to write to a file, you might first check to see if the file is locked by some other process, which you could do with a custom action or type at argument parsing time, and then proceed with your write if it isn’t. But what if it becomes locked between the check and the write attempt? The pythonic alternative might be to wrap a try/catch block in a loop around the write attempt, so the program simply tries to write until the file isn’t locked. Validating that the file is unlocked at argument parsing time is not very useful.

That said, custom actions are very useful, and when employed correctly, they can make code more readable with fewer duplications.