Using Python decorators to be a lazy programmer: a case study

Decorators are considered one of the more advanced features of python and it will often be the last topic in a python class or introductory book. It will, unfortunately, also be one that trips up many beginning or even intermediate python programmers. Those who stick it out and work through it, though, will be handsomely rewarded for their hard work.

Known by those in-the-know, decorators are tools to make your python code beautiful, more concise, well-written, and elegant--but did you know you can use decorators to be a lazy bum of a programmer?!

I just recently whipped up a script that I'm using to help my expand and organize my photo library of my favorite pieces of art. This script that takes the URL of an art piece from artsy.net, downloads the image of the artwork and renames the downloaded file to follow a filename template (given as a CLI arg) based on the artist's name, the title of the piece, and the date of completion. For example, if we wanted to download an image of the Fountain, and have the file automatically named: Marcel Duchamp - Fountain - 1917.jpg, one can use the following command:

python artsy-dl.py "https://www.artsy.net/artwork/marcel-duchamp-fountain-1" 
                   "%a - %t - %d"

because %a is automatically replaced with the artist's name (that is extracted from the webpage), %t is automatically replaced with the title of the piece, and %d is the date of the piece.

In this post, I'll be demonstrating the use of decorators to the end doing the bare minimum and how I saved myself from having to write tedious (but important) error-checking code. But before that, you, dear reader, have to be clear on what a decorator is. The section that follows it perhaps the greatest intro to decorators ever.

(If you are already familiar with decorators, you can skip to the section called "The problem"--though you may want to at least skim this section.)

What are decorators?

Here's the skinny on Python decorators. Grokking decorators necessitates an intuitive understanding of three concepts:

  • People often speak of functions as being "first-class citizens" in Python. By this they mean that functions are values that can be assigned to variables, returned from (other) functions, and passed as an argument to (still other) functions.
  • When a function (we’ll call it outer) returns another function (we'll call this inner), the inner function "closes over" (remembers) variables defined in the enclosing scope (outer's scope). If the returned function is stored in a variable and called at a later time, it still remembers the variable(s) from the enclosing scope—even if it is called long after the outer function finishes running and it's variables otherwise lost. The inner function that is returned is known as a "closure".
  • A language that supports closures affords us the unique opportunity to easily add or modify the behavior of a function a by creating a function b that takes function a as an argument, and returns a function c which does something, and then calls function a. Function c can now be used in place of function a--it is essentially function a plus some extra functionality: it is a "decorated" version of function a.

In order to concretize these concepts, let's see an example of a decorator complete with an illustration of the motivation behind creating it and the cognitive steps taken toward it's finished state. Unlike some decorator tutorials, this lesson will not patronize you, dear reader, by designing a overly-simple decorator with no practical worth (it's always been my thought that this pedagogical strategy most often backfires). Instead, we'll create, you and I, a decorator of actual utility.

Suppose we wanted to time the execution of a function. Wanting something with a little more precision than a stopwatch, we decide to use the time module:

import time

def sleep_for_a_second():
    time.sleep(1)

start_time = time.time()
sleep_for_a_second()
end_time = time.time()

print("It took {0:.2f} seconds".format(end_time-start_time))
#> It took 1.00 seconds

This is ok, but if we want to time the execution of many different functions, this will result in a lot of repeated code. Being champions of the DRY principle, we decide it would be better to put this in a function:

def time_a_function(func):
    start_time = time.time()
    func()
    end_time = time.time()
    print("It took {0:.2f} seconds".format(end_time-start_time))

def sleep_for_a_second():
    time.sleep(1)

def sleep_for_two_seconds():
    time.sleep(2)

time_a_function(sleep_for_a_second)
#> It took 1.00 seconds
time_a_function(sleep_for_two_seconds)
#> It took 2.00 seconds

time_a_function is a function that takes the function we want to time as an argument.

We just timed two functions. Notice how we can now time an arbitrary number of functions with no extra code.

But there's an issue with this approach. We've hitherto been timing functions that take no arguments. How would we time a function that takes one or more arguments?

def sleep_for_n_seconds(n):
    time.sleep(n)

time_a_function(sleep_for_n_seconds(5))
#> TypeError: 'NoneType' object is not callable

Nope. Before, we were passing the variable that holds the function to time_a_function, but the above incantation evaluates sleep_for_n_seconds(5), passes it's None return value to time_a_function and because time_a_function can't call it (because it's not a function), we get an error. So how are we going to time sleep_for_n_seconds?

The solution is to make a function that takes a function that returns a function that takes an argument (n) and performs the function and times it and use the returned function in place of the original (whew!).

In other words:

def timer_decoration(func):
    def new_fn(n):
        start_time = time.time()
        func(n)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
    return new_fn

def sleep_for_n_seconds(n):
    time.sleep(n)

sleep_for_n_seconds = timer_decoration(sleep_for_n_seconds)
sleep_for_n_seconds(3)
#> It took 3.01 seconds

Study this code carefully. You've just wrote a decorator.

If you are confused it--as always with programming--helps if you type the code out yourself (no copy-and-pasting!) and play around with it.

Though it's not terrible unwieldy otherwise, python gives us a nice elegant way to tag a particular function with a decorator so that the function is automatically decorated (i.e. doesn't require us to replace the original function).

@timer_decoration
def sleep_for_n_seconds(n):
    time.sleep(n)

# now the reassignment of 'sleep_for_n_seconds' is unnecessary
sleep_for_n_seconds(3)
#> It took 3.00 seconds

But what happens if we try to decorate the original sleep_for_a_second function?

@timer_decoration
def sleep_for_a_second():
    time.sleep(1)

# sleep_for_a_second()
#> TypeError: new_fn() missing 1 required positional argument: 'n'

sleep_for_a_second is now expecting 1 argument :(. We can generalize our decorator to handle functions that take an arbitrary number of arguments with *args and **kargs...

def timer_decoration(func):
    def new_fn(*args, **kargs):
        start_time = time.time()
        func(*args, **kargs)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
    return new_fn

@timer_decoration
def sleep_for_n_seconds(n):
    time.sleep(n)

@timer_decoration
def sleep_for_a_second():
    time.sleep(1)

@timer_decoration
def sleep_for_k_seconds(k=1):
    time.sleep(k)

sleep_for_n_seconds(3)
#> It took 3.00 seconds
sleep_for_a_second()
#> It took 1.00 seconds
sleep_for_k_seconds(k=4)
#> It took 4.00 seconds

Ace!

Finally, let's rewrite our decorator to support returning the return value of the decorated function.

def timer_decoration(func):
    def new_fn(*args, **kargs):
        start_time = time.time()
        ret_val = func(*args, **kargs)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
        return ret_val
    return new_fn

@timer_decoration
def sleep_for_a_second_p():
    time.sleep(1)
    return True

print(sleep_for_a_second_p())
#> It took 1.01 seconds
#> True

Note that our decorator is now generalized enough to be used with any function... no matter what it's return type is... no matter what arguments it takes...

It doesn't matter what the function is, the function's behavior remains the same except now it is "decorated" with functionality that times it.

This is just one example of a decorator with obvious generalized utility. You can also use decorators to perform memoization, static-typing-like type enforcement of function signatures, automatically retry functions that failed, and simulate non-strict evaluation.

The problem

To review, I wrote a script that takes the URL of an artwork on artsy.net, downloads the image, and then names the file in accordance with a user-supplied format string that uses info about the artwork. With the help of the requests, lxml, and wget modules, a script to do this can be coded relatively quickly. The problem, though--which is common for scripts that talk to the web that don't do error-checking--is that the script is brittle. Without error-checking, any malformed URL, network interruption, invalid output path, or weird edge case like an artwork without a title, will result in an unsightly error message and lengthy stack trace. Besides being aesthetically objectionable, if anyone else is using your script, you will look like an incompetent software engineer. So you have to bite the bullet and error-check.

The problem with error-checking is

  • If all possible errors are checked for separately and individually and handled appropriately (this is good practice), it will result in code often many times longer than the original code. Only a small fraction of the code will be the actual interesting logic of the program--most of it will now be mindless conditionals.
  • It's difficult for someone without training (me) to anticipate every possible error.
  • It takes a lot of work and I'm lazy

So my usual M.O. is to wrap each component in a try/except block (with no specificity in the exception), print an error message, and terminate execution...

try:
    <brittle code>
except:
    sys.exit("<brittle code> broke")

Except I don't even do that. Instead of wrapping each component in a try/except with its own error message, I just wind up try/excepting main once. This cuts down typing and carpal tunnel is a real thing...

def main():
    <literally everything>

try:
    main()
except:
    sys.exit("whoopsie daisy")

Decorators to the rescue

Okkkaaaayyyyy... if we absolutely must do some modicum of error checking around each component (so the user has some kind of clue as to why it the script's usage failed) we can write a decorator to do this for us. The following is an excerpt of the script as of commit d6be4956543:

...
# the decorator
def cop_out(f):
    def inner(*args, **kargs):
        try:
            return f(*args, **kargs)
        except:
            sys.exit("\nThe function <{}> failed\n".format(f.__name__))
    return inner

@cop_out
def get_command_line_arguments():
    return sys.argv[1], sys.argv[2]

@cop_out
def download_webpage(url):
    r = requests.get(url)
    if r.status_code != 200:
        raise Exception
    return r

@cop_out
def parse_webpage(requests_object):
    return lxml.html.fromstring(requests_object.text)
....

Note how we use f.__name__ to get the name of the function that was decorated. This allows us to add the support for specialized (at the function level, at least) error messages for free!

Now, if the user calls the script with too few arguments, the program will print The function <get_command_line_arguments> failed. If you give it a real URL but not to an artsy.net artwork, it'll say The function <extract_artist_name_from_webpage> failed. If you give it a made-up URL, it'll say The function <download_webpage> failed, etc...

Sure, beyond the function level, you don't know why it failed, but anything is better than nothing and your users shouldn't be so bossy and entitled.

But one more thing... if you looked at the code, you'll notice that my function names are descriptive... maybe too long and descriptive. The use of prose-like descriptive function names (certainly by the standards of Haskell programmers) was no accident. Although it may seem like an uncharacteristically diligent and conscientious decision on my part, it was actually to facilitate further laziness. Consider the following tweak to the decorator:

def cop_out(f):
    def inner(*args, **kargs):
        try:
            return f(*args, **kargs)
        except:
            message = f.__name__.replace("_", " ")
            sys.exit("\nFailed to {}\n".format(message))
    return inner

Consider how this generates error messages that appear to individualized...

$ python artsy-dl.py "https://www.artsy.net/artwork/jean-michel-basquiat-untitled-33211237"
Failed to get command line arguments

$ python3.4 artsy-dl.py "https://www.artsy.net/artwork/jean-michel-BAHHSKIIIAAHT-untitled-33211237"
                        "%a/%t - %d"
Failed to download webpage

$ python2.7 artsy-dl.py "https://www.artsy.net/artist/jean-michel-basquiat"
                         "%a/%t - %d"
Failed to extract artist name from webpage

So there you have it! Decorators can be used for legitimate, elegant solutions but can also be employed--virtually for free--to give the illusion that you are a caring software engineer and meticulous with your error checking.

PS

If you're a potential employer or client, I'm just kidding--I'm very diligent about error checking. This piece is satire. I promise.

share this: Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail