Using Python decorators to be a lazy programmer: a case study

Decorators are considered one of the more advanced features of python and it will often be the last topic in a python class or introductory book. It will, unfortunately, also be one that trips up many beginning or even intermediate python programmers. Those who stick it out and work through it, though, will be handsomely rewarded for their hard work.

Known by those in-the-know, decorators are tools to make your python code beautiful, more concise, well-written, and elegant--but did you know you can use decorators to be a lazy bum of a programmer?!

I just recently whipped up a script that I'm using to help my expand and organize my photo library of my favorite pieces of art. This script that takes the URL of an art piece from artsy.net, downloads the image of the artwork and renames the downloaded file to follow a filename template (given as a CLI arg) based on the artist's name, the title of the piece, and the date of completion. For example, if we wanted to download an image of the Fountain, and have the file automatically named: Marcel Duchamp - Fountain - 1917.jpg, one can use the following command:

python artsy-dl.py "https://www.artsy.net/artwork/marcel-duchamp-fountain-1" 
                   "%a - %t - %d"

because %a is automatically replaced with the artist's name (that is extracted from the webpage), %t is automatically replaced with the title of the piece, and %d is the date of the piece.

In this post, I'll be demonstrating the use of decorators to the end doing the bare minimum and how I saved myself from having to write tedious (but important) error-checking code. But before that, you, dear reader, have to be clear on what a decorator is. The section that follows it perhaps the greatest intro to decorators ever.

(If you are already familiar with decorators, you can skip to the section called "The problem"--though you may want to at least skim this section.)

What are decorators?

Here's the skinny on Python decorators. Grokking decorators necessitates an intuitive understanding of three concepts:

  • People often speak of functions as being "first-class citizens" in Python. By this they mean that functions are values that can be assigned to variables, returned from (other) functions, and passed as an argument to (still other) functions.
  • When a function (we’ll call it outer) returns another function (we'll call this inner), the inner function "closes over" (remembers) variables defined in the enclosing scope (outer's scope). If the returned function is stored in a variable and called at a later time, it still remembers the variable(s) from the enclosing scope—even if it is called long after the outer function finishes running and it's variables otherwise lost. The inner function that is returned is known as a "closure".
  • A language that supports closures affords us the unique opportunity to easily add or modify the behavior of a function a by creating a function b that takes function a as an argument, and returns a function c which does something, and then calls function a. Function c can now be used in place of function a--it is essentially function a plus some extra functionality: it is a "decorated" version of function a.

In order to concretize these concepts, let's see an example of a decorator complete with an illustration of the motivation behind creating it and the cognitive steps taken toward it's finished state. Unlike some decorator tutorials, this lesson will not patronize you, dear reader, by designing a overly-simple decorator with no practical worth (it's always been my thought that this pedagogical strategy most often backfires). Instead, we'll create, you and I, a decorator of actual utility.

Suppose we wanted to time the execution of a function. Wanting something with a little more precision than a stopwatch, we decide to use the time module:

import time

def sleep_for_a_second():
    time.sleep(1)

start_time = time.time()
sleep_for_a_second()
end_time = time.time()

print("It took {0:.2f} seconds".format(end_time-start_time))
#> It took 1.00 seconds

This is ok, but if we want to time the execution of many different functions, this will result in a lot of repeated code. Being champions of the DRY principle, we decide it would be better to put this in a function:

def time_a_function(func):
    start_time = time.time()
    func()
    end_time = time.time()
    print("It took {0:.2f} seconds".format(end_time-start_time))

def sleep_for_a_second():
    time.sleep(1)

def sleep_for_two_seconds():
    time.sleep(2)

time_a_function(sleep_for_a_second)
#> It took 1.00 seconds
time_a_function(sleep_for_two_seconds)
#> It took 2.00 seconds

time_a_function is a function that takes the function we want to time as an argument.

We just timed two functions. Notice how we can now time an arbitrary number of functions with no extra code.

But there's an issue with this approach. We've hitherto been timing functions that take no arguments. How would we time a function that takes one or more arguments?

def sleep_for_n_seconds(n):
    time.sleep(n)

time_a_function(sleep_for_n_seconds(5))
#> TypeError: 'NoneType' object is not callable

Nope. Before, we were passing the variable that holds the function to time_a_function, but the above incantation evaluates sleep_for_n_seconds(5), passes it's None return value to time_a_function and because time_a_function can't call it (because it's not a function), we get an error. So how are we going to time sleep_for_n_seconds?

The solution is to make a function that takes a function that returns a function that takes an argument (n) and performs the function and times it and use the returned function in place of the original (whew!).

In other words:

def timer_decoration(func):
    def new_fn(n):
        start_time = time.time()
        func(n)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
    return new_fn

def sleep_for_n_seconds(n):
    time.sleep(n)

sleep_for_n_seconds = timer_decoration(sleep_for_n_seconds)
sleep_for_n_seconds(3)
#> It took 3.01 seconds

Study this code carefully. You've just wrote a decorator.

If you are confused it--as always with programming--helps if you type the code out yourself (no copy-and-pasting!) and play around with it.

Though it's not terrible unwieldy otherwise, python gives us a nice elegant way to tag a particular function with a decorator so that the function is automatically decorated (i.e. doesn't require us to replace the original function).

@timer_decoration
def sleep_for_n_seconds(n):
    time.sleep(n)

# now the reassignment of 'sleep_for_n_seconds' is unnecessary
sleep_for_n_seconds(3)
#> It took 3.00 seconds

But what happens if we try to decorate the original sleep_for_a_second function?

@timer_decoration
def sleep_for_a_second():
    time.sleep(1)

# sleep_for_a_second()
#> TypeError: new_fn() missing 1 required positional argument: 'n'

sleep_for_a_second is now expecting 1 argument :(. We can generalize our decorator to handle functions that take an arbitrary number of arguments with *args and **kargs...

def timer_decoration(func):
    def new_fn(*args, **kargs):
        start_time = time.time()
        func(*args, **kargs)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
    return new_fn

@timer_decoration
def sleep_for_n_seconds(n):
    time.sleep(n)

@timer_decoration
def sleep_for_a_second():
    time.sleep(1)

@timer_decoration
def sleep_for_k_seconds(k=1):
    time.sleep(k)

sleep_for_n_seconds(3)
#> It took 3.00 seconds
sleep_for_a_second()
#> It took 1.00 seconds
sleep_for_k_seconds(k=4)
#> It took 4.00 seconds

Ace!

Finally, let's rewrite our decorator to support returning the return value of the decorated function.

def timer_decoration(func):
    def new_fn(*args, **kargs):
        start_time = time.time()
        ret_val = func(*args, **kargs)
        end_time = time.time()
        print("It took {0:.2f} seconds".format(end_time-start_time))
        return ret_val
    return new_fn

@timer_decoration
def sleep_for_a_second_p():
    time.sleep(1)
    return True

print(sleep_for_a_second_p())
#> It took 1.01 seconds
#> True

Note that our decorator is now generalized enough to be used with any function... no matter what it's return type is... no matter what arguments it takes...

It doesn't matter what the function is, the function's behavior remains the same except now it is "decorated" with functionality that times it.

This is just one example of a decorator with obvious generalized utility. You can also use decorators to perform memoization, static-typing-like type enforcement of function signatures, automatically retry functions that failed, and simulate non-strict evaluation.

The problem

To review, I wrote a script that takes the URL of an artwork on artsy.net, downloads the image, and then names the file in accordance with a user-supplied format string that uses info about the artwork. With the help of the requests, lxml, and wget modules, a script to do this can be coded relatively quickly. The problem, though--which is common for scripts that talk to the web that don't do error-checking--is that the script is brittle. Without error-checking, any malformed URL, network interruption, invalid output path, or weird edge case like an artwork without a title, will result in an unsightly error message and lengthy stack trace. Besides being aesthetically objectionable, if anyone else is using your script, you will look like an incompetent software engineer. So you have to bite the bullet and error-check.

The problem with error-checking is

  • If all possible errors are checked for separately and individually and handled appropriately (this is good practice), it will result in code often many times longer than the original code. Only a small fraction of the code will be the actual interesting logic of the program--most of it will now be mindless conditionals.
  • It's difficult for someone without training (me) to anticipate every possible error.
  • It takes a lot of work and I'm lazy

So my usual M.O. is to wrap each component in a try/except block (with no specificity in the exception), print an error message, and terminate execution...

try:
    <brittle code>
except:
    sys.exit("<brittle code> broke")

Except I don't even do that. Instead of wrapping each component in a try/except with its own error message, I just wind up try/excepting main once. This cuts down typing and carpal tunnel is a real thing...

def main():
    <literally everything>

try:
    main()
except:
    sys.exit("whoopsie daisy")

Decorators to the rescue

Okkkaaaayyyyy... if we absolutely must do some modicum of error checking around each component (so the user has some kind of clue as to why it the script's usage failed) we can write a decorator to do this for us. The following is an excerpt of the script as of commit d6be4956543:

...
# the decorator
def cop_out(f):
    def inner(*args, **kargs):
        try:
            return f(*args, **kargs)
        except:
            sys.exit("\nThe function <{}> failed\n".format(f.__name__))
    return inner

@cop_out
def get_command_line_arguments():
    return sys.argv[1], sys.argv[2]

@cop_out
def download_webpage(url):
    r = requests.get(url)
    if r.status_code != 200:
        raise Exception
    return r

@cop_out
def parse_webpage(requests_object):
    return lxml.html.fromstring(requests_object.text)
....

Note how we use f.__name__ to get the name of the function that was decorated. This allows us to add the support for specialized (at the function level, at least) error messages for free!

Now, if the user calls the script with too few arguments, the program will print The function <get_command_line_arguments> failed. If you give it a real URL but not to an artsy.net artwork, it'll say The function <extract_artist_name_from_webpage> failed. If you give it a made-up URL, it'll say The function <download_webpage> failed, etc...

Sure, beyond the function level, you don't know why it failed, but anything is better than nothing and your users shouldn't be so bossy and entitled.

But one more thing... if you looked at the code, you'll notice that my function names are descriptive... maybe too long and descriptive. The use of prose-like descriptive function names (certainly by the standards of Haskell programmers) was no accident. Although it may seem like an uncharacteristically diligent and conscientious decision on my part, it was actually to facilitate further laziness. Consider the following tweak to the decorator:

def cop_out(f):
    def inner(*args, **kargs):
        try:
            return f(*args, **kargs)
        except:
            message = f.__name__.replace("_", " ")
            sys.exit("\nFailed to {}\n".format(message))
    return inner

Consider how this generates error messages that appear to individualized...

$ python artsy-dl.py "https://www.artsy.net/artwork/jean-michel-basquiat-untitled-33211237"
Failed to get command line arguments

$ python3.4 artsy-dl.py "https://www.artsy.net/artwork/jean-michel-BAHHSKIIIAAHT-untitled-33211237"
                        "%a/%t - %d"
Failed to download webpage

$ python2.7 artsy-dl.py "https://www.artsy.net/artist/jean-michel-basquiat"
                         "%a/%t - %d"
Failed to extract artist name from webpage

So there you have it! Decorators can be used for legitimate, elegant solutions but can also be employed--virtually for free--to give the illusion that you are a caring software engineer and meticulous with your error checking.

PS

If you're a potential employer or client, I'm just kidding--I'm very diligent about error checking. This piece is satire. I promise.

share this: Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

The hardest thing about teaching statistics

(Note: this post should probably be titled "Quantitative Methods of Curricula Planning" but I thought the current title would draw more interest–though they would both lose out to "These Weird Approaches To Lesson Planning Will Leave You Speechless")

Suppose you were tasked with teaching a course about a field of study. There would be, of course, several topics that you are expected to cover by the course end date; how would you decide the order in which to teach them?

Most people would say that the topics should build on one another, with monotonically increasing levels of difficulty. Further, no topic should be brought up that requires comprehension of another topic yet unlearned.

Planning the syllabus under these constraints would, perhaps, come naturally to skilled and empathetic lecturers. But,

  • not all lecturers are skilled and empathetic
  • even satisfying all of these constraints, there are objectively superior and inferior lesson plans
  • there are some subjects for which these constraints cannot be satisfied (statistics)

For these reasons, having a suite of quantitative methods for choosing the best order of topics in teaching a field of study would be valuable to pedagogy (not to mention providing challenging problems for me to focus on instead of writing).

--

I started thinking about this topic as I began to plan writing my book about learning introductory statistics with R. There are, of course, myriad other very good books on this very topic, so I figured that one way I can stand out is to organize the topics in a way that best facilitates mastering the material. I thought that this would be especially appreciated with a field of study that is notoriously scary and difficult to the uninitiated (like statistics is.)

Anyone, anywhere, teaching introductory statistics will be expected to touch on the common topics: measures of central tendency, measures of dispersion, probability, the central limit theorem, sampling theory, etc… I know how everyone else have arranged the topics, but what's the best way?

It might seem strange, but answering that question was probably the hardest thing about putting together this book and in all of my (admittedly limited) experience designing statistics curricula.

Let's speak of graph theory

To explore optimal paths through the topics, we can represent the subject of statistics as a big graph, or network. Each topic would be a node and there would be directed edges indicating when knowledge of a particular topic is a prerequisite to understanding another. Specifically, if there is a edge connecting topic "a" to topic "b", topic "b" requires an understanding of "a"–like long division requires knowledge of subtraction.

This is what a topic network of an excerpt of introductory stats topics might look like.

statistics topics knowledge dependency diagram

In graph theory, this is known as a directed acyclic graph (DAG). DAGs have the property that there exists at least one ordering of nodes such that no node in the ordering is connected to ("pointing to") a node earlier in the ordering. This is called a topological sort. For most DAGs, there are a number of different orderings that satisfy the ‘dependency’ constraints.

Now that I have your attention, let's now speak of monads

To get a list of all of them, I wrote a small library and set of algorithms in Haskell. You can view it here but the "meat" of the algorithm is in the following snippet that recursively adds all nodes with no children (topics that have no topics that depend on them) to a list of possible alternatives and removes the childless nodes. This is repeated until there are no nodes left to remove. A potential snag is that the function only takes one path but each function call may generate multiple alternate paths. However, if we view the output of the "gatherAllChildless" function as a non-deterministic computation, we can exploit the fact that the path of nodes is a monad and have the function recursively call itself inside of a monadic bind.

This has a sub-quadratic time complexity (< O(n^2))… not too bad. There are 26 possible orderings of the topics that satisfy these “knowledge dependencies” including:

probability -> central tendency -> measures of dispersion -> sampling theory -> sampling distributions -> probability distributions -> central limit theorem -> statistical inference -> NHST

central tendency -> probability -> measures of dispersion -> probability distributions -> sampling theory -> sampling distributions -> central limit theorem -> statistical inference -> NHST

There are a few of the ordering that intuitively seem like poor choices. Taking the first one, for example: it might be strange to start a book on statistics with probability when readers may want to get starting with univariate analysis right away. Looking at the second one, it seems strange to stick "probability" in between "central tendency" and "measures of dispersion", even though it can technically be done, because most people expect highly related topics to be positioned next to each other.

One way of cutting down on the list is to label each topic node with a difficulty level, and choose the ordering which causes the fewest backwards jumps in difficulty level. This should represent the path that has the most gentle level-of-difficulty slope.

Given the algorithms from lines 67 to 78 of TopoSort.hs and the following (subjective) difficulty mapping:

"central tendency": "1"
"measures of dispersion": "2"
"sampling theory": "3"
"sampling distributions": "3"
"central limit theorem": "5"
"probability": "4"
"probability distributions": "3"
"statistical inference": "5"
"NHST": "5"

the “optimal” ordering is:

central tendency -> measures of dispersion -> sampling theory -> probability -> sampling distributions -> probability distributions -> central limit theorem -> statistical inference -> NHST

Yay! This is pretty close to the ordering I chose.

--

The most truly difficult thing about sorting this out is that the statistics topic network diagram is not a DAG. This means that there is no ordering possible that doesn’t appeal to topics yet unlearned. For example, explaining why sample standard deviation divides by n-1 instead of n requires appealing to sampling theory, which requires a good foundation in measures of dispersion to understand. There are a few more of these cyclical relationships in the field.

All of these instances require some hand-waving on the part of the writer or lecturer ("don't worry about why we divide by 'n-1', we’ll get to that later") and adds to the learner's perceived difficulty of grasping the field.

The best way to reconcile these circular knowledge dependencies is to introduce weight to the edges that represent the extent to which a topic requires knowledge of another. Then, a cycle detection algorithm can be run on the graph. Once all the cycles are detected, the edges in the cycles with the lowest weight can be systematically removed until there are no more cycles and the graph is a DAG. At that point, the specialized topo sort from above may be used. I plan on implementing this when I have more time :)

--

It's my hope that these and other qualitative methods for planning curricula can be applied to other legendarily confusing fields of study. These methods can even be applied to entire undergraduate course catalogues and major requirements to guide students over 4+ years of undergraduate study.

share this: Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

What does Flatland have to do with Haskell?

Edwin Abbott's 1884 novella, Flatland, recounts the misadventures of a square that lives in a two-dimensional world called "Flatland". In this story, the square has a dream where he visits a one-dimensional world (Lineland) and unsuccessfully tries to educate the populace about Flatland's existence. Shortly thereafter, a sphere visits Flatland to introduce our protagonist to his own home, Spaceland. The sphere looks like a circle to the square, because the square can only see the part of the sphere that intersects Flatland's plane. The square can't fathom Spaceland until the sphere actually brings him into the third dimension.

Having his view of reality sufficiently rocked, the square postulates the existence of still higher dimensional lands. The sphere denies this possibility and returns the square to Flatland on bad terms.

The irony here is that, in spite of being aware of the square's previous insistence that Flatland is the only reality there is, the square's corresponding limitation of thought, and the square's subsequent enlightenment; the sphere arrogantly asserted that his own land represented the limits of dimensionality and that there can be no more dimensions.

I begin with this summary not as an impromptu book report, but because this is a perfect analogy for why I've decided to learn Haskell.
-
It isn't hard to imagine the inveterate C programmer being hesitant to embrace higher-level languages like Python and Perl. "Imagine the whole world that opens up to you when you don't have to worry about memory management and resolving pointers," requests the proselytizing Python programmer.

But the C programmer got on fine without knowing Python all this time. Besides the (standard) Python interpreter is written in C.

"Why? So I can change types on a whim?" retorts the C programmer. "...so I can pass functions as arguments to other functions? Big deal! I can do that with function pointers in C."

While it is technically true that, in C, functions can be passed as arguments to other functions, and can be returned by functions, this is only a small part of the functional programming techniques that Python allows. But it is the only part that most only trained in C can understand. This is like when the square thought that he saw the sphere because he saw a circle where the sphere intersected Flatland. There's a bigger picture (lambda functions, currying, closures) that only appears when the constraints imposed by a programming paradigm are pushed back.

Shifting tides in industry eventually compel the C programmer to give Python a shot. Once she is fluent in Python and its idioms (becomes a "pythonista", as they say) things begin to change. The OOP of Python allows the programmer to gain a new perspective on programming that had been, up to this point, unavailable to her simply because the concept doesn't exist in C.

Resolved to follow the road to higher abstraction, the (former) C programmer asks the Python programmer if there are other languages that will provoke a similar shift in thought relative to even Python. "Haskell, perhaps?" she offers.

The Python programmer scoffs, "Nah, Python is as good as it gets. Besides, purely functional programmers are a bunch of weirdos."

Lands

Haskell is a relatively obscure programming language (20th place or lower on most popularity indices), but accounts for a disproportionate number of the "this-language-will-change-the-way-you-look-at-programming" claims. By the accounts of Haskell aficionados, finally understanding Haskell is a transformative experience.

Up until recently, I found myself mirroring the thoughts and behavior of the Python programer in this story; I keep hearing about how great Haskell is, and how it'll facilitate an enormous shift in how I'll view computer programming, but I largely dismissed these claims as the rantings of eccentrics that are more concerned with mathematical elegance than practical concerns.

Is it any wonder that I didn't believe the Haskell programmers? I can't see the benefits of Haskell within the constraints of how I view computer programming—constraints subtly imposed by my language of choice. (Lisp programmer Paul Graham describes this as the "Blub Paradox" in this essay Beating The Averages)

But just like the sphere and the arrogant Python programmer should, I have to remain open to the idea that there's a whole world out there that I'm not availing myself of just because I’m too obstinate to try it.

Any language that is Turing-complete will let you do anything. It's trivially true that there is nothing that you can do in one language that you can't do in another. What sets languages apart (from most trivial to least) is the elegance of the syntax, it's community and third party packages, the ease in which you can perform certain tasks, and whether you’ll achieve enlightenment as a result of using it. I don't know if Haskell will do this for me but I think I ought to give it a shot.

---
If the ideas of relativity when applied to the domain of programming languages interests you, you might also be interested in the Sapir–Whorf hypothesis, which applies similar ideas mentioned in this post to the realm of natural languages.

share this: Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail