Chapter 12: Functions

In software, a surprising amount of program source code is very general. The copy-paste action is an extremely common programming practice. For example, when code is written to get a number from the user, that same code might get reused in a large number of future programs. These common blocks of Python code might include a block to verify that numbers are in acceptable ranges, that strings fit a certain requirement, or complicated print statements work with general data structures. Instead of copying and pasting over and over again, duplicating code and making things more complicated, it is possible to write functions to handle repeated tasks. For example, getting input from the user doesn't require writing complicated keyboard handling code with a termination condition that occurs when the user hits enter. You just call input, and it does the magic. You can write functions like this in this same way, abstracting away the details of an otherwise common set of program code.

Building simple functions

Let's start with a very simple example. Let's look at the max function, and specifically max with two parameters. The max function accepts two values, usually numbers, and returns the maximum value out of those two numbers.

>>> max(3, 5)
5
>>> max(7, -2)
7

It's been easy so far to just accept that this function takes some parameters, makes some magic happen, and then returns a value that fits the function's purpose. In the first example, the two numbers are 3 and 5. Since 5 is clearly the maximum number out of those two, max should return a 5. So how could we actually write this function for ourself?

Recall that when we've talked about these functions previously, we've used the idea that they accept parameters as input and return a value as output. A convenient paradigm that might help is simply asking another person for the answer. Let's say you walk up to me and ask me to tell you whether 3 or 5 is bigger. I stand there, think about it for a second, and tell you that 5 is bigger. You seem satisfied, and walk away. It isn't important how I came up with the answer, only that it's right.

In my head, I might have remembered that you first gave me the number 3, and that you next gave me the number 5. In a function, the variable names that the input parameters are going to take are specified.

firstNumber = 3
secondNumber = 5

Next, I probably used the equivalent of an if-statement to see if the first number was larger than the second number. If it was, I returned the first number to you. If it wasn't bigger, I returned the second. In Python, return statement does this. The way that max, input, and other familiar functions give a value back isn't with print, but with return.

if firstNumber > secondNumber:
    return firstNumber
else:
    return secondNumber

If you try to run this on its own, Python will complain about a SyntaxError, and inform you that you can't return outside of a function. To allow us to make calls like myMax(3, 5) in the same way that we called max(3, 5), we'll use a keyword called def, while is short for define. It tells Python that we are defining a function to be used just like the other built-in functions. In the same statement, we'll indicate whether or not we're expecting any variables to be passed as parameters to the function, and what we'd like to call them in the context of our new code.

def myMax(firstNumber, secondNumber):
    if firstNumber > secondNumber:
        return firstNumber
    else:
        return secondNumber

The new myMax function assumes that the user is going to provide two numbers, and that we're going to refer to them as firstNumber and secondNumber. We take the same if-statement given above and return a value based on which input is bigger. For this function, that's really all that we need. Let's test it out, just like we tried the max function.

>>> max(3, 5)
5
>>> myMax(3, 5)
5
>>> max(7, -2)
7 >>> myMax(7, -2)
7

The new myMax looks exactly the same as max, except that it has a different name. Each time we call it, the function takes the two values and assigns them to the variable names in the function definition. It's as if we're doing something like the following:

def myMax():
    firstNumber = 3
    secondNumber = 5
    if firstNumber > secondNumber:
        return firstNumber
    else:
        return secondNumber

Of course, we can't write the function that way, since we wouldn't be able to run it with the 7 and -2 in the second case (or any other case, for that matter). Python knows that when you specify the function parameters in the definition and call the function later on, it should take those values and substitute them into the variable names you give.

One important thing to note is that these variable names only occur in the context of the function itself. There is a fundamental mechanism in Python called variable scope, specifying the lifetime of a variable while the program is running. For a function, the scope of a variable exists as far as the function is running in that instance. Once the function returns a value, all the other values it creates disappear. We can show this with an example.

>>> myMax(4, 9)
9
>>> firstNumber
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
            firstNumber
NameError: name 'firstNumber' is not defined

If myMax gets called, we know that a firstNumber variable is created and set to 4. So why do we get a NameError when trying to reference firstNumber after the function is completed? It's thanks to the variable scope just referenced. Once the function is done executing, it cleans up after itself and gets rid of the other unnecessary variables.

In a similar vein, variables that you define inside your program itself aren't necessarily visible inside the function. If you want a function to be able to use a value, you really should be passing it in as a function parameter.

Time-saving functions

As stated earlier, a big advantage of functions is that they allow you to write efficient pieces of code that you can use again and again. One example of this occurred where we used a try block to repeatedly ask for input until a valid number was received. So how can we turn this code into a function?

First, ask yourself what the input and output conditions of this function should be. Are there any values to consider as input? In this simple case, there aren't. We know that we want to get a number from the user, so we can write a function called getNumber with no function parameters.

def getNumber():

Next, what are we expecting as output from this function? In this case, we'd like to get a single int value returned. For example, it would be nice to use our function like this:

print(getNumber())
age = getNumber()
x = getNumber() + y

It should ideally look like any of the other previous built-in functions that we've come across, and should operate as a function that gets a value from input and returns the int representation of that value. If the code from the previous chapter is copy-pasted and modified to include the def statement above and a return statement that provides the value retrieved, it can be used in the following way.

def getNumber():
    done = False
    while not done:
        try:
            user_number = int(input("Enter a number other than zero: "))
            done = True
        except ValueError:
            print("You didn't enter a number! Shame on you.")
    return user_number

Again, the only changes that have been made in this block of code are the define statement at the start, and the return statement at the end. All of the other code is the same. With this function, we now have access to a reusable piece of code that hides away all the error checking and verification, and just does the right thing.

>>> print(getNumber())
Enter a number other than zero: Alexander
You didn't enter a number! Shame on you.
Enter a number other than zero: 5
5
>>> print("Your number is: {0}".format(getNumber()))
Enter a number other than zero: 5
Your number is: 5

In the contacts_obj example from the dictionaries chapter, we asked the user for a nickname and printed out the full contact information based on the presence of a key, and gave an appropriate message if the key wasn't found. Let's pull out the full contact section as a function, and see what that changes.

def printContact(contact_dictionary):
    for key in contact_dictionary:
        print("    {0}: {1}".format(key, contact_dictionary[key]))

nickname = input("Enter the contact you would like to see: ")
if nickname in contacts_obj:
    print("Contact: {0}".format(nickname))
    printContact(contacts_obj[nickname])
else:
    print("Sorry, I don't know {0}!".format(nickname))

Enter the contact you would like to see: alexander
Contact: alexander
    town: Kingston
    phone: 5551763
    first_name: Alexander
    last_name: Coder

Nothing about the behaviour of the program has changed. It still runs exactly as it did before, and we're able to search based on contact information. What is new is the printContact function, which takes care of all the key-based output for a contact dictionary. Instead of referencing contacts_obj[nickname], as was the case for the previous example, we pass in the contacts_obj[nickname] dictionary as a function parameter to the printContact function. Now printContact expects that there is a dictionary in the contact_dictionary variable, and runs the exact same code with the new variable.

Function parameters can be any variable type. Whether you want to pass numbers, strings, lists, dictionaries, or other types, you'll have no problem giving the information to your function.

Parameter passing by value

An important distinction in Python's treatment of function parameters that differs from standard variables is that the function parameters are passed by value. If you walked up to me on the street and asked me to tell you whether or not 3 was bigger than 5, I probably wouldn't be able to trick you into believing that you asked if 6 is bigger than 3. Whether or not it's true that 6 is bigger than 3 (last time I checked, it was), that wasn't the question that you asked me. You asked about 5, and it shouldn't be possible for me to switch that value on you.

This is the sort of thing meant when speaking about passing by value. With basic variable types, you're not passing a mutable piece of data to me that can be modified and given back to you. Function parameters are sacred. If you give me an int, you're essentially giving me a copy of that variable with the same value. You're giving me the value in that variable, and not access to the variable itself.

def modifyValue(firstNumber, secondNumber):
    firstNumber = firstNumber + secondNumber
    secondNumber = secondNumber - 10
    return True

a = 3
b = 5
modifyValue(a, b)
print("a = {0}, b = {1}".format(a, b))

a = 3, b = 5

If it was true that you'd literally given modifyValue access to the a and b variables and not the values stored in a and b, the function could have actuallymodified firstNumber and secondNumber. That sort of behaviour would be very bad, and might allow me to trick you into believing that your data wasn't what it actually was in the first place. Imagine if we had a verifyPassword function where the function could change your password inside of itself. That's not okay.

Passing by value is Python's way of making sure that functions have a minimal ability to be malicious with your data, whether intentional or not. When you call the modifyValue function above, the value in a is copied into firstNumber, and the value in b is copied into secondNumber. When the function finishes, a doesn't have the value 8 (firstNumber + secondNumber), and b isn't equal to -5 (secondNumber - 10). They retain their original values.

Surprisingly, not all variables hold completely true to this concept. Even passing by value doesn't mean that the copy that you get is guaranteed to be exactly what it was when you made the copy the first time. I know, I just said... Well, hear me out. Imagine that you found a treasure map while walking around town and you wanted to show a copy to me. To make sure that you didn't lose the map, you photocopied it, and gave the copy to me. You still have your own copy of the map. Even if I tear it up or change the route, nothing happens to your copy. However, if I go and steal the treasure, your map doesn't really have the same value that it did before, since there's no treasure at the end. Thanks for the treasure though!

Lists and dictionaries are examples of data structures that exhibit this type of behaviour. When you pass a list or a dictionary to the function, it doesn't change the fact that you're pointing to the same list that you were the first time. However, the function can insert or delete elements inside the data structure. For example,

def noChangeList(yourList):
    yourList = []

def addToList(yourList):
    yourList.append(25)
    yourList.append(70)
    yourList.remove(yourList[0])

myList = [3, 5, 9, 12, 99]
print("Original myList: {0}".format(myList))
noChangeList(myList)
print("Unchanged myList: {0}".format(myList))
addToList(myList)
print("Modified myList: {0}".format(myList))

Original myList: [3, 5, 9, 12, 99]
Unchanged myList: [3, 5, 9, 12, 99]
Modified myList: [5, 9, 12, 99, 25, 70]

The noChangeList function attempts to set yourList to an empty list. If Python wasn't using pass by value, you might expect myList to have a new list. It isn't true though, just as you wouldn't be fooled by me handing you an empty treasure map. However, addToList is able to take the list data structure and make changes to the values inside of it. The pass by value requirement only occurs at the top-most level of the data structre, and the values inside that data structure can be shifted around.

A dictionary will exhibit the same behaviour. Let's modify addToList and create addToDict to insert and delete elements from a dictionary.

def noChangeDict(yourDict):
    yourDict = {}

def addToDict(yourDict):
    yourDict.clear()
    yourDict["Alexander"] = True

myDict = {"test": 35, "foo": 21, "bar": 5}
print("Original myDict: {0}".format(myDict))
noChangeDict(myDict)
print("Unchanged myDict: {0}".format(myDict))
addToDict(myDict)
print("Modified myDict: {0}".format(myDict))

Original myDict: {'test': 35, 'foo': 21, 'bar': 5}
Unchanged myDict: {'test': 35, 'foo': 21, 'bar': 5}
Modified myDict: {'Alexander': True}

These changes are acceptable because lists and dictionaries are mutable types, while strings and numbers are immutable. A list and a dictionary are data structures that can be modified in-place; adding an element to a list doesn't return a new list, but changes the very list that you are actually adding the element to. When you try modify a string, you're not able to do so, and have to resort to getting a new string based on the old one. You can't change numbers or strings normally, so you also can't modify them as function parameters.

Default function parameters

The print statement has a couple of optional parameters that don't need to be specified if the defaults are acceptable. The sep parameter acts as a separator when multiple parameters are being passed in. By default, the separator is a single empty space. The end parameter indicates how print should terminate the string being printed to the screen, and the default is terminating by moving to the next line.

a = "Alexander"
b = "Coder"
print(a, b)
print(a, b, sep="")
print(a, b, sep="     ")
print(a, b, sep=":)")

Alexander Coder
AlexanderCoder
Alexander     Coder
Alexander:)Coder

print(a)
print(b)
print(a, end="")
print(b)

Alexander
Coder
AlexanderCoder

We've come an awfully long way since print was introduced, but the behaviour exhibited here might be a little unfamiliar. The idea of specifying optional function parameters to modify the function in a particular way is different than forcing the user to explicitly say what the default separator and the default end string are.

So what's different in the print statement? How did the code define an optional default parameter that isn't required, but is there if the programmer wants to access it? Let's modify getNumber from earlier to allow an optional minimum value for the number. At first, we'll set this as a variable inside the function itself. If the value of the variable is False, we'll assume the user doesn't have an actual minimum requirement. Otherwise, the code should check to see if the number is in the acceptable range, and repeat the loop if the number is too low.

def getNumber():
    minimum_value = False
    done = False
    while not done:
        try:
            user_number = int(input("Enter a number: "))
        except ValueError:
            print("You didn't enter a number! Shame on you.")
            continue
        if minimum_value == False or user_number >= minimum_value:
            done = True
        else:
            print("Your number was too low!")
            print("The minimum value is {0}.".format( minimum_value))
    return user_number

>>> getNumber()
Enter a number: Alexander
You didn't enter a number! Shame on you.
Enter a number: -500
-500

Now modify minimum_value to have the value 5. Any number less than 5 should be rejected.

def getNumber():
    minimum_value = 5
    # The rest of the previous code follows here..

>>> getNumber()
Enter a number: -500
Your number was too low!
The minimum value is 5.
Enter a number: 5
5

Excellent! Now this code has a variable to check whether or not a value is too low, and if so, forces the user to enter in a new value. We were modifying this variable manually in the function, but it seems like the perfect candidate for a function parameter. Let's move it right up into the function definition itself and see how this changes things.

def getNumber(minimum_value):
    done = False
    while not done:
        try:
            user_number = int(input("Enter a number: "))
        except ValueError:
            print("You didn't enter a number! Shame on you.")
            continue
        if minimum_value == False or user_number >= minimum_value:
            done = True
        else:
            print("Your number was too low!")
            print("The minimum value is {0}.".format(minimum_value))
    return user_number

The only things that are actually different here are the minimum_value statement being moved up to the function parameter list, and the lack of a default starting value for the minimum_value variable. When this function gets called now, we'll have to specify the value that should be placed in that variable.

>>> getNumber(5)
Enter a number: -500
Your number was too low!
The minimum value is 5.
Enter a number: 5
5
>>> getNumber(False)
Enter a number: -500
-500

This is working exactly as we'd expect. If we give it a number value, it'll treat that as the minimum acceptable value for input. If it gets False instead, there is no minimum requirement.

To make this code really useful, it would be nice if the programmer didn't need to know that False was the magic value that allowed the code to ignore the minimum value check. Why didn't we use None? Or True for "allow any value"? By default, the programmer probably doesn't want to specify a minimum value, and we'd like to save them the trouble of having to specify False each time they want to get a number.

If you include an equals sign in the parameter definition, the function will treat that as the default value if it isn't provided by the programmer. Also, if the programmer wants to set that value, they have to give the variable name when they're calling it, just like in the sep and end examples with print. To get this in our new function, we only have to make one simple change.

def getNumber(minimum_value=False):

The default we'd like to use is False, so we advise the function that if the variable isn't set when the function is called, the default is already in place. The function is then free to use that variable, even if the programmer doesn't know or doesn't care about it.

>>> getNumber()
Enter a number: -500
-500
>>> getNumber(minimum_value=5)
Enter a number: -500
Your number was too low!
The minimum value is 5.
Enter a number: 5
5

You can omit these default parameter values entirely, or you can have one or more, as long as they all show up at the end of the parameter list. You aren't necessarily required to give the variable name when calling the function. We could have just written getNumber(5) instead of specifying explicitly that we were modifying minimum_value. However, once you get into multiple defaults, you need some way of telling Python exactly which default you're changing, so it's good practice to include the variable name when you're writing code like this.

To show multiple default parameter values, let's make the input string a customizable value. Instead of the boring old "Enter a number" string, let's make it something that the programmer specifies. Maybe they want an age, or a year, or some other relevant numeric value that may or may not have a minimum. We'll call it input_text.

def getNumber(minimum_value=False, input_text="Enter a number: "):
    done = False
    while not done:
        try:
            user_number = int(input(input_text))
        except ValueError:
            print("You didn't enter a number! Shame on you.")
            continue
        if minimum_value == False or user_number >= minimum_value:
            done = True
        else:
            print("Your number was too low!")
            print("The minimum value is {0}.".format(minimum_value))
    return user_number

>>> getNumber()
Enter a number: -500
-500
>>> getNumber(minimum_value=10)
Enter a number: 5
Your number was too low!
The minimum value is 10.
Enter a number: 15
15
>>> getNumber(minimum_value=10, input_text="Enter your age: ")
Enter your age: 3
Your number was too low!
The minimum value is 10.
Enter your age: 18
18

Now the input function is a little more customizable. The input statements have more character, and you can probably imagine using them in a wider range of situations. What about other default variables that a general function like getNumber might want to include, like a maximum value in addition to the minimum value? Consider this a challenge; this example is actually left as an exercise at the end of the chapter, and you'll get a lot of good practice by going through and coding it up. In fact, you might gain a new function that you use in many of your own programs going forward!

Advanced sorting

We've taken some time to discuss lists, and given a simple built-in way to sort the elements of a list using sorted. Unfortunately for us, the only elements that can be sorted are ones that have implemented comparisons between one another. Dictionaries have no such ordering, so we can't do something like this:

my_obj = [
    {"name": "Alexander", "age": 30},
    {"name": "Old Guy", "age": 85},
    {"name": "Young Kid", "age": 2},
]

print(sorted(my_obj))

Traceback (most recent call last):
File "C:\Python33\sandbox.py", line 10, in <module>
            print(sorted(my_obj))
TypeError: unorderable types: dict() < dict()

It would be nice to get a list of those dictionaries, sorted by age or even by name, but there is no built-in way to order one dictionary against another.

The sort and sorted functions have an optional function parameter called key that you can specify. The key value, rather than being a list or a string, is actually a reference to a function that returns a comparable value. What that means is that you'll tell sort how exactly to sort the dictionaries by suggesting that it call key(d) for dictionary d and using the returned value as the ordering. If key is a function that accepts a dictionary and returns the value for the "age" key, we can obtain a sorted list of the elements in my_obj.

def getAge(d):
    return d["age"]

my_obj = [
    {"name": "Alexander", "age": 30},
    {"name": "Old Guy", "age": 85},
    {"name": "Young Kid", "age": 2},
]

print(sorted(my_obj, key=getAge))

[{'age': 2, 'name': 'Young Kid'}, {'age': 30, 'name': 'Alexander'}, {'age': 85, 'name': 'Old Guy'}]

The syntax is a little peculiar, and requires a bit of explanation. It isn't the case that we're calling the getAge function ourselves. We simply define a function that can be used on our dictionaries, and give a reference to that function to the sort functions so they can use it to handle our data. It's like providing an instruction manual for a gadget that you might not otherwise know how to use. The sort functions can't sort arbitrary dictionaries on their own, but they can do it if you tell them how.

Passing a function as a value is a funny idea too, but it is important to realize that the functions themselves are actually variables. By declaring key=getAge, we're allowing sorted to call key(d) instead of having to write a new sorted function that explicitly uses getAge(d). Function implementations are small programs, and the names are simply references to those pieces of code.

You can show this by using a function as a variable in your own code, although you are definitely advised to use this as an experiment and not as an actual coding practice!

alexander = sorted
print(alexander(my_obj, key=getAge))

[{'age': 2, 'name': 'Young Kid'}, {'age': 30, 'name': 'Alexander'}, {'age': 85, 'name': 'Old Guy'}]

Holy smokes.

This idea of functions as variables can give sorted the ability to sort arbitrary lists of data, as long as you're able to give it a way to compare the elements against each other. A function like getAge that accepts a single parameter empowers sorted to make sense of the data and to move things around in a sensible way.

Breaking Stuff

Functions are clever ways of compartmentalizing code blocks so that they can be called outside of another part of the code. They are a core part of programming, and are so widely used that you're probably starting to think of them as just another part of the language. However, there's one thing we haven't touched on yet. Can a function call itself?

Let's start with a problem: we want to write a piece of code that, when given an integer as input, calculates the total sum of the squares less than or equal to that integer. For example, the sum of squares for the value 3 should return 14 because the square of three is nine, the square of two is four, the square of one is one, and the sum of those squares is fourteen. Now consider the following piece of code.

def sum_squares(x):
    return x * x + sum_squares(x - 1)

print(sum_squares(3))

At first glance, that looks fair enough. We define a new function called sum_squares that takes a single variable called x as input. To calculate the sum of squares, we could write a loop that iterates over the values from 1 to x. However, for the sake of trying something new, we consider the fact that calculating the sum of squares for a variable x is the same as calculating the sum of the square of x plus the sum of the squares of all the values less than x. That clever insight allows us to consider the problem recursively, or in other words, allows us to consider the solution in terms of the original problem. We write a function that phrases the solution in relation to itself, and write the code such that it calls itself.

In the example code, we solve this problem by returning the square of the current value stored x and the value returned by our new sum_squares function when called with a value of x-1.

Now, if we try to run this code, we get a surprising result:

...
RuntimeError: maximum recursion depth exceeded

What on Earth does that mean? What's the maximum recursion depth, and why did we exceed it?

Recursion, as we stated above, is a way of writing code that calls itself. This is a wonderful way of writing code, but it requires one important truth. To be useful as a piece of recursive code, the function must, at some point, stop calling itself. If it doesn't, the code will call itself forever.

Consider the code that we wrote above. It manages to state the problem of finding the sum of squares in terms of itself, and that's fine. However, it never stops doing this! When called with the value 3, it tries to find the sum of squares of the value 2. When called with the value 2, it tries to find the sum of squares of the value 1. When called with the value 1, it tries to find the sum of squares of the value 0, and so on. This causes a problem when the computer gets so deep in calls to sum_squares that it runs out of room. This is the maximum recursion depth error; we've called the function too many times, and we've run out of space to call it again.

Our code needs a stopping condition, or what is also called a base case. At some point, we have to return a value that doesn't depend on sum_squares. One way to do this is to write something like the following:

def sum_squares(x):
    if x == 1:
        return 1
    else:
        return x * x + sum_squares(x - 1)

Now when we ask for the value, we should eventually hit that anchor.

To be thorough, you'll need to add some additional checks to see if the value is less than or equal to zero, if the value isn't an integer, or any other number of strange things. However, be aware of recursion, and have fun trying to break it in other ways!

Summary

Python functions are an ideal way to make your code more efficient, readable, and understandable. They can help chop up longer programs into smaller, more manageable pieces, and can provide you with individual components that can be brought into your other programs. The art of designing functions can help you understand how other built-in functions actually operate, and will quickly become a major building block in your arsenal of programming tools.

Exercises

1) Add the maximum value as an optional parameter value to the getNumber example from this chapter. Test it out by writing a program that asks the user what year they were born in, and come up with reasonable minimum and maximum values.

2) Revisit an older and reasonable large piece of code that you've already written, and try to refactor out a piece of the code into a function. In particular, look for a piece of code that is used in multiple places in your code, like an input function. How many functions are you able to identify without making every individual line its own function?

It is an interesting exercise to look at a piece of code to determine how many functions you can identify without going too far. You certainly don't want to replace a statement like x = a + b with a new function, as you'd just be duplicating the work of the addition operation. However, if you're doing a more complicated mathematical function, it might make sense to pull the entire line out and to replace it with a function. Where do you intuitively feel the line lies in the code you're written so far? There is no right answer for all cases, so feel free to experiment to see what feels right for you.

3) Add a getName function to the sorting example, and write a sorting call that sorts my_obj by the name of the person instead of the age. Add some new values to the dictionaries and some more entries, and build more complicated sorts.