Chapter 7: Loops

As we saw in the last chapter, an if-statement is an incredibly useful programming tool. However, one of the problems with an if-statement is that it is still essentially just a one-time operation. For example, in our password program, it would be nice to return back to the top to ask the user again and again until they either gave up or got it right.

Loops have the same type of functionality as if-statements, but they repeat until you tell them to end. In the password example, when the user's password is entered incorrectly, instead of just ending, a loop could return back to the input statement and start again. We gain a lot of power from this new statement, and this chapter will explain how Python implements it.

History repeating

The idea of testing for a necessary condition before executing some code is a fundamental part of computer programming. It is impractical to write useful code that doesn't depend on the input it receives. The input may be strings of text from the prompt, files on a hard drive, or incoming data from the internet. Regardless of the context, checks need to be performed against the values of the variables, and programs get directed to the appropriate path when certain preconditions are satisfied.

With the examples that we saw in the previous chapter, we have gained the ability to route the user through several paths based on their input. Unfortunately, there are still some limitations. Say that a user enters a number outside the valid range, like a negative integer value for their age. The ideal program would realize that the age was invalid, and simply ask them again. For example, we're not turned away from a friend when we tell them that we're negative five years old (well, I suppose that it depends on the friend). So why should our program end when it reaches an invalid value when we could simply ask again?

One solution is to check to see if the user gave us a bad value first, and ask them again if they broke the rules. An example from the previous chapter can be expanded in the following way.

age = int(input("Enter your age: "))
if age < 0 or age > 150:
    print("I think you're trying to trick me.")
    age = int(input("Enter your age: "))
print("Alright, you're {0} years old.".format(age))

Enter your age: -5
I think you're trying to trick me.
Enter your age: 30
Alright, you're 30 years old.

It looks good at first glance. But this doesn't really solve our problem, because the user can lie a second time and trick our program again.

Enter your age: -5
I think you're trying to trick me.
Enter your age: -5
Alright, you're -5 years old.

So how do we get around this problem? We can't just keep embedding input statements inside of if-statements again and again. ... Can we?

age = int(input("Enter your age: "))
if age < 0 or age > 150:
    print("I think you're trying to trick me.")
    age = int(input("Enter your age: "))
    if age < 0 or age > 150:
        print("I think you're trying to trick me.")
        age = int(input("Enter your age: "))
        if age < 0 or age > 150:
            print("I think you're trying to trick me.")
            age = int(input("Enter your age: "))
            if age < 0 or age > 150:
                print("I think you're trying to trick me.")
                age = int(input("Enter your age: "))
                # hmm.. again?
print("Alright, you're {0} years old.".format(age))

Eventually we're going to hit the limit of our copy-paste ninjitsu, and a troublesome user is going to be able to convince our program that they're negative five years old. What we need is a more robust solution. How about the while statement?

age = int(input("Enter your age: "))
while age < 0 or age > 150:
    print("I think you're trying to trick me.")
    age = int(input("Enter your age: "))
print("Alright, you're {0} years old.".format(age))

Enter your age: -5
I think you're trying to trick me.
Enter your age: -5
I think you're trying to trick me.
Enter your age: -5
I think you're trying to trick me.
Enter your age: -5
I think you're trying to trick me.
Enter your age: -5
I think you're trying to trick me.
Enter your age: 30
Alright, you're 30 years old.

The while keyword works like an if-statement and performs the necessary check to see if the inner block of code needs to be executed. The major difference is that when Python reaches the end of that block of code, it does the expression check again. If the check still returns False, the while loop goes back to the beginning of the block and runs it again and again until things in the expression eventually settle.

Let's rework the password example from the previous chapter using a loop. Here's the new code.

password = "spam"
done = False
while done == False:
    user_password = input("Enter the password: ")
    if password == user_password:
        done = True
    else:
        print("That's not the password! I can't let you in.")
print("SECRET ACCESS GRANTED.")

Enter the password: turkey
That's not the password! I can't let you in.
Enter the password: spam
SECRET ACCESS GRANTED.

The code adds a new loop, along with a boolean variable called done that is initialized to False. The done variable will only get set to True when the password entered by the user matches the one defined at the start of the program. If the password doesn't match, we print out the same old error message. If they do match, the loop will terminate because done is no longer False, and our secret message and other hidden program code will start to execute.

In the age example above, we have a really clear example of looping that only occurs when necessary. What happens when the user actually enters in a good value the first time instead of entering some invalid negative age that doesn't pass the expression test? In that case, the while loop never enters into the block that asks the user for their age again. The input was acceptable, and we move right past the indented block to the print statement below. The while loop only gets triggered when the data is invalid, and only stays active when the input remains unacceptable.

This type of example can be augmented with counter variables. These are standard variables that are increased based on the number of times a certain type of input has occurred. (Okay, it's really just an int that we increase by 1 every time something interesting happens. I'm just going to call it a counter variable here.) Let's change the above example slightly to add a counter variable that checks how many times the user has been difficult.

age = int(input("Enter your age: "))
counter = 1
while age < 0 or age > 150:
    print("That was input #{0}, and you tried to trick me.".format(counter))
    age = int(input("Enter your age: "))
    counter = counter + 1
print("Alright, you're {0} years old.".format(age))
print("Number of times that I had to ask you: {0}".format(counter))

Enter your age: 30
Alright, you're 30 years old.
Number of times that I had to ask you: 1

Enter your age: -5
That was input #1, and you tried to trick me.
Enter your age: -4
That was input #2, and you tried to trick me.
Enter your age: -7
That was input #3, and you tried to trick me.
Enter your age: 30
Alright, you're 30 years old.
Number of times that I had to ask you: 4

The addition of a counter provides a visual indication about how many times the loop was executed, and in a sense, actually helps show how the loop is actually working. At first, the counter is set to a starting value of 1 to match the number of times we've presented the input statement to the user and asked them for their age. If they give an acceptable value, we're happy, and the counter never changes. However, the first time that a loop hits an error condition, defined in the code as one that causes the expression in the while-statement to evaluate to True, our counter gets incremented by one.

It is also interesting to note that the while loop never uses the counter variable in its expression. It's an extra variable that is used for information, and something that changes inside the loop block itself but never gets tested in the loop expression.

The break and continue statements

Let's say that we hit a case where the user has lied ten times in a row. We might get suspicious, and we might even think that the user is never going to tell us the truth. If that happens, we'll end up stuck inside our loop forever, never to return to normal program execution. How can we get out of this tricky situation?

Loops are pretty neat. They give us the ability to run for an arbitrary amount of time, processing data and collecting input until we hit some programmer-defined stopping point. If we have invalid ages, it's possible to keep forcing the user to give new data until they finally comply and give up their actual age. However, we might have edge cases where it would be better to end the loop early, or even to just stop the current execution and move on to the next iteration of the loop itself. If we ever hit a point in a loop where we'd like to break out for some reason, Python provides a couple of handy statements to allow us to do this.

age = int(input("Enter your age: "))
counter = 1
while age < 0 or age > 150:
    print("That was input #{0}, and you tried to trick me.".format(counter))
    age = int(input("Enter your age: "))
    counter = counter + 1
    if counter >= 5:
        age = 0
        break
print("Number of times that I had to ask you: {0}".format(counter))
if age == 0:
    print("You didn't follow the rules, and I am sad.")
else:
    print("Alright, you're {0} years old.".format(age))

Enter your age: -5
That was input #1, and you tried to trick me.
Enter your age: -4
That was input #2, and you tried to trick me.
Enter your age: -3
That was input #3, and you tried to trick me.
Enter your age: -2
That was input #4, and you tried to trick me.
Enter your age: -1
Number of times that I had to ask you: 5
You didn't follow the rules, and I am sad.

In the example above, a condition is set up to make sure that a user who is intent on giving bad data can't succeed in hogging all of the program time. If they give five bad values, we assume they're playing games and get out of there using the break statement. When you break out of a loop, you tell the loop that there's no need to keep testing for the truth condition, and that you're satisfied with the state of the code. By using break, the loop terminates right there, and the rest of the program after the loop starts running.

Let's use a simple example to just show break on its own.

i = 1
while i > 0:
    print(i)
    i = i + 1
    if i > 5:
        break

1
2
3
4
5

Exiting out of the loop isn't the only action that might be needed. We might also want to stop execution of the current loop block while remaining in the current loop itself. This is different than break, which halts the loop entirely. Here's an example that shows the difference between the two.

while True:
    num = int(input("Enter a positive number, or 0 to quit: "))
    if num < 0:
        print("Positive numbers only, please!")
        continue
    elif num == 0:
        print("OK, quitting!")
        break
    num_squared = num * num
    print("{0} times {0} is {1}.".format(num, num_squared))
print("All done!")

Enter a positive number, or 0 to quit: -1
Positive numbers only, please!
Enter a positive number, or 0 to quit: 3
3 times 3 is 9.
Enter a positive number, or 0 to quit: 5
5 times 5 is 25.
Enter a positive number, or 0 to quit: -4
Positive numbers only, please!
Enter a positive number, or 0 to quit: 0
OK, quitting!
All done!

We want to get a number from the user so that we can give them the square of that number. In this example, we're only asking for positive numbers, so if the user gives us a negative one, we don't want to quit, but we don't want to actually give them the square. We asked for a positive number, and we'd like to enforce that without quitting in the case of invalid input.

The continue statement acts like the break statement in the context of the current iteration of the loop. Continuing out of a block of code inside a loop will end the current iteration, return to the top of the loop, and continue from the start of the loop again. Breaking out of a block of code ends the loop altogether.

To make sure that the loop runs until we actually tell it to stop, we use an expression that might be a little counter-intuitive at first glance. What does it mean to have a while loop that uses the True expression? Look at an if-statement for an example.

if True:
    print("Hello, world!")
else:
    print("Goodbye.")

Hello, world!

Of course it must always be the case that "Hello, world!" gets printed and that "Goodbye." is never seen. By definition, if the if-statement expression evaluates to True, we run the code in the if-statement's code block. True is always True, and True is never False, so it's impossible for us to get to the "Goodbye." print statement. However, when blocks of code are repeating, like with while loops, Python is asked to run the code forever until explicitly told to stop.

while True:
    print("Hello, world!")

Hello, world!
Hello, world!
Hello, world!
Hello, world!
[repeat forever..]

With break and continue, you gain the ability to write loops that grant you a greater level of control. You aren't bound to just allow the loops to run as they see fit. You have the power to either terminate the entire loop, or to simply terminate the current iteration in the loop by moving to the next one.

Let's build an example with a counter variable.

counter = 1
while True:
    print("Hello, world!")
    counter = counter + 1
    if counter > 5:
        break

Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!

This example still uses the True expression, asking Python to run forever until we tell it otherwise. With the break statement, the kill command is given to the while loop. Based on the number of times we've already printed our statement, we can ask Python to cut the loop short and continue on with the rest of the program code.

The for statement

A for statement differs from a while statement by actually requiring a set of elements that it will iterate over. For example, we might want a loop that runs over a range of numbers, like the integer values from 1 to 10. It is possible to set up a counter with a while loop to make this happen, but the for statement is set up to handle this type of situation easily and efficiently. Let's revisit the "Hello, world!" example from the previous section. To see how it can be modified to print out the string five times using a for loop. To do that, we'll learn about a new built-in function called range.

for x in range(5):
    print("Hello, world!")

Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!

The range function provides a way to access a set of numbers in an efficient way. When used with a for statement, it performs a loop that runs a fixed number of times. The example above asks Python to run the "Hello, world!" print statement five times. It does this by assigning the new variable x a particular value in the requested range for each iteration of the loop. If you remember how these indices are calculated with string character positions, you'll recall that these indices start at zero. To show this, look at the following code:

for x in range(5):
    print("The value of x is {0}.".format(x))

The value of x is 0.
The value of x is 1.
The value of x is 2.
The value of x is 3.
The value of x is 4.

When the for function is combined with the range built-in function, each iteration of the loop sets x to the next value in the range we requested. It's as if we had a counter with a while loop that we were manually requesting. The for loop takes the next value automatically, sets the variable x to the new value, and executes the inner code block. With a while loop, the code might look like this:

x = 0
while x < 5:
    print("The value of x is {0}.".format(x))
    x = x + 1

The value of x is 0.
The value of x is 1.
The value of x is 2.
The value of x is 3.
The value of x is 4.

Since the range value gives us this counter behaviour implicitly, we can write code that is a little more readable by combining for and range when the actual number range is known. It is important to note that range(5) gives the first five numbers in the range, starting at 0, and that this range doesn't include the actual number 5. This is a side-effect of the zero-based counting system.

The range function is versatile enough to allow ranges that don't actually start at zero (for example, the numbers from 2 to 4), and ranges that increase or decrease in other ways than incrementing by one. If we wanted to calculate the squares of the first five positive integers, we could specify the starting and ending values in range and use a similar for loop to get our code. In this case, we can use range(1, 6). As mentioned earlier, we must consider that the end point that is given is never part of the actual range returned. In the range(5) example, the value 5 is never part of the range, and when we use range(1, 6), the value 6 is never part of the range. The ending value is the stopping condition, the value that we stop the range at and never test.

for x in range(1, 6):
    print("The square of {0} is {1}.".format(x, x * x))

The square of 1 is 1.
The square of 2 is 4.
The square of 3 is 9.
The square of 4 is 16.
The square of 5 is 25.

If a starting value is provided, you also have the option of providing a step value. The step is the amount that the value is modified by at each iteration of the loop. For example, if you want to consider the even numbers instead of every one along the way, you specify a starting value of 0 and a step of 2.

for x in range(0, 11, 2):
    print("The square of {0} is {1}.".format(x, x * x))

The square of 0 is 0.
The square of 2 is 4.
The square of 4 is 16.
The square of 6 is 36.
The square of 8 is 64.
The square of 10 is 100.

In general, a for statement can be thought of as a way to define a new variable that is given a set of possible values to exist with in the context of a particular block of code. The squares example has some very general code to output the original variable and its square, and the code makes no assumption on what the variable is, except for the fact that it must be a number. We didn't change the inner block between the range(1, 6) and range(0, 11, 2) examples at all. All that was modified was the range that the code was iterated over. Even the variable name remained the same.

We can use variables in the range function as well. The squares example can be extended to use a user-provided range of numbers to output.

lower = int(input("Enter the lower value of the range: "))
upper = int(input("Enter the upper value of the range: "))
for x in range(lower, upper):
    print("The square of {0} is {1}.".format(x, x * x))

Enter the lower value of the range: 8
Enter the upper value of the range: 14
The square of 8 is 64.
The square of 9 is 81.
The square of 10 is 100.
The square of 11 is 121.
The square of 12 is 144.
The square of 13 is 169.

Neat trick, eh? Give it a shot! What happens if you enter some different numbers, or if you start with the larger number instead of the smaller one? Can you break this code?

Nested loops

One powerful technique using loops involves nesting them inside of other loops. If you have two pieces of data that are changing in relation to one another, you can define a new loop inside of the original loop to process your data. Let's use multiplication tables as an example of this.

for x in range(1, 6):
    for y in range(1, 6):
        print("{0} * {1} is {2}".format(x, y, x * y))

1 * 1 is 1
1 * 2 is 2
1 * 3 is 3
1 * 4 is 4
1 * 5 is 5
2 * 1 is 2
2 * 2 is 4
[...]
5 * 3 is 15
5 * 4 is 20
5 * 5 is 25

What happened in this example? The for loop using the y variable is sitting inside of the for loop using the x variable. Every time the x loop happens, five iterations of the y loop occur. At first, x is set to 1, and y is set to 1. When the print statement finishes, the y block also finishes, and the next iteration of the y loop begins. This bumps up the y value to 2, while x remains at 1, and the print statement occurs again. Once the print statement finishes showing the result of 1 * 5, the y loop is finished, and the first x block finally finishes. The x variable jumps up to 2, and the y loop starts all over again with y set to 1. The important note there is that the entire y loop is essentially refreshed each time the x loop starts.

Let's use some print statements to provide a clearer example of the way that inner loops actually evaluate in the code.

for x in range(3):
    print("x: {0}".format(x))
    for y in range(3):
        print("    y: {0}".format(y))

x: 0
    y: 0
    y: 1
    y: 2
x: 1
    y: 0
    y: 1
    y: 2
x: 2
    y: 0
    y: 1
    y: 2

The inner loop runs multiple times, and each time the y loop starts up, it has a different value for x. You can use this to build complicated examples with different data sets. When we look at list variables in the next chapter, you'll see how you can combine things like sets of names, locations, or other variables in ways like this.

Breaking Stuff

Want to know one of my favourite ways of breaking stuff? Or, at least, one of the ways that I most frequently break stuff? It's a little bit embarassing, but it happens all the time, so I might as well fess up to it.

In Python, as in many other programming languages, variable names are important. You should think very carefully about the names you use, both in how they're capitalized, what information they convey, and in the styles of names used for particular purposes. Case in point: loops are often written using the identifier i. Why i? Well, just because. And if i is taken in one of the outer loops, just use j. Lots of loops will default to i or j, just as many simple variables will default to x.

Where we get bit by this is when we forget about the variables that have already been defined, and we reuse them in some context. Let's look at a simple, and perfectly obvious example, of a broken piece of code.

for int in range(10):
    print("{0} times {0} is {1}.".format(int, int * int))

0 times 0 is 0.
1 times 1 is 1.
2 times 2 is 4.
3 times 3 is 9.
4 times 4 is 16.
5 times 5 is 25.
6 times 6 is 36.
7 times 7 is 49.
8 times 8 is 64.
9 times 9 is 81.

Seems legit, right? We define a variable called int and call the range function to give us 10 different values, each of which is assigned to int and used to print out a string showing what the square of the value is. The code even seems to do the right thing, and ends without any errors.

What matters in this piece of code is the assignment to int. The int identifier is currently assigned to a function--in this case, the function that converts another value into an integer. If we assign a new value to int, we actually tell Python to throw away the reference to the function that converts values into integers, and to instead track the value returned by range. For example, look at this code:

for int in range(10):
    print("{0} times {0} is {1}.".format(int, int * int))

x = int(input("How high should I go? "))

for int in range(x):
    print("{0} times {0} is {1}.".format(int, int * int))

That code should, at first glance, show us the squares of the values from 0 to 9, then ask us what value we should pass to range, and then do the same thing. The int call, when used with input has previously converted string representations of numbers into the actual number value itself. However, by telling Python that int--the identifier int, which is a variable name--is used to store a value now, we can no longer use it to access the function!

0 times 0 is 0.
1 times 1 is 1.
2 times 2 is 4.
3 times 3 is 9.
4 times 4 is 16.
5 times 5 is 25.
6 times 6 is 36.
7 times 7 is 49.
8 times 8 is 64.
9 times 9 is 81.
How high should I go? 15
Traceback (most recent call last):
  File "sample.py", line 4, in <module>
    x = int(input("How high should I go? "))
TypeError: 'int' object is not callable

We'll get into more detail about why this happens later, don't worry. For the time being, try not to reuse identifiers. You might not get what you expect.

Summary

Loops are a fundamental part of programming. Adding control flow gives your program the flexibility to run sections of code multiple times, and allows your code to perform complicated tasks that are otherwise impossible or unwieldy.

In the next section, we'll look at a data structure called a list that is used by Python to enable some very diverse functionality.

Exercises

1. Write a program that prints all the multiples of 7 from 0 to 100. Experiment with printing the squares, the reciprocal (one divided by the number), and other expressions.

2. Print out the multiplication tables from 1 to 10 in a table on the screen. You can use the end parameter in print statements to stop Python from printing each string on its own line. For example,

for x in range(1, 11):
    print("{0:4}".format(x), end="")

1   2   3   4   5   6   7   8   9  10

Set up an outer loop and an inner loop to keep track of the values you're multiplying together, and let each row and column correspond to increasing values from 1 to 10.