Chapter 2 Variables, expressions, and statements
2.1 Values and types
A value is one of the basic things a program works with, like a letter or a number. The values we have seen so far are 1, 2, and
These values belong to different types: 2 is an integer, and is a string, so called because it contains a “string” of letters. You (and the interpreter) can identify strings because they are enclosed in quotation marks.
The print statement also works for integers. We use the python command to start the interpreter.python >>> print 4 4
If you are not sure what type a value has, the interpreter can tell you.>>> type('Hello, World!') <type 'str'> >>> type(17) <type 'int'>
Not surprisingly, strings belong to the type str and integers belong to the type int. Less obviously, numbers with a decimal point belong to a type called float, because these numbers are represented in a format called floating point.>>> type(3.2) <type 'float'>
What about values like and ? They look like numbers, but they are in quotation marks like strings.>>> type('17') <type 'str'> >>> type('3.2') <type 'str'>
When you type a large integer, you might be tempted to use commas between groups of three digits, as in 1,000,000. This is not a legal integer in Python, but it is legal:>>> print 1,000,000 1 0 0
Well, that’s not what we expected at all! Python interprets 1,000,000 as a comma-separated sequence of integers, which it prints with spaces between.
This is the first example we have seen of a semantic error: the code runs without producing an error message, but it doesn’t do the “right” thing.
One of the most powerful features of a programming language is the ability to manipulate variables. A variable is a name that refers to a value.
An assignment statement creates new variables and gives them values:>>> message = 'And now for something completely different' >>> n = 17 >>> pi = 3.1415926535897931
This example makes three assignments. The first assigns a string to a new variable named message; the second assigns the integer 17 to n; the third assigns the (approximate) value of π to pi.
To display the value of a variable, you can use a print statement:>>> print n 17 >>> print pi 3.14159265359
The type of a variable is the type of the value it refers to.>>> type(message) <type 'str'> >>> type(n) <type 'int'> >>> type(pi) <type 'float'>
2.3 Variable names and keywords
Programmers generally choose names for their variables that are meaningful and document what the variable is used for.
Variable names can be arbitrarily long. They can contain both letters and numbers, but they cannot start with a number. It is legal to use uppercase letters, but it is a good idea to begin variable names with a lowercase letter (you’ll see why later).
The underscore character () can appear in a name. It is often used in names with multiple words, such as or . Variable names can start with an underscore character, but we generally avoid doing this unless we are writing library code for others to use.
If you give a variable an illegal name, you get a syntax error:>>> 76trombones = 'big parade' SyntaxError: invalid syntax >>> [email protected] = 1000000 SyntaxError: invalid syntax >>> class = 'Advanced Theoretical Zymurgy' SyntaxError: invalid syntax
76trombones is illegal because it begins with a number. [email protected] is illegal because it contains an illegal character, @. But what’s wrong with class?
It turns out that class is one of Python’s keywords. The interpreter uses keywords to recognize the structure of the program, and they cannot be used as variable names.
Python reserves 31 keywords1 for its use:and del from not while as elif global or with assert else if pass yield break except import print class exec in raise continue finally is return def for lambda try
You might want to keep this list handy. If the interpreter complains about one of your variable names and you don’t know why, see if it is on this list.
A statement is a unit of code that the Python interpreter can execute. We have seen two kinds of statements: print and assignment.
When you type a statement in interactive mode, the interpreter executes it and displays the result, if there is one.
A script usually contains a sequence of statements. If there is more than one statement, the results appear one at a time as the statements execute.
For example, the scriptprint 1 x = 2 print x
produces the output1 2
The assignment statement produces no output.
2.5 Operators and operands
Operators are special symbols that represent computations like addition and multiplication. The values the operator is applied to are called operands.
The operators +, -, *, /, and ** perform addition, subtraction, multiplication, division, and exponentiation, as in the following examples:20+32 hour-1 hour*60+minute minute/60 5**2 (5+9)*(15-7)
The division operator might not do what you expect:>>> minute = 59 >>> minute/60 0
The value of minute is 59, and in conventional arithmetic 59 divided by 60 is 0.98333, not 0. The reason for the discrepancy is that Python is performing floor division2.
When both of the operands are integers, the result is also an integer; floor division chops off the fractional part, so in this example it truncates the answer to zero.
If either of the operands is a floating-point number, Python performs floating-point division, and the result is a float:>>> minute/60.0 0.98333333333333328
An expression is a combination of values, variables, and operators. A value all by itself is considered an expression, and so is a variable, so the following are all legal expressions (assuming that the variable x has been assigned a value):17 x x + 17
If you type an expression in interactive mode, the interpreter evaluates it and displays the result:>>> 1 + 1 2
But in a script, an expression all by itself doesn’t do anything! This is a common source of confusion for beginners.
2.7 Order of operations
When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. For mathematical operators, Python follows mathematical convention. The acronym PEMDAS is a useful way to remember the rules:
- Parentheses have the highest precedence and can be used to force an expression to evaluate in the order you want. Since expressions in parentheses are evaluated first, 2 * (3-1) is 4, and (1+1)**(5-2) is 8. You can also use parentheses to make an expression easier to read, as in (minute * 100) / 60, even if it doesn’t change the result.
- Exponentiation has the next highest precedence, so 2**1+1 is 3, not 4, and 3*1**3 is 3, not 27.
- Multiplication and Division have the same precedence, which is higher than Addition and Subtraction, which also have the same precedence. So 2*3-1 is 5, not 4, and 6+4/2 is 8, not 5.
- Operators with the same precedence are evaluated from left to right. So the expression 5-3-1 is 1, not 3, because the 5-3 happens first and then 1 is subtracted from 2.
When in doubt, always put parentheses in your expressions to make sure the computations are performed in the order you intend.
2.8 Modulus operator
The modulus operator works on integers and yields the remainder when the first operand is divided by the second. In Python, the modulus operator is a percent sign (). The syntax is the same as for other operators:>>> quotient = 7 / 3 >>> print quotient 2 >>> remainder = 7 % 3 >>> print remainder 1
So 7 divided by 3 is 2 with 1 left over.
The modulus operator turns out to be surprisingly useful. For example, you can check whether one number is divisible by another—if x % y is zero, then x is divisible by y.
You can also extract the right-most digit or digits from a number. For example, x % 10 yields the right-most digit of x (in base 10). Similarly, x % 100 yields the last two digits.
2.9 String operations
The + operator works with strings, but it is not addition in the mathematical sense. Instead it performs concatenation, which means joining the strings by linking them end to end. For example:>>> first = 10 >>> second = 15 >>> print first+second 25 >>> first = '100' >>> second = '150' >>> print first + second 100150
The output of this program is 100150.
2.10 Asking the user for input
Sometimes we would like to take the value for a variable from the user via their keyboard. Python provides a built-in function called that gets input from the keyboard3. When this function is called, the program stops and waits for the user to type something. When the user presses Return or Enter, the program resumes and returns what the user typed as a string.>>> input = raw_input() Some silly stuff >>> print input Some silly stuff
Before getting input from the user, it is a good idea to print a prompt telling the user what to input. You can pass a string to to be displayed to the user before pausing for input:>>> name = raw_input('What is your name?\n') What is your name? Chuck >>> print name Chuck
The sequence at the end of the prompt represents a newline, which is a special character that causes a line break. That’s why the user’s input appears below the prompt.
If you expect the user to type an integer, you can try to convert the return value to int using the int() function:>>> prompt = 'What...is the airspeed velocity of an unladen swallow?\n' >>> speed = raw_input(prompt) What...is the airspeed velocity of an unladen swallow? 17 >>> int(speed) 17 >>> int(speed) + 5 22
But if the user types something other than a string of digits, you get an error:>>> speed = raw_input(prompt) What...is the airspeed velocity of an unladen swallow? What do you mean, an African or a European swallow? >>> int(speed) ValueError: invalid literal for int()
We will see how to handle this kind of error later.
As programs get bigger and more complicated, they get more difficult to read. Formal languages are dense, and it is often difficult to look at a piece of code and figure out what it is doing, or why.
For this reason, it is a good idea to add notes to your programs to explain in natural language what the program is doing. These notes are called comments, and in Python they start with the symbol:# compute the percentage of the hour that has elapsed percentage = (minute * 100) / 60
In this case, the comment appears on a line by itself. You can also put comments at the end of a line:percentage = (minute * 100) / 60 # percentage of an hour
Everything from the # to the end of the line is ignored—it has no effect on the program.
Comments are most useful when they document non-obvious features of the code. It is reasonable to assume that the reader can figure out what the code does; it is much more useful to explain why.
This comment is redundant with the code and useless:v = 5 # assign 5 to v
This comment contains useful information that is not in the code:v = 5 # velocity in meters/second.
Good variable names can reduce the need for comments, but long names can make complex expressions hard to read, so there is a trade-off.
2.12 Choosing mnemonic variable names
As long as you follow the simple rules of variable naming, and avoid reserved words, you have a lot of choice when you name your variables. In the beginning, this choice can be confusing both when you read a program and when you write your own programs. For example, the following three programs are identical in terms of what they accomplish, but very different when you read them and try to understand them.a = 35.0 b = 12.50 c = a * b print c hours = 35.0 rate = 12.50 pay = hours * rate print pay x1q3z9ahd = 35.0 x1q3z9afd = 12.50 x1q3p9afd = x1q3z9ahd * x1q3z9afd print x1q3p9afd
The Python interpreter sees all three of these programs as exactly the same but humans see and understand these programs quite differently. Humans will most quickly understand the intent of the second program because the programmer has chosen variable names that reflect their intent regarding what data will be stored in each variable.
We call these wisely chosen variable names “mnemonic variable names”. The word mnemonic4 means “memory aid”. We choose mnemonic variable names to help us remember why we created the variable in the first place.
While this all sounds great, and it is a very good idea to use mnemonic variable names, mnemonic variable names can get in the way of a beginning programmer’s ability to parse and understand code. This is because beginning programmers have not yet memorized the reserved words (there are only 31 of them) and sometimes variables with names that are too descriptive start to look like part of the language and not just well-chosen variable names.
Take a quick look at the following Python sample code which loops through some data. We will cover loops soon, but for now try to just puzzle through what this means:for word in words: print word
What is happening here? Which of the tokens (for, word, in, etc.) are reserved words and which are just variable names? Does Python understand at a fundamental level the notion of words? Beginning programmers have trouble separating what parts of the code must be the same as this example and what parts of the code are simply choices made by the programmer.
The following code is equivalent to the above code:for slice in pizza: print slice
It is easier for the beginning programmer to look at this code and know which parts are reserved words defined by Python and which parts are simply variable names chosen by the programmer. It is pretty clear that Python has no fundamental understanding of pizza and slices and the fact that a pizza consists of a set of one or more slices.
But if our program is truly about reading data and looking for words in the data, pizza and slice are very un-mnemonic variable names. Choosing them as variable names distracts from the meaning of the program.
After a pretty short period of time, you will know the most common reserved words and you will start to see the reserved words jumping out at you:
for word in words:
The parts of the code that are defined by Python (for, in, print, and :) are in bold and the programmer-chosen variables (word and words) are not in bold. Many text editors are aware of Python syntax and will color reserved words differently to give you clues to keep your variables and reserved words separate. After a while you will begin to read Python and quickly determine what is a variable and what is a reserved word.
At this point, the syntax error you are most likely to make is an illegal variable name, like class and yield, which are keywords, or and , which contain illegal characters.
If you put a space in a variable name, Python thinks it is two operands without an operator:>>> bad name = 5 SyntaxError: invalid syntax
For syntax errors, the error messages don’t help much. The most common messages are SyntaxError: invalid syntax and SyntaxError: invalid token, neither of which is very informative.
The runtime error you are most likely to make is a “use before def;” that is, trying to use a variable before you have assigned a value. This can happen if you spell a variable name wrong:>>> principal = 327.68 >>> interest = principle * rate NameError: name 'principle' is not defined
Variables names are case sensitive, so LaTeX is not the same as latex.
At this point, the most likely cause of a semantic error is the order of operations. For example, to evaluate 1/2 π, you might be tempted to write>>> 1.0 / 2.0 * pi
But the division happens first, so you would get π / 2, which is not the same thing! There is no way for Python to know what you meant to write, so in this case you don’t get an error message; you just get the wrong answer.
- A statement that assigns a value to a variable.
- To join two operands end to end.
- Information in a program that is meant for other programmers (or anyone reading the source code) and has no effect on the execution of the program.
- To simplify an expression by performing the operations in order to yield a single value.
- A combination of variables, operators, and values that represents a single result value.
- floating point:
- A type that represents numbers with fractional parts.
- floor division:
- The operation that divides two numbers and chops off the fractional part.
- A type that represents whole numbers.
- A reserved word that is used by the compiler to parse a program; you cannot use keywords like if, def, and while as variable names.
- A memory aid. We often give variables mnemonic names to help us remember what is stored in the variable.
- modulus operator:
- An operator, denoted with a percent sign (%), that works on integers and yields the remainder when one number is divided by another.
- One of the values on which an operator operates.
- A special symbol that represents a simple computation like addition, multiplication, or string concatenation.
- rules of precedence:
- The set of rules governing the order in which expressions involving multiple operators and operands are evaluated.
- A section of code that represents a command or action. So far, the statements we have seen are assignments and print statements.
- A type that represents sequences of characters.
- A category of values. The types we have seen so far are integers (type int), floating-point numbers (type float), and strings (type str).
- One of the basic units of data, like a number or string, that a program manipulates.
- A name that refers to a value.
We won’t worry about making sure our pay has exactly two digits after the decimal place for now. If you want, you can play with the built-in Python round function to properly round the resulting pay to two decimal places.
For each of the following expressions, write the value of the expression and the type (of the value of the expression).
- 1 + 2 * 5
Use the Python interpreter to check your answers.
Exercise 5 Write a program which prompts the user for a Celsius temperature, convert the temperature to Fahrenheit, and print out the converted temperature.
Chapter 3 Conditional execution
3.1 Boolean expressionsA boolean expression is an expression that is either true or false. The following examples use the operator , which compares two operands and produces if they are equal and otherwise: >>> 5 == 5 True >>> 5 == 6 False and are special values that belong to the type ; they are not strings:
>>> type(True) <type 'bool'> >>> type(False) <type 'bool'> The operator is one of the comparison operators; the others are: x != y # x is not equal to y x > y # x is greater than y x < y # x is less than y x >= y # x is greater than or equal to y x <= y # x is less than or equal to y x is y # x is the same as y x is not y # x is not the same as y Although these operations are probably familiar to you, the Python symbols are different from the mathematical symbols. A common error is to use a single equal sign () instead of a double equal sign (). Remember that is an assignment operator and is a comparison operator. There is no such thing as or .
3.2 Logical operatorsThere are three logical operators: , , and . The semantics (meaning) of these operators is similar to their meaning in English. For example,
is true only if is greater than 0 and less than 10.
is true if either of the conditions is true, that is, if the number is divisible by 2 or 3.
Finally, the operator negates a boolean expression, so is true if is false, that is, if is less than or equal to .
Strictly speaking, the operands of the logical operators should be boolean expressions, but Python is not very strict. Any nonzero number is interpreted as "true." >>> 17 and True True This flexibility can be useful, but there are some subtleties to it that might be confusing. You might want to avoid it (unless you know what you are doing).
3.3 Conditional executionIn order to write useful programs, we almost always need the ability to check conditions and change the behavior of the program accordingly. Conditional statements give us this ability. The simplest form is the statement: if x > 0 : print 'x is positive' The boolean expression after the statement is called the condition. We end the statement with a colon character (:) and the line(s) after the if statement are indented.
If the logical condition is true, then the indented statement gets executed. If the logical condition is false, the indented statement is skipped.
statements have the same structure as function definitions or loops. The statement consists of a header line that ends with the colon character (:) followed by an indented block. Statements like this are called compound statements because they stretch across more than one line.
There is no limit on the number of statements that can appear in the body, but there has to be at least one. Occasionally, it is useful to have a body with no statements (usually as a place keeper for code you haven't written yet). In that case, you can use the statement, which does nothing.
if x < 0 : pass # need to handle negative values! If you enter an if statement in the Python interpreter, the prompt will change from three chevrons to three dots to indicate you are in the middle of a block of statements as shown below: >>> x = 3 >>> if x < 10: ... print 'Small' ... Small >>>
3.4 Alternative executionA second form of the statement is alternative execution, in which there are two possibilities and the condition determines which one gets executed. The syntax looks like this: if x%2 == 0 : print 'x is even' else : print 'x is odd' If the remainder when is divided by 2 is 0, then we know that is even, and the program displays a message to that effect. If the condition is false, the second set of statements is executed.
Since the condition must be true or false, exactly one of the alternatives will be executed. The alternatives are called branches, because they are branches in the flow of execution.
3.5 Chained conditionalsSometimes there are more than two possibilities and we need more than two branches. One way to express a computation like that is a chained conditional: if x < y: print 'x is less than y' elif x > y: print 'x is greater than y' else: print 'x and y are equal' is an abbreviation of "else if." Again, exactly one branch will be executed.
There is no limit on the number of statements. If there is an clause, it has to be at the end, but there doesn't have to be one.
if choice == 'a': print 'Bad guess' elif choice == 'b': print 'Good guess' elif choice == 'c': print 'Close, but not correct' Each condition is checked in order. If the first is false, the next is checked, and so on. If one of them is true, the corresponding branch executes, and the statement ends. Even if more than one condition is true, only the first true branch executes.
3.6 Nested conditionalsOne conditional can also be nested within another. We could have written the trichotomy example like this: if x == y: print 'x and y are equal' else: if x < y: print 'x is less than y' else: print 'x is greater than y' The outer conditional contains two branches. The first branch contains a simple statement. The second branch contains another statement, which has two branches of its own. Those two branches are both simple statements, although they could have been conditional statements as well.
Although the indentation of the statements makes the structure apparent, nested conditionals become difficult to read very quickly. In general, it is a good idea to avoid them when you can.
Logical operators often provide a way to simplify nested conditional statements. For example, we can rewrite the following code using a single conditional: if 0 < x: if x < 10: print 'x is a positive single-digit number.' The statement is executed only if we make it past both conditionals, so we can get the same effect with the operator: if 0 < x and x < 10: print 'x is a positive single-digit number.'
3.7 Catching exceptions using try and exceptEarlier we saw a code segment where we used the and functions to read and parse an integer number entered by the user. We also saw how treacherous doing this could be: >>> speed = raw_input(prompt) What...is the airspeed velocity of an unladen swallow? What do you mean, an African or a European swallow? >>> int(speed) ValueError: invalid literal for int() >>> When we are executing these statements in the Python interpreter, we get a new prompt from the interpreter, think "oops" and move on to our next statement.
However if this code is placed in a Python script and this error occurs, your script immediately stops in its tracks with a traceback. It does not execute the following statement.
Here is a sample program to convert a Fahrenheit temperature to a Celsius temperature: inp = raw_input('Enter Fahrenheit Temperature:') fahr = float(inp) cel = (fahr - 32.0) * 5.0 / 9.0 print cel If we execute this code and give it invalid input, it simply fails with an unfriendly error message: python fahren.py Enter Fahrenheit Temperature:72 22.2222222222 python fahren.py Enter Fahrenheit Temperature:fred Traceback (most recent call last): File "fahren.py", line 2, in <module> fahr = float(inp) ValueError: invalid literal for float(): fred There is a conditional execution structure built into Python to handle these types of expected and unexpected errors called "try / except". The idea of and is that you know that some sequence of instruction(s) may have a problem and you want to add some statements to be executed if an error occurs. These extra statements (the except block) are ignored if there is no error.
You can think of the and feature in Python as an "insurance policy" on a sequence of statements.
We can rewrite our temperature converter as follows: inp = raw_input('Enter Fahrenheit Temperature:') try: fahr = float(inp) cel = (fahr - 32.0) * 5.0 / 9.0 print cel except: print 'Please enter a number' Python starts by executing the sequence of statements in the block. If all goes well, it skips the block and proceeds. If an exception occurs in the block, Python jumps out of the block and executes the sequence of statements in the block. python fahren2.py Enter Fahrenheit Temperature:72 22.2222222222 python fahren2.py Enter Fahrenheit Temperature:fred Please enter a number Handling an exception with a statement is called catching an exception. In this example, the clause prints an error message. In general, catching an exception gives you a chance to fix the problem, or try again, or at least end the program gracefully.
3.8 Short circuit evaluation of logical expressionsWhen Python is processing a logical expression such as , it evaluates the expression from left-to-right. Because of the definition of , if is less than 2, the expression is and so the whole expression is regardless of whether evaluates to or .
When Python detects that there is nothing to be gained by evaluating the rest of a logical expression, it stops its evaluation and does not do the computations in the rest of the logical expression. When the evaluation of a logical expression stops because the overall value is already known, it is called short-circuiting the evaluation.
While this may seem like a fine point, the short circuit behavior leads to a clever technique called the guardian pattern. Consider the following code sequence in the Python interpreter: >>> x = 6 >>> y = 2 >>> x >= 2 and (x/y) > 2 True >>> x = 1 >>> y = 0 >>> x >= 2 and (x/y) > 2 False >>> x = 6 >>> y = 0 >>> x >= 2 and (x/y) > 2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: integer division or modulo by zero >>> The third calculation failed because Python was evaluating and was zero which causes a runtime error. But the second example did not fail because the first part of the expression evaluated to so the was not ever executed due to the short circuit rule and there was no error.
We can construct the logical expression to strategically place a guard evaluation just before the evaluation that might cause an error as follows: >>> x = 1 >>> y = 0 >>> x >= 2 and y != 0 and (x/y) > 2 False >>> x = 6 >>> y = 0 >>> x >= 2 and y != 0 and (x/y) > 2 False >>> x >= 2 and (x/y) > 2 and y != 0 Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: integer division or modulo by zero >>> In the first logical expression, is so the evaluation stops at the . In the second logical expression is but is so we never reach .
In the third logical expression, the is after the calculation so the expression fails with an error.
In the second expression, we say that acts as a guard to insure that we only execute if is non-zero.
3.9 DebuggingThe traceback Python displays when an error occurs contains a lot of information, but it can be overwhelming, especially when there are many frames on the stack. The most useful parts are usually:
- What kind of error it was, and
- Where it occurred.
>>> x = 5 >>> y = 6 File "<stdin>", line 1 y = 6 ^ SyntaxError: invalid syntax In this example, the problem is that the second line is indented by one space. But the error message points to , which is misleading. In general, error messages indicate where the problem was discovered, but the actual error might be earlier in the code, sometimes on a previous line.
The same is true of runtime errors. Suppose you are trying to compute a signal-to-noise ratio in decibels. The formula is SNRdb = 10 log10 (Psignal / Pnoise). In Python, you might write something like this: import math signal_power = 9 noise_power = 10 ratio = signal_power / noise_power decibels = 10 * math.log10(ratio) print decibels But when you run it, you get an error message1:
Traceback (most recent call last): File "snr.py", line 5, in ? decibels = 10 * math.log10(ratio) OverflowError: math range error The error message indicates line 5, but there is nothing wrong with that line. To find the real error, it might be useful to print the value of , which turns out to be 0. The problem is in line 4, because dividing two integers does floor division. The solution is to represent signal power and noise power with floating-point values.
In general, error messages tell you where the problem was discovered, but that is often not where it was caused.
- The sequence of statements within a compound statement.
- boolean expression:
- An expression whose value is either or .
- One of the alternative sequences of statements in a conditional statement.
- chained conditional:
- A conditional statement with a series of alternative branches.
- comparison operator:
- One of the operators that compares its operands: , , , , , and .
- conditional statement:
- A statement that controls the flow of execution depending on some condition.
- The boolean expression in a conditional statement that determines which branch is executed.
- compound statement:
- A statement that consists of a header and a body. The header ends with a colon (:). The body is indented relative to the header.
- guardian pattern:
- Where we construct a logical expression with additional comparisons to take advantage of the short circuit behavior.
- logical operator:
- One of the operators that combines boolean expressions: , , and .
- nested conditional:
- A conditional statement that appears in one of the branches of another conditional statement.
- A list of the functions that are executing, printed when an exception occurs.
- short circuit:
- When Python is part-way through evaluating a logical expression and stops the evaluation because Python knows the final value for the expression without needing to evaluate the rest of the expression.
- In Python 3.0, you no longer get an error message; the division operator performs floating-point division even with integer operands.