(A very brief)
Introduction to Python

Lecture 02

Dr. Colin Rundel

Basic types

Type system basics

Like R, Python is a dynamically typed language but the implementation details are very different as it makes extensive use of an object oriented class system for implementation

True
True
1
1
1.0
1.0
1+1j
(1+1j)
"string"
'string'
type(True)
<class 'bool'>
type(1)
<class 'int'>
type(1.0)
<class 'float'>
type(1+1j)
<class 'complex'>
type("string")
<class 'str'>

Dynamic types

As a dynamically typed langiage most basic operations will attempt to coerce object to a consistent type appropriate for the operation.

Boolean operations:

1 and True
True
0 or 1
1
not 0
True
not (0+0j)
True
not (0+1j)
False

Comparisons:

5. > 1
True
5. == 5
True
1 > True
False
(1+0j) == 1
True
"abc" < "ABC"
False

Mathematical operations

1 + 5
6
1 + 5.
6.0
1 * 5.
5.0
True * 5
5
(1+0j) - (1+1j)
-1j
5 / 1.
5.0
5 / 2
2.5
5 // 2
2
5 % 2
1
7 ** 2
49

Note that the default numeric type in python is an integer rather than a double, but floats will generally take precedence over integers.

Coercion errors

Python is not quite as liberal as R when it comes to type coercion,

"abc" > 5
TypeError: '>' not supported between instances of 'str' and 'int'
"abc" + 5
TypeError: can only concatenate str (not "int") to str
"abc" + str(5)
'abc5'
"abc" ** 2
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'
"abc" * 3
'abcabcabc'

Casting

Explicit casting between types can be achieved using the types as functions, e.g. int(), float(), bool(), or str().

float("0.5")
0.5
float(True)
1.0
int(1.1)
1
int("2")
2
int("2.1")
ValueError: invalid literal for int() with base 10: '2.1'
bool(0)
False
bool("hello")
True
str(3.14159)
'3.14159'
str(True)
'True'

Variable assignment

When using Python it is important to think of variable assignment as the process of attaching a name to an object (literal, data structure, etc.)

x = 100
x
100
x = "hello"
x
'hello'
ß = 1 + 2 / 3
ß
1.6666666666666665
a = b = 5
a
5
b
5

string literals

Strings can be defined using several different approaches,

'allows embedded "double" quotes'
'allows embedded "double" quotes'
"allows embedded 'single' quotes"
"allows embedded 'single' quotes"

strings can also be triple quoted, using single or double quotes, which allows the string to span multiple lines and contain quote characters,

"""line one
line "two"
line 'three'"""
'line one\nline "two"\nline \'three\''

Multiline strings

A single \ can also be used to span a long string over multiple lines without including the newline

"line one \
not line two \
not line three"
'line one not line two not line three'

f strings

As of Python 3.6 you can use f strings for string interpolation formatting (as opposed to %-formatting and the format() method).

x = [0,1,2,3,4]
f"{x[::2]}"
'[0, 2, 4]'
f'{x[0]}, {x[1]}, ...'
'0, 1, ...'
f"From {min(x)} to {max(x)}"
'From 0 to 4'
f"{x} has {len(x)} elements"
'[0, 1, 2, 3, 4] has 5 elements'

raw strings

One other special type of string literal you will come across are raw strings (prefixed with r) - these are like regular strings except that \ is treated as a literal character rather than an escape character.

print("ab\\cd")
ab\cd
print("ab\ncd")
ab
cd
print("ab\tcd")
ab  cd
print(r"ab\\cd")
ab\\cd
print(r"ab\ncd")
ab\ncd
print(r"ab\tcd")
ab\tcd

Special values

Base Python does not support missing values. Non-finite floating point values are available but somewhat awkward to use. There is also a None type which is similar in usage and functionality to NULL in R.

1/0
ZeroDivisionError: division by zero
1./0
ZeroDivisionError: float division by zero
nan
NameError: name 'nan' is not defined
float("nan")
nan
inf
NameError: name 'inf' is not defined
float("-inf")
-inf
5 > float("inf")
False
5 > float("-inf")
True
None
type(None)
<class 'NoneType'>

Sequence types

lists

Python lists are heterogenous, ordered, mutable containers of objects (they are very similarly to lists in R).

[0,1,1,0]
[0, 1, 1, 0]
[0, True, "abc"]
[0, True, 'abc']
[0, [1,2], [3,[4]]]
[0, [1, 2], [3, [4]]]
x = [0,1,1,0]
type(x)
<class 'list'>
y = [0, True, "abc"]
type(y)
<class 'list'>

Common operations

x = [0,1,1,0]
2 in x
False
2 not in x
True
x + [3,4,5]
[0, 1, 1, 0, 3, 4, 5]
x * 2
[0, 1, 1, 0, 0, 1, 1, 0]
len(x)
4
max(x)
1
x.count(1)
2
x.count("1")
0

See here and here for a more complete listing of functions and methods.

list subsetting

Elements of a list can be accessed using the [] method, element position is indicated using 0-based indexing, and ranges of values can be specified using slices (start:stop:step).

x = [1,2,3,4,5,6,7,8,9]
x[0]
1
x[3]
4
x[0:3]
[1, 2, 3]
x[3:]
[4, 5, 6, 7, 8, 9]
x[-3:]
[7, 8, 9]
x[:3]
[1, 2, 3]

slice with a step

x = [1,2,3,4,5,6,7,8,9]
x[0:5:2]
[1, 3, 5]
x[0:6:3]
[1, 4]
x[0:len(x):2]
[1, 3, 5, 7, 9]
x[0::2]
[1, 3, 5, 7, 9]
x[::2]
[1, 3, 5, 7, 9]
x[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1]

Exercise 1

Come up with a slice that will subset the following list to obtain the elements requested:

d = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  • Select only the odd values in this list

  • Select every 3rd value starting from the 2nd element.

  • Select every other value, in reverse order, starting from the 9th element.

  • Select the 3rd element, the 5th element, and the 10th element

mutability

Since lists are mutable the stored values can be changed,

x = [1,2,3,4,5]
x[0] = -1
x
[-1, 2, 3, 4, 5]
del x[0]
x
[2, 3, 4, 5]
x.append(7)
x
[2, 3, 4, 5, 7]
x.insert(3, -5)
x
[2, 3, 4, -5, 5, 7]
x.pop()
7
x
[2, 3, 4, -5, 5]
x.clear()
x
[]

lists, assignment, and mutability

When assigning an object a name (x = ...) you do not necessarily end up with an entirely new object, see the example below where both x and y are names that are attached to the same underlying object in memory.

x = [0,1,1,0]
y = x

x.append(2)

What are the values of x and y now?

x
[0, 1, 1, 0, 2]
y
[0, 1, 1, 0, 2]

lists, assignment, and mutability

To avoid this we need to make an explicit copy of the object pointed to by x and point to it with the name y.

x = [0,1,1,0]
y = x.copy()

x.append(2)

What are the values of x and y now?

x
[0, 1, 1, 0, 2]
y
[0, 1, 1, 0]

Nested lists

Now lets look at happens when we have a list inside a list and make a change at either level.

x = [0, [1,2], [3,4]]
y = x
z = x.copy()

x[0] = -1
x[1][0] = 5

What are the values of x, y, and z now?

x
[-1, [5, 2], [3, 4]]
y
[-1, [5, 2], [3, 4]]
z
[0, [5, 2], [3, 4]]

Value unpacking

lists (and other sequence types) can be unpacking into multiple variables when doing assignment,

x, y = [1,2]
x
1
y
2
x, y = [1, [2, 3]]
x
1
y
[2, 3]
x, y = [[0,1], [2, 3]]
x
[0, 1]
y
[2, 3]
(x1,y1), (x2,y2) = [[0,1], [2, 3]]
x1
0
y1
1
x2
2
y2
3

Extended unpacking

It is also possible to use extended unpacking via the * operator (in Python 3)

x, *y = [1,2,3]
x
1
y
[2, 3]
*x, y = [1,2,3]
x
[1, 2]
y
3


If * is not used here, we get an error:

x, y = [1,2,3]
ValueError: too many values to unpack (expected 2)

tuples

Python tuples are heterogenous, ordered, immutable containers of values.

They are nearly identical to lists except that their values cannot be changed - you will most often encounter them as a tool for packaging multiple objects when returning from a function.

(1, 2, 3)
(1, 2, 3)
(1, True, "abc")
(1, True, 'abc')
(1, (2,3))
(1, (2, 3))
(1, [2,3])
(1, [2, 3])

tuples are immutable

x = (1,2,3)
x[2] = 5
TypeError: 'tuple' object does not support item assignment
del x[2]
TypeError: 'tuple' object doesn't support item deletion
x.clear()
AttributeError: 'tuple' object has no attribute 'clear'

Casting sequences

It is possible to cast between sequence types

x = [1,2,3]
y = (3,2,1)
tuple(x)
(1, 2, 3)
list(y)
[3, 2, 1]
tuple(x) == x
False
list(tuple(x)) == x
True

Ranges

These are the last common sequence type and are a bit special - ranges are a homogenous, ordered, immutable “containers” of integers.

range(10)
range(0, 10)
range(0,10)
range(0, 10)
range(0,10,2)
range(0, 10, 2)
range(10,0,-1)
range(10, 0, -1)
list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
list(range(0,10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
list(range(0,10,2))
[0, 2, 4, 6, 8]
list(range(10,0,-1))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

What makes ranges special is that range(1000000) does not store 1 million integers in memory but rather just three 3\(^*\).

Strings as sequences

In most of the ways that count we can think about Python strings as being ordered immutable containers of unicode characters and so much of the sequence type functionality we just saw can be applied to them.

x = "abc"
x[0]
'a'
x[-1]
'c'
x[2:]
'c'
x[::-1]
'cba'
len(x)
3
"a" in x
True
"bc" in x
True
x[0] + x[2] 
'ac'
x[2] = "c"
TypeError: 'str' object does not support item assignment

String Methods

Because string processing is a common and important programming task, the str class implements a number of additional methods for these specific tasks. See here a list of methods.

x = "Hello world! 1234"
x.find("!")
11
x.isalnum()
False
x.isascii()
True
x.lower()
'hello world! 1234'
x.swapcase()
'hELLO WORLD! 1234'
x.title()
'Hello World! 1234'
x.split(" ")
['Hello', 'world!', '1234']
"|".join(x.split(" "))
'Hello|world!|1234'

Exercise 2

String processing - take the string given below and apply the necessary methods to create the target string.

Source:

"the quick  Brown   fox Jumped  over   a Lazy  dog"

Target:

"The quick brown fox jumped over a lazy dog."

Set and Mapping types

We will discuss sets (set) and dictionaries (dict) in more detail next week.

Specifically we will discuss the underlying data structure behind these types (as well as lists and tuples) and when it is most appropriate to use each.