CS 12 24.2 Lecture 1

Static Type Checking in Python

Lecture 1 — CS 12 (Computer Programming II)
Second Semester, AY 2024-2025
Department of Computer Science
University of the Philippines Diliman

Lecture 1 outline

Administrivia

Statically typed Python

Reading assignment: Review of Pythonic features

Administrivia

To be up by tomorrow:

UVLe section
Lecture 0 slide deck
Lecture 1 slide deck
Student information form
Pyright installation guide

Lecture 1 outline

Administrivia

Statically typed Python

Reading assignment: Review of Pythonic features

Motivating example

Problem: Find the bug in the code below:

from collections import Counter
 
def mode(xs):
    if not xs:
        return None
 
    ctr = Counter(xs)
    largest_count = max(ctr.values())
    mode_candidate_count = list(ctr.values()).count(largest_count)
 
    return ctr.most_common()[0][0] if mode_candidate_count == 1 else None
 
answer = mode([1, 2, 3, 1])
square_of_mode = answer**2

Is there an easier way to identify this?

Type hints [1/2]

Type hint/annotation: Explicit declaration of types of variables and return values

Code without type hints:

students = 93
nums = []
names = []

Syntax for annotating variable declarations:

<variable-name>: <type> = <value>

students: int = 93
nums: list[int] = []
names: list[str] = []

Must specify type of elements of list inside square brackets
Type of variable must not change

Type hints [2/2]

Code without type hints:

def is_prime(n):
    ...
 
def print_larger_len(a, b):
    ...

Syntax for annotating function parameters and return type:

def <function-name>(<arg>: <type>[, <arg>: <type>]) -> <return-type>:
    <block>

def is_prime(n: int) -> bool:
    ...
 
def print_larger_len(a: str, b: str) -> None:
    ...

Why bother adding type hints? [1/2]

Suppose this is in a codebase you are to modify:

def update(spaceship):
    ...

What is the expected type of spaceship?
What type does update return?

Need to go through function body to find out

Why bother adding type hints? [2/2]

Same code, but with type hints:

def update(spaceship: dict[str, int]) -> None:   # Note: dict[str, int]
    ...                                          # means the keys are strs
                                                 # while the values are ints

What is the expected type of spaceship?
What type does update return?

Above can be answered without going through function body; saves time and effort

Toy example; imagine larger codebase
Type-mentioning comments can also address this
What do type hints provide that comments do not?

Type checking

Type checking: Process of checking whether type rules of PL are violated by code

Type rule violation is called a type error

Python type rule example: str and int cannot be added together

When is this rule checked?

Kind	When	Python
Static type checking	Before execution	Not done by default
Dynamic type checking	During execution	Done by default

Dynamic type checking in Python

Type rule: str and int cannot be added together

Code below violates the above type rule:

def f(a, b):      # Intention is for both params to be ints
    return a + b
 
f(1, 2)           # Passes type checking
 
f(12, "hello")    # Does not pass type checking;
                  # raises TypeError during execution (crash!)

In general, we do not want our programs to crash during runtime

Python allows running of programs with type errors
Better if we can flag the above error before runtime; how?

Static type checking and type hints

Same code, but with type hints:

def f(a: int, b: int) -> int:
    return a + b
 
f(1, 2)         # Should pass type checking
 
f(12, "hello")  # Should not pass type checking

b is annotated to be an int
f(12, "hello") passes a str to b

Should be a type error (inconsistency with hint and usage)

Pyright

Pyright: Static type checker by Microsoft (Github repo)

Called Pylance on VS Code
Installation guide to be uploaded

Pyright is able to automatically flag the type error (sandbox):

def f(a: int, b: int) -> int:
    return a + b
 
f(1, 2)
 
# Argument of type "Literal['hello']" cannot be assigned to
# parameter "b" of type "int" in function "f";
# -> "Literal['hello']" is incompatible with "int"
f(12, "hello")

Ideally, Pyright should report no errors before we run our code

Please practice reading through type error messages

Pyright strict mode rules

CS 12 24.2 requirement: Pyright 1.1.392 strict mode

All function parameters must be annotated (optional for return type)

def f(a: int, b: int):  # Return type is optional
    return a + b

Variables initialized with empty collections must be annotated

# Flagged by Pyright; must be `nums: list[int] = []`
nums = []
 
for num in range(1, 11):
    nums.append(num)

Ensure your code passes Pyright strict mode before submission!

Type inference [1/2]

What is the return type of the function below?

def f(n: int):
    if n > 0:
        return "POSITIVE"
 
    elif n < 0:
        return "NEGATIVE"

Type inference [2/2]

Note: Python functions that terminate without encountering a return statement automatically return None

f can return a str ("POSITIVE" or "NEGATIVE") or None:

def f(n: int):
    if n > 0:
        return "POSITIVE"
 
    elif n < 0:
        return "NEGATIVE"   # None will be returned when n == 0

Pyright is able to infer this based on the given and have your IDE report it (very useful!):

# f(n: int) -> (Literal['POSITIVE', 'NEGATIVE'] | None)

A | B is the Union type that covers values of type A and of type B
f returning None might be a bug (likely unintentional); Pyright helps detect these

Limitations of type inference

f below returns "POSITIVE", "NEGATIVE", or "ZERO":

def f(n: int):
    if n > 0:
        return "POSITIVE"
    elif n < 0:
        return "NEGATIVE"
    elif n == 0:
        return "ZERO"

Pyright incorrectly infers the return type to include None

Cannot infer that the three cases cover all possible ints
Smart enough to make inferences based on else branch:

def f(n: int):             # Pyright infers the return type
    if n > 0:              # of f correctly due to `else`
        return "POSITIVE"
    elif n < 0:
        return "NEGATIVE"
    else:
        return "ZERO"

Type narrowing

Pyright can rule out possibilities of union types using conditionals:

def f(nums: list[int] | None) -> int | None:
    if nums is None:      # How does Pyright behave when
        return None       # condition is removed?
 
    nums.append(100)      # What more specific type does
                          # Pyright infer here?
    return sum(nums)

Very useful in ensuring types are as expected

Static type rules of collections

Type	Dynamic typing	Static typing (homogeneity)	Static type annotation
list	Elements can be of any type	Elements must have same type	list[int], list[str], list[list[int]]
tuple	–	–	tuple[int,int,int], tuple[str,bool]
set	Elements can be of any type	Elements must have same type	set[int], set[str]
dict	Keys and values can be of any type	Keys must have same type; values must have same type	dict[str,int], dict[int,dict[int,str]]

Back to motivating example

Problem: Find the bug in the code below:

from collections import Counter
 
def mode(xs: list[int]):
    if not xs:
        return None
 
    ctr = Counter(xs)
    largest_count = max(ctr.values())
    mode_candidate_count = list(ctr.values()).count(largest_count)
 
    return ctr.most_common()[0][0] if mode_candidate_count == 1 else None
 
answer = mode([1, 2, 3, 1])
square_of_mode = answer**2

Pyright can help with this

Type alias

Type alias: Alternative name for existing type

Equivalent to existing type
Simplifies writing of complex types and enables domain-specific type names:

Matrix = list[list[int]]
 
def mul_matrices(a: Matrix, b: Matrix) -> Matrix:
    nrow = len(a)
    ncol = len(b[0])
    ret = [[0 for _ in range(ncol)] for _ in range(nrow)]
 
    for i in range(nrow):
       for j in range(ncol):
           for k in range(nrow):
               ret[i][j] += a[i][k] * b[k][j]
 
    return ret

Literal type

Literal type: Type having specific constant values

Literal[1, 2] is same as Literal[1] | Literal[2]
Valid types for literal: int, str, byte, bool, None, Enum value (not float; PEP 586)

Cannot use variables; must be constant values

Useful for expressing that only specific values are allowed:

from typing import Literal, get_args  # Must import from typing module
 
MyLiteral = Literal['a', 'b']
 
def f(x: MyLiteral):        # Pyright can infer that that return type
    if x == 'a':            # of f is Literal[1, 2]
        return 1
    else:
        return 2
 
print(get_args(MyLiteral))  # ('a', 'b')
 
f('c')                      # Invalid; Literal['c'] is not in Literal['a', 'b']
 
if f('a') == 0:             # Invalid; Literal[0] is not in Literal[1, 2]
    ...

Exhaustiveness checking [1/2]

Suppose you have forgotten to handle 'CS' and 'IE':

EnggProgram = Literal['ChE', 'CE', 'CoE', 'CS', 'EcE', 'EE', 'GE',
                      'IE', 'MatE', 'ME', 'MetE', 'EM']
EnggUnit = Literal['DChE', 'DCS', 'DGE', 'DIEOR', 'EEE', 'ICE', 'DME', 'DMMME']
 
def get_offering_unit(program: EnggProgram) -> EnggUnit:
    if program == 'ChE':
        return 'DChE'
    elif program == 'CE':
        return 'ICE'
    elif program == 'CoE' or program == 'EcE' or program == 'EE':
        return 'EEE'
    elif program == 'GE':
        return 'DGE'
    elif program == 'MatE' or program == 'MetE' or program == 'EM':
        return 'DMMME'
    elif program == 'ME':
        return 'DME'
 
    # Recall that functions return None if allowed to reach the end

Pyright complains with: "None" cannot be assigned to type "EnggUnit"

Adding a reveal_type(program) (debug stub used by Pyright) after the if chain...
...says: Type of "program" is "Literal['CS', 'IE']"

Pyright can tell you exactly which cases you missed!

Exhaustiveness checking [2/2]

Also works with pattern matching:

EnggProgram = Literal['ChE', 'CE', 'CoE', 'CS', 'EcE', 'EE', 'GE',
                      'IE', 'MatE', 'ME', 'MetE', 'EM']
EnggUnit = Literal['DChE', 'DCS', 'DGE', 'DIEOR', 'EEE', 'ICE', 'DME', 'DMMME']

def get_offering_unit(program: EnggProgram) -> EnggUnit:
    match program:
        case 'ChE':
            return 'DChE'
        case 'CE':
            return 'ICE'
        case 'CoE' | 'EcE' | 'EE':  # Value must be one of them;
            return 'EEE'            # more concise than if version
        case 'CS':
            return 'DCS'
        case 'GE':
            return 'DGE'
        case 'IE':
            return 'DIEOR'
        case 'MatE' | 'MetE' | 'EM':
            return 'DMMME'
        case 'ME':
            return 'DME'

To be covered in later lectures

Subtyping
Generics
Type variance

Lecture 1 outline

Administrivia

Statically typed Python

Reading assignment: Review of Pythonic features

Truthy and falsy collections

Instead of using len to check emptiness:

if len(elems) > 0:
    ...

Use collection itself as truthy/falsy value:

if elems:      # Easier to read
    ...        # and write

Nonempty collections are truthy
Empty collections are falsy

Empty string is falsy

Python collections can be used as boolean values

Negative indexing

Negative indices start from the tail end:

elems = [ 'a', 'b', 'c', 'd' ]
#          0    1    2    3
#         -4   -3   -2   -1
 
# elems[-1] == 'd'
# elems[-len(elems)] == 'a'

Index -1 is last element (if nonempty)
Be wary of empty lists (check emptiness before indexing)

elems[-i] is shorthand for elems[len(elems)-i]:

elems = [ 'a', 'b', 'c', 'd' ]  # len(elems) == 4
#          0    1    2    3     # i
#        4-4  4-3  4-2  4-1     # len(elems)-i

Shorthand is easier to read and write

Aside: Boolean shortcircuiting

and shortcircuiting:

a and b
If a is False, evaluation of b is skipped

False and ??? is always False

or shortcircuiting:

a or b
If a is True, evaluation of b is skipped

True or ??? is always True

String interpolation via fstrings

Instead of manual concatenation:

if word and word[-1] == 's':     # Nonempty string is truthy
    suffix = "'"
else:
    suffix = "'s"
 
possessive_form = word + suffix  # "student" -> "student's"
                                 # "Cyrus" -> "Cyrus'"

Use an f-string instead:

possessive_form = f"{word}{suffix}"  # Explicitly a string

f-string: All {}s inside string are evaluated and made into strings

Do not forget to prefix f to opening ' or " (assume num = 12):
Use {{ and }} inside an f-string to add regular braces

Conditional expressions

Previous example using an if statement:

if word and word[-1] == 's':
    suffix = "'"
else:
    suffix = "'s"

Indentation-free version using an if expression:

suffix = "'" if word and word[-1] == 's' else "'s"

Indentation may make code harder to read

Slicing

Slicing syntax (only for lists, tuples, and strings):

# Indexes [start,stop) of sequence in increments of step
# similar to range
<sequence>[<start>:<stop>:<step>]

lst = [10, 11, 12, 13, 14]
print(lst[1:3])   # [11, 12]
 
tup = (10, 11, 12, 13, 14)
print(tup[1:3])   # (11, 12)
 
s = "abcde"
print(s[1:3])     # "bc"

lst = [10, 11, 12, 13, 14]
print(lst[1:4])   # [11, 12, 13]
print(lst[1:])    # [11, 12, 13, 14]
print(lst[:4])    # [10, 11, 12, 13]
print(lst[:])     # [10, 11, 12, 13, 14] (effectively copies list)
print(lst[::-1])  # [14, 13, 12, 11, 10] (list(reversed(lst)))

Iterable unpacking [1/2]

Instead of assigning individual elements via indexing:

coords = (10, 20)
x = coords[0]
y = coords[1]

Use tuple unpacking instead:

coords = (10, 20)
x, y = coords

Temporary variable swap vs. one-line swap:

temp = x
x = y
y = temp

x, y = y, x

Iterable unpacking [2/2]

Instead of multiple list concatenation via +:

a = [20, 30]
b = [50, 60]
c = [10] + a + [40] + b  # [10, 20, 30, 40, 50, 60]

Use the unpacking star operator instead:

a = [20, 30]
b = [50, 60]
c = [10, *a, 40, *b]  # [10, 20, 30, 40, 50, 60]

Use ** for unpacking dicts (dictionary unpacking operator):

x = {'a': 1, 'b': 2}
y = {'c': 3}
z = {**x, **y, 'd': 4}  # {'a': 1, 'b': 2, 'c': 3, 'd': 4}

Index-free, read-only iteration

Task: Print all elements of a given list

elems = ['Poring', 'Fabre', 'Lunatic', 'Chonchon']

Naive approach: Iteration using list indexing:

for i in range(len(elems)):  # [0, 1, 2, 3]
    print(x[i])

Index-free list iteration via for:

for elem in elems:
    print(elem)

Use range(len(elems)) only when index is necessary

Numbered iteration

Task: Print each element of a given list with its 1-indexed position

elems = ['Tressa', 'Cyrus', 'Olberic', 'Primrose']

Naive approach: Using incremented index:

for i in range(len(elems)):                # "#1: Tressa";
    print(f'#{i+1}: {elems[i]}')           # must index

Better approach: Use enumerate instead (list of tuples):

for i, elem in enumerate(elems):           # i starts at 0;
    print(f'#{i+1}: {elem}')               # no need to index

for n, elem in enumerate(elems, start=1):  # n starts at 1; no need
    print(f'#{n}: {elem}')                 # to index and increment

Key-value dict enumeration

Instead of iterating by keys only:

d = {'a': 1, 'b': 2, 'c': 3}
 
for key in d:  # No need to do d.keys()
    print(f'Key: {key} has value {d[key]}')

Use .items() instead:

d = {'a': 1, 'b': 2, 'c': 3}
 
for key, value in d.items():
    print(f'Key: {key} has value {value}')

Reverse iteration [1/2]

Task: Print all elements of a given list in reverse

elems = ['Poring', 'Fabre', 'Lunatic', 'Chonchon']

Naive approach #1: Decreasing range sequence:

for i in range(len(elems) - 1, -1, -1):  # i: [3, 2, 1, 0]
    print(elems[i])

Naive approach #2: Reverse index via subtraction:

for i in range(len(elems)):     #     i: [0, 1, 2, 3]
    print(elems[len(elems)-i])  # 3 - i: [3, 2, 1, 0]

Prone to off-by-one errors; can we do better?

Reverse iteration [2/2]

Approach #1: Use built-in reversed function:

for elem in reversed(elems):
    print(elem)

Approach #2: Use reverse slice

for elem in elems[::-1]:
    print(elem)

Not as readable as above

Populating a list via for

Instead of using a for to build a list from an existing iterable:

nums = []
for n in range(1, 11):
    nums.append(n**2)

Use a list comprehension instead:

# [<expr> for <variable> in <iterable>] builds a new list
nums = [n**2 for n in range(1, 11)]

Can also be nested to form list of lists:

matrix = [[((r * 3) + c) for c in range(3)] for r in range(5)]

Conditionally populating a list

Instead of using a for to build a subset of an existing iterable:

primes = []
for n in range(2, 101):
    if is_prime(n):
        primes.append(x)

Use a list comprehension instead:

# [<expr> for <variable> in <iterable> if <condition>];
# <expr> is only appended if <condition> is True
primes = [n for n in range(2, 101) if is_prime(n)]

Set and dict comprehensions

Set comprehension:

strs = ['a', 'aa', 'b', 'abc', 'c', 'bbb']
 
lengths = set()
for s in strs:
    lengths.add(len(s))

lengths = {len(s) for s in strs}  # {1, 2, 3}

Dict comprehension:

strs = ['a', 'aa', 'b', 'abc', 'c', 'bbb']
 
lengths = {}
for s in strs:             # lengths['aa'] == 2
    lengths[s] = len(s)    # lengths['abc'] == 3

lengths = {s: len(s) for s in strs}