Disclaimer: These are notes I took while using MOOC 2022.
Pretty much everything you need to know about Python in one place for a intermediate-beginner.
Use of f strings to concatenate
1 | limit = int(input("Limit: ")) |
1 | The consecutive sum: 1 + 2 + 3 = 6 |
Use of a < b < c
is legal and has the intended mathematical effect in python. But why don’t we use this?
- Consider other programming languages, C# does not have this feature. Best to avoid features that are not widely use…can be confusing
Strings use the same indexing as perl. Negative indexing.
- You can think of input_string[-1] as shorthand for input_string[len(input_string) - 1].
Searching for substrings
The in operator can tell us if a string contains a particular substring. The Boolean expression a in b is true, if b contains the substring a.
For example, this bit of code
1 | input_string = "test" |
1 | Sample output |
Instead, the Python string method find
can be used for this purpose. It takes the substring searched for as an argument, and returns either the first index where it is found, or -1 if the substring is not found within the string.
The image below illustrates how it is used:
1 | input_string = "test" |
1 | Sample output |
Substrings and slices
A substring of a string is a sequence of characters that forms a part of the string. For example, the string example contains the substrings exam, amp and ple, among others. In Python programming, the process of selecting substrings is usually called slicing, and a substring is often referred to as a slice of the string. The two terms can often be used interchangeably.
If you know the beginning and end indexes of the slice you wish to extract, you can do so with the notation [a:b]. This means the slice begins at the index a and ends at the last character before index b - that is, including the first, but excluding the last. You can think of the indexes as separator lines drawn on the left side of the indexed character, as illustrated in the image below:
1 | input_string = "presumptious" |
1 | pre |
Type Hinting
1 | def print_many_times(message : str, times : int): |
1 | def ask_for_name() -> str: |
Lists
Adding items to a list
The append
method adds items to the end of a list. It works like this:
1 | numbers = [] |
1 | [5, 10, 3] |
Adding to a specific location
If you want to specify a location in the list where an item should be added, you can use the insert
method. The method adds an item at the specified index. All the items already in the list with an index equal to or higher than the specified index are moved one index further “to the right”. Here’s an example:
1 | numbers = [1, 2, 3, 4] |
1 | # Program |
1 | Sample output |
Removing items from a list
There are two different approaches to removing an item from a list:
- If the index of the item is known, you can use the method
pop
. - If the contents of the item are known, you can use the method
remove
.
So, the method pop
takes the index of the item you want to remove as its argument. The following program removes items at indexes 2 and 3 from the list. Notice how the indexes of the remaining items change when one is removed.
1 | my_list = [1, 2, 3, 4, 5, 6] |
Sample output:
1 | [1, 2, 4, 5, 6] |
It’s useful to remember that the method pop
also returns the removed item:
1 | my_list = [4, 2, 7, 2, 5] |
Sample output:
1 | 7 |
Remove
The remove
method, on the other hand, takes the value of the item to be removed as its argument. For example, consider this program:
1 | my_list = [1, 2, 3, 4, 5, 6] |
The above code produces the following output:
1 | [1, 3, 4, 5, 6] |
The remove
method removes the FIRST occurrence of the value in the list, similar to how the string function find
returns the index of the first occurrence of a substring. Here’s another example:
1 | my_list = [1, 2, 1, 2] |
The output of the above code is:
1 | [2, 1, 2] |
in
vs not in
using in
for list
If the specified item is not present in the list, the remove
function will raise an error. Similar to working with strings, you can check for the presence of an item using the in
operator. Here’s an example:
1 | my_list = [1, 3, 4] |
The output of the above code is:
1 | The list contains item 1 |
Sorting Lists (sort
vs sorted
)
To sort a list from smallest to greatest, you can use the sort
method:
1 | my_list = [2, 5, 1, 2, 4] |
To create a new sorted copy of the list without modifying the original, you can use the sorted
function:
1 | my_list = [2, 5, 1, 2, 4] |
Remember, sort
modifies the original list, while sorted
creates a new sorted list.
1 | original = [2, 5, 1, 2, 4] |
Example of sort vs sorted issue
1 | # Write your solution here |
Use of in-range
1 | def list_sum(my_list1 : list, my_list2 : list): |
Add two lists with the same index, use zip
zip function is used to iterate over two or more lists (and not only lists, but any iterables) in parallel.
1 | # for item1, item2 in zip(list1, list2): |
Print Statement Formatting
We have learned three methods for formatting the arguments in the print
statement.
- The first method is using the
+
operator for string concatenation:
1 | name = "Mark" |
- The second method is separating the segments with commas:
1 | print("Hi", name, "your age is", age, "years") |
- To remove the automatically added spaces, you can use the
sep
keyword argument:
1 | print("Hi", name, "your age is", age, "years", sep="") |
You can specify any string as the separator. For example, using "\n"
as the separator will print each argument on a separate line:
1 | print("Hi", name, "your age is", age, "years", sep="\n") |
You can also modify the end of the line using the end
keyword argument. Setting end=""
removes the newline character:
1 | print("Hi ", end="") |
.f
modifier in formatted values
The third method to prepare strings is f-strings. The previous example with the name and the age would look like this formulated with f-strings:
1 | name = "Erkki" |
The output is:
1 | Hi Erkki your age is 39 years |
Thus far, we have only used very simple f-strings, but they can be very versatile in formatting string-type content. One very common use case is setting the number of decimals that are printed out with a floating-point number. By default, the number is quite high:
1 | number = 1/3 |
The output is:
1 | The number is 0.3333333333333333 |
The specific format we want the number to be displayed in can be set within the curly brackets of the variable expression. Let’s add a colon character and a format specifier after the variable name:
1 | number = 1/3 |
The output is:
1 | The number is 0.33 |
The format specifier .2f
states that we want to display 2 decimals. The letter f
at the end means that we want the variable to be displayed as a float, i.e., a floating-point number.
White-Space Format
Here’s another example, where we specify the amount of whitespace reserved for the variable in the printout. Both times the variable name is included in the resulting string, it has a space of 15 characters reserved. First, the names are justified to the left, and then they are justified to the right:
1 | names = ["Steve", "Jean", "Katherine", "Paul"] |
**{name:15}: Here, :15 means that 15 spaces are reserved for the variable name. If the name is less than 15 characters, the extra spaces will be filled with whitespace. By default, the text is left-justified, meaning the extra spaces will be on the right side of the name.
{name:>15}: This is similar to the previous one, but with an extra > character. The > means that the text should be right-justified. So if the name is less than 15 characters, the extra spaces will be on the left side of the name.
The output is:
1 | Steve centre Steve |
The uses of f-strings are not restricted to print
commands. They can be assigned to variables and combined with other strings:
1 | name = "Larry" |
The output is:
1 | Hi Larry, you are 48 years of age, and you live in Palo Alto |
You can think of an f-string as a sort of function that creates a normal string based on the “arguments” within the curly brackets.
Format int list to str list with 2 decimal places
1 | def formatted(my_list : list[float]) -> list[str]: |
More String and List
Slicing Strings
You are already familiar with the []
syntax for accessing a part of a string:
1 | my_string = "exemplary" |
The output is:
1 | mpla |
Slicing Lists
The same syntax works with lists. Lists can be sliced just like strings:
1 | my_list = [3, 4, 2, 4, 6, 1, 2, 4, 2] |
The output is:
1 | [4, 6, 1, 2] |
Slicing with Step
In fact, the []
syntax works very similarly to the range
function, which means we can also give it a step:
1 | my_string = "exemplary" |
The output is:
1 | eepa |
Reverse a String (easy)
If we omit either of the indexes, the operator defaults to including everything. Among other things, this allows us to write a very short program to reverse a string:
1 | my_string = input("Please type in a string: ") |
Sample output:
1 | Please type in a string: exemplary |
Count Method
The count
method counts the number of times the specified item or substring occurs in the target. The method works similarly with both strings and lists:
1 | my_string = "How much wood would a woodchuck chuck if a woodchuck could chuck wood" |
The output is:
1 | 5 |
The method will not count overlapping occurrences. For example, in the string “aaaa”, the method counts only two occurrences of the substring “aa”, even though there would actually be three if overlapping occurrences were allowed.
Replace Method
Basic Replacement
The replace
method creates a new string where a specified substring is replaced with another string:
1 | my_string = "Hi there" |
The output is:
1 | Hey there |
The method will replace all occurrences of the substring.
Multiple Replacements
1 | sentence = "sheila sells seashells on the seashore" |
The output is:
1 | SHEila sells seaSHElls on the seashore |
Immutable Strings and Variable Assignment
When using the replace
method, a typical mistake is forgetting that strings are immutable. If the old string is no longer needed, the new string can be assigned to the same variable:
1 | my_string = "Python is fun" |
The output is:
1 | Java is fun |
If the old string is not assigned to a new variable or the updated string is not stored, the original string remains unchanged.
in
Operator
The in
operator is used to check if a value exists in a sequence, such as a string or a list. It returns True
if the value is present and False
otherwise.
Example 1: Checking Membership in a String
1 | my_string = "Hello, World!" |
The output is:
1 | True |
In the first example, 'l'
is present in the string, so True
is returned. In the second example, 'z'
is not found in the string, so False
is returned.
Example 2: Checking Membership in a List
1 | my_list = [1, 2, 3, 4, 5] |
The output is:
1 | True |
In the first example, 3
is present in the list, so True
is returned. In the second example, 6
is not found in the list, so False
is returned.
not in
Operator
The not in
operator is used to check if a value does not exist in a sequence. It returns True
if the value is not present and False
if it is.
Example 1: Checking Non-Membership in a String
1 | my_string = "Hello, World!" |
The output is:
1 | True |
In the first example, 'z'
is not found in the string, so True
is returned. In the second example, 'o'
is present in the string, so False
is returned.
Example 2: Checking Non-Membership in a List
1 | my_list = [1, 2, 3, 4, 5] |
The output is:
1 | True |
In the first example, 6
is not found in the list, so True
is returned. In the second example, 3
is present in the list, so False
is returned.
if
vs. if not
if
variant
1 | def no_shouting(my_list : list[str]) -> list[str]: |
if not
variant (concise)
1 | def no_shouting(my_list: list): |
Original Code that didn’t compile
1 | # Write your solution here |
Issue #1
1 | for num in my_exercise: |
for
loops do not alter data as shown. num is a temporary variable used to iterate through the structure.
Solution to Issue #1 v_1
1 | weighted_my_exercise_list = [] |
- ended up creating a new list and storing the values in there
Solution to Issue #1 v_1 (LIST COMPHRENSION)
1 | my_exercise = [num // 10 for num in my_exercise] |
- This is an example of a list comprehension, which is a Python feature that provides a concise way to create lists based on existing lists (or other iterable objects).
Issue #2 (code conciseness)
1 | i = 0 # counter to track exam automatic fails, regardless of points |
- this is more of a
C++
way of handling code.
Solution to Issue #2 (enumeration function)
1 | for i, total_points in enumerate(combined_points): |
enumerate()
is a built-in Python function that allows you to loop over something and have an automatic counter.
When you use enumerate()
, it gives you two values for each iteration of the loop: the count (or index) and the value of the item at that index.
So when you write for i, total_points in enumerate(combined_points):
, the variable i
is set to the index of the current item in the loop, and total_points
is set to the value of the item at that index.
In your original code, you maintained the index i
manually by initializing i = 0
before the loop and incrementing i
with i += 1
inside the loop. enumerate()
does this for you automatically.
Let’s say combined_points
is [14, 15, 20, 25, 30]
.
Here’s what i
and total_points
would be on each iteration of the loop:
- On the first iteration,
i
would be0
andtotal_points
would be14
. - On the second iteration,
i
would be1
andtotal_points
would be15
. - And so on, until the end of the list.
The benefit of using enumerate()
is that it makes the code more readable and Pythonic (idiomatic Python). It’s also safer in case you forget to increment i
, and it’s slightly more efficient because you’re not performing an additional operation on each loop iteration.
Issue #3 (this loops over each element as opposed to each collection of unique grades)
1 | for i in range(len(sort_grade_list), 0, -1): |
- Issue: loops over each element, this is not what was intended.
- Issue#2: Recall the use of range() and its parameters, we want to include grade “0”, but that isn’t included in the given parameters.
Solution to Issue#3 V_1
1 | for i in range(5, -1, -1): # can use set() function to get unique set of grades, but not discussed. |
- Hard-coded the grading range (not ideal, but works)
- Fixed parameter to be -1, so that 0 is included.
Solution to Issue#3 V_2 (better), use of set()
function
1 | for i in range(len(set(grade_list)), 0, -1): |
- set(grade_list) to get a collection of unique grades.
Issue #4: return values on certain functions
1 | # show grade distribution, formatted |
- The list grade_list is reversed using [::-1], but this doesn’t actually change grade_list itself, it just returns a reversed copy.
Solution to Issue#4
1 | ``` |
In this revised code, the split()
function creates a list of two strings from the user’s input. The int(user_input_list[0])
and int(user_input_list[1])
lines then convert the first and second elements of that list to integers, respectively. The rest of the code is the same as in the previous example.
Compiled Code (Passed All Tests)
1 | # Write your solution here |
Model Solution
- This solution and the way it solves problems was eye-opening.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48def exam_and_exercise_completed(inpt):
space = inpt.find(" ")
exam = int(inpt[:space])
exercise = int(inpt[space+1:])
return [exam, exercise]
def exercise_points(amount):
return amount // 10
def grade(points):
boundary = [0, 15, 18, 21, 24, 28]
for i in range(5, -1, -1):
if points >= boundary[i]:
return i
def mean(points):
return sum(points) / len(points)
def main():
points = []
grades = [0] * 6
while True:
inpt = input("Exam points and exercises completed: ")
if len(inpt) == 0:
break
exam_and_exercises = exam_and_exercise_completed(inpt)
exercise_pnts = exercise_points(exam_and_exercises[1])
total_points = exam_and_exercises[0] + exercise_pnts
points.append(total_points)
grd = grade(total_points)
if exam_and_exercises[0] < 10:
grd = 0
grades[grd] += 1
pass_pros = 100 * (len(points) - grades[0]) / len(points)
print("Statistics:")
print(f"Points average: {mean(points):.1f}")
print(f"Pass percentage: {pass_pros:.1f}")
print("Grade distribution:")
for i in range(5, -1, -1):
stars = "*" * grades[i]
print(f" {i}: {stars}")
main()
Lists within Lists
The items in a list can be lists themselves:
1 | my_list = [[5, 2, 3], [4, 1], [2, 2, 5, 1]] |
The output is:
1 | [[5, 2, 3], [4, 1], [2, 2, 5, 1]] |
Lists within lists can be useful for storing structured data. For example, you could store information about a person in a list. Each person’s information can be represented as a sublist within the main list:
1 | persons = [["Betty", 10, 1.37], ["Peter", 7, 1.25], ["Emily", 32, 1.64], ["Alan", 39, 1.78]] |
The output is:
1 | Betty: age 10 years, height 1.37 meters |
In this example, each sublist represents a person, with the first item being the name, the second item being the age, and the third item being the height.
Matrices
A two-dimensional array, or a matrix, can be represented using a list within a list. Each sublist corresponds to a row in the matrix.
For example, consider the following matrix:
1 | 5 1 1 |
It can be represented as a two-dimensional list in Python:
1 | my_matrix = [[1, 2, 3], [3, 2, 1], [4, 5, 6]] |
To access individual elements within the matrix, use consecutive square brackets. The first index refers to the row, and the second index refers to the column. Indexing starts from zero. For example, my_matrix[0][1]
refers to the second item on the first row.
1 | my_matrix = [[1, 2, 3], [3, 2, 1], [4, 5, 6]] |
The output is:
1 | 2 |
Traversing rows and elements within the matrix can be done using nested loops:
1 | my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] |
The output is:
1 | [1, 2, 3] |
To access individual elements, use nested loops:
1 | my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] |
The output is:
1 | A new row |
Visualizing Code with Lists within Lists
Understanding programs that involve lists within lists can be challenging. The Python Tutor visualisation tool can help in visualizing how they work.
Working with matrices and nested lists can be easier to grasp using visualizations. For example, a 3x3 matrix technically consists of four lists. The first list represents the entire matrix, while the remaining lists represent the rows.
The visualisation tool helps in understanding the reference relationships between the main list and the nested lists representing the rows.
Accessing Items in a Matrix
Accessing a single row within a matrix is straightforward by selecting the desired row. The following function calculates the sum of the elements in a chosen row:
1 | def sum_of_row(my_matrix, row_no: int): |
Working with columns within a matrix requires iterating through each row and selecting the item at the chosen position:
1 | def sum_of_column(my_matrix, column_no: int): |
Changing the value of a single element within the matrix can be done by selecting the desired row and column:
1 | def change_value(my_matrix, row_no: int, column_no: int, new_value: int): |
The output is:
1 | [[4, 2, 3, 2], [9, 1, 12, 11], [7, 8, 9, 5], [2, 9, 15, 1]] |
To modify the contents of a matrix, it is necessary to access elements by their indexes rather than using a simple loop. A loop using the range
function can be used to iterate over the indexes.
Understanding Python Variables and References
Variables as References in Python
Instead of considering a variable as a “box” containing its value, it is more accurate to think of a Python variable as a reference to the actual object, which can be a number, a string, or a list, among others. This reference information points to the location in computer memory where the value can be found, but it is not the value itself.
1 | a = [1, 2, 3] |
Using References to Identify Variables
The function id
can be used to find out the exact location the variable points to, returning an integer that can be thought of as the memory address. For instance, if you execute the code above on your own computer, the result will likely differ as your variables will point to different locations - the references will be different.
The Python Tutor visualization tool also presents references as arrows from the variable to the actual value. However, it simplifies how strings are represented by displaying them as if they are stored in the variables themselves, even though Python handles strings much like lists with references to locations in memory.
Understanding Immutable and Mutable Types in Python
Many of Python’s built-in types like str
are immutable, meaning their value or any part of it cannot change. Conversely, some types like list
are mutable, meaning their content can change without needing to create an entirely new object.
Surprisingly, the basic data types int
, float
, and bool
are also immutable in Python. Although it may seem like you’re changing the value stored in the variable, each operation creates a new object in memory.
1 | number = 1 |
Multiple References and List Assignment
When you assign a list variable to a new variable, what gets copied is the reference, not the list itself. This means there are now two references to the same memory location containing the list, and changes made to the list through one reference affect the other reference too, as they both point to the same location.
1 | list1 = [1, 2, 3, 4] |
Copying a List
To create an actual separate copy of a list, you can either create a new list and add each item from the original list, or use the bracket operator [:]
to select all items in the original list, which creates a copy.
1 | my_list = [1,2,3,4] |
Lists as Parameters in Functions
When passing a list as an argument to a function, you’re passing a reference to that list, meaning the function can modify the list directly. Note that changes made to the list inside the function persist outside the function because they’re modifying the same list referenced in the calling scope.
1 | def add_item(my_list: list): |
Editing Lists Given as Arguments
When you try to modify a list argument within a function by assigning a new list to the argument variable, the original list is not affected because the argument variable now points to a new memory location. A better approach is to directly modify the original list or copy items from the new list into the original one.
1 | def augment_all(my_list: list): |
1 | # Write your solution here |
.join()
.join()
is a string method in Python which concatenates (or “joins”) all the elements of an iterable (like a list or a tuple) into a string. The string on which this method is called is used as a separator between the elements.
Here’s an example:
1 | separator = ', ' |
In this example, result
will be 'apple, banana, cherry'
. The ', '
string is used to join the words in the words
list.
Let’s look at a few more examples:
- Joining a list of strings with no separator:
1 | "".join(["H", "e", "l", "l", "o"]) |
This will result in the string 'Hello'
.
- Joining a list of numbers:
1 | ", ".join([str(num) for num in [1, 2, 3, 4, 5]]) |
This will result in the string '1, 2, 3, 4, 5'
. Note that we had to convert the numbers to strings first, because .join()
only works on iterables of strings.
- Joining a tuple of strings with a separator:
1 | " - ".join(("apple", "banana", "cherry")) |
This will result in the string 'apple - banana - cherry'
.
Let’s execute these examples in Python to see the results.
1 | # Example 1: Joining a list of strings with no separator |
1 | ('Hello', '1, 2, 3, 4, 5', 'apple - banana - cherry') |
Here are the results from our examples:
- Joining a list of strings with no separator:
'Hello'
- Joining a list of numbers:
'1, 2, 3, 4, 5'
- Joining a tuple of strings with a separator:
'apple - banana - cherry'
As you can see, the .join()
method is quite versatile and can be used in various scenarios to combine elements of an iterable into a single string.
Shallow Copy Problem
1 | # Write your solution here |
The line this_sudoku = sudoku[:]
is creating a shallow copy of the list sudoku
, which means it creates a new list containing all of the elements in the old list. However, it’s important to note that it’s a “shallow” copy. This means that while the top-level list is duplicated, the inner lists (the rows in your Sudoku grid) are still the same lists. If you modify one of these inner lists, the change will be reflected in both this_sudoku
and sudoku
.
Here is a small example that demonstrates this behavior:
1 | sudoku = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] |
If you need a deep copy of the list, where changes to one list do not affect the other, you should use the copy
module’s deepcopy
function:
1 | import copy |
With deepcopy
, changes to the inner lists in sudoku
will not affect this_sudoku
, and vice versa.
My Solution vs. Model Solution vs. deepcopy Solution
# shallow copy of the rows, but the inner row lists remain referenced
# shallow copy of the inner row lists, but this double shallow copy effective creates the entire copy of the list of list
1 | def add_number(sudoku: list, row_no: int, column_no: int, number:int): |
The model solution of copy and add sidelines this issue by initializing a new list to begin with. Then it adds the rows after. It’s the same as what I did, but with more steps.
1 | def copy_and_add(sudoku: list, row_no: int, column_no: int, number:int): |
List Comphrension Solution (Limits is only List of List)
1 | def copy_and_add(sudoku: list, row_no: int, column_no: int, number: int): |
More “Sophisticated” Solution
1 | import copy |
1 | if __name__ == "__main__": |
Transposing Elements
1 | The following matrix |
Solution 1 (creating a temp matrix) – inefficient
The initial code is my take on creating a deep copy of a matrix, or in this case alist[list[int]]
. A temporary matrix is created and used as a temp variable. However, the problem specifies that the original list is modified. (So yes, while this solution works, it somewhat defeats the original goal.”
1 | def transpose(matrix: list): |
Copying a matrix: deepcopy()
vs. `list comphrension vs. manually
deepcopy
1
2
3#import copy
def transpose(matrix: list):
temp_list = copy.deepcopy(matrix)list comphrension
1
2
3
4
5
6
7
8
9def transpose(matrix: list):
temp_list = [row[:] for row in matrix]
""" manually, but too many lines
def transpose(matrix: list):
temp_list = matrix[:] # copies row references
for i, row in enumerate(matrix): # copies information from each row
temp_list[i] = row[:]
"""
Solution 2: Model Solution
1 | def transpose(matrix: list): |
Solution 3: Pythonic
- Variables are assigned in order, and left hand side is evaluted first, creating a tuple, and then assignment is done in a single, atomic, operation.
1
2
3
4
5
6def transpose(matrix: list):
n = len(matrix)
for i in range(n):
for j in range(i, n):
matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]
Side Effects
The following program has an unintended side effect:
1 | def second_smallest(my_list: list) -> int: |
The since my_list
is passed as a reference, calling my_list.sort()
affects the original list.
We can avoid the side effect by making a small change to the function:
1 | def second_smallest(my_list: list) -> int: |
sorted()
returns a list, whereas sort()
alters the original list.
Functions free of side effects are also called pure functions. Especially when adhering to a functional programming style, this is a common ideal to follow.
In a dictionary, items are indexed by keys
, and each key
maps to a value
.
1 | my_dictionary = {} # creates empty dictionary |
1 | 3 |
Dictionary with User Input
1 | word = input("Please type in a word: ") |
What can be stored in a dictionary?
- Looks to me like any data type. Here are a few examples
string -> int
1
2
3
4results = {}
results["Mary"] = 4
results["Alice"] = 5
results["Larry"] = 2int ->
int[list]
1
2
3
4lists = {}
lists[5] = [1, 2, 3]
lists[42] = [5, 4, 5, 4, 5]
lists[100] = [5, 2, 3]
Keys are unique
If you use an existing key, the value mapped to that key is replaced with the new value.
1 | my_dictionary["suuri"] = "big" |
1 | large |
Keys must be immutable!
This means that keys CANNOT be lists.
1 | my_dictionary[[1, 2, 3]] = 5 |
1 | TypeError: unhashable type: 'list' |
Traversing a dictionary
Use for item in collection
as we have been to traverse a dictionary.
1 | my_dictionary = {} |
1 | key: apina |
A more pythonic way to achieve this is to use the .items()
method. In this case, we’re returning a view of an object that displays a list of dictionary key-value tupule pairs, which is unpacked into key and value.
- This is more efficient and readable because unlike the
for item in collection
, we don’t need to do a key lookup to access each value.
1 | my_dictionary = {} |
1 | key: apina |
As the keys are processed based on a hash value, the order should not usually matter in applications. In fact, in many older versions of Python the order is not guaranteed to follow the time of insertion.
Some more advanced ways to use dictionaries
Count the number of times a word appears in a list
1 | def counts(my_list: list): |
1 | {'banana': 1, 'milk': 1, 'beer': 3, 'cheese': 2, 'sourmilk': 3, 'juice': 1, 'sausage': 2, 'tomato': 1, 'cucumber': 1, 'butter': 2, 'margarine': 1, 'chocolate': 1} |
What if we wanted to categorize the words based on the initial letter in each word? One way to accomplish this would be to use dictionaries:
1 | def categorize_by_initial(my_list: list): |
1 | words beginning with b: |
Phone Book that can store multiple numbers for one name
1 | # Write your solution here |
Removing keys and values from a dictionary (Two Methods)
- Method One :
del
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
1 | {'Alan': 'lecturer', 'Emily': 'professor'} |
If you try and use del
on a key that does not exisit in the dictionary,
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
1 | >>> del staff["Paul"] |
So, before deleting a key you should check if it is present in the dictionary:
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
- Method Two:
pop()
Unlikedel
,pop
returns the value removed from dictionary
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
1 | {'Alan': 'lecturer', 'Emily': 'professor'} |
By default, pop will also cause an error if you try to delete a key which is not present in the dictionary. It is possible to avoid this by giving the method a second argument, which contains a default return value. This value is returned in case the key is not found in the dictionary. The special Python value None
will work here:
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
1 | This person is not a staff member |
NB: if you need to delete the contents of the entire dictionary, and try to do it with a for loop, like so
1 | staff = {"Alan": "lecturer", "Emily": "professor", "David": "lecturer"} |
1 | RuntimeError: dictionary changed size during iteration |
Why?
When you iterate over a dictionary, Python creates an iterator that expects the dictionary to stay the same size. If you modify the dictionary’s size (by adding or removing items), Python loses track of the size and raises an error.
When traversing a collection with a for loop, the contents may not change while the loop is in progress.
Instead, use clear()
1 | staff.clear() |
Invert a Dictionary
So my original approach was as follows:
1 | for key, value in dictionary.items(): |
The above did not work because I was deleting (therefore, changing size) of the dictionary, while I used the iterator of the same structure I was deleting.
To avoid this, the model code made a copy first. So the iterator is of the copy, which means we can freely change the dictionary. Recall that dictionaries, like lists, are referenced.
1 | def invert(dictionary: dict): |
My original approach:
- I just stored both keys and values in seperate lists, and then deleted the dictionary using
clear()
1
2
3
4
5
6
7
8
9
10
11def invert(dictionary: dict):
list_key = []
list_value = []
for key, value in dictionary.items():
list_key.append(key)
list_value.append(value)
dictionary.clear()
for key, value in zip(list_key, list_value):
dictionary[value] = key
Movie Database - Structured Data
The advantage of a dictionary is that it is a collection. It collects related data under one variable, so it is easy to access the different components. This same advantage is offered by a list. However, as a programmer, the index [1], [2], etc
do not tell us anything about what is stored. When using a dictionary this problem is avoided, as each bit of data is accessed through a named key.
1 | person1 = {"name": "Pippa Python", "height": 154, "weight": 61, "age": 44} |
1 | def add_movie(database: list, name: str, director: str, year: int, runtime: int): |
Search for a movie title, case-insensitive (since in
is case-sensitive)
1 | # Write your solution here |
Tuple is a data structure which is, in many ways, similar to a list. The most important differences between the two are:
- Tuples are enclosed in parentheses (), while lists are enclosed in square brackets [].
- Tuples are immutable, while the contents of a list may change.
The following bit of code creates a tuple containing the coordinates of a point:
1 | point = (10, 20) |
The items stored in a tuple are accessed by index, just like the items stored in a list:
1 | point = (10, 20) |
Sample output:
1 | x coordinate: 10 |
The values stored in a tuple cannot be changed after the tuple has been defined. The following will not work:
1 | point = (10, 20) |
Sample output:
1 | TypeError: 'tuple' object does not support item assignment |
Programming exercise: The oldest person
1 |
|
Solution:
1 | def oldest_person(people: list): |
The problem requires you to write a function named oldest_person
that takes a list of tuples as its argument. Each tuple contains the name of a person as the first element and their year of birth as the second element. The function should find the oldest person in the list and return their name.
In the provided example, four tuples representing people’s names and birth years are created: p1
, p2
, p3
, and p4
. These tuples are then added to the people
list. The oldest_person
function is called with the people
list as the argument, and the result is printed.
The oldest_person
function starts by initializing the oldest
variable with the first tuple from the people
list. It then iterates over each tuple in the people
list and compares the birth year of each person with the birth year of the oldest person found so far. If a person has a lower birth year, the oldest
variable is updated with that person’s tuple.
Finally, the function returns the name of the oldest person (oldest[0]
).
Purpose of a Tuple
Tuples serve a specific purpose in Python programming. They are particularly useful when dealing with a fixed set of values that are somehow related. For instance, when working with coordinates like x and y, tuples are a natural choice because coordinates always consist of two values.
Example:
1 | point = (10, 20) |
While it is technically possible to use a list to store coordinates, it is not ideal. Lists are collections of consecutive items that can change in size. When storing coordinates, it is preferable to have a specific structure that represents the x and y values directly, rather than an arbitrary list.
An important characteristic of tuples is that they are immutable, unlike lists. This immutability allows tuples to be used as keys in dictionaries. Consider the following example, where a dictionary is created with coordinate points as keys:
1 | points = {} |
Output:
1 | monkey |
If we attempt to use lists instead of tuples as keys in the dictionary, it would result in an error:
1 | points = {} |
Output:
1 | TypeError: unhashable type: 'list' |
This error occurs because lists are mutable objects, and mutable objects cannot be used as dictionary keys. Tuples, being immutable, can be hashed and used as keys effectively.
In summary, tuples are beneficial for representing fixed sets of related values and can be used as keys in dictionaries due to their immutability.
Tuples without parentheses
The parentheses are not strictly necessary when defining tuples. The following two variable assignments are identical in their results:
1 | numbers = (1, 2, 3) |
This means we can also easily return multiple values using tuples. Let’s have a look at the following example:
1 | def minmax(my_list): |
Sample output:
The smallest item is 5 and the greatest item is 312
This function returns two values in a tuple. The return value is assigned to two variables at once:
1 | min_value, max_value = minmax(my_list) |
Using parentheses may make the notation more clear. On the left-hand side of the assignment statement, we also have a tuple, which contains two variable names. The values contained within the tuple returned by the function are assigned to these two variables.
1 | (min_value, max_value) = minmax(my_list) |
You may remember the dictionary method items
in the previous section. We used it to access all the keys and values stored in a dictionary:
1 | my_dictionary = {} |
Tuples are at work here, too. The method my_dictionary.items()
returns each key-value pair as a tuple, where the first item is the key and the second item is the value.
Another common use case for tuples is swapping the values of two variables:
1 | number1, number2 = number2, number1 |
The assignment statement above swaps the values stored in the variables number1
and number2
. The result is identical to what is achieved with the following bit of code, using a helper variable:
1 | helper_var = number1 |
1 | # Write your solution here |
Working with Text Files in Python
Overview
- Python programming often involves reading and writing data to files.
- This provides a simple and effective way to handle large datasets.
- In this guide, we focus only on text files, which contain lines of text.
Text Files vs Word Processor Documents
- Text Files: Simple to handle; used with programs like Visual Studio Code.
- Word Documents: Contain text but aren’t text files. Includes formatting info; complex to handle programmatically.
Reading from Text Files
- Example file used:
example.txt
, with content:1
2
3Hello there!
This example file contains three lines of text.
This is the last line.
Including Files in Python
- Python’s
with
statement is used to include and handle files. - It opens the file, allows operations within its block, then automatically closes it.
Example Code
1 | with open("example.txt") as new_file: |
- Output:
1
2
3Hello there!
This example file contains three lines of text.
This is the last line.
File Handle and read
Method
new_file
is a file handle that allows access to the file.read
method returns file contents as a single string.- String returned by
read
:"Hello there!\nThis example file contains three lines of text.\nThis is the last line."
Processing Text Files Line-by-Line in Python
read
Method vs Line-by-Line Iteration
read
method: prints entire file content.- Line-by-line iteration: more common and flexible, treats file as list of strings (lines).
Iterating Through Lines with a for
Loop
- Each string represents a single line in the file.
- Use a
for
loop to traverse the file.
Code Example: Counting and Printing Lines
1 | with open("example.txt") as new_file: |
- Output:
1
2
3
4Line 1 Hello there!
Line 2 This example file contains three lines of text.
Line 3 This is the last line.
Total length of lines: 81
Understanding the Code
- Line breaks (
\n
) are removed from each line with thereplace
method. replace
method: Replaces line break characters with an empty string, thereby allowing accurate calculation of line lengths.- The
for
loop counts the lines, prints each line with its line number, and sums the lengths of all lines.
Key Takeaway
replace
Method: Used for manipulating strings; very useful for cleaning data in files.
Working with CSV Files in Python
CSV File Basics
- CSV: Comma-separated values.
- A type of text file with data separated by a predetermined character, usually a comma (,) or semicolon (;).
- Used for storing different kinds of records; easy data exchange between systems.
Python split
Method
- Splits a string into a list of substrings based on a separator character.
- Used to separate different fields on a line.
Code Example: Processing CSV Data
- Assume
grades.csv
file contains student names and their grades, separated by semicolons.
1 | with open("grades.csv") as new_file: |
- Output:
1
2
3
4
5
6Name: Paul
Grades: ['5', '4', '5', '3', '4', '5', '5', '4', '2', '4']
Name: Beth
Grades: ['3', '4', '2', '4', '4', '2', '3', '1', '3', '3']
Name: Ruth
Grades: ['4', '5', '5', '4', '5', '5', '4', '5', '4', '4']
Understanding the Code
- For each line, remove the line break (
\n
) and split the line into parts at each semicolon usingsplit
. - The first part is the student’s name, the rest are the grades.
Key Takeaway
split
Method: Powerful tool for parsing and processing structured data in files, such as CSVs.
Reading a File Multiple Times in Python
Problem Statement
- Need to process contents of a file more than once in a single program.
- Attempting to read a file twice leads to an error because after the first read, the file handle rests at the end of the file and data in the file can’t be accessed again.
Incorrect Code
- Tries to read the file twice, leading to an error.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16with open("people.csv") as new_file:
# Print out the names
for line in new_file:
parts = line.split(";")
print("Name:", parts[0])
# Find the oldest
age_of_oldest = -1
for line in new_file:
parts = line.split(";")
name = parts[0]
age = int(parts[1])
if age > age_of_oldest:
age_of_oldest = age
oldest = name
print("The oldest is", oldest) - Error encountered:
UnboundLocalError: local variable 'oldest' referenced before assignment
Inefficient Solution
- Open and read the file twice.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17with open("people.csv") as new_file:
# Print out the names
for line in new_file:
parts = line.split(";")
print("Name:", parts[0])
with open("people.csv") as new_file:
# Find the oldest
age_of_oldest = -1
for line in new_file:
parts = line.split(";")
name = parts[0]
age = int(parts[1])
if age > age_of_oldest:
age_of_oldest = age
oldest = name
print("The oldest is", oldest) - Cons: Unnecessary repetition, inefficiency as the file is read twice.
Efficient Solution
- Read the file once, store its contents in a suitable data structure for future use.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19people = []
with open("people.csv") as new_file:
for line in new_file:
parts = line.split(";")
people.append((parts[0], int(parts[1]), parts[2]))
# Print out the names
for person in people:
print("Name:", person[0])
# Find the oldest
age_of_oldest = -1
for person in people:
name = person[0]
age = person[1]
if age > age_of_oldest:
age_of_oldest = age
oldest = name
print("The oldest is", oldest) - Pros: More efficient, as the file is only read once. Contents are stored in a list for further processing.
Key Takeaway
- File Reading Strategy: Avoid reading the same file multiple times, instead store contents in memory for efficient processing.
Processing CSV Files in Python
Context
- Working with file
grades.csv
, which contains student names and their grades. - Aim: Create a dictionary
grades
where keys are student names and values are lists of grades.
Reading and Storing Data
1 | grades = {} |
- Output:
{'Paul': [5, 4, 5, 3, 4, 5, 5, 4, 2, 4], 'Beth': [3, 4, 2, 4, 4, 2, 3, 1, 3, 3], 'Ruth': [4, 5, 5, 4, 5, 5, 4, 5, 4, 4]}
Compute and Display Statistics
- Compute best grade and average grade for each student.
1
2
3
4for name, grade_list in grades.items():
best = max(grade_list)
average = sum(grade_list) / len(grade_list)
print(f"{name}: best grade {best}, average {average:.2f}") - Output:
Paul: best grade 5, average 4.10
Beth: best grade 4, average 2.90
Ruth: best grade 5, average 4.50
Key Concepts
- Dictionary in Python: Powerful data structure for storing key-value pairs. In this case, used for mapping student names to their grades.
- File Processing: Used the
with open
statement to read the CSV file line by line, then split each line into parts and stored them in the dictionary. - Statistics Calculation: Computed the maximum (best grade) and average grade for each student using built-in Python functions
max()
andsum()
.
Note
This technique is applicable for processing many different types of data contained in files, not just for grade lists.
CSV File Processing in Python with Whitespace Handling
Context
- A CSV file
people.csv
with unnecessary white spaces and line breaks exported from Excel. - Each line contains a first and last name separated by a semicolon, and extra spaces.
Before Cleanup
When we read from the CSV file and print the names without using any strip functions, we see extra spaces and line breaks in the output.
1 | last_names = [] |
Sample output:
1 | [' Python\n', ' Java\n', ' Haskell'] |
Reading and Storing Data (After Cleanup)
1 | last_names = [] |
- Output:
['Python', 'Java', 'Haskell']
Key Concepts
- Whitespace Removal: Utilizing the
strip()
method to remove unnecessary white spaces from beginning and end of a string. In the example above, the strip method is used to remove the leading and trailing white spaces from the last names.
Other Whitespace Handling Techniques
lstrip()
: Removes leading white spaces (from the left).rstrip()
: Removes trailing white spaces (from the right).
Demonstration:
1 | " teststring ".rstrip() |
Note
strip()
, lstrip()
, and rstrip()
are powerful string methods that can be particularly useful when cleaning up data in Python. Their usage can be extended beyond the given example to handle data from various sources which might not always be cleanly formatted.
Combining Data from Different Files in Python
In many scenarios, the data required for processing by a program could be scattered across multiple files. Here’s an example that illustrates how you can connect data from multiple CSV files using a common identifier.
Context
- A company’s personal details of employees are stored in a file
employees.csv
. - The employee’s salaries are stored in another file
salaries.csv
. - Each data line in both files contains a Personal Identity Code (PIC) that can be used as a common identifier.
Sample Data
employees.csv
1 | pic;name;address;city |
salaries.csv
1 | pic;salary;bonus |
Code
1 | names = {} |
1 | """First the program produces the dictionaries **names** and salaries. They have the following contents: |
Output
1 | incomes: |
Key Concepts
- Dictionaries: The program uses two dictionaries
names
andsalaries
which use the Personal Identity Code (PIC) as the key. - Data Linking: The PIC is used as a common identifier to link the employee’s name from
employees.csv
to their corresponding salary insalaries.csv
.
Note
- The program also handles cases where an employee’s PIC is not present in the salary file, it prints 0 euros as the salary for those employees.
- The order of storing items in a dictionary doesn’t matter as keys are processed based on hash values.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30#
inpt = int(input("Layers: "))
alphabet = [
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'
]
num_char = inpt * 2 - 1
my_string = alphabet[inpt - 1] * num_char
splice_index_start = 0
splice_index_end = num_char
in_char_counter = num_char - 2
my_list = []
my_list.append(my_string)
print(my_string)
for i in range(1, inpt):
holder_string_start = my_string[:i]
holder_string_end = my_string[num_char - i:]
my_string = holder_string_start + (alphabet[inpt - i - 1] * in_char_counter) + holder_string_end
my_list.append(my_string)
in_char_counter -= 2
print(my_string)
for line in my_list[::-1][1:]: # reverse, then skip first line
print(line)
1 | # write your solution here |
my condensed revised ver.
1 | exercises = {} |
1 | def grade(points): |
1 | # write your solution here |
1 | my_input = input() |
Write some functions for working on a file containing location data from the stations for city bikes in Helsinki.
Each file will follow this format:
Longitude;Latitude;FID;name;total_slot;operative;id
24.950292890004903;60.155444793742276;1;Kaivopuisto;30;Yes;001
24.956347471358754;60.160959093887129;2;Laivasillankatu;12;Yes;002
24.944927399779715;60.158189199971673;3;Kapteeninpuistikko;16;Yes;003
Each station has a single line in the file. The line contains the coordinates, name, and other identifying information for the station.
Distance between stations
First, write a function named get_station_data(filename: str). This function should read the names and locations of all the stations in the file, and return them in a dictionary with the following format:
Sample output
{
“Kaivopuisto: (24.950292890004903, 60.155444793742276),
“Laivasillankatu: (24.956347471358754, 60.160959093887129),
“Kapteeninpuistikko: (24.944927399779715, 60.158189199971673)
}
Dictionary keys are the names of the stations, and the value attached is a tuple containing the location coordinates of the station. The first element in the tuple is the Longitude field, and the second is the Latitude field.
Next, write a function named distance(stations: dict, station1: str, station2: str), which returns the distance between the two stations given as arguments.
The distance is calculated using the Pythagorean theorem. The multiplication factors below are approximate values for converting latitudes and longitudes to distances in kilometres in the Helsinki region.
we will need the function sqrt from the math module
import math
x_km = (longitude1 - longitude2) * 55.26
y_km = (latitude1 - latitude2) * 111.2
distance_km = math.sqrt(x_km2 + y_km2)
Some examples of the function in action:
stations = get_station_data(‘stations1.csv’)
d = distance(stations, “Designmuseo”, “Hietalahdentori”)
print(d)
d = distance(stations, “Viiskulma”, “Kaivopuisto”)
print(d)
Sample output
0.9032737292463177
0.7753594392019532
NB: If Visual Studio can’t find the file and you have checked that there are no spelling errors, take a look at these instructions.
The greatest distance
Please write a function named greatest_distance(stations: dict), which works out the two stations on the list with the greatest distance from each other. The function should return a tuple, where the first two elements are the names of the two stations, and the third element is the distance between the two.
stations = get_station_data(‘stations1.csv’)
station1, station2, greatest = greatest_distance(stations)
print(station1, station2, greatest)
1 |
|
1 | program which allows the user to search for recipes based on their names, preparation times, or ingredients used. The program should read the recipes from a file submitted by the user. |
1 | def open_file(filename: str) -> list: |
Writing Data to Files
We can create a new file every time we want to write data to a file, but we can also append new data to the end of an existing file. In both cases, we use the open
function from the previous section. For writing files, the function requires a second argument.
Creating a New File
If you want to create a new file, you would call the open
function with the additional argument "w"
to signify that the file should be opened in write mode. So, the function call could look like this:
1 | with open("new_file.txt", "w") as my_file: |
NB: If the file already exists, all the contents will be overwritten. It’s important to be very careful when creating new files.
With the file open, you can write data to it. You can use the write
method, which takes the string that is to be written as its argument.
1 | with open("new_file.txt", "w") as my_file: |
When you execute the program, a new file named new_file.txt
will appear in the directory. The contents would look like this:
Sample data:
1 | Hello there! |
If you want line breaks in the file, you will have to add them manually. The write
function doesn’t work exactly like the more familiar print
function, although they are similar. So, the following program:
1 | with open("new_file.txt", "w") as my_file: |
Would result in a file with these contents:
Sample data:
1 | Hello there!This is the second lineThis is the last line |
Line breaks are achieved by adding new line characters (\n
) to the argument strings:
1 | with open("new_file.txt", "w") as my_file: |
Now the contents of new_file.txt
would look like this:
Sample data:
1 | Hello there! |
Appending Data to an Existing File
To append data to the end of an existing file, you can open the file in append mode by passing "a"
as the second argument to the open()
function.
1 | with open("new_file.txt", "a") as my_file: |
Appending data to a file allows you to add new content without overwriting the existing contents. If the file doesn’t exist, it will be created. However, it’s important to note that appending data to files is not a common practice in programming. In most cases, files are read, processed, and overwritten entirely when needed.
Remember to include the appropriate newline characters (\n
) if you want to separate the appended content into separate lines within the file.
Writing CSV Files
Writing CSV files line by line
CSV files can be written line by line using the write
method, similar to writing any other file. Each line in the CSV file represents a record, with fields separated by a specific delimiter, such as a semicolon or a comma.
Here’s an example that creates a file called coders.csv
and writes programmer data to it:
1 | with open("coders.csv", "w") as my_file: |
Executing this program would result in a CSV file with the following contents:
1 | Eric;Windows;Pascal;10 |
Writing CSV files from a list
If the data to be written is stored in a list in computer memory, you can iterate over the list and construct the lines using an f-string:
1 | coders = [ |
If the list contains a large number of items, building the string manually may become cumbersome. In such cases, you can use nested loops to construct the line:
1 | with open("coders.csv", "w") as my_file: |
Clearing file contents and deleting files
To clear the contents of an existing file, you can open it in write mode and close it immediately. This can be achieved using a pass
statement within a with
block:
1 | with open("file_to_be_cleared.txt", "w") as my_file: |
Alternatively, you can use a one-liner to bypass the with
block:
1 | open("file_to_be_cleared.txt", "w").close() |
To delete a file entirely, you can use the os
module:
1 | import os |
This will delete the file called “unnecessary_file.csv” from the filesystem.
1 | The file solutions.csv contains some solutions to mathematics problems: |
1 | def filter_solutions(): |
1 | Kirka;79-15;22 |
Here is a summary of the typical errors in Python and how they can be handled, as you would expect from a person with a Master’s degree in the field:
ValueError: This occurs when an argument passed to a function is invalid. For instance, calling
float("1,23")
will raise a ValueError because in Python, decimals are represented using a period, not a comma.TypeError: This error arises when an operation is applied to an object of inappropriate type. For instance, calling
len(10)
results in a TypeError because the length function expects an iterable (like string or list), but receives an integer.IndexError: This occurs when trying to access an index that doesn’t exist in a sequence. For instance,
"abc"[5]
will raise an IndexError because there’s no element at index 5 in the string “abc”.ZeroDivisionError: This error is raised when there is an attempt to divide by zero. It’s a common mathematical error. For example, calculating the mean of a list using
sum(my_list) / len(my_list)
will throw a ZeroDivisionError if the list is empty.File Handling Exceptions:
a. FileNotFoundError: Raised when trying to access a file that doesn’t exist.
b. io.UnsupportedOperation: Occurs when an operation is not supported in the mode the file is opened.
c. PermissionError: Raised when the program doesn’t have the necessary permissions to access the file.Handling Multiple Exceptions: Python allows handling multiple exceptions using more than one
except
block attached to atry
block. For instance, you can separately handleFileNotFoundError
andPermissionError
by using two differentexcept
blocks.Generic Exception Handling: Sometimes it might not be necessary to know the specific error. In such cases, a generic
except
block can be used which doesn’t specify the error. However, it’s usually a good practice to specify the exception type as generic handling can mask the real issue.Passing Exceptions: If a function raises an exception and it’s not handled within that function, the exception is passed to the calling code. This continues up the call stack until it’s either handled or causes the program to exit.
Raising Exceptions: Exceptions can be explicitly triggered using the
raise
statement. This is helpful, for example, when validating input parameters in a function. Raising an exception can signal that something is wrong, which is particularly useful when the function is called from elsewhere.
Here’s an example where a custom exception is raised:
1 | def factorial(n): |
In this example, the factorial
function raises a ValueError
if the input number is negative.
This summary covers the typical errors you may encounter in Python and how they can be handled or raised intentionally. Understanding these errors and handling mechanisms is crucial for writing robust and error-resistant code.
1 | def filter_incorrect(): |
Variable Scope in Python
The scope of a variable refers to the sections of a program where a variable is accessible. In Python, variables can have local or global scope.
Local Variables
Variables defined within a function have local scope, meaning they are only accessible within that function. This includes function parameters and other variables defined within the function. Local variables do not exist outside the function.
In the following example, the variable x
is defined within the testing
function:
1 | def testing(): |
Here, x
is only accessible within the testing
function. Trying to access x
outside the function results in an error.
Global Variables
Variables defined outside any function, typically in the main section of the program, have global scope. Global variables can be accessed from any part of the program, including other functions.
1 | def testing(): |
In this example, x
is a global variable defined in the main section of the program. It can be accessed and used within the testing
function.
However, a global variable cannot be changed directly from within a function unless specified using the global
keyword:
1 | def testing(): |
Here, the testing
function creates a new local variable x
that masks the global variable. The local variable x
has a value of 5, but it is a separate variable from the global x
.
To modify the global variable within a function, you need to use the global
keyword:
1 | def testing(): |
By using global x
within the testing
function, the assignment x = 3
affects the global variable x
as well.
When to Use Global Variables
Global variables should be used judiciously and not as a way to bypass function parameters or return values. It is generally better to use function parameters and return values to pass data between functions.
Global variables are useful in situations where you need to have common information available to multiple functions throughout the program. For example:
1 | def calculate_sum(a, b): |
Here, the global variable count
keeps track of how many times the functions calculate_sum
and calculate_difference
are called.
However, it’s important to use global variables sparingly and consider other alternatives, such as function parameters and return values, whenever possible. Overusing global variables can make it difficult to track program state and lead to unpredictable behavior.
Passing data between functions is best achieved through function arguments and return values, as shown in the example below
Debugging Methods in Python
Recap of Debugging Methods
- Visualization tools and debugging print outs are common methods.
- Visual Studio Code built-in debugger is an effective tool. Problems with file location are covered in the previous section.
Introduction to Breakpoint Command
- Python version 3.7 introduced the breakpoint() command for debugging.
- The command halts the program execution at the point where it is inserted.
- An interactive console opens upon halting, enabling the user to experiment with the code.
Use Cases and Instructions
- The command is useful in identifying the cause of an error in a particular line.
- Execution can be resumed using the command continue, or c, in the debugging console.
- Other commands for the console can be found through the help command.
- The exit command concludes the program execution.
- Users must remember to remove breakpoint commands after debugging.
Python Modules
Introduction to Modules
- Python’s language definition includes useful functions, but more complex programs often require additional functionalities provided by the Python standard library.
- The standard library consists of modules, each containing functions and classes around different themes.
- The import command allows the use of a given module’s contents in the current program.
Using the Math Module
- The math module provides functions for mathematical operations.
- Functions in a module are referred to by prefixing them with the module name (e.g., math.sqrt).
Importing Specific Module Sections
- Select parts of a module can be imported using the from command, which eliminates the need for prefixing.
- The star notation imports all contents of a module.
- The star notation can be handy in testing and small projects but may also pose problems.
Programming Exercises
- An exercise on calculating the hypotenuse of a triangle using the math module.
- Another exercise on separating different character types using the string module.
- A third exercise on creating fractions using the fractions module.
Understanding Module Contents
- Python documentation provides detailed resources on each module.
- The dir function lists all names defined by a module.
- The names can represent classes, constant values, or functions.
Randomness
Learning objectives
After this section, you will be able to:
- Understand and utilize the functions in the
random
module - Generate random numbers in your programs
- Shuffle data structures using the
shuffle
function - Pick random items from a data structure using the
choice
function - Generate unique sets of random numbers
Generating a random number
The random
module in Python’s standard library provides tools for generating random numbers and implementing other randomized functionality.
To generate a random integer value between a
and b
(inclusive), you can use the randint(a, b)
function. For example:
1 | from random import randint |
Output:
1 | The result of the throw: 4 |
You can also generate multiple random numbers by using a loop:
1 | from random import randint |
More randomizing functions
The random
module provides other functions for randomizing data structures. The shuffle
function shuffles a list in-place:
1 | from random import shuffle |
Output:
1 | ['banana', 'atlas', 'carrot'] |
The choice
function returns a randomly selected item from a data structure:
1 | from random import choice |
Output:
1 | 'carrot' |
Lottery numbers
Generating lottery numbers involves selecting a set of unique random numbers within a specified range. Here are a few approaches to achieve this:
Approach 1: List and loop
1 | from random import randint |
Approach 2: Shuffle and slice
1 | from random import shuffle |
Approach 3: Sample function
1 | from random import sample |
True randomness
The random
module in Python generates pseudorandom numbers, which are not truly random but rather based on an algorithm and a seed value. To ensure the same sequence of pseudorandom numbers, you can set the seed value using the seed
function:
1 | from random import randint, seed |
For true randomness, external sources such as background radiation or noise levels are used to generate the seed value.
Programming exercise: Password generator
You can use the random
module to create a password generator. Here’s an example of generating passwords consisting of lowercase characters ‘a’ to ‘z’:
1 | from random import choice |
Output:
1 | lttehepy |
The choice
function is used to randomly select a character from the lowercase alphabet. The generated password length is specified as an argument to the generate_password
function.
Remember, this is a simple example using only lowercase characters. You can extend it to include uppercase letters, digits, and special characters based on your requirements.
Code
1 | from random import choice, shuffle |
1 | Programming exercise: |
1 | from random import sample |
using beginswith()
1 | import random |
startswith()
The datetime object
The Python datetime
module provides functionalities for working with dates and times. One of the key components of this module is the datetime
object, which represents a specific date and time.
Obtaining the current date and time
To get the current date and time, you can use the datetime.now()
function:
1 | from datetime import datetime |
Output:
1 | 2023-06-19 12:30:45.123456 |
Creating a datetime object
You can also create a datetime
object for a specific date and time by providing the year, month, day, hour, minute, second, and microsecond values:
1 | from datetime import datetime |
Output:
1 | 2021-12-24 18:30:00 |
If you don’t provide the time components, the default time will be set to midnight (00:00:00).
Accessing datetime components
You can access the individual components of a datetime
object using the corresponding attributes:
1 | from datetime import datetime |
Output:
1 | Day: 24 |
Comparing datetime objects
Datetime objects can be compared using the standard comparison operators, such as <
, >
, ==
, etc. This allows you to check if one date is before, after, or equal to another date:
1 | from datetime import datetime |
Output:
1 | It is not yet Midsummer |
Calculating the difference between datetime objects
You can calculate the difference between two datetime
objects using the subtraction operator. The result is a timedelta
object representing the time difference:
1 | from datetime import datetime |
Output:
1 | Midsummer is 2 days away |
Performing arithmetic operations with datetime objects
You can perform arithmetic operations involving datetime
and timedelta
objects. Adding a timedelta
object to a datetime
object results in a new datetime
object:
1 | from datetime import datetime, timedelta |
Output:
1 | A week after Midsummer it will be 2023-06-28 00:00:00 |
Conclusion
The datetime
object in Python’s datetime
module allows you to work with dates and times effectively. You can obtain the current date and time, create custom datetime objects, compare dates, calculate differences, and perform arithmetic operations. Understanding and utilizing the datetime
object is essential for working with time-related data and operations in Python.
Programming Exercise: How old
Please write a program that asks the user for their date of birth and then prints out how old the user was on the eve of the new millennium. The program should ask for the day, month, and year separately and print out the age in days. Please refer to the examples below:
Sample Output:
1 | Day: 10 |
Sample Output:
1 | Day: 28 |
You may assume that all day-month-year combinations given as arguments will be valid dates. That is, there will not be a date like February 31st.
1 | # Write your solution here |
Programming exercise:
Valid PIC?
Points:
0
In this exercise you will validate Finnish Personal Identity Codes (PIC).
Please write a function named is_it_valid(pic: str)
, which returns True
or False
based on whether the PIC given as an argument is valid or not. Finnish PICs follow the format ddmmyyXyyyz
, where ddmmyy
contains the date of birth, X
is the marker for century, yyy
is the personal identifier, and z
is a control character.
The program should check the validity by these three criteria:
- The first half of the code is a valid, existing date in the format
ddmmyy
. - The century marker is either
+
(1800s),-
(1900s), orA
(2000s). - The control character is valid.
The control character is calculated by taking the nine-digit number created by the date of birth and the personal identifier, dividing this by 31, and selecting the character at the index specified by the remainder from the string 0123456789ABCDEFHJKLMNPRSTUVWXY
. For example, if the remainder was 12, the control character would be C
.
More examples and explanations of the uses of the PIC are available at the Digital and Population Data Services Agency.
NB! Please make sure you do not share your own PIC, for example in the code you use for testing or through the course support channels.
Here are some valid PICs you can use for testing:
- 230827-906F
- 120488+246L
- 310823A9877
1 |
|
1 | Programming exercise: |
1 | # Write your solution here |
Model Code…
1 | from datetime import datetime, timedelta |
This section explains handling dates and times in Python, and it introduces several key functionalities:
- datetime object: The Python
datetime
module includes thenow
function which returns a datetime object containing the current date and time.
1 | from datetime import datetime |
You can also create the datetime object yourself:
1 | from datetime import datetime |
- Accessing elements of datetime object: You can access different elements of a datetime object, like the day, month, and year.
1 | from datetime import datetime |
- Time of day: You can also specify the time of day when creating a datetime object.
1 | from datetime import datetime |
- Comparison and calculation of differences between datetime objects: The familiar comparison operators also work on datetime objects.
1 | from datetime import datetime |
The difference between two datetime objects can be calculated simply with the subtraction operator:
1 | from datetime import datetime |
- Datetime and timedelta: Addition is available between datetime and timedelta objects.
1 | from datetime import datetime, timedelta |
- strftime method: The
strftime
method allows you to format the string representation of a datetime object.
1 | from datetime import datetime |
- strptime function: The
strptime
function parses a datetime object from a string given by the user.
1 | from datetime import datetime |
1 | Notation Significance |
1 | # Write your solution here |
1 | [{"week":7,"exercises":[17,13,13,8,6,5,11],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"59f883227655fe0034b4dfe5","year":2017,"term":"syksy","fullName":"Ohjelmistotuotanto","name":"ohtus17","url":"https://github.com/mluukkai/ohjelmistotuotanto2017/wiki/Ohjelmistotuotanto-syksy-2017","__v":7},{"week":8,"exercises":[6,14,19,22,21,21,23,23],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5a576ac24d91600059c09180","year":1970,"term":"Unknown term","fullName":"Full stack -websovelluskehitys","name":"fs","url":"https://fullstack-hy.github.io","__v":9},{"week":8,"exercises":[6,14,19,22,21,21,23,23],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5a7f50aa9b73740051c69898","year":2018,"term":"Unknown term","fullName":"Open Full Stack 2018","name":"ofs","url":"http://fullstackopen.github.io","__v":8},{"week":7,"exercises":[0,17,13,13,8,6,6,11],"enabled":false,"miniproject":true,"peerReviewOpen":false,"extension":false,"_id":"5bb48ca56ec4c800e33cb76f","year":2018,"term":"syksy","fullName":"Ohjelmistotuotanto","name":"ohtu2018","url":"https://github.com/mluukkai/ohjelmistotuotanto2018/wiki/Ohjelmistotuotanto-syksy-2018","__v":7},{"week":4,"exercises":[0,8,6,7,0,0,0,0],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5be43839e90ef000b62e8ca4","year":2018,"term":"fall","fullName":"Beta DevOps with Docker","name":"docker-beta","url":"https://docker-hy.github.io","__v":3},{"week":7,"exercises":[0,11,16,16,15,15,15,15],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5be5dfaeca8b21009ac43d35","year":2018,"term":"syksy","fullName":"Web-palvelinohjelmointi Ruby on Rails","name":"rails2018","url":"https://github.com/mluukkai/WebPalvelinohjelmointi2018","__v":7},{"week":1,"exercises":[0,9,6,7,0,0,0,0],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5c17f2fdcccfd100f9c6a260","year":2018,"term":"christmas","fullName":"DevOps with Docker","name":"docker18","url":"https://docker-hy.github.io/","__v":3},{"week":8,"exercises":[6,14,20,22,21,21,21,20,0],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":true,"_id":"5c39d27776e25b01007e7a12","year":2019,"term":"kevät","fullName":"Full stack websovelluskehitys","name":"fullstack2019","url":"https://fullstack-hy2019.github.io/","__v":11},{"week":8,"exercises":[0,4,4,4,5,3,3,4],"enabled":false,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5c3dd379e2ecb8022bb75407","year":2019,"term":"Fall","fullName":"Cloud Computing Fundamentals","name":"CCFUN","url":"https://ccfun.fi/home","__v":8},{"week":0,"exercises":[6,14,20,22,22,22,21,21,26,27],"enabled":true,"miniproject":false,"peerReviewOpen":false,"extension":true,"_id":"5c7f97d3b7e42b00495261de","year":2020,"term":"Year","fullName":"Full Stack Open 2020","name":"ofs2019","url":"https://fullstackopen.com/","__v":16},{"week":4,"exercises":[1,17,10,8,0,0,0,0],"enabled":true,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5cb5bcd65e4c2f005281f7e7","year":2019,"term":"Year","fullName":"DevOps with Docker 2019","name":"docker2019","url":"https://docker-hy.github.io/","__v":4},{"week":1,"exercises":[1,17,10,8],"enabled":true,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5e8ae0d2d9979700193caed4","name":"docker2020","url":"https://devopswithdocker.com/","term":"Year","year":2020,"fullName":"DevOps with Docker 2020","__v":0},{"week":1,"exercises":[0,13,8,7],"enabled":true,"miniproject":false,"peerReviewOpen":false,"extension":false,"_id":"5ebe6a8f54e7f10019becc15","name":"beta-dwk-20","url":"https://devopswithkubernetes.com","term":"Summer","year":2020,"fullName":"Beta DevOps with Kubernetes","__v":1}] |
Who cheated
Points: 1/1
The file start_times.csv
contains individual start times for a programming exam, in the format name;hh:mm
. An example:
1 | jarmo;09:00 |
Additionally, the file submissions.csv
contains points and hand-in times for individual exercises. The format here is name;task;points;hh:mm
. An example:
1 | jarmo;1;8;16:05 |
Your task is to find the students who spent over 3 hours on the exam tasks. That is, any student whose any task was handed in over 3 hours later than their exam start time is labeled a cheater. There may be more than one submission for the same task for each student. You may assume all times are within the same day.
Please write a function named cheaters()
, which returns a list containing the names of the students who cheated.
My Code
- Issues: I used (time) instead of a datetime object, which forced to to later convert to timedelta or datetime, when I should’ve used datetime to begin with.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44import csv
from datetime import datetime, timedelta, time
def cheaters():
with open("start_times.csv") as file_one, open("submissions.csv") as file_two:
students_start_time = {}
# this line seperates elements into a list based on the delimiter
for line in csv.reader(file_one, delimiter=";"):
students_start_time[line[0]] = datetime.strptime(line[1], "%H:%M").time()
students_end_time = {}
for line in csv.reader(file_two, delimiter=";"):
if line[0] in students_end_time:
# checks if new_time is later than the current recorded time
if datetime.strptime(line[3], "%H:%M").time() > students_end_time[line[0]]:
students_end_time[line[0]] = datetime.strptime(line[3], "%H:%M").time()
#student name doesn't exist, add it
else:
students_end_time[line[0]] = datetime.strptime(line[3], "%H:%M").time()
#print(students_start_time)
#print(students_end_time)
cheaters_list = []
for student in students_start_time:
#time object can't subtract, convert to either datetime or timedelta object
end_time_delta = timedelta(hours=students_end_time[student].hour,
minutes=students_end_time[student].minute)
print("end_time_delta", end_time_delta)
start_time_delta = timedelta(hours=students_start_time[student].hour,
minutes=students_start_time[student].minute)
difference_time = end_time_delta - start_time_delta
# 3 hour time_delta
cut_off_time = timedelta(hours=3)
if difference_time > cut_off_time:
cheaters_list.append(student)
return cheaters_list
Model Code: just more clean, compare.
1 | import csv |
Who cheated, version 2
Points: 1/1
You have the CSV files from the previous exercise at your disposal again. Please write a function named final_points()
, which returns the final exam points received by the students, in a dictionary format, following these criteria:
- If there are multiple submissions for the same task, the submission with the highest number of points is taken into account.
- If the submission was made over 3 hours after the start time, the submission is ignored.
- The tasks are numbered 1 to 8, and each submission is graded with 0 to 6 points.
In the dictionary returned, the key should be the name of the student, and the value should be the total points received by the student.
Hint: Nested dictionaries might be a good approach when processing the tasks and submission times of each student.
1 | # Write your solution here |
**BOTH MY SOLUTION AND IDEALIZED SOLUTION IS PRETTY MUCH THE SAME, NO DIFFERENCE IN METHOD. I WILL SAY INITIALIZING VARIABLES BEFOREHAND WITH PROPER NAMES DOES MAKE CODE MORE READABLE. HE ALSO INITIALIZES A LOT OF OTHER ASPECTS AS WELL… **
Model Solution
1 | import csv |
1 | from difflib import get_close_matches |
1 | def change_case(orig_string: str): |
This section aims to familiarize you with some additional Python features that you may find useful:
- Single line conditionals: Python offers a way to create conditional logic in a single line of code using the structure:
a if [condition] else b
. This is sometimes referred to as a ternary operator.
1 | x = 10 |
This can be especially useful for conditional assignments:
1 | y = 5 |
- “Empty” block: Python does not allow for empty blocks of code. In instances where you need to have a block of code which does nothing (perhaps for testing), you can use the
pass
command.
1 | def testing(): |
- Loops with else blocks: In Python, loops can have
else
blocks. These blocks execute when the loop finishes normally, without encountering anybreak
statements.
1 | my_list = [3,5,2,8,1] |
- Default parameter value: Python allows function parameters to have default values. These are used whenever no argument is passed for that parameter.
1 | def say_hello(name="Emily"): |
- A variable number of parameters: Python also allows functions to be defined with a variable number of parameters, by adding a star (
*
) before the parameter name.
1 | def testing(*my_args): |
In this case, all arguments passed to the function are contained in a tuple and can be accessed via the named parameter.