Welcome to day 4:
1 def fun():
2 """Just an example for a single line doc string."""
help(fun)
shows docstring in Python Console 1 def check_input_type(input):
2 """
3 Checks the type that is passed and returns type name as string.
4
5 # Example usages:
6
7 >>> check_input_type("Hello World")
8 'str'
9 >>> check_input_type(1)
10 'int'
11 >>> check_input_type(0.0)
12 'float'
13 >>> check_input_type([])
14 'something else'
15 """
16 if isinstance(input, int):
17 return 'int'
18 elif isinstance(input, float):
19 return 'float'
20 elif isinstance(input, str):
21 return 'str'
22 else:
23 return 'something else'
• Right-click the function with doctest and click on "Run 'Doctest
• Or: add the following lines and run the whole module with all doctests
1 import doctest
2
3 # ...
4 # ... put your functions with doctests here
5 # ...
6
7 # this invokes the doctests for the actual module
8 def test_it():
9 doctest.testmod(verbose=True)
10
11 # this is a "main" function
12 if __name__ == '__main__':
13 test_it()
• Other ways to run → advanced course
# doctest: +SKIP
in the endWrite a function that checks that an expression of open and close parenthesis, "(" and ")", is correct. A parenthesis can only close, if one has been opened before. Example "()()" and "((()())())" are correct, while "())(" is not. Test your function using a doctest.
pandas
h5py
pickle
file_p = open(<FILENAME>, <MODE>)
, MODE: a
(append), r
(read), w
(write)close(file_p)
file_p.readline()
file_p.read()
file_p.write(some_data)
print
: print(exp, file=file_p)
Note: uses the str
representation of exp
1 # looping over all lines delimited by newline characters (`\n`)
2 input_file = open('inFile.txt', 'r')
3 for line_str in input_file:
4 # process line_str, the string of chrs up to and including
5 # the next '\n' in the file attached to inp_obj
6 input_file.close()
7
8
9
10 # processing and writing to output file
11 output_file = open('outFile.txt', 'w')
12 while processing_data:
13 # calculate the next output next_output
14 output_file.write(next_output)
15 output_file.close()
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. CSV format was used for many years prior to attempts to describe the format in a standardized way in RFC 4180. The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed by different applications. These differences can make it annoying to process CSV files from multiple sources. Still, while the delimiters and quoting characters vary, the overall format is similar enough that it is possible to write a single module which can efficiently manipulate such data, hiding the details of reading and writing the data from the programmer.
-- https://docs.python.org/3/library/csv.html
csv
module: csv
allows other delimiters, too1 import csv
2 with open('sample.csv') as csv_file:
3 csv_reader = csv.reader(csv_file, delimiter=',', quotechar='"')
4 for row in csv_reader:
5 # rows are now lists with column elements
6 print(row)
example.csv
file in the day-3/assets/
directory to your current directorycsv
module and calculate the per column sum and mean values1) virtual environment with pip
:
pip
or pip3
2) with Anaconda:
conda
in both cases: Remember the PTB Web-Proxy (webproxy.berlin.ptb.de:8080)!
# create a virtual environment
python -m venv my_venv
# source environment (Windows)
. my_venv/Scripts/Activate
# source environment (Linux)
. my_venv/bin/activate
# shell now shows the environment folder in prompt
# install a python package NOTE: proxy is required
pip install --proxy=webproxy.berlin.ptb.de:8080 numpy
# show what is installed
pip freeze
# create the environment
conda create -n my_conda_env
# possibly list your environments and check
conda env list
# activate the environment
conda activate my_conda_env
# install some packages
conda install biopython
# Again: Set the proxy!
conda config --set proxy_servers.http <PROXY>
conda config --set proxy_servers.https <PROXY>
numpy
in your environmentcollections.deque
from collections import deque
append
and appendleft
pop
and popleft
clear
, copy
, count
, reverse
, rotate
queue
moduleimport queue
queue.Queue
queue.LiFoQeueue
queue
example 1 import threading
2 import queue
3
4 q = queue.Queue()
5
6 def worker():
7 while True:
8 it = q.get()
9 print(f'Working on {it}') # real work here
10 q.task_done()
11
12 # Turn-on the worker thread.
13 threading.Thread(target=worker, daemon=True).start()
14
15 # Send thirty task requests to the worker.
16 for item in range(30):
17 q.put(item)
18
19 # Block until all tasks are done.
20 q.join()
21 print('All work completed')
Simulate the following:
PandemicQueue
class
elegantly.list
flexible but SLOWlist
my_3d_list[x][y][z]
(no slicing)np.array
(1d), np.matrix
(2d) and np.ndarray
(arbitrary dimension)Check out the numpy quickstart
C
-library of n-dimensional arrays of fixed sizenp.sum
, np.mean
)np.exp
, np.sin
pandas
, sklearn
etc.) 1 # always needed
2 import numpy as np
3
4 # creation of a linear array from input list (or tuple)
5 arr = np.array([i for i in range(200)])
6
7 # creation of an array with 200 zeros (floats)
8 float_zeros = np.zeros(200)
9
10 # creation of array with 200 zeros (ints)
11 int_zeros = np.zeros(200, dtype=np.int32)
12
13 # creation of an array initialized to 1
14 int_ones = np.ones(200, dtype=np.int32)
15
16 # similar to python range integer ascending from 0...N-1
17 int_range = np.arange(10)
dtype=
np.int32, np.int64, np.float32, np.float64
'i4', 'i8', 'f4', 'f8'
astype(new_type)
method 1 # array of ones with dtype float32 ('f4')
2 a = np.ones(10, dtype='f4')
3
4 # conversion to int32
5 a_int = a.astype('i4')
6
7 # creation of a float64 array
8 b = np.array([1.234, 3.876, 3.54], dtype='f8') # array([1.234, 3.876, 3.54 ])
9
10 # conversion to i4 - values are truncated (NOT rounded!)
11 b_int = b.astype('i4') # array([1, 3, 3])
12
13 # rounding to nearest integer
14 b_int_rounded = b.round().astype('i4') # array([1, 4, 4])
1 # addition (subtraction)
2 a = np.array([1., 2.])
3 b = np.array([0.5, 0.5])
4 a + b
5 >>> array([1.5, 2.5])
6
7 # multiplication
8 a * b
9 >>> array([0.5, 1. ])
10
11 # division (element-wise)
12 b / a
13 >>> array([0.5, 1. ])
14
15 # multiplication with scalar
16 3.0 * a
17 >>> array([3., 6.])
1 # indexing as with lists
2 a = np.arange(10)
3 a[0]
4 >>> 0
5 a[1]
6 >>> 1
7
8
9 # slicing as with lists, return is sub-array !representation! (view)
10 sub_a = a[2:4]
11 sub_a
12 >>> array([2, 3])
13 sub_a[0] = 5
14 a
15 >>> array([0, 1, 5, 3, 4, 5, 6, 7, 8, 9])
16
17
18 # slicing in strides of 2 syntax: [start:end:stride]
19 a[0::2]
20 >>> array([0, 5, 4, 6, 8])
• Lots of math functions available
• Trigonometric, Sums, Trigonometric, etc...
• Check https://numpy.org/doc/stable/reference/routines.math.html
• All act on complete array:
1 np.exp(np.array([0.0, 1.0, 2.0]))
2 >>> array([1., 2.71828183, 7.3890561 ])
Table of Contents | t |
---|---|
Exposé | ESC |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide next slide | c |
Notes | 2 |
Help | h |