Welcome to this Workshop !

  • This workshop is brought to you by PTB's IT department (Q.4)
  • I work for the HPC department at PTB (Q.45)
  • I am a physicist (QCD), data analyst (DNA), software architect (HPC)

Landscape

Presenter Notes

Table of Contents

  • Day 1: Adaptability & Extendibility

    * Modularity in python
    * DRY
    * SOLID
    * Design Patterns
    
  • Day 2: Testing, Documentation & Readability

    * Packaging
    * Unit Tests
    * Doctest
    * Sphinx
    * GitLab Integration
    
  • Day 3: Scalability & Performance

    * Ahmdal's law
    * Data Structures
    * Dont's
    * Profiling & Benchmarking
    * Parallelisation with `joblib` and `mpi4py`
    

Presenter Notes

Modularity

Landscape

Presenter Notes

Why Modularity?

Goals:

  • Split software into small, weakly dependent units (modules)
  • Separation of Concerns
  • Make software manageable and maintainable
  • "Divide and Conquer" principle
  • Make the software understandable using mind-sized chunks
  • Abstraction can be incorporated at ease
  • Reusability
  • Concurrent execution

Presenter Notes

Development Cost

  • Generally, you would want to minimize this, I guess
  • Cost per module vs. Integration Cost → In real life: case dependent optimum

Landscape

Presenter Notes

Modularity in Python

  • Packages & Libraries
    • "the ecosystem", pypi.org, conda, pip, etc.
  • Package with Modules my_module.py
    • structure on file system level
  • Classes class MyClass
    • (multiple) Inheritance, Interfaces, Traits, Method Specialization
  • Functions def my_add(x, y)
    • may be also local
  • Additionally: Decorators, Lambdas, etc.

Presenter Notes

DRY - Don't Repeat Yourself

Landscape

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system

-- A. Hunt & D.Thomas in The pragmatic Programmer

Presenter Notes

DRY - Motivation

Why?

  • difficult to change consistently duplicated code (logic)
  • therefore hard to maintain
  • sneak in of slight changes makes it even worse
  • almost impossible to refactor if it happened (hope you have good test coverage!)

Further in a Team: - Gives rise to unmotivated local adaptions of repeated code -> clutter

Thus: - Already the first <Ctrl>-<v> usually is the bad one! - Early (but not premature!) abstractions

Presenter Notes

Example 1 - DRY?

1 def check_area(b):
2     if (1 / 2 * 3.14 * b * b) < 0.2:
3         return True
4     elif (1 / 2 * 3.14 * b * b) < 0.8:
5         return False
6     elif (1 / 2 * 3.14 * b * b) < 2.0:
7         return True
8     else:
9         return False

Presenter Notes

This code is not DRY.

Example 2 - DRY?

 1 class ProductInterface:
 2 
 3     def display_price(self):
 4         raise NotImplementedError(
 5                 "You forgot to implement the displayPrice method")
 6 
 7 class PlasticDuck(ProductInterface):
 8 
 9     def __init__(self, price):
10         self._price = price
11 
12     def display_price(self):
13         print("The price of this plastic"
14               "duck is {} euros!".format(self._price))
15 
16 plast_duck = PlasticDuck(2)
17 plast_duck.displayPrice()

Think about the fact that the word 'price' appears 9! times

Presenter Notes

This code is DRY.

Example 3 - DRY?

 1 class CsvValidation:
 2 
 3     def validate_product(self, product : dict):
 4         if 'color' not in product:
 5             raise RuntimeError(
 6                 'Import fail: the product attribute color is missing')
 7 
 8         if 'size' not in product:
 9             raise RuntimeError(
10                 'Import fail: the product attribute size is missing')
11 
12         if 'type' not in product:
13             raise RuntimeError(
14                 'Import fail: the product attribute type is missing')

Presenter Notes

This code is DRY. But there is some code duplication.

Example 4 - DRY?

1 class CsvValidation:
2 
3     def validate_product(self, product : dict):
4         for property in ['color', 'size', 'type']:
5             if property not in product:
6                 raise RuntimeError(
7                     'Import fail: the product attribute {} is missing'.format(property))

Presenter Notes

This code is DRY. No duplications - but more difficult to understand! There is a tradeoff.

DRY Summary

  • Try to keep your code DRY
  • Note that there is a tradeoff
  • Don't make your code WET since this would mean:
    • Write Everything Twice
    • We Edit Terribly
    • Waste Everyone's Time

Presenter Notes

Hands On

Exercise Day-1/exercise-1.py

Tasks: Read a bitmap image

  • Refactor the sequential unstructured code - use the provided skeleton
  • Write and implement a class Bitmap
  • Dry the code - no hard-coded constants
  • Separate the concerns
  • Use width, height and offset of pixel array from bitmap

Presenter Notes

SOLID

Principles of object-oriented design by Robert C. Martin

  • Single Responsibility Principle
  • Open-Closed Principle
  • Liskov Substitution Principle
  • Interface Segregation Principle
  • Dependency Inversion Principle

Presenter Notes

https://gist.github.com/dmmeteo/f630fa04c7a79d3c132b9e9e5d037bfd

Single Responsibility Principle

There should never be more than one reason for a class to change.

-- Robert C. Martin in "SRP: The Single Responsibility Principle"

A class (method, module) should have exactly one purpose

Separation of Concerns

Avoid the "egg-laying wool-milk-sow":

Landscape

Source: de:User:Pixelrausch Wikipedia

Presenter Notes

Single Responsibility Principle - Example

 1 def percentage_of_word_in_localfile(search, file):
 2     search = search.lower()
 3     content = open(file, "r").read()
 4     words = content.split()
 5     number_of_words = len(words)
 6     occurrences = 0
 7     for word in words:
 8         if word.lower() == search:
 9             occurrences += 1
10     return occurrences / number_of_words

What do you think of this?

  • Is this code reusable?
  • Is this code testable?

Presenter Notes

Single Responsibility Principle - Example

 1 def read_localfile(file):
 2     return open(file, "r").read()
 3 
 4 def number_of_words(content):
 5     return len(content.split())
 6 
 7 def count_word_occurrences(word, content):
 8     counter = 0
 9     for e in content.split():
10         if word.lower() == e.lower():
11             counter += 1
12     return counter
13 
14 def percentage_of_word(word, content):
15     total_words = number_of_words(content)
16     word_occurrences = count_word_occurrences(word, content)
17     return word_occurrences/total_words
18 
19 def percentage_of_word_in_localfile(word, file):
20     content = read_localfile(file)
21     return percentage_of_word(word, content)

Presenter Notes

Open-Closed Principle

Modules should be both open (for extension) and closed (for modification).

- Bertrand Meyer in Object Oriented Software Construction

Landscape

Source: Photo by Dan Cristian Paduret from Pexels

Presenter Notes

Open-Closed Principle - Example

 1 class Animal:
 2     def __init__(self, name: str):
 3         self.name = name
 4 
 5     def get_name(self) -> str:
 6         pass
 7 
 8 def animal_sound(animals: list):
 9     for animal in animals:
10         if animal.name == 'lion':
11             print('roar')
12 
13         elif animal.name == 'mouse':
14             print('squeak')
15 
16 # client code
17 animals = [
18     Animal('lion'),
19     Animal('mouse')
20 ]
21 
22 animal_sound(animals)

Presenter Notes

Open-Closed Principle - Example

 1 class Animal:
 2 
 3     ... # __init__ and get_name same as before
 4 
 5     def make_sound(self):
 6         pass
 7 
 8 class Lion(Animal):
 9     def make_sound(self):
10         return 'roar'
11 
12 class Mouse(Animal):
13     def make_sound(self):
14         return 'squeak'
15 
16 def animal_sound(animals: list):
17     for animal in animals:
18         print(animal.make_sound())
19 
20 # client code
21 animals = [Lion('lion'), Mouse('mouse')]
22 animal_sound(animals)

Presenter Notes

Liskov Substitution Principle

Subtype Requirement: Let Φ(x) be a property provable about objects x of type T. Then Φ(y) should be true for objects y of type S where S is a subtype of T.

-- Liskov, B. H. & Wing, J. M. in A behavioral notion of subtyping. (1994)

In other words: A base-class must be substitutable by its sub-class (anywhere!)

  • strong requirement
  • Inheritance: strong coupling between code parts
  • Keep inheritance chains small
  • In doubt: prefer composition over inheritance
  • Always ask for the type of relation "is a" vs. "has a"

Presenter Notes

Liskov Substitution Principle

Implications on method parameters and return types

Landscape

Source: Wikipedia

Presenter Notes

Interface Segregation Principle

Landscape

Clients should not be forced to depend on methods that they do not use.

-- Robert C. Martin in "Agile Software Development: Principles, Patterns, and Practices"

  • Implementation is a contract. We assure all required methods are properly implemented
  • Therefore: Make small & specific interfaces
  • Let client decide which (small) interface to implement

Presenter Notes

Interfaces in Python

A simple approach:

 1 class Interface(object):
 2 
 3     def method1(self):
 4         raise NotImplementedError("this is not implemented") # -> provoke runtime error
 5 
 6 class Concrete(Interface):
 7 
 8     def method1(self):
 9         print("method 1 is implemented here")
10         # ...

Presenter Notes

Interfaces in Python with abc

abc == "Abstract base class"

A better approach assuring implementation before any execution starts:

 1 import abc
 2 
 3 class Interface(object, metaclass=abc.ABCMeta):
 4 
 5     @abc.abstractmethod
 6     def method1(self):
 7         #pass
 8 
 9 class Concrete(Interface):
10 
11     def method1(self):
12         print("method 1 is implemented here")
13         # ...

Presenter Notes

Hands On

Exercise Day-1/exercise-2.py

Tasks:

  • Use the namedtuple Pixel as an interface to the pixel concept in class Bitmap
  • Write an abc based interface class for a 2d addressable object
  • Make class Bitmap implement this interface
  • Note the lines that have been indicated to be changed

Presenter Notes

Dependency Inversion Principle (DIP)

  • In a class hierarchy low-level classes should depend on high-level classes
    • high-level : defines control flow
    • low-level : does specialized tasks
  • Otherwise stated: Depend on abstraction, not on specialization.
  • Avoids:
    • frequent updates and changes of high-level classes
    • cyclic dependencies

Presenter Notes

DIP (Counter-)Example

 1 class Lamp(object):    
 2     def __init__(self):
 3         self._is_shining = False
 4 
 5     def turn_on(self):
 6         self._is_shining = True
 7 
 8     def turn_off(self):
 9         self._is_shining = False
10 
11 class Switch(object):
12     def __init__(self, lamp: Lamp):
13         self._lamp = lamp
14         self._pressed = False
15 
16     def press(self):
17         self._pressed = not self._pressed
18         if self._pressed:
19             self._lamp.turn_on()        # <---- PROBLEM IS HERE
20         else:
21             self._lamp.turn_off()       # <---- AND HERE

Presenter Notes

DIP Example

 1 #### module: lamp.py
 2 import switch
 3 class Lamp(Switchable):    
 4     def __init__(self):
 5         self._is_shining = False
 6 
 7     def turn_on(self):
 8         self._is_shining = True
 9 
10     def turn_off(self):
11         self._is_shining = False

Presenter Notes

DIP Example

 1 #### module: switch.py
 2 import abc
 3 import lamp
 4 class Switchable(object, metaclass=abc.ABCMeta):
 5     @abc.abstractmethod
 6     def turn_on():
 7         pass
 8 
 9     @abc.abstractmethod
10     def turn_off():
11         pass
12 
13 class Switch(object):
14     def __init__(self, switched: Switchable):
15         self._switched = switched
16         self._pressed = False
17 
18     def press(self):
19         self._pressed = not self._pressed
20         if self._pressed:
21             self._switched.turn_on()        # <---- No Ref to concrete implementation
22         else:
23             self._switched.turn_off()       # <---- No Ref to concrete implementation

Presenter Notes

Hands On

Exercise Day-1/exercise-3.py

Tasks:

We will write a smoother (filter) that replaces each pixel value with the arithmetic mean of itself and its eight direct neighbors. We directly design it as an implementation of a more generic FilterInterface

  • Think about how to interact with the Bitmap (signature of the filter method)
  • Implement the kernel of the nearest neighbor smoother
  • Edit and implement at every "\<EDIT HERE>" tag
  • Verify that the produced output bitmap is smoothed

Presenter Notes

Design Patterns

  • general & reusable solution to a common type of problem
  • formalized best-pratices
  • pattern to be implemented for each specific case
  • Book: "Design Patterns: Elements of Reusable Object-Oriented Software" by E. Gamma, R. Helm, R. Johnson, J. Vlissides (aka 'Gang of four')

Presenter Notes

Design Patterns List

drawing

Behavioral
Structural
Creational

Presenter Notes

The "Template Method"

  • Behavioral Pattern
  • Goal: Compose complex algorithms from various parts
  • Fixed, optional, dynamic, mandatory (required) parts possible
  • Single interface exposed to client code
  • Question: Have you already seen this pattern today?

Presenter Notes

The "Template Method"

 1 import abc
 2 
 3 class AbstractAlgorithm(object, metaclass=abc.ABCMeta):
 4     """
 5     Defines an algorithm template, sub-classes implement part of it.
 6     """
 7 
 8     def template_method(self) -> None:
 9         """
10         Algorithm skeleton.
11         """
12         self.base_operation()
13         self.required_operation1()
14         self.hook()
15         self.required_operation2()
16 
17     # These operations already have implementations.
18 
19     def base_operation(self) -> None:
20         print("AbstractAlgorithm says: I am doing some alway necessary preparation work")

Presenter Notes

The "Template Method"

 1   # These operations have to be implemented in subclasses.
 2 
 3   @abc.abstractmethod
 4   def required_operation1(self) -> None:
 5       pass
 6 
 7   @abc.abstractmethod
 8   def required_operation2(self) -> None:
 9       pass
10 
11   # These are "hooks." Subclasses may override them optionally. Adds flexibility
12 
13   def hook(self) -> None:
14       pass

Presenter Notes

The "Template Method"

 1 class ConcreteAlgo1(AbstractAlgorithm):
 2     """
 3     needs to implement all abstract operations of the base
 4     class. They can also override some operations with a default implementation.
 5     """
 6 
 7     def required_operation1(self) -> None:
 8         print("ConcreteAlgo1 says: Implemented Operation1")
 9 
10     def required_operation2(self) -> None:
11         print("ConcreteAlgo1 says: Implemented Operation2")
12 
13 class ConcreteAlgo2(AbstractAlgorithm):
14   # accordingly

Presenter Notes

The "Template Method"

 1 def client_code(abstract_class: AbstractAlgorithm) -> None:
 2     """
 3     The client code calls the template method to execute the algorithm. Note
 4     that we operate with the interface
 5     """
 6     # ...
 7     abstract_class.template_method()
 8     # ...
 9 
10 
11 if __name__ == "__main__":
12     print("Same client code can work with different subclasses:")
13     client_code(ConcreteAlgo1()) # Or: client_code(ConcreteAlgo2())
14     print("")

Presenter Notes

The "Factory" Pattern

  • Creational Pattern
  • Goal: Creation of similar objects in one place
  • New objects can be added for creation easily
  • Example: Create dynamically any of the algorithms, you have defined through above the "Template Method"
  • You definitely do not want to have nested complicated if elif blocks

Presenter Notes

Hands On

Exercise Day-1/exercise-4.py

Tasks: Study a simple factory

  • Study the code and try to understand it
  • Add a new Triangle type of shape
  • Think about the required presence of the class in class (Factory). How can we avoid it?
  • How can we pass in arguments to the Constructors of the Shape sub-class?

Presenter Notes

The "Singleton" Pattern

  • Creational Pattern
  • Goal: a unique single global scope object (instance)
  • Useful e.g. for Database connector or a global factory
  • Instance created and initialized at first call
  • Much better than a global variable!
  • Double initialization prevented
  • Use this pattern wisely!

Presenter Notes

The "Singleton" Pattern

 1 class Singleton:
 2     __instance = None  # NOTE: class variable!
 3 
 4     @staticmethod
 5     def get_instance():
 6         if Singleton.__instance is None:
 7             Singleton()  # first time initialization
 8         return Singleton.__instance
 9 
10     def __init__(self):
11         if Singleton.__instance is not None:
12             # in python we have to raise:
13             raise Exception("Do not call the constructor!! This is a Singleton")
14         else:
15             Singleton.__instance = self

Presenter Notes

The "Iterator" Pattern

  • Behavioral
  • Goal: Access elements of object sequentially and w/o knowledge of internal representation
  • In Python we can add the __iter__ and __next__ magic methods
  • Let us check a simple example together: Writing a Class printing all odd numbers up to N

Presenter Notes

Hands On

Exercise Day-1/exercise-5.py

Tasks: Study a python iterator example

  • Study the code and try to understand it
  • Uncomment the commented line, understand why it raises an Exception
  • Bonus: Implement a "countdown" iterator

Presenter Notes

Day-1 Recap

  • Modularity
  • DRY
  • SOLID principles
  • Design Patterns

Thanks!

Presenter Notes