programming and coding¶

This two concept are often mixed up. We could use this two informal descriptions:

Programming is about thinking how to formalize a certain sequence of operation.

Coding is writing a computer program that actually perform the operations we have in mind.

While coding is strictly related to a specific programming language, but most of the programming skills are identical among different languages.

As a matter of fact, one does not even need a computer to be able to "program"

how do you prepare coffee?¶

Try describing the steps necessary to prepare a coffee.

pointers¶

  • are you describing a moka or a capsule machine?
  • how expert is the other person?
  • is the coffee already available?

programming concepts¶

There are really only 5 basic concepts to programming; anything else, in all languages, is basically just rehashing of these ideas.

  • storing and retrieving things
  • Conditional execution
  • functions and routines
  • iteration
  • data structure(s)

We will discuss these ideas (and all that follows) using python.

Python was not chosen because it's the best language. It's a solid all-rounder, that will allow you to experiment with ease in a lot of different facets of programming.

Python is not the best language at anything, but it's the second best for most things

(someone on Twitter)

Python is a high level and interpreted language.

high level means that it tries to express its concepts in a way closer to the human than to the machine

interpreted means that it does not execute directly on the machine, but generates commands for the python interpreter, that executes them

for sake of completeness, it's also a dynamically typed language.

Python tutor¶

To see what python is doing in the inside, you can go and play in this website.

It allows you to write python code and see what the python interpreter is doing with each piece of code.

http://pythontutor.com/visualize.html#mode=edit

In [93]:
a = 0+1j
b = 1 + a**2 / 2
print(b)
(0.5+0j)

notes:¶

we did not need to declare the existence of variables a and b beforehand

we did not need to declare their type beforehand

Python handles complex number naturally (not important, but nice)

In [95]:
a = 1
b = '2'
c = a + b
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-95-0be4553b3466> in <module>
      1 a = 1
      2 b = '2'
----> 3 c = a + b

TypeError: unsupported operand type(s) for +: 'int' and 'str'

notes:¶

Python will complain loudly if you make a mistake, and does not try to guess what you want to do if it's not uniquely defined. It's for your own good. Really.

Before panicking, read the error message, they are usually very clear!

if you are not sure what to look for, download and print this flowchart: https://www.dropbox.com/s/cqsxfws52gulkyx/drawing.pdf

In [8]:
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

conditional execution¶

do something only if a certain condition is met

In [97]:
x = 5
if x>0:
    print("positive number")
else:
    print("negative number")
positive number
In [4]:
y = 1 if x>0 else -1
print(y)
1

important note:¶

Python uses whitespaces (indentations) to represent code scoping: what is indented after the if and the else is related to those two commands, and only executed when you are inside them.

this is referred as a code block.

Before starting a new block, the previous line ends with a :

In [7]:
x = 0
if x>0:
    print("positive number")
elif x<0:
    print("negative number")
else:
    print("x is zero")
x is zero

functions and routines¶

From the language perspective of python they are the same thing, but from a conceptual perspective, they are very different.

  • routines do something to the system, and usually don't return anything interesting (like printing).
  • functions create new objects from the inputs, but don't modify the input themselves

they are both referred as functions, and can do both, but try to know which is which and plan accordingly.

And try to avoid mixing the two.

In [5]:
def get_sign_of_number(number_to_be_tested):
    sign = 1 if number_to_be_tested>0 else -1
    return sign

get_sign_of_number(-3)
Out[5]:
-1

An important concept related to functions is the concept of scoping.

When we execute a function we can use variables that are reserved to the function.

This allows to write simpler code.

In [101]:
def my_adder(x):
    global y
    g  = 3
    result = x + y + g
    y += 1
    return result

y = 4
print(my_adder(1))
8
In [104]:
print(my_adder(1))
11
In [8]:
print(g)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-e1cdc681402c> in <module>()
----> 1 print(g)

NameError: name 'g' is not defined

libraries¶

libraries are collections of functions and classes that allows to reuse well-written code (from you or from somebody else) in your new code.

to import a library you can do it in several ways.

whatever you need to implement, it is usually worth checking if there is already a library doing it for you.

In [9]:
import math
math.log(math.e)
Out[9]:
1.0
In [10]:
from math import log, e
log(e)
Out[10]:
1.0
In [11]:
from math import log as take_logarithm
take_logarithm(math.e)
Out[11]:
1.0

there is another way of importing functions from libraries:

from math import *

don't do this

everytime you import *, a puppy dies.

don't kill the puppies.

the most commonly imported libraries are usually imported with standard shorter names, and you will see most code using these conventions. Don't stray from conventions unless you need why you are doing it.

in scientific code, the most common ones are:

import numpy as np
import pandas as pd
import pylab as plt

DRY principle¶

DRY: Do not Repeat Yourself.

Repeating code (or copy pasting it) can lead to sever errors:

  • dependencies on surrounding code
  • forgetting to change a variable name
  • visual clutter

Functions are the primary way of implementing DRY principles.

The rule of thumb is that if you are going to do it more than twice, write a function instead... Sometime even if you are going to do it once!

In [15]:
import statistics

x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
x_med = statistics.median(x)
x_min = min(x)
x_max = max(x)
x_top = x_max - x_med
x_bot = x_med - x_min
x_ratio = x_top/x_bot
x_ratio
Out[15]:
1.25

This code is obscure and pollute the namespace!

In [19]:
def interquantile_asymmetry(data):
    """calculate the asymmetry between min, max and median"""
    x_med = statistics.median(data)
    x_min = min(data)
    x_max = max(data)
    x_top = x_max - x_med
    x_bot = x_med - x_min
    x_ratio = x_top / x_bot
    return x_ratio

interquantile_asymmetry(x)
Out[19]:
1.25

iterations¶

Repeating an operation over and over, typically with a different input each time, or with an exit condition.

Typically they underpin the usage of words like: while, for/for each, until.

There are two main approaches to iteration: bounded and unbounded.

In the bounded kind of iteration, the program have an idea on how many iteration cycles it will need. It usually corresponds to saying: for every one of these objects, do this.

The unbounded iterations are more related to exploration. The program has no idea beforehand on how many steps might be necessary. It usually corresponds to saying: to this thing until something happens

Unbounded iterations usually require some form of internal state that is used to perform decision to continue based on what has been seen previously.

In python, bounded iterations are represented with a for, while unbuonded with a while.

It is important to notice that:

  • These are not clear-cut distinctions, and sometimes things can get murky
  • bounded does not mean finite
  • the bounds might be known to the program but not to you
  • Python often transforms unbounded loops in bounded ones under the hood
In [20]:
years = [2010, 2011, 2012]
for year in years:
    print("The year is", year)
The year is 2010
The year is 2011
The year is 2012
In [107]:
import random
x = 0
while x<3:
    print("position=", x)
    step = random.choice([-1, 1])
    x = x + step
position= 0
position= 1
position= 2
position= 1
position= 0
position= 1
position= 2
position= 1
position= 2

sometimes one needs to break out early from a loop.

this can be done using the break keyword, that kills the loop immediately and completely

similarly, one can skip a single iteration of the loop with the continue keyword

In [3]:
# the `break` keyword is useful to perform a do-while loop

import random
x = 0
while True:
    print("position=", x)
    step = random.choice([-1, 1])
    x = x + step
    if x>=3:
        break
position= 0
position= 1
position= 2
position= 1
position= 0
position= 1
position= 2
In [4]:
# continue can be used to skip data that we don't want to process

years = [2018, 2019, 2020, 2021, 2022, 2023]
for year in years:
    if year<2020:
        continue
    print(year)
2020
2021
2022
2023

data structures¶

These are containers that holds the informations, and make more or less easy to retrieve and manipulate this information.

Programming can be described as creating and modifying data structures with the previous concept until the desired result is obtained.

There are a great number of data structures, and more get introduced in a language by libraries.

I will use the python names for these concepts, but they can be found one way or another in basically all programming languages.

basics¶

  • strings
  • maps
  • list
  • tuples

more advanced¶

  • objects
  • enum

Fundamental structures¶

Strings¶

They contain text, and allows for some (relatively) complex manipulation of it.

strings can be enclosed in single quotes, double quotes, triple single and triple double.

"text to be used"
'text to be used'
'''text to be used'''
"""text to be used"""

single or double quotes are identical, just chose the one you like the most.

triple quotes (both single and double) can span several lines

"""this string starts here
keeps going
and ends here"""

In [29]:
a = "hello WORLD"

"hello" in a
Out[29]:
True
In [30]:
a.title()
Out[30]:
'Hello World'
In [31]:
a.lower()
Out[31]:
'hello world'
In [32]:
a.upper()
Out[32]:
'HELLO WORLD'
In [33]:
a.startswith("hello")
Out[33]:
True
In [34]:
a.endswith("world")
Out[34]:
False
In [36]:
a.replace("hello", "ciao")
Out[36]:
'ciao WORLD'
In [109]:
pi = 3.141592653589793
"value of π: {:.3e}".format(pi)
Out[109]:
'value of π: 3.142e+00'

special characters¶

by default, special characters that cannot be rendered directly are represented with \ followed by some letters or codes that uniquely identify them

  • \n newline
  • \t tabulation
  • \U0001f604 unicode point for smiley face
In [48]:
print('\U0001f604')
😄

it can recognized the whole unicode point name, if you want to have a more human description of it

In [55]:
print('\N{grinning face with smiling eyes}')
print('\N{GREEK SMALL LETTER ALPHA}')
😁
α
In [110]:
print("α")
α

Indexing of objects¶

Several objects in python allow extracting subsequences by using the indexing operator [].

Each object specify how it reacts to the specific index you pass it.

Strings, lists and most other iterable objects use 0-based indexing

In [56]:
a = "hello world"
a[0]
Out[56]:
'h'
In [57]:
a[-1]
Out[57]:
'd'

it is also possible to specify ranges, with a start and stop positions (assumed by default to be the first and last element).

in a range the last element indicated is excluded from the sequence

In [58]:
a[:5]
Out[58]:
'hello'
In [59]:
a[-5:]
Out[59]:
'world'
In [60]:
a[:5] + a[5:] == a
Out[60]:
True

It is also possible to specify a step, that indicates if it has to skip some elements of the sequence

In [61]:
a[::2]
Out[61]:
'hlowrd'
In [62]:
a[::-1]
Out[62]:
'dlrow olleh'

Lists¶

Lists are collections of items all with some common characteristic.

[1, 2, 3]

is a list of integer numbers

['hello', 'world']

is a list of strings

['hello', 1, 'world']

is a list of generic python objects

Python will not enforce the kind of objects you can keep in a list, but you should in general be careful of the meaning of what you are doing.

Other programming languages can be more strict about it

The typical usage is to store object on which you want to perform some specific operation.

Therefore, the only restriction is that this operation is allowed on the data contained.

In [12]:
data = ['hello', 1, 'world']

for datum in data:
    print(datum*2)
hellohello
2
worldworld
In [13]:
data = [1, 2, 3]
2 in data
Out[13]:
True

Tuples¶

They store heterogeneous data, where the order is relevant.

Sometimes you will see them referred as immutable lists. This is an improper use.

A common use is to pack together multiple return values from a function.

In [4]:
info = ("enrico", "giampieri", "FIS/07")
In [17]:
info[0]
Out[17]:
'enrico'

They are typically used to return several values at the same time from a function

In [71]:
def min_max(data):
    min_ = min(data)
    max_ = max(data)
    return (min_, max_)

data = [0, 1, 2, 3, 4, 5, 6]
min_, max_ = min_max(data)
print(min_, max_)
0 6

Their greatest limitation is the limited expressivity: each position have a meaning, but there nothing directly expressing that meaning.

to correct this, newer versions of python introduced namedtuples in the mix, where the meanings are explicitly referenced by names.

In [72]:
from collections import namedtuple

Person = namedtuple("Person", ['age', 'name'])

person = Person(age=21, name="Tom")
print(person)
print(person.age)
print(person[0])
Person(age=21, name='Tom')
21
21

Even more information can be packed in using the NamedTuple factory from the typing module, from python 3.6 onward.

This allows to include a class description and informations about the fields expected type (but do not enforce them)

In [78]:
from typing import NamedTuple

class Person(NamedTuple):
    "a person with age and name"
    age: int
    name: str

person = Person(age=21, name="Tom")
print(person)
Person(age=21, name='Tom')

Dictionaries (maps)¶

dictionaries encodes relationships between two different groups of values.

it could also be seen as a list indexed by names instead of by position.

these pairs are usually referred as keys and values.

values can be any arbitrary python objects.

keys must be hashable objects, such as strings, numbers, booleans, tuples (if all the elements as hashable)

In [79]:
office_by_name = {'Enrico': 'd012', 
                  'Claudia': 'd08'}
office_by_name = dict(Enrico='d012', Claudia='d08')
In [80]:
office_by_name.keys()
Out[80]:
dict_keys(['Enrico', 'Claudia'])
In [81]:
office_by_name.values()
Out[81]:
dict_values(['d012', 'd08'])
In [21]:
for person, office in office_by_name.items():
    print(person, office)
Enrico d012
Claudia d08
In [22]:
office_by_name['Enrico']
Out[22]:
'd012'
In [23]:
'Giulia' in office_by_name
Out[23]:
False
In [24]:
info_by_ID = {'001': {'name': 'Enrico', 
                      'office': 'd012'}, 
              '002': {'name': 'Claudia', 
                      'office': 'd08'}
             } 
info_by_ID['001']['office']
Out[24]:
'd012'

Objects¶

they store data and provide a layer to interact with these data.

We will not get into object-oriented programming right now.

In [85]:
class Particle:
    def __init__(self, name, mass, charge=0):
        self.name = name
        self.mass = mass
        self.charge = charge
        
    def invert_charge(self):
        self.charge = - self.charge
        
    def weight(self, g=9.8):
        return self.mass * g
    
my_particle = Particle("mistery particle", 1.4)
my_particle.weight()
Out[85]:
13.72

IPython¶

the main way we will use to interact with python will be the ipython shell.

ipython special pre-commands¶

  • %<magic function> use magic functions

  • %%<cell magic> use cell magic mode

  • !<function> call system programs (non-interactive)

  • /<function> activate autocalling

    /print "a"

  • ,<function> (comma) activate autocalling and autoquoting (divide by whitespace)

  • ;<function> (comma) activate autocalling and autoquoting (ignore whitespaces)

    ,my_function a b c # becomes my_function("a","b","c") ;my_function a b c # becomes my_function("a b c")

all of these should be used as the first letter, the only exception being !, that can follow the = assignment

  • ? general help
  • <object>? help of an object
  • pattern? return all the variables in the namespace that follow the pattern
In [87]:
%whos
Variable         Type                          Data/Info
--------------------------------------------------------
NamedTuple       NamedTupleMeta                <class 'typing.NamedTuple'>
Particle         type                          <class '__main__.Particle'>
Person           type                          <class '__main__.Person'>
a                str                           hello world
b                str                           2
data             list                          n=7
datum            str                           world
e                float                         2.718281828459045
log              builtin_function_or_method    <built-in function log>
math             module                        <module 'math' from '/hom<...>36m-x86_64-linux-gnu.so'>
max_             int                           6
min_             int                           0
min_max          function                      <function min_max at 0x7f66401528c8>
my_particle      Particle                      <__main__.Particle object at 0x7f662b7e7dd8>
namedtuple       function                      <function namedtuple at 0x7f665d969400>
np               module                        <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>
office_by_name   dict                          n=2
person           Person                        Person(age=21, name='Tom')
pi               float                         3.141592653589793
random           module                        <module 'random' from '/h<...>lib/python3.6/random.py'>
step             int                           1
take_logarithm   builtin_function_or_method    <built-in function log>
text             list                          n=11
this             module                        <module 'this' from '/hom<...>3/lib/python3.6/this.py'>
x                int                           3
In [89]:
%whos list
Variable   Type    Data/Info
----------------------------
data       list    n=7
text       list    n=11
In [88]:
%who_ls int
Out[88]:
['max_', 'min_', 'step', 'x']
In [91]:
def myfun():
    import time
    time.sleep(0.1)
    
%timeit myfun()
100 ms ± 8.86 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)