=====================================
Introduction to Python for Scientists
=====================================

Stefan Schwarzer <sschwarzer@sschwarzer.com>

EuroSciPy 2012

Brussels, 2012-08-23


Tutorial content is at
http://sschwarzer.com/download/python_tutorial_euroscipy2012.zip


Overview
--------

* About this tutorial

* First steps in the interpreter

* Names

* Data types

* Conditional statements (`if ... elif ... else`)

* Loops (`for`, `while`)

* Exceptions (`try ... except ... else`, `try ... finally`)

* Modules

* Built-in functions

* Object-Oriented Programming (OOP)


About This Workshop
-------------------

* Introduction to the Python programming language

* No prior Python knowledge necessary

* ... but programming knowledge (ideally also in object-oriented
  programming)

* Objective of this workshop: participants should be able to
  write (at least) small programs

* There's always too little time. :-/

  Consequences:

  - Only the most important concepts and libraries are explained.
  - Possibly there's not enough time to finish the exercises.
  - You don't need to understand every detail.
  - Please give me feedback about how it goes during the break (or
    even earlier)

* We use Python 2.x (better supported at this time, i. e. there are
  far more libraries working with Python 2.x)

* If you have questions during the exercises, please ask!

* Remarks:

  - Code embedded in the text are enclosed in backticks, for example
    `number = 1`.
  - (G)Vim users get syntax highlighting for this text file if they
    load ("source") `talk.vim`.
  - Most of the example code in this file is executed with the help of
    a small script `example.py`. This is in the zip file for this
    tutorial.


Important Literature / Links
----------------------------

* Python homepage: http://www.python.org/

* Official Python documentation

  - Homepage: http://docs.python.org/
  - Tutorial: http://docs.python.org/tutorial/
  - Library Reference: http://docs.python.org/library/
  - Language Reference: http://docs.python.org/reference/

* Python Style Guides

  - http://www.python.org/dev/peps/pep-0008/
  - http://www.python.org/dev/peps/pep-0257/

* Python Package Index: http://pypi.python.org/pypi

* Common Python mistakes:
  http://sschwarzer.com/download/robust_python_programs_europython2010.pdf

* Book "Learning Python"
  http://shop.oreilly.com/product/9780596158071.do
  describes Python 2.x *and* 3.x ! :-)


The Python Language
-------------------

* Allows compact, highly readable programs

* Is relatively easy to learn

* Very useful for scientific software, but also
  - system administration
  - web applications
  - very good "glue language" to connect different systems

* Python supports these programming paradigms:

  - procedural
  - object-oriented
  - some functional programming elements (but not really a functional
    language like, say, Haskell)

* Pre-installed on many Linux systems; easily installable on
  Windows. There are installer packages ("distributions") especially
  for scientic use:

  - http://code.google.com/p/pythonxy/
  - http://www.enthought.com/products/epd.php


First Steps in the Interactive Interpreter
------------------------------------------

* Python interpreter can be used interactively. Just start
  `python` on the command line.

* Interactive mode is excellent to try things out.

* Best way to get a first impression of Python. Example:
.
    $ python
    Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
    [GCC 4.5.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 1
    1
    >>> 2 * (3 + 4)
    14
    >>> print u"Hello world!"
    Hello world!
    >>> 1 + "2"
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>> def output(arg):
    ...   print arg
    ...
    >>> output(7)
    7
    >>> help("print")
    The ``print`` statement
    ***********************

       print_stmt ::= "print" ([expression ("," expression)* [","]]
                      | ">>" expression [("," expression)+ [","]])

    ``print`` evaluates each expression in turn and writes the resulting
    object to standard output (see below).  If an object is not a string,
    it is first converted to a string using the rules for string
    [many more lines omitted]
    >>>
..
* Also, for advanced Python users, to try out modules (libraries).
  Example:
.
    $ python
    Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
    [GCC 4.5.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import urllib
    >>> url_file_obj = urllib.urlopen("http://sschwarzer.com")
    >>> data = url_file_obj.read()
    >>> url_file_obj.close()
    >>> data[:50]
    '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Stric'
..
* Exit the interpreter at the prompt `>>>` with ctrl-d on Linux/Unix,
  ctrl-z on Windows.


Highly Recommended: The Extended IPython Interpreter
----------------------------------------------------

* Available on Linux via the package system. IPython is part of
  the special scientific Python distributions.

* Tab completion
.
    In [8]:import math
    In [9]:math. <- (press <tab> right after typing the dot)
    math.__class__         math.asin              math.fsum
    math.__delattr__       math.asinh             math.gamma
    math.__dict__          math.atan              math.hypot
    math.__doc__           math.atan2             math.isinf
    math.__format__        math.atanh             math.isnan
    math.__getattribute__  math.ceil              math.ldexp
    math.__hash__          math.copysign          math.lgamma
    math.__init__          math.cos               math.log
    math.__name__          math.cosh              math.log10
    math.__new__           math.degrees           math.log1p
    math.__package__       math.e                 math.modf
    math.__reduce__        math.erf               math.pi
    math.__reduce_ex__     math.erfc              math.pow
    math.__repr__          math.exp               math.radians
    math.__setattr__       math.expm1             math.sin
    math.__sizeof__        math.fabs              math.sinh
    math.__str__           math.factorial         math.sqrt
    math.__subclasshook__  math.floor             math.tan
    math.acos              math.fmod              math.tanh
    math.acosh             math.frexp             math.trunc
..
* Command history (you can use the arrow keys to see previous
  commands, but you can also retrieve previous input and output with
  `_i<number>` and `_<number>`, respectively. Example:
.
    $ ipython
    Total number of aliases: 15
    Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
    Type "copyright", "credits" or "license" for more information.

    IPython 0.10.1 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object'. ?object also works, ?? prints more.
    In [1]: import math
    In [2]: math.sin(1.0)
    Out[2]: 0.8414709848078965
    In [3]: _i2
    Out[3]: u'math.sin(1.0)\n'
    In [4]: _2
    Out[4]: 0.8414709848078965
..
* Get help `<item>?` or even source code `<item>??`:
.
    In [5]: import urllib
    In [6]: urllib.urlopen?
    Type:           function
    Base Class:     <type 'function'>
    String Form:    <function urlopen at 0x235db18>
    Namespace:      Interactive
    File:           /usr/lib/python2.7/urllib.py
    Definition:     urllib.urlopen(url, data=None, proxies=None)
    Docstring:
        Create a file-like object for the specified URL to read from.

    In [7]: urllib.urlopen??
    Type:           function
    Base Class:     <type 'function'>
    String Form:    <function urlopen at 0x235db18>
    Namespace:      Interactive
    File:           /usr/lib/python2.7/urllib.py
    Definition:     urllib.urlopen(url, data=None, proxies=None)
    Source:
    def urlopen(url, data=None, proxies=None):
        """Create a file-like object for the specified URL to read from."""
        from warnings import warnpy3k
        warnpy3k("urllib.urlopen() has been removed in Python 3.0 in "
                 "favor of urllib2.urlopen()", stacklevel=2)

        global _urlopener
        if proxies is not None:
            opener = FancyURLopener(proxies=proxies)
        elif not _urlopener:
            opener = FancyURLopener()
            _urlopener = opener
        else:
            opener = _urlopener
        if data is None:
            return opener.open(url)
        else:
            return opener.open(url, data)
..
    Getting the source code will only work for items which are
    programmed in pure Python, not in compiled languages like C or
    Fortran.

* So called "magic" commands, prefixed with `%`. For example

  - `%alias`: create a simple command to save keystrokes
  - `%edit`: edit previously input lines or a file from inside IPython
  - `%macro`: combine several input commands into one for easy use
  - `%pwd`, `%cd`: get or set current working directory
  - `%timeit`: benchmark some code

* Invocation with `ipython -pylab` provides a Matlab-like environment
  where NumPy (matrix calculations) and matplotlib (diagramming) are
  already loaded.


Identifiers (Names)
-------------------

* Names consist of A-Z, a-z, 0-9 and _

* First character must not be a digit.

* Names are case-sensitive; `hello` and `Hello` are different
  identifiers.

* Keywords *cannot* be used as names.

* Keywords are
.
    and         del         from        not         while
    as          elif        global      or          with
    assert      else        if          pass        yield
    break       except      import      print
    class       exec        in          raise
    continue    finally     is          return
    def         for         lambda      try
..
* Built-in functions *should* *not* be used as names; see
  http://docs.python.org/library/functions.html

* PEP 8 lists naming conventions (more on this below). Examples:

  - constants: `CAPITALS_WITH_UNDERSCORES`
  - classes: `OneClass`
  - modules: preferably `amodule`, else `a_module`
  - everything else: `my_cool_identifier`

  Use of underscores:

  - non-public name: `_internal_name`
  - "private" attribute in a class: `__totally_internal`
  - special method names: `__str__`


Datatypes
---------

* Simple datatypes

  - numbers (`int`, `long`, `float`, `complex`)
  - boolean values (`bool`)
  - strings (`str`, `unicode`)

* Container datatypes

  - lists (`list`)
  - tuples (`tuple`)
  - dictionarys = associative arrays = hashes(`dict`)
  - sets (`set`)

* Functions, classes, methods

* Files

* Modules

* Some other, more exotic types


Numbers
-------

* Examples
.
    print 2 + 3 * 7  # 23
    print 8 / 4      # 2
    # integer division: integer part remains, fractional part not included
    print 9 / 2      # 4
    print 9. / 2     # 4.5
..
* There are `int` and `long` types, but you can usually ignore the
  difference. If necessary, `int`s are converted to `long`s
  implicitly.
.
    small = 2 ** 5
    print type(small), small
    large = 2 ** 1000
    print type(large), large
..
* `float` values correspond to C's `double` type with a precision
  of about 16 significant digits.
.
    print repr(1./7)
..
* `complex` values consist of real and imaginary part:
.
    z = (0+1j) ** 2
    print z, z.real, z.imag
..
* If different numeric datatypes are used in an expression, the values
  are implicitly converted to the "highest" type (`int` -> `long` ->
  `float` -> `complex`). Examples:
.
    print 1 + 2.1
    print 1 == 1.0
    print 1 == (1+0j)
..
* Interpreter playtime


Boolean Values
--------------

* Boolean values are the constants `True` and `False`.
.
    t = (1 == 1)
    print type(t), t
    f = (1 == 0)
    print type(f), f
..
* Actually, `True` and `False` are integer values in disguise.
.
    two = 1 + True
    print type(two), two
    null = False * 9
    print type(null), null
..
  Warning: Such expressions can impair the readability of the code.
  Think a bit about whether you can use other code that's easier to
  understand.

* Boolean values can be combined with `or`, `and` and `not`
  (in order of increasing priority).
.
    a = 1
    b = 2
    print (a >= 1) and not (b < 2)
..
* `and` and `or` are "short-circuit operators". This means the
  second operand won't even be evaluated if the result is already
  known. The result then is the first operand (but not necessarily
  `True` or `False`).

* More on `bool` values below


Strings
-------

* Python doesn't have a special type for single characters (unlike
  `char` in C).

* Strings are enclosed in quote characters:
.
  "test"

  'He said, "Hello!"'

  "That's it!"

  """A
    "multi-line"
      string"""

  '''And
      another one'''
..
* These "triple-quoted" strings can contain all other kinds of quotes,
  just not "themselves". So the following is *not* possible:
.
    """Triple-quoted strings """of the same kind""" can't
    be nested."""
..
* Quote characters can be "escaped" with backslashes if desired:
.
    print "He said, \"Hello!\""
..
  Usually this isn't necessary though.

* All strings so far are of type `str`. There are also unicode strings
  (type `unicode`).

* As in other languages, there are a few special strings:

    \"  escaped double quote
    \'  escaped single quote
    \n  line feed (newline)
    \t  tab
    \r  carriage return ("\r\n" is the usual line end string under
        Windows)
    \b  backspace

  Even characters which aren't on any keyboard can be expressed by
  their numerical codes:

    \xhh        byte with hexadecimal code hh
    \uhhhh      unicode character of code point hhhh
    \Uhhhhhhhh  ditto, with 8 hexadecimal digits
    see
    http://docs.python.org/reference/lexical_analysis.html#string-literals

* If there's an `r` before a quote character that introduces a string,
  the string is a "raw string". In it, escapes are ignored. Example:
.
    print """abc\ndef"""
    print
    print r"""abc\ndef"""
..
  The latter string actually doesn't contain a newline character but a
  `\` and an `n`.

* Strings can be compared with comparison operators like
  `==`, `!=`, `<` and `>=`.

* Interpreter playtime


The Basics of Basics about Unicode
----------------------------------

* The times of "plain text" are long gone. Without knowing the
  encoding of a text file, you can't possibly know which characters
  are in it.

* Unicode defines several thousand characters and assigns a number
  ("code point") to each character. This is an abstract value and
  says nothing about the bytes used to store the character.

* To physically store or transfer characters, you have to "encode"
  them to a byte string. Common encodings are UTF-8, ISO-8859-1
  (Latin 1), ISO-8859-15 (Latin 9) and CP-1252 (on Windows).

* Thus: unicode --- encode ---> bytes
        bytes   --- decode ---> unicode

* Recommended links:

  - http://docs.python.org/howto/unicode.html
  - http://www.joelonsoftware.com/articles/Unicode.html


Unicode Strings
---------------

* The strings discussed so far were byte strings (type `str`),
  sequences of bytes.

* In Python, there are also unicode strings (type `unicode`).

* To write a unicode literal, prepend the leading quote character
  with a `u`:
.
    byte_string = "byte string"
    print type(byte_string), byte_string
    unicode_string = u"unicode string"
    print type(unicode_string), unicode_string

    print

    # For the examples sake, assume the byte string contains
    # characters encoded in UTF-8.
    decoded_string = byte_string.decode("UTF-8")
    print type(decoded_string), decoded_string
    # Encode the unicode string to UTF-8.
    encoded_string = unicode_string.encode("UTF-8")
    print type(encoded_string), encoded_string
..
* If an expression contains both byte strings *and* unicode strings,
  byte strings are implicitly converted to unicode strings.
.
    s = "abc"
    u = u"def"
    print s < u
    together = s + u
    print type(together), together
..
* If you want to use unicode literals in Python source code (e. g. for
  the name of a person), you have to put a special comment in the
  first two lines of the file. Example:

.
    #! /usr/bin/env python
    # encoding: UTF-8
    ...
..
  If there's no such encoding comment, ASCII is assumed. In this case,
  you aren't allowed to use any non-ASCII characters in the file.

* *Remark* Python 3 always uses unicode for character strings and
  UTF-8 as default encoding for Python source code.


Byte Strings and Unicode Strings (Again)
----------------------------------------

* Accessing single characters
.
    uni = u"This is a string"
    print uni[0]    # Indices start with 0
    print uni[3]
    print uni[-1]   # Negative indices count from the end
    print uni[-3]
    print uni[100]  # Error
..
* Strings are "immutable":
.
    uni = u"test"
    uni[0] = u"r"   # Error
..
* Ranges of characters can be accessed with "slices":
.
    uni = u"This is a test."
    print uni[5:7]  # First index : last index plus 1.
    # You could combine the individual words of the sentence
    # (with trailing spaces or the dot) like this:
    print uni[0:5] + uni[5:8] + uni[8:10] + uni[10:len(uni)]

    print

    # If an index is omitted, the start or end of the string are assumed.
    print uni[:3]
    print uni[10:]
    print uni[:]    # A copy of the string
    print uni[:-2]  # The string without the last two characters
..
* Slices allow a step as a third value:
.
    print u"reversed"[::-1]
..
* Playtime. Suggestions:
.
    uni = u"abcdef"
    print 1, uni[2:2]
    print 2, uni[:100]
    print 3, uni[4:2]
    print 4, uni[:len(uni)]
    print 5, uni[:-len(uni)+1]
    print 6, uni[:-len(uni)]
    print 7, uni[:-100]
..

Lists
-----

* Lists can contain any number (including 0) of objects.
.
    L = []
    print "Empty list:", L
    L = [1, "abc", u"def", [9]]
    print "List with list:", L
..
* You can access individual elements in a list via indices.
  Since lists are mutable, you also can modify its contents.
  This isn't possible with strings.
.
    L = [1, 2, "abc"]
    print L[1]  # Indices count from 0.
    L[2] = 7
    print L     # [1, 2, 7]
..
* Slices work as with strings:
.
    L = range(10)      # The list [0, 1, 2, ..., 8, 9] (without 10!)
    print L[:3]        # [0, 1, 2]
    L[2:6] = range(3)  # Left and right side can have different lenghts.
    print L            # [0, 1, 0, 1, 2, 6, 7, 8, 9]

    print

    print u"Again, in slow motion ..."
    L = range(10)
    print u"Original list:", L
    print u"Part to replace:", L[2:6]
    print u"New part:", range(3)
    L[2:6] = range(3)
    print u"Result:", L
..
* Use the `append` and `extend` methods to add elements to a list:
.
    L = [1, 2]
    print u"Original list:", L
    L.append(3)
    print L
    L.extend([4, 5])
    print L
..
* `insert` allows to insert elements "in the middle":
.
    L = [1, 2, 3]
    L.insert(1, 7)
    print L
..
* Lists are equal if they contain the same elements in the same order.
.
    L1 = [1, 2, 3]
    L2 = [1, 2]
    print L1 == L2
    L2 = [1, 3, 2]
    print L1 == L2
    L2 = [1, 2, 3.0]
    print L1 == L2
..
* Playtime. What happens here?
.
    L = range(8)
    print L[:]
    print L[:-2]
    print L[3:3]
    L[:] = [3, 2, 1]
    print L

    print

    L = range(8)
    L[3:2] = [-1, -2]
    print L
..

Tuples
------

* Tupels are very much like lists, but are immutable.
.
    t = (1, 2, 3)
    print t[1]
    print t[:2]
    t[1] = 7  # Error
..
* Empty and single-element tuples:
.
    empty_tuple = ()
    single_element_tuple = (1,)  # Use comma to distinguish from expression.
..
* "Tuple unpacking" is often used, either explicit or implicit.
  It also works with lists on the right hand side.
.
    a, b = 1, 2
    print a, b

    a, b = b, a
    print a, b

    greeting = u"Hello world!"
    first_word, second_word = greeting.split()
    print first_word
    print second_word
..

Dictionaries
------------

* Dictionaries contain key/value pairs.
.
    d = {}
    d = {2: "abc", 1: 2}
    print d
    print d[1]
    print d[2]
    print d[3]
..
  The pairs have *no* guaranteed order!

* Every key can only exist once in a dictionary. If you assign to the
  key again, the old value is overwritten with the new one.
.
    d = {2: "abc", 1: 2}
    print d[1]
    d[1] = "def"
    print d
..
* An assignment to a key which didn't exist before creates a new
  dictionary entry (key/value pair).
.
    d = {1: 2}
    d["a"] = 17
    print d
..
* Keys must be immutable.
.
    d = {}
    d[(1, 2)] = 1
    print d
    d[[1, 2]] = 2
..
* Dictionaries are equal if they contain the same key/value pairs.
  Note that "same" here means that the keys or values compare as
  equal.
.
    d1 = {1: 2, 3: 4}
    d2 = {3: 4.0, 1.0: 2}
    print d1 == d2
..

Sets
----

* Sets are similar to dictionaries.
.
    s = set([1, 2, 3])
    s = {1, 2, 3}  # Since Python 2.7
    print s        # This is *not* a list.
    s.add(4)
    print s
..
* Sets contain only values, no key/value pairs.

* There's *no* guaranteed order.

* Contained elements must be immutable.
.
    s = set()
    s.add((1, 2))
    print s
    s.add([1, 2])
..
* Sets can be used for set operations. Example: Which values are in
  set 1, but *not* in set 2?
.
    s1 = set(range(5))     # 0, 1, 2, 3, 4
    s2 = set(range(3, 8))  # 3, 4, 5, 6, 7
    print s1 - s2
    print s1.difference(s2)
..
  Which values are in set 1 *and* set 2?
.
    s1 = set(range(5))
    s2 = set(range(3, 8))
    print s1 & s2
    print s1.intersection(s2)
..

Names and Assignment
--------------------

* *Python does *not* copy (only if requested explicitly)!

* An assignment assigns a name (left) to an object (right).

* Tip: The `is` operator tells if two objects are *identical*, not
  just if their values compare equal.
.
    x = 1.0
    y = x
    print x is y
    y = 1.0
    print x is y  # Result depends on Python implementation.
..
* The following code is legal (but not recommended):
.
    a = 1       # assign int
    a = "abc"   # assign str
    a = [1, 2]  # assign list
..
  Each of these assignments connects an object of a different type
  with the same name. The previously assigned object is discarded if
  it's not used anywhere else.


Names and Assignment - Immutable Objects
----------------------------------------

* Assignments
.
    x = 1.0
    y = x       # x and y refer to the same object.
    y = 1.0     # Create a new object 1.0 and let point y at it.
..
* After the first assignment:

  x ---> 1.0

* After the second assignment:

  x ---+
       +---> 1.0
  y ---+

* After the third assignment:

  x ---> 1.0  (the object created first)

  y ---> 1.0  (the object created in the third line)


Names and Assignment - Mutable Objects
--------------------------------------

* What happens here?
.
    L1 = [1, 2]
    L2 = L1
    L1.append(3)
    print L1
    print L2  # Also modified!
    print L1 is L2
..
* After the first assignment, `L1` is a name referring to the list
  `[1, 2]` *which is created before the assignment*.

  L1 ---> [1, 2]

* After the second assignment, the *single* list object is known under
  two names, `L1` and `L2`.

  L1 ---+
        +---> [1, 2]
  L2 ---+

* The `append` call modifies this single list.

* This "anomaly" isn't possible for immutable objects (like numbers or
  strings). After all, by definition, you can't modify them.

* Playtime. What happens here?
.
    L1 = [1, 2]
    L2 = L1
    t1 = (L1,)
    t2 = t1
    L1.append(3)
    print L1, L2
    print L1 is L2
    print t1, t2
    print t1 is t2
..

String Methods
--------------

* Some useful methods:
.
    s = u"This is a string"
    print s.count(u"i")
    print s.encode("UTF-8")  # See `decode`.
    print s.startswith(u"This")
    print s.endswith("ing")
    print s.lower()
    print s.upper()
    parts = s.split()
    print parts
    print u"*".join(parts)
    print u"abc\ndef\nghi".splitlines()
..
* Other useful operations on strings:
.
    print u"abc" + u"def"
    print 10 * "=-"

    s = u"This is a string"
    print len(s)
    print u"is" in s
..

String Formatting
-----------------

* Strings can be combined with each other and with other
  objects in rather flexible ways.

* Syntax: `string_object % object`

* Examples:
.
    print u"one: %d" % 1

    print u"%s %s" % (u"one", u"two")
    print u"%d plus %d is %d" % (2, 3, 2+3)
    print u"right-adjusted with two fractional digits: %10.2f" % (1./3)

    # Beware! There's an anomaly for tuples with one element:
    print u"an integer: %s"   % 3
    print u"a string: %s"     % u"string"
    print u"a list: %s"       % [1]
    print u"a dictionary: %s" % {1: 2}
    print u"a tuple: %s"      % (u"1",)
..
* Variant: named placeholders and a dictionary
.
    name = {"first_name": u"Stefan", "last_name": u"Schwarzer"}
    print u"%(first_name)s %(last_name)s" % name
    print u"%(last_name)s, %(first_name)s" % name
..
  See
  http://docs.python.org/library/stdtypes.html#string-formatting-operations


List Methods
------------

* Already seen: `append(value)` and `extend(iterable)`

* `iterable`: everything you can iterate over. This will be explained
  in more detail later. In `extend`, the `iterable` is usually a list.

* `sort` sorts a list "in place", i. e. the list itself is modified.
  The result (return value) is the special value `None`.
.
    L = [3, -1, 9, 7]
    print L.sort()  # This is really the value `None`, but `print`
                    # converts this to the string "None".
    print L
..

  `sort` takes an optional argument `key`. This is a function which is
  applied to all list elements. The list elements are sorted according
  to these function results. The list elements themselves don't change.
.
    L = ["def", "ABC", "Def", "abc"]
    L.sort(key=str.lower)
    print L
..
* `reverse` reverses the list order in-place and returns `None`.

* `index(value)` returns the index of the first element which has the
  value `value`.
.
    L = [1, 7, 3, 2]
    print L.index(3)
    print L.index(-1)
..
* Also useful:
.
    L = [1, 2, 3]
    del L[1]
    print L
    L = [1, 2, 3]
    del L[1:]
    print L

    print

    L = [1, 7, 3, 2]
    print len(L)

    L1 = [1, 2]
    L2 = [2, 3]
    print L1 + L2

    L = [0]
    print 10 * L
..
  *Beware*: When "multiplying" a list with an integer, the list is
  *not* copied. This means the elements of the resulting list are the
  same object.
.
    inner_list = [1, 2]
    outer_list = 5 * [inner_list]
    print outer_list
    inner_list.append(3)
    print outer_list
..
* Interpreter playtime


Tuple Methods
-------------

Tuples have some of the methods of lists, but no methods that can
change the tuple. After all, tuples are immutable.


Dictionary Methods
------------------

* `keys` returns a list of all dictionary keys, `values`
  the values and `items` a list of `(key, value)` pairs. The orders of
  the methods' results are consistent *as long as the dictionary is
  not modified between the calls*.
.
    d = {"str": "abc", "int": 1, "list": [1, 2, 3]}
    print d
    print d.keys()
    print d.values()
    print d.items()

    print

    d["tuple"] = (2, 3, 4)
    print d
    print d.keys()
    print d.values()
    print d.items()
..
* `get(key, default)` returns the value for key `key` if it's in the
  dictionary, otherwise the method returns `default`. The second
  argument is optional; in this case the default is `None`.
.
    d = {1: 2}
    print d.get(1)
    print d.get(2)  # Not in dictionary
    print d.get(2, 3)
    print d
..
* `setdefault(key, default)` is very similar to `get`, but if the
  key doesn't exist yet, the default is inserted in the dictionary
  as this key's value. Again, the `default` argument is optional.
.
    d = {1: 2}
    print d.setdefault(1)
    print d.setdefault(2)  # Not in dictionary
    print d.setdefault(2, 3)
    print d
..
* `dict1.update(dict2)` updates the dictionary `dict1` with the
  key/value pairs of `dict2`. If a key exists in both dictionaries,
  the value from the second dictionary is used.
.
    d1 = {1: 2, 3: 4}
    d2 = {3: 17, 5: 6}
    d1.update(d2)
    print d1
..

* Also useful:
.
    d = {1: 2, 3: 4}
    print (3 in d)
    print (5 in d)

    print len(d)  # Number of keys/values/items, but not keys _and_
                  # values

    del d[1]
    print d
..


Exercise
--------

What are the contents of `d` after running the following code?
.
    L = range(5)
    L[:-1] = range(10, 12)
    t = (1, 2, 3)
    d = {1: L, 2: t}
    L.append(u"Python is great".split()[0])
    L = 2 * L
    t.append(2)
..

Code Blocks
-----------

* Code blocks are *not* wrapped in `{ ... }` or `begin ... end`.

* Code blocks are created by indenting all statements in them by the
  same amount of spaces. Example:
.
    for value in list_:
        if something:
            a = value + 1
            b = value + 2
        else:
            a = value + 3
            b = value + 4
        a_function(a, b)
..
  Note on `list_`: It's recommended not to introduce a name which is
  also the name of a built-in.

* The Python Style Guide (PEP 8) requires *four* *spaces* per
  indentation level.

* Most publicly available Python code is formatted according to this
  convention.

* Most programming editors automatically adapt their settings
  accordingly when they "notice" that a Python file is edited.
  Often, pressing the tab key then inserts four spaces, not a tab
  character.

* If your editor of choice doesn't make these changes automatically,
  configure it so that it does. :-)


Conditional Statements
----------------------

* The structure of the `if` statement is
.
    if condition1:
        block1
    elif condition2:
        block2
    elif condition3:
        block3
    else:
        block4
..
* The `elif` and `else` branches are optional. `elif` without
  `else` is also possible.

* The condition expressions don't need to be written in brackets.

* After each condition and after `else` follows a colon.

* The condition expressions don't have to evaluate to a boolean
  value.


Boolean Values
--------------

* Not only `True` and `False` are usable as results of a condition
  expression.

* In Python, the following expressions are considered false:

  - `False`
  - `None`
  - all values with the numerical value 0: 0  0.0  0+0j
  - empty strings: ""  u""
  - empty containers: []  ()  {}  set()  frozenset()
  - for completeness:
    - objects of user-defined classes which define the `__nonzero__`
      method and return `False` from it.
    - objects of user-defined classes which define `__len__`, but
      not `__nonzero__` and calling `__len__` returns 0.

* All other values are considered true.

* If you're unsure, check with `bool(value)`:
.
    print bool([])
    print bool("")
    print bool(set())
    print bool(0)
    print
    print bool([1])
    print bool(1e-50)
..

Statements Spanning More Than One Line
--------------------------------------

* "Usually" a Python statement ends at the end of the line.

* Exceptions:

  - The line ends with a backslash `\` (which is not part of a
    comment).
  - At the end of a line there are still "open brackets".

* Examples:
.
    a = 1    # One line, one statement

    b = 2 + \
        3

    c = 2 +  # This does *not* work
        3

    d = (2 +
         3)

    e = [1,
         2,
         3,  # (All containers allow a comma after the last element.)
        ]
..

Loops
-----

* `while` loops
.
    while condition:
        block1
    else:
        block2
..
  Write colons after the condition and `else`.

  The `else` branch is optional and is rarely used.

* You can skip an iteration with the `continue` statement. When
  `continue` is executed, the condition at the top of the loop is
  evaluated again.

* You can leave a loop with the `break` statement.

* If you have nested loops (`while` and `for` loops), `continue` and
  `break` refer to the innermost loop that contains these statements.

* `for` loop:
.
    for value in iterable:
        block1
    else:
        block2
..
  Use colons after the iterable and the `else`.

  Again, the `else` part is optional and rarely seen.

* The `value` can be a tuple. In this case, each element of the
  iterable is "unpacked" so that it can be assigned to the names
  after the `for`.
.
    for first, second in [(1, "a"), (2, "b")]:
        print first
        print second
        print
..
* If you use `continue`, the loop continues with the next element
  of the iterable.

* Because the `for` loop is so flexible, you seldem need to work
  with indices. If you *do* need indices, for example to overwrite
  list elements, use the `enumerate` function:
.
    L = range(5)
    print L
    for index, value in enumerate(L):
        L[index] = value + 1
    print L
..
* Loops also come in the form of expressions, so-called "list
  comprehensions". This example creates a list of the squares
  of all even numbers between 0 and 10:
.
    squares = [i**2 for i in xrange(10+1) if i % 2 == 0]
    print squares
..
  You can nest `for` and `if` fragments in list comprehensions.
  However, a good rule of thumb is to limit this to

  - one `for` and one `if` fragment or
  - two `for` fragments.

  Otherwise, better use explicit `for` and `if` statements, or your
  code will be difficult to understand for others (and maybe for
  you as well a few months later ;-) ).


Exercise
--------

Write a script to calculate the factorial of a number. The argument
should be passed in via the command line.

Factorial `n!` definition:

    n! = 1 * 2 * 3 * n
    0! = 1

The factorial function is undefined for negative values, so you
should check for this error condition.

Tip: You can get the first command line argument as an integer
with
.
    import sys

    n = int(sys.argv[1])
..

Here's a possible solution. I use `n` as a constant because I can't
use command line arguments with the editor invocation trick.
.
    n = 7

    if n < 0:
        print u"Argument %d is invalid!" % n
    else:
        factorial = 1
        # `range(1, 1)` is an empty list - therefore we don't need
        # special handling of the case n == 0 .
        for i in range(1, n+1):
            factorial = factorial * i
        print factorial
..
With small "optimizations":
.
    n = 7

    if n < 0:
        print u"Argument %d is invalid!" % n
    else:
        factorial = 1
        # `xrange` doesn't create a list but an iterable that needs
        # very little memory.
        for i in xrange(1, n+1):
            # So-called "augmented assignment"
            factorial *= i
        print factorial
..

Functions
---------

* Syntax
.
    def name(argument_list):
        block
..
* Probably the simplest function:
.
    def empty():
        # `pass` does nothing, but it's needed to have a block at all.
        pass

    print empty()
..
* With one argument:
.
    def add_one(n):
        return n + 1

    print add_one(2)
..
* With two arguments, one of them a default argument:
.
    def add(n, summand=1):
        return n + summand

    print add(2)
    print add(2, 4)
..
* The argument list can have an arbitrary number of arguments.

* Default arguments have to go at the end, so this is *invalid*:
.
    def add(summand=1, n):
        return n + summand

    print add(4, 2)
..
* Default are evaluated *only once* - when the function is defined!
.
    def append_two(list_=[]):
        list_.append(2)
        return list_

    print append_two([1, 2, 3])
    print append_two()
    print append_two()
    print append_two()
..
* The `return` statement returns a value from the function. The return
  value can be a tuple. This way, you basically can return multiple
  values.

* A function can have more than one `return` statement. They can return
  values of different types, but usually it's better not to do this.

* If there's no `return` statement or it doesn't have a value, the
  returned value is `None`. However, it's probably better style to
  express your intention by writing `return None`.

* If you use names in the call, the order of the arguments can be
  different from the order in the function definition. Example:
.
    def my_append(list_, value):
        list_.append(value)
        return None

    L = [1, 2, 3]
    my_append(L, 4)
    print L

    L = [1, 2, 3]
    my_append(value=4, list_=L)
    print L
..
* A string immediately after the `def` line is a "docstring" and can
  be accessed programmatically.
.
    def stupid_function():
        """Return `None`.

        This function isn't really useful. It's only supposed to
        show how a docstring is defined and used.
        """
        return None

    print stupid_function.__doc__
..
* With the syntax `*args` and `**kwargs` it's possible to pass
  arbitrary positional and keyword arguments.
.
    def show_arguments(*args, **kwargs):
        """Show positional and keyword arguments."""
        print u"Positional arguments:", args
        print u"Keyword arguments:", kwargs

    show_arguments(1, 2, "abc", a=7, x=(3, 4), b=[1, 2])
..

Exercise
--------

Write a function to calculate a factorial. The function should accept
an optional string argument which is the error text shown if the
argument is less than zero.
.
    def factorial(n, error_text=u"argument %s is invalid"):
        """Return the factorial of `n`.

        If `error_text` is given, it must be a string for the
        error message if `n` is less than 0. The error string
        can contain a placeholder `%s` which is replaced by the
        passed-in value of `n`.
        """
        if n < 0:
            print error_text % n
            # "Real" error handling follows.
            return None
        factorial = 1
        for i in xrange(1, n+1):
            factorial *= i
        return factorial

    print factorial(0)
    print factorial(1)
    print factorial(7)
    print factorial(-1)
    print factorial(-1, error_text=u"can't calculate factorial of %s")
..

Exception Handling
------------------

* To write sensible programs, you need one more feature: exception
  handling.

* Without exception handling, this code would abort the program.
.
    fobj = open("/not_there")
..
* Syntax for exception handling:
.
    try:
        block1
    except exception_class1[, exception_object1]:
        block2
    except exception_class2[, exception_object2]:
        block3
    else:
        block4
..
* Syntactically, the part `, exception_object` is optional. You'll
  need it if you want to access attributes of the concrete exception
  object. For more on exception classes and objects see below.

* There must be at least one `except` block. There can be as many as
  you like.

* The `else` branch is optional. It's executed if *none* of the error
  conditions applies.

* Example:
.
    try:
        fobj = open("/not_there")
    except IOError, exc:
        # `exc` is no string, but is converted to one implicitly.
        print u"Directory or file not accessible: %s" % exc
    else:
        data = fobj.read()
        fobj.close()
..
* Unconditional execution: With `try ... finally`, a code block is
  executed in any case, i. e. regardless whether there's an exception
  or not.
.
    try:
        block1
    finally:
        block2
..
* Example:
.
    try:
        fobj = open("python_tutorial.txt")
    except IOError, exc:
        print u"Directory or file not accessible: %s" % exc
    else:
        # Only this is new ...
        try:
            data = fobj.read()
        finally:
            # Also executed if there's an error in `fobj.read`.
            print "Closing ..."
            fobj.close()
..
* To re-raise an exception after some error handling, use `raise`:
.
    try:
        fobj = open("/not_there")
    except IOError, exc:
        print u"Directory or file not accessible: %s" % exc
        raise
    else:
        data = fobj.read()
        fobj.close()
..
* You can also raise exceptions yourself:
.
    raise ValueError("invalid value")
..
* You can combine `try ... except` and `try ... finally` like this:
.
    try:
        ...
    except ...:
        ...
    except ...:
        ...
    finally:
        ...
..
* There's also
.
    try:
        ...
    except:  # No exception class.
        ...
..
  This catches (almost) all errors - including some you probably don't
  want to catch. ;-)
.
    try:
        fobj = opne(a_file)
    except:
        print u"File not found!"
..
  *This hides a `NameError` exception!*


Exercise
--------

Rewrite the factorial function so that it uses exception handling.
In case of an invalid argument, a `ValueError` should be raised.
(I'll mention a better approach below.)
.
    def factorial(n, error_text=u"argument %s is invalid"):
        """Return the factorial of `n`.

        If `error_text` is given, it must be a string for the
        error message if `n` is less than 0. The error string
        can contain a placeholder `%s` which is replaced by the
        passed-in value of `n`.

        If the argument `n` is invalid, raise a `ValueError`.
        """
        if n < 0:
            raise ValueError(error_text % n)
        factorial = 1
        for i in xrange(1, n+1):
            factorial *= i
        return factorial

    print factorial(0)
    print factorial(1)
    print factorial(7)
    try:
        print factorial(-1)
    except ValueError:
        print u"Continue ..."
    print factorial(-1, error_text=u"can't calculate factorial of %s")
..

Custom Exception Classes
------------------------

* Normally you define custom exception classes so that another
  user of the code can handle them specially. Imagine everyone
  raised `ValueError`s everywhere. You wouldn't be able to tell
  which `ValueError` means what.

* Recipe for your own exception class:
.
    # usually `Error` at the end of the name
    class EuroSciPyLocationError(Exception):
        pass
..
* Use like:
.
    def euroscipy(location):
        ...
        # Hopefully not ...
        if is_burnt_down(location):
            raise EuroSciPyLocationError("invalid location")
..
  A caller could then react:
.
    try:
        euroscipy("ULB")
    except EuroSciPyLocationError:
        euroscipy("Atomium")
..

Modules
-------

* Modules are files containing Python code.

* The "normal" Python distribution contains several hundred modules.
  Many more are contained in the scientific Python distributions
  and in the Python Package Index (PyPI).

* A file ending in `.py` can be used both as a module and as a
  standalone program.

* Modules are loaded with the `import` statement. You can access the
  names in the module with `module_name.name`.
.
    import example

    # Modules can have docstrings, too.
    print example.__doc__
    print example.EXAMPLE_FILENAME
..
* Of course, there are usually other things in a module than just
  constants.

* `import` variations:
.
    from example import EXAMPLE_FILENAME

    print EXAMPLE_FILENAME
..
  This saves a bit typing, but isn't recommended: The previously
  described form is better in that it shows the module each name comes
  from.

* There's also `from example import *`. This is even worse. Better
  use it only for shortcuts in Python's interactive mode.

* Modules can only be imported, if they

  - are in the same directory as the importing module, or
  - are in one of the directories in Python's module search path
    (`sys.path`, can be extended with the `PYTHONPATH` environment
    variable).
.
    $ export PYTHONPATH=/home/me/python/my_modules
..
* Many Python programs have the code
.
    if __name__ == "__main__":
        do_something()
..
  at the end. Every module has a `__name__` attribute. If a module is
  imported, `__name__` refers to the name of the module. If the file
  is executed as a Python program, the name is "__main__". So the same
  file can be imported as a module (without running `do_something`) or
  be run as a script.

* As an extension of the module concept, there are "packages". Usually
  a package is a directory with a file `__init__.py` in it (see above).
.
    package_directory
        __init__.py
        module1.py
        module2.py
..
  Modules from the package can be imported like this:
.
    import package_directory  # Implicitly imports __init__.py
    import package_directory.module1
    # or
    from package_directory import module1
    # or
    from package_directory import module1 as other_name
    # Better not (see above)
    from package_directory.module1 import name1, name2
..
* If you have more than one package containing a module with the
  same name, you can rename the modules to avoid naming conflicts.
.
    from package1 import module as package1_module
    from package2 import module as package2_module
..
* *Beware!* Make sure you do *not* import the same module from
  different directories in `sys.path`. This can lead to strange bugs.
  Example:
.
    $ pwd
    /home/schwa/test
    $ mkdir subdir
    $ export PYTHONPATH=/home/schwa/test
    $ touch subdir/__init__.py  # explained later
    $ echo "a = 1" > subdir/my_module.py
    $ cd subdir
    $ python
    ...
    >>> import my_module         # From current directory
    >>> my_module
    <module 'my_module' from 'my_module.py'>
    >>> import subdir.my_module  # from PYTHONPATH
    >>> subdir.my_module
    <module 'subdir.my_module' from '/home/schwa/test/subdir/my_module.pyc'>
    >>> my_module is subdir.my_module
    False
    >>> my_module.a
    1
    >>> subdir.my_module.a
    1
    >>> my_module.a = 2
    >>> my_module.a
    2
    >>> subdir.my_module.a
    1
..

Exercise
--------

Move the factorial function into a module. Write another Python module
which imports the factorial module and uses the function.
.
    # encoding: UTF-8
    # factorial.py


    def factorial(n, error_text=u"argument %s is invalid"):
        """Return the factorial of `n`.

        If `error_text` is given, it must be a string for the
        error message if `n` is less than 0. The error string
        can contain a placeholder `%s` which is replaced by the
        passed-in value of `n`.

        If the argument `n` is invalid, raise a `ValueError`.
        """
        if n < 0:
            raise ValueError(error_text % n)
        factorial = 1
        for i in xrange(1, n+1):
            factorial *= i
        return factorial
..
.
    # encoding: UTF-8
    # factorial_test.py

    import factorial


    arguments = [0, 1, 2, 5, 10]
    expected_results = [1, 1, 2, 120, 3628800]

    for argument, expected in zip(arguments, expected_results):
        result = factorial.factorial(argument)
        if result != expected:
            print u"Error! %s != %s" % (result, expected)

    try:
        result = factorial.factorial(-1)
    except ValueError:
        pass
    else:
        print "Error! `factorial` should have raised a `ValueError`."
..

Important Modules
-----------------

* A list of the modules of a standard Python installation is at
  http://docs.python.org/modindex.html .

  The modules `sys` and `os` are used in almost every program.
  The remaining listed modules are sorted alphabetically.

* `sys`: system informationen, for example `sys.path`, `sys.modules`,
  `sys.argv`

* `os`: various file and process operations, for example, `os.listdir`
  returns a list of directory and file names in a directory.

  `os.path`: file operations, for example
  `os.path.join("dir1", "dir2", "datei")` -> `"dir1/dir2/datei"`
  with the appropriate separators. `os.path` is available as soon as
  you import `os`, so the latter is enough.

* `cgi`: simple tools for web interfaces. There are _lots_ of web
  frameworks, but still, `cgi` provides some helpers like `cgi.escape`
  to convert HTML special characters to their entity form ("<" ->
  "&lt;" etc.)

* `ConfigParser`: read and write ini style configuration files

* `csv`: support for comma separated values files

* `email`: parse and create e-mails, including support for attachments

* `fnmatch`, `glob`: filter directory or file names by pattern

* `bz2`, `gzip`, `tarfile`, `zip`, `zlib`: archive file support

* `logging`: very flexible logging framework

* `math`, `cmath`: mathematical operations

* `operator`: functions equivalent to operators; useful for list
  comprehensions and callback functions

* `pickle`, `shelve`: persistence for Python objects

* `profile`: profiler for runtime analysis

* `random`: random numbers

* `re`: regular expressions (see example `contents.py`)

* `shutil`: high-level directory/file operations

* `StringIO`: provide strings with a file interface

* `subprocess`: call external programs, including redirection of
  standard input, output, error

* `time`, `datetime`, `calendar`: date/time operations

* `threading`, `multiprocessing`: support for concurrency on
  thread and process level, respectively. I guess you'll hear
  a lot about concurrency at this conference. :-)

* `unittest`, `doctest`: frameworks for automated tests

* `xml`: different XML parsers; `ElementTree` is the easiest one most
  of the time


Built-in Functions
------------------

* Apart from the functions and classes defined in modules, Python
  contains quite a few built-in functions that can be accessed without
  importing anything.

* `complex`, `float`, `int`, `str` and `unicode`, to "convert"
  objects to the respective type. (The original objects actually
  aren't changed.)
.
    print complex("17")
    print float("7.7")
    print int("17"), int("0xf", 16)
    print str(17), str(str), type(str(17))
    print unicode(17), unicode(unicode), type(unicode(17))
..
  `str` is also implicitly called by the `print` statement and when
  using `%s` in format strings.

  `int` can also convert to the `long` type. (Actually, the difference
  between `int` and `long` rarely matters because the two are
  converted to each other on the fly when necessary.)

  Besides `str` and `unicode`, `repr` also creates `str` objects.
  Contrary to the result of `str`, `repr` creates rather low-level
  representations more useful for debugging.
.
    print str("Test"), str("abc\ndef")
    print
    print repr("Test"), repr("abc\ndef")
..
* `range` returns a list of numerical values. The function takes up to
  three parameters. More information is at
  http://docs.python.org/library/functions.html#range .

* `xrange` accepts the same arguments, but returns an `xrange` object
  which doesn't calculate a list in advance. `xrange` is mostly used
  in loops.

* `len` returns the length of its argument. _For example_, this works
  for strings, lists and dictionaries (where the "length" is the
  number of key/value pairs).

* `open(name, mode)` opens a file. `name` is the file name (possibly
  with a directory part); `mode` is a mode like

  - `r`, the default, stands for reading, `w` for writing and `a`
    for appending.

  - A `b` after one of the previous letters opens the file in "binary
    mode". Use this when you process binary data like image files.

* `sorted`, applied to an iterable, returns a sorted list from the
  iterable. The argument is *not* modified. Like the `sort` method
  of lists, `sorted` can take a `key` argument.

* `getattr`, `setattr` and `delattr` are used to read, write and delete
  attributes of objects. `hasattr` checks if an object has an attribute.
  Interesting is that the names of the attributes are passed in as
  strings, so these functions can work on *calculated* attribute
  names.

* `globals` returns a dictionary with the names and values of the
  top-level module namespace.

* `locals` works similarly. It returns a dictionary for the local
  names and values (for example, within a function).

  Note: While you can modify `globals()` to change the actual module
  globals, this doesn't work with `locals()`.

* `zip` pair-wise combines two or more iterables. See
  http://docs.python.org/library/functions.html#zip .

* `raw_input` reads a string from standard input (stdin).

* `abs` calculates the absolute value of its argument.

* `max`, `min` and `sum` calculate the maximum, minimum and sum,
  respectively, of their argument. Many more mathematical operations
  are in the `math` and `cmath` modules (see above).

* More built-in functions are described under
  http://docs.python.org/library/functions.html .

* Playtime


Name Lookup
-----------

* When you use a name in the code, Python tries to find the name
  and the associated object in the following order. If the name is
  found, the latter locations aren't tried.

  - L - local (for example, in a function which contains the name)
  - E - enclosing (in "enclosing" functions/methods; you can nest
        functions!)
  - G - global (in the module-global namespace)
  - B - built-in (in the builtin namespace)

* An example:
.
    # Module my_module.py

    G = 1                          # Module-global

    def outer_function(list_):
        E = list_                  # Local from the point of view of the
                                   # outer function, enclosing from the
                                   # point of view of the inner function
        def inner_function():
            L = 3                  # Local
            return L + E + len(E)  # `len` is found in the built-ins
        return inner_function
..
* If you want to access a module-global name from inside a function,
  you need to use the `global` statement (not to be confused with
  the `globals()` built-in function).
.
    # Module namespace

    value = 1

    def f():
        global value  # Without `global` the assignment would create a local
        value = 2     # name `value` and let the module value untouched.

    f()
..

File Objects
------------

* File objects are created with the `open` function (see above).
  The following are the (in my opinion) most important methods of
  file objects.

* `__iter__` and `next` normally aren't used directly, but they're
  used implicitly to iterate over the lines of a file:
.
    fobj = open("example.py")
    try:
        for line in fobj:
            print line,
    finally:
        fobj.close()
..
  Note: A shorter and more modern way to write the above code is
.
    with open("example.py") as fobj:
        for line in fobj:
            print line,
..
* `close` closes a file. Note that you need to write `close()`; you
  have to call functions explicitly in Python.

* `readlines` reads all lines of the file into a list. The strings
  in the list contain a line ending character (`\n`).

* `read` without an argument reads the "rest" of the file and returns
  a corresponding string. With an argument, it reads *at most* as many
  bytes.

* `write` writes a string object into the file. This works only if the
  file had been opened for writing (modes `w` or `a`).

* `writelines` is the "opposite" operation of `readlines` in that
  it writes a list of strings into a file. Despite the name of the
  method, you have to provide the line break characters yourself in
  the strings.


Exercise
--------

Extend the outcome of the previous exercise by reading the arguments
line by line from a file. Write the results to another file.

So, if the input file looks like
.
0
1
5
7
..
you should generate a file containing
.
1
1
120
5040
..
A possible solution:
.
    # encoding: UTF-8

    import factorial


    # Files are automatically closed if you use the `with` statement
    # like this.
    with open("factorial_input.txt") as input_fobj:
        with open("factorial_output.txt", 'w') as output_fobj:
            for line in input_fobj:
                argument = int(line)
                result = factorial.factorial(argument)
                output_fobj.write("%d\n" % result)
..
In Python 2.7, you can combine the `with` statements like this:
.
    with open("factorial_input.txt") as input_fobj, \
         open("factorial_output.txt", 'w') as output_fobj:
        ...
..

Object-Oriented Programming (OOP)
---------------------------------

* Overview: Who in this room has developed object-oriented software?
  (For the purpose of this question, I don't only mean using
  classes but implementing them yourself.)

* Object-orientation is easier to show with an example than to
  describe. Nevertheless, here's an attempt.

* Objects are a tool for software design. Objects represent "things".

* Objects have "attributes", which represent the current state of an
  object. Objects usually also have "methods", which change or query
  the state of the object.

* In most object-oriented languages, as in Python, objects are
  created from a template, a "class".

* Example:
.
    class Person(object):
        """`Person` describes a person."""

        def __init__(self, first_name, last_name):
            self.first_name = first_name
            self.last_name = last_name

        def name(self):
            return "%s %s" % (self.first_name, self.last_name)

        def marry(self, new_last_name):
            self.last_name = new_last_name

        def __str__(self):
            # Special method that returns a string. This string is
            # used implicitly by the `print` statement.
            return self.name()


    william = Person("William", "Walker")
    print type(william)
    # `william.__class__` is `william`'s datatype.
    print william.__class__
    # `william.__class__.__name__` is a string.
    print william.__class__.__name__
    print
    # `print` prints the string returned by `name`.
    print william.name()
    # `print` implicitly calls `__str__` to print the object.
    print william
    william.marry("Smith")
    print william
..
Some important points ...

* In contrast to the usual OOP terminology, methods in Python are also
  attributes, only just callable ones.

* All attributes are accessible and changeable. There are no
  "protected" or "private" attributes, only conventions (for example,
  prepending an "internal" attribute name with "_").
.
    class Person(object):
        """`Person` describes a person."""

        def __init__(self, first_name, last_name):
            self.first_name = first_name
            self.last_name = last_name

        def name(self):
            return "%s %s" % (self.first_name, self.last_name)

        def marry(self, new_last_name):
            self.last_name = new_last_name

        def __str__(self):
            # Special method that returns a string. This string is
            # used implicitly by the `print` statement.
            return self.name()


    william = Person("William", "Walker")
    william.last_name = "Potter"
    print william

    def anonymous_name():      # Method of *object*, so no `self` argument;
        return "<anonymous>"   # "bound method"
    william.name = anonymous_name
    print william

    # It's also possible to change the class instead of the object.
    def anonymous_name(self):  # Method of *class*, with `self`;
        return "<anonymous>"   # "unbound method"
    Person.name = anonymous_name

    stefan = Person("Stefan", "Schwarzer")
    print stefan
..
  *Note* That something is *possible* doesn't necessarily mean it
  should be done. :-) Also, it's rather the exception to replace
  methods in Python programs at runtime. More common, however, is the
  addition of non-callable attributes like numbers, strings or lists.

* Every method defined in a class has `self` as its first argument
  (with the exception of "class methods" and "static methods"). When
  the method is called, `self` is the class instance.

* Like in other object-oriented languages you can use inheritance,
  that is, you create a "derived class" from a "base class" and add
  attributes (callable or not). Example:
.
    class Person(object):
        """`Person` describes a person."""

        def __init__(self, first_name, last_name):
            self.first_name = first_name
            self.last_name = last_name

        def name(self):
            return "%s %s" % (self.first_name, self.last_name)

        def marry(self, new_last_name):
            self.last_name = new_last_name

        def __str__(self):
            # Special method that returns a string. This string is
            # used implicitly by the `print` statement.
            return self.name()


    class Speaker(Person):

        def __init__(self, first_name, last_name):
            # Call base class constructor.
            super(Speaker, self).__init__(first_name, last_name)
            self.topic = "<unknown>"

        def give_talk(self, topic):
            self.topic = topic
            print "%s gives a talk on %s." % (self.name(), topic)

        def __str__(self):
            name = super(Speaker, self).__str__()
            return '%s (topic "%s")' % (name, self.topic)


    stefan = Speaker("Stefan", "Schwarzer")
    stefan.give_talk("Python")
    print stefan
..
* Inheritance should only be used if the objects of the derived class
  conceptually are also objects of the base class. For example,
  talking animals aside ;-) , `Speaker`s are always `Person`s.

  In other words, you should *not* inherit from a class just to be
  able to access the base class attributes. In that case, it's better
  to use "aggregation". That is, the new class gets an object of the
  existing class as an attribute.

  Example:
.
    class Tire(object):

        def pump_up(self):
            print "pumping up a tire ..."


    class Car(object):

        def __init__(self):
            self.tires = [Tire() for i in xrange(4)]

        def maintain(self):
            for tire in self.tires:
                tire.pump_up()


    car = Car()
    car.maintain()
..

Special Methods
---------------

* As already shown, classes can have special methods like `__init__`
  and `__str__` that influence the behavior of certain statements and
  functions in Python. Here are some more examples:

* `__call__` defines `obj(...)`.

* `__len__` defines `len(obj)`.

* `__getattr__`, `__setattr__` and `__delattr__` influence
.
    obj.attr            # calls __getattr__
    obj.attr = wert     # calls __setattr__
    del obj.attr        # calls __delattr__
..
* `__getitem__`, `__setitem__`, `__delitem__` control indexing
  operations.
.
    obj[index]
    obj[index] = value
    del obj[index]
..
* `__contains__` defines `value in obj`.

* `__repr__` is called implictly by the `repr` builtin function.
  It's also used when a format string contains a `%r` placeholder.

* `__unicode__` is called when applying the `unicode` builtin function
  to the object.

* `__del__` is called when an object is destroyed by Python's garbage
  collector.

* `__lt__`, `__le__` etc. control the result of comparison operators
  (here `<` and `<=`).

* `__add__`, `__sub__` and many more control mathematical operators
  like `+` and `-`.

* `__nonzero__` controls how an object is converted to a `bool`
  object, for example, for use in `if` and `while` conditions.

* An extensive list is at
  http://docs.python.org/reference/datamodel.html#special-method-names .


Exercise
--------

Derive from the `Person` class and add a method `walk`. This method is
supposed to set an instance variable `speed`. The output of `print person`
should contain the current speed of the person.

Test the class with the following code:
.
    william = MovablePerson("William", "Walker")
    print william.speed  # Should output 0
    william.walk(3)      # Should set the speed to 3 km/h
    print william.speed  # Should output 3
    print william        # Should contain the current speed
..
One possibility:
.
    class Person(object):
        """`Person` describes a person."""

        def __init__(self, first_name, last_name):
            self.first_name = first_name
            self.last_name = last_name

        def name(self):
            return "%s %s" % (self.first_name, self.last_name)

        def marry(self, new_last_name):
            self.last_name = new_last_name

        def __str__(self):
            # Special method that returns a string. This string is
            # used implicitly by the `print` statement.
            return self.name()


    class MovablePerson(Person):

        def __init__(self, first_name, last_name):
            super(MovablePerson, self).__init__(first_name, last_name)
            self.speed = 0.0

        def walk(self, speed):
            self.speed = float(speed)

        def __str__(self):
            name = super(MovablePerson, self).__str__()
            return "%s walks at %s km/h." % (name, self.speed)


    william = MovablePerson("William", "Walker")
    print william.speed
    william.walk(3)
    print william.speed
    print william
..

Exercise
--------

Use the module `urllib` to fetch any HTML page you want. Then use the
`re` module oder string operations to find and print the links on the
page line by line. If you're unfamiliar with regular expressions, the
code in `contents.py` might help.

For this exercise, it's unimportant whether the links occur in HTML
`a` tags or not.

Use functions, classes and/or methods to structure the code so that
it's easy to understand.

Here are a few links to help you:

* `urllib`: http://docs.python.org/library/urllib.html#module-urllib

* `re`: http://docs.python.org/library/re.html#module-re

* valid characters in URLs: http://www.ietf.org/rfc/rfc2396.txt

  To simplify things a bit, assume that a link, after the leading
  "http://", contains only the following characters:

  - Uppercase and lowercase letters from the ASCII characters
  - Digits
  - `;  /  ?  :  @  &  =  + $  , -  _  .  !  ~  *  '  (  ) %`

*Optional* enhancements to make the exercise more interesting:

* Sort the URLs alphabetically before printing them.

* Remove duplicate URLs, so no link is printed twice.

* Pass in the starting URL per command line.

A possible solution:
.
    import os
    import re
    import string
    import sys
    import urllib


    URL = "http://sschwarzer.com/publications"
    VALID_URL_CHARS = string.ascii_letters + string.digits + \
                      ";/?:@&=+$,-_.!~*'()%"

    def url_regex():
        """Return a regular expression for matching URLs."""
        return re.compile(r"http://[%s]+" % re.escape(VALID_URL_CHARS))

    def find_urls(text):
        """Return a list of URL contained in the text `text`."""
        regex = url_regex()
        return regex.findall(text)

    def get_text(url):
        """Return the text found at the URL `url`."""
        fobj = urllib.urlopen(url)
        try:
            text = fobj.read()
        finally:
            fobj.close()
        return text

    def print_urls(url_list):
        """Print each of the URLs in the list `url_list` to stdout."""
        # Remove duplicates.
        url_list = list(set(url_list))
        for url in sorted(url_list):
            print url

    def main(url):
        """Print links in URL `url`."""
        text = get_text(url)
        urls = find_urls(text)
        print_urls(urls)


    if __name__ == '__main__':
        # Make this runnable via `example.py`.
        main(URL)
        sys.exit()

        # Get URL from command line.
        try:
            url = sys.argv[1]
        except IndexError:
            # `sys.argv[0]` contains the script name.
            print "Usage: %s <url>" % os.path.basename(sys.argv[0])
            sys.exit(1)
        main(url)
..
In practice you won't use so many small functions. The shown approach
only should give you an idea what functions could be extracted.


Python "Philosophy"
-------------------

* Get some tips with
.
  import this
..
* EAFP instead of LBYL (EAFP = "it's easier to ask for forgiveness
  than permission", LBYL = "look before you leap")

  Example:
.
    # LBYL
    if os.path.exists(filename):
        fobj = open(filename)
        ...

    # EAFP (typical Python)
    try:
        fobj = open(filename)
        ...
    except IOError:
        ...
..
* Duck Typing ("If it walks like a duck, swims like a duck and quacks
  like a duck, it must be a duck.")

  This means that two objects with similar interfaces don't need to
  be objects of classes derived from the same base class. Here's an
  example:

  First a "Java-like" approach:
.
    class A(object):
        def walk(self, speed):
            raise NotImplementedError("`walk` is undefined")

    class B(A):
        def walk(self, speed):
            ...

    class C(A):
        def walk(self, speed):
            ...
..
  Python-typical (Duck Typing):
.
    class B(object):
        def walk(self, speed):
            ...

    class C(object):
        def walk(self, speed):
            ...
..
  A well-known example in the Python standard library are the classes
  `file` and `StringIO`. These have much in common, but don't derive
  from the same base class (if you don't count `object`).


Outlook
-------

Some topics not discussed in this tutorial:

* Multiple inheritance and mixins
.
    class A(object):
        pass

    class B(object):
        pass

    class C(A, B):
        pass
..
  - http://www.python.org/download/releases/2.2.3/descrintro/

* Generators and generator expressions
.
    def even_numbers(max_number):
        for i in xrange(0, max_number+1, 2):
            yield i

    print type(even_numbers)
    print type(even_numbers(10))

    for i in even_numbers(10):
        print i
..
.
    even_numbers = (i for i in xrange(0, 10, 2))
    print even_numbers
    print type(even_numbers)
    print list(even_numbers)
..
  - http://docs.python.org/glossary.html#term-generator
  - http://docs.python.org/library/itertools.html#module-itertools

* Decorators
.
    def print_name(func):
        def new_func(*args, **kwargs):
            print func.__name__
            return func(*args, **kwargs)
        # Better use `functools.wrap`.
        new_func.__name__ = func.__name__
        new_func.__doc__ = func.__doc__
        return new_func

    @print_name
    def example_func(max_n):
        return range(max_n)

    print example_func(5)
..
  - http://docs.python.org/glossary.html#term-decorator
  - http://docs.python.org/reference/compound_stmts.html#function
  - http://docs.python.org/library/functools.html#module-functools

* Properties
.
    class A(object):

        def __init__(self):
            self.__x = None

        def _get_x(self):
            print "Get x"
            return self.__x

        def _set_x(self, value):
            print "Set x"
            self.__x = value

        x = property(_get_x, _set_x)


    a = A()
    print a.x
    a.x = 7
    print a.x
..
  - http://www.python.org/download/releases/2.2.3/descrintro/
  - http://docs.python.org/reference/datamodel.html#implementing-descriptors

* Metaclasses (rarely used in production code)
.
    class A(type):

        def __new__(new_class, name, bases, dict_):
            special = ['__init__', '__module__', '__metaclass__']
            for key in dict_:
                if key in special:
                    continue
                if not key.startswith("example_"):
                    raise TypeError(
                      "attribute name '%s' doesn't start with \"example_\"" %
                      key)
            return type.__new__(new_class, name, bases, dict_)


    class B(object):

        __metaclass__ = A

        def __init__(self):
            print "We're fine."

        def example_test(self):
            print "Example"

    print "Everything's fine."


    class C(B):
        def forbidden(self):
            pass
..
  - http://docs.python.org/glossary.html#term-metaclass
  - http://docs.python.org/reference/datamodel.html#metaclasses

* Moreover

  - Descriptors
    http://www.python.org/download/releases/2.2.3/descrintro/

  - Coroutines
    http://docs.python.org/reference/expressions.html#yield-expressions


Appendix: Execute Code Snippets from GVim
-----------------------------------------

* The following macro reaches from the `.` to the `..` (so it
  contains two empty lines at the end).
.
?^\.$
jV/^\.\.
k:w !./example.py


..
* Select the this area and press `"ey` to save the text as macro "e".

* If

  - `example.py` is in the current directory,
  - `gnome-terminal` is installed and
  - it has a profile "Talk",

  you can execute a snippet by pressing `@e` when the cursor is
  between the corresponding `.` and `..`.

  If you don't have `gnome-terminal` and/or a `Talk` profile, you can
  edit `example.py` to match your environment.