PYTHONHOLICS: March 2017

Sunday, March 19, 2017

Multiple Assignment with Dictionaries

Combining built in function items to transform dictionary into a list of tuples, tuple assignment and for loop you can create nice code pattern for traversing the keys and values of a dictionary in a single loop. Using the code from lecture Dictionaries and Tuples.

d = {'a':10, 'b':20, 'c': 30}
l = list()
for key,val in d.items():
    print key, val
    l.append((key,val))
print "l = " + str(l)
l.sort(reverse = True)
print "List of tuples after applying sort()function"
print "l = " + str(l)

Output of previous code is given below.

a 10
c 30
b 20
l = [('a', 10), ('c', 30), ('b', 20)]
List of tuples after applying sort()function
l = [('c', 30), ('b', 20), ('a', 10)]

The for loop has two iteration variable because items returns a list of tuples and key, val is a tuple assignment that successively iterates through each of the key/value pairs in the dictionary. For each iteration through the loop, both key and value are advanced to the next key/value pair in the dictionary.

Dictionaries and Tuples

Dictionaries have a method called items that returns a list of tuples, where each tuple is a key-value pair.

d = {'a':10, 'b':20, 'c': 30}
t = d.items()
print t
[('a', 10), ('c', 30), ('b', 20)]

As you can see from previous example, the items are in no particular order. So in this example we’ve started from defining a dictionary and then using built in function itmes() transformed the dictionary in into a tuple. Since the list of tuples is a list we can apply sort function on a list of tuples. Converting a dictionary to a list of tuples is a way to output the contents of a dictionary sorted by key.

d = {'a':10, 'b':20, 'c': 30}
t = d.items()
print "t = " + str(t)
t.sort()
print "list of tuples using sort() function"
print " t = " + str(t)

The output of previous example is given below.

t = [('a', 10), ('c', 30), ('b', 20)]
list of tuples using sort() function
t = [('a', 10), ('b', 20), ('c', 30)]

The new list is sorted in ascending order by the key value.

Comparing Tuples

Comparison operator work with tuples and other sequences. For example let’s define two tuples.

>>> t1 = (0, 1, 2)
>>>t2 = (0, 4, 5)
>>> t1 < t2 
True 
>>>t1 > t2
False

When we type t1 < t2 the Python starts comparing the first element from each sequence. If the first element is equal in both sequences the Python is comparing next sequence elements until it finds those elements that differ. So in first statement t1 < t2 Python starts by comparing first element of t1 and first element of t2. Since they are both equal the Python is comparing the second element of t1 (1) and t2 (4). The 1 is less than 4 so that’s True. The final comparison is the third element of t1 (2) and t2 (5) and the answer is True which means that in deed the t1 is less than t2. The second statement t1 > t2 is False because elements of tuple t1 are smaller than elements in t2. The sort function can be also applied on tuples. It sorts primarily by first element, but in the case of a tie, it sorts by second element. This feature lends itself to a pattern called DSU and DSU stands for:

DECORATE – sequence by building a list of tuples with one or more sort key preceding the elements from the sequence,

SORT – the list of tuples using Python built-in sort, and

UNDECORATE – by extracting the sorted elements of the sequence.

Try typing the following code.

txt = "Python is a programming language that lets you work quickly  and integrate systems more effectively." 
words = txt.split()
t = list()
for word in words: 
    t.append((len(word),word))
print 't = ' + str(t)
t.sort(reverse=True)
res = list()
for length, word in t: 
    res.append(word)

print res

Output is given below.

t = [(6, 'Python'), (2, 'is'), (1, 'a'), (11, 'programming'), (8, 'language'), (4, 'that'),

(4, 'lets'), (3, 'you'), (4, 'work'), (7, 'quickly'), (3, 'and'), (9, 'integrate'),

(7, 'systems'), (4, 'more'), (12, 'effectively.')]
['effectively.', 'programming', 'integrate', 'language', 'systems', 'quickly',

'Python', 'work', 'that', 'more', 'lets', 'you', 'and', 'is', 'a']

The first loop builds a list of tuples, where each tuple is preceded by its length. Sort compare the first element, length, first and only considers the second element to break ties. The keyword argument reverse = True tells sort to go in decreasing order.

Tuple assignment

Tuples have unique syntactic features in Python programming language and that is the ability to have a tuple on the left hand side of an assignment statement. This allows you to assign more than one variable at a time when the left hand side is a sequence. In this example we have a two element list (this is a sequence) and assign the first and second elements of the list to variables x and y in a single statement.

a = ['Python', 'language']
x,y = a 
print x 
print y

Output of previous script is given below.

Python 
language

Python translates the tuple assignment to be the following:

a = ['Python', 'language']
x = a[0]
y = a[1]
print x 
print y

Output is the same as in previous example.When we use a tuple on the left hand side of the assignment statement, we omit the parentheses, but the following is an equally valid syntax:

a = ['Python', 'language']
(x,y) = a
print x 
print y

A particularly clever application of tuple assignment allows us to swap the values of two variables in a single statement.

x,y = y,x

In previous example both statements are tuples, but the left side is a tuple of variables while the right side is a tuple of expressions. Each value on the right side is assigned to its respective variable on the left side. All the expressions on the right side are evaluated before any of the assignments.

The number of variables on the left and the number of values on the right have to be the same otherwise you’ll get ValueError.

a, b = 1,2,3
Traceback (most recent call last):
  File "", line 1, in 
ValueError: too many values to unpack

Introduction to Tuples

Tuple is a sequence of values much like a list. The values stored in a tuple can be any type, and they are indexed by integers. The important difference is that tuples are immutable. Tuples are also comparable and hashable so we can sort lists of them and use tuples as key values in Python dictionaries.

Syntactically, a tuple is a comma separated list of values:

>>> t ='a','b', 'c', 'd','e'
>>> t
('a', 'b', 'c', 'd', 'e')
>>> type(t)
< type 'tuple'>

To create a tuple with a single element, you have to include comma after final element.

>>> t1 = ('a',)
>>> print t1
('a',)
>>> type(t1)
< type 'tuple'>

Without that final comma in parenthesis the Python would treat this declaration as a string.

>>>t1 = ('a')
>>>type(t1)
< type 'str'>

Another way to create a tuple is with using the built-in function tuple. With no argument, this function will create an empty tuple.

>>>t1 = tuple()
>>>type(t1)
< type 'tuple'>

If the argument is a sequence (string, list or tuple), the result of the call to tuple is tuple with the elements of a sequence:

>>>word = tuple("Python is awesome")
>>>print word 
('P', 'y', 't', 'h', 'o', 'n', ' ', 'i', 's', ' ', 'a', 'w', 'e', 's', 'o', 'm','e')

Because tuple is a name of constructor, you should avoid using it as a variable name. Most list operators also work on tuples. The bracket operator indexes an element:

>>>z = ('a', 'b', 'c', 'd')
>>>print z[0]
a
>>>print z[1]
b
>>>i = 0
>>>while i < len(z):
...   print z[i]
...   i = i + 1
...
a
b
c
d

And the slice operator selects a range of elements:

>>>z[1:3]
('b', 'c')

If you try to modify one of the tuples elements the TypeError exception will occur.

>>>z[0] = 'a'
Traceback (most recent call last):
 File "<stdin>",line 1, in <module>
TypeError: 'tuple' object does not support item assignment

You can modify the elements of a tuple, but you can replace one tuple with another.

>>>z = ('Z',) + z[1:]
>>>print z
('Z', 'b', 'c', 'd')

Sunday, March 12, 2017

BIG O Notation

In algorithm analysis it’s important to quantify the number of operations or steps that algorithm will require in order to find or compute solution.

For each of these steps are considered to be a basic unit of computation then the execution time for an algorithm can be express as a number of steps required to solve the problem. Good basic unit of computation for sum algorithm is to count number of assignment statements perform to compute the sum. In this function the number of assignment statements is 1 which is total = 0 plus the value of n, the number of times we perform total = total + i.

T(n) = 1 + n

Where n is the size of the problem and the T(n) is the time it takes to solve the problem of size n namely 1 + n steps. The dominant term is what in the end is used for comparison. Order of magnitude functions describes the part of T(n) that increases the fastest as value of n increases.

Order of magnitude is often called Big-O notation and is written as O(f(n)). In case of T(n) = 1 +n if n is small then the number 1 plays significant role to the final result. As n gets larger the 1 becomes insignificant to the final result. In that case we can drop 1 an say that running time is O(n).

For example if T(n) = 10 n^3 + 2105 when n is small then 2105 plays significant role but when the n is a large number than 2105 can be negligible.

Sometimes the performance of algorithm depends on the exact values of data rather than simply the size of the problem. For these kinds of algorithms we need to characterize their performance in terms of best case, worst case and average case performance. The worst case is when algorithm performs poorly. In most cases the algorithm performs somewhere in between these two extremes which is an average case.

How to perform Algorithm Analisys

Let's imagine a situation where we have two programs that solve a same problem and we have to find out which one of those two is better? Before we start algorithm analysis we’ll explain the difference between program and underlying algorithm that the program represents.

Program is an algorithm that has been encoded into some programming language.

Algorithm is a generic step-by-step list of instructions for solving a problem.

So there may be many programs for the same algorithm depending on the programmer and programming language being used.

Let’s start by creating a program that calculates sum of number n.

Function sumOfn is clearly better than function James so the conclusion of analyzing those two examples of a program pay attention to readability. Write programs that are easy to read and understand.

Algorithm analysis is concerned with comparing algorithms based upon the amount of computing resources that each algorithm uses.

Compare algorithms to see which one is better because it is more efficient in its use of resources or it simply uses fewer resources. We can conclude that those two functions above seem very similar because they both use some algorithm to solve summation problem essentially.

Computing resources

Consider amount of space or memory an algorithm requires to solve a problem. The amount of space required by problem solution is typically dictated by a problem instance itself.

Analyze and compare algorithms based on the amount of time they require to execute this is called Execution time or Running time of algorithm

To measure execution time of function sumOfn is to perform Benchmark analysis.

Python benchmark of algorithm using time module and calculation difference between starting time and ending time.

In the time module there is a function called time, that will return current system clock time in seconds since the same starting arbitrary point.

import time 
def sum_of_n_2(n):
    start = time.time()
    total = 0
    for i in range(1, n+1):
        total = total + i 
    end = time.time()
    timedifference = end - start
    return total, timedifference

print "Sum is %d required %10.15f seconds" % sum_of_n_2(10)
print "Sum is %d required %10.15f seconds" % sum_of_n_2(100)
print "Sum is %d required %10.15f seconds" % sum_of_n_2(1000)

The output is given below
Sum is 55 required 0.000000000000000 seconds

Sum is 5050 required 0.000000000000000 seconds
Sum is 500500 required 0.000000000000000 seconds
Sum is 50005000 required 0.000000000000000 seconds
Sum is 5000050000 required 0.006999969482422 seconds
Sum is 500000500000 required 0.093000173568726 seconds
Sum is 50000005000000 required 0.992999792098999 seconds
Sum is 5000000050000000 required 10.267999887466431 seconds

The function sum_of_n_2 is the same as function used in previous example, just with timing calls embedded before and after the summation. Function returns a tuple consisting of the result and amount of time in seconds required for the calculation.

Now we will construct the function with same purpose to sum n integers. Function sum_of_n_3 takes advantage of closed equation Sum(i) = n(n+1)/2

With this function we’ll avoid using loops to calculate the sum of n integers.

def sum_of_n_3(n):
    start = time.time()
    total = (n*(n+1))/2
    end = time.time()
    timedifference = end-start
    return total, timedifference 

print "Sum is %d required %10.15f seconds" % sum_of_n_3(10)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(1000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(10000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(1000000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(10000000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100000000)

After executing the code we'll get following result.

Sum is 55 required 0.000000000000000 seconds
Sum is 5050 required 0.000000000000000 seconds
Sum is 500500 required 0.000000000000000 seconds
Sum is 50005000 required 0.000000000000000 seconds
Sum is 5000050000 required 0.000000000000000 seconds
Sum is 500000500000 required 0.000000000000000 seconds
Sum is 50000005000000 required 0.000000000000000 seconds
Sum is 5000000050000000 required 0.000000000000000 seconds

As in previous example we did the same benchmark using different values of n. As we can see time required is 0 for each case. Second they are very consistent no matter the value of n is. Function is hardly impacted by the number of integers being added.

Iterative solution seen to be doing more work since the program is being repeated. This is a reason it is taking longer. Time required for iterative solution seems to increase the value of n. We need a better way of measuring the algorithm quality because the benchmark technique is dependent on system/computer we’re using.

What is an Algorithm ?

Algorithm is any well-defined computational procedure that takes some value, or set of values as input and produces some value, or set of values, as an output. An algorithm is thus a sequence of computational steps that transforms the input into the output. Algorithm is also a tool for solving a well-specified computational problem. The statement of the problem specifies in general terms the desired input/output relationship. The algorithm describes a specific computational procedure for achieving that input/output relationship.

Example 1 – a sorting problem. Let’s say that we have a list or sequence of unsorted numbers and our task is to sort them out into non-decreasing order. Here is how the sorting problem is defined:

Input: a sequence of n numbers a1,a2,a3,…,an
Output: a permutation or ordering of the input sequence such that a1’ < a2’ < a3’ < an.

Let’s define a list of unsorted numbers in Python:

>>> l1 = [3324,212,3214,45, 500,213,12,34]

>>> l1.sort()

>>> l1

[12, 34, 45, 212, 213, 500, 3214, 3324]

As you can see first we’ve defined list l1 and then we’ve used a built-in Python function sort() without any arguments. The result of sorting is what we’ve been looking for and that is an ordered list from the lowest number 12 to the highest number 3324.

Sorting is a fundamental operation in computer science, and as a result a large number of good sorting algorithms have been developed. Which algorithm is best for given application depends on:

Number of items to be sorted
The extent to which the items are already sorted
Possible restrictions on the item values.
Kind of storage device to be used such as memory, disk or other.

Algorithm is correct if, for every input it halts with the correct output. We say that the correct algorithm solves a given computation problem. Incorrect algorithm might not halt at all on some input instances, or it might halt with an answer other than the desired one. Sometimes the incorrect algorithms might be useful if their error rate can be controlled.

In English language an algorithm might be specified as computer program or hardware design but it must provide precise description of the computational procedure to be followed.