PYTHONHOLICS: How to perform Algorithm Analisys

Let's imagine a situation where we have two programs that solve a same problem and we have to find out which one of those two is better? Before we start algorithm analysis we’ll explain the difference between program and underlying algorithm that the program represents.

Program is an algorithm that has been encoded into some programming language.

Algorithm is a generic step-by-step list of instructions for solving a problem.

So there may be many programs for the same algorithm depending on the programmer and programming language being used.

Let’s start by creating a program that calculates sum of number n.

Function sumOfn is clearly better than function James so the conclusion of analyzing those two examples of a program pay attention to readability. Write programs that are easy to read and understand.

Algorithm analysis is concerned with comparing algorithms based upon the amount of computing resources that each algorithm uses.

Compare algorithms to see which one is better because it is more efficient in its use of resources or it simply uses fewer resources. We can conclude that those two functions above seem very similar because they both use some algorithm to solve summation problem essentially.

Computing resources

Consider amount of space or memory an algorithm requires to solve a problem. The amount of space required by problem solution is typically dictated by a problem instance itself.

Analyze and compare algorithms based on the amount of time they require to execute this is called Execution time or Running time of algorithm

To measure execution time of function sumOfn is to perform Benchmark analysis.

Python benchmark of algorithm using time module and calculation difference between starting time and ending time.

In the time module there is a function called time, that will return current system clock time in seconds since the same starting arbitrary point.

import time 
def sum_of_n_2(n):
    start = time.time()
    total = 0
    for i in range(1, n+1):
        total = total + i 
    end = time.time()
    timedifference = end - start
    return total, timedifference

print "Sum is %d required %10.15f seconds" % sum_of_n_2(10)
print "Sum is %d required %10.15f seconds" % sum_of_n_2(100)
print "Sum is %d required %10.15f seconds" % sum_of_n_2(1000)

The output is given below
Sum is 55 required 0.000000000000000 seconds

Sum is 5050 required 0.000000000000000 seconds
Sum is 500500 required 0.000000000000000 seconds
Sum is 50005000 required 0.000000000000000 seconds
Sum is 5000050000 required 0.006999969482422 seconds
Sum is 500000500000 required 0.093000173568726 seconds
Sum is 50000005000000 required 0.992999792098999 seconds
Sum is 5000000050000000 required 10.267999887466431 seconds

The function sum_of_n_2 is the same as function used in previous example, just with timing calls embedded before and after the summation. Function returns a tuple consisting of the result and amount of time in seconds required for the calculation.

Now we will construct the function with same purpose to sum n integers. Function sum_of_n_3 takes advantage of closed equation Sum(i) = n(n+1)/2

With this function we’ll avoid using loops to calculate the sum of n integers.

def sum_of_n_3(n):
    start = time.time()
    total = (n*(n+1))/2
    end = time.time()
    timedifference = end-start
    return total, timedifference 

print "Sum is %d required %10.15f seconds" % sum_of_n_3(10)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(1000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(10000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(1000000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(10000000)
print "Sum is %d required %10.15f seconds" % sum_of_n_3(100000000)

After executing the code we'll get following result.

Sum is 55 required 0.000000000000000 seconds
Sum is 5050 required 0.000000000000000 seconds
Sum is 500500 required 0.000000000000000 seconds
Sum is 50005000 required 0.000000000000000 seconds
Sum is 5000050000 required 0.000000000000000 seconds
Sum is 500000500000 required 0.000000000000000 seconds
Sum is 50000005000000 required 0.000000000000000 seconds
Sum is 5000000050000000 required 0.000000000000000 seconds

As in previous example we did the same benchmark using different values of n. As we can see time required is 0 for each case. Second they are very consistent no matter the value of n is. Function is hardly impacted by the number of integers being added.

Iterative solution seen to be doing more work since the program is being repeated. This is a reason it is taking longer. Time required for iterative solution seems to increase the value of n. We need a better way of measuring the algorithm quality because the benchmark technique is dependent on system/computer we’re using.

PYTHONHOLICS

Sunday, March 12, 2017

How to perform Algorithm Analisys

1 comment: