Tuesday, April 1, 2014

Sorting And Efficiency

Sorting data structures is one of the most frequently used operations in computer science…. Operations such as sorting a sequence of words in alphabetical order, or sorting a sequence of numbers in increasing order are just basic examples that illustrate how vast the purposes of sorting could be.
We have seen some algorithms that allow the implementation of this process successfully so we will discuss the differences between them.

When we are sorting, the first question we should ask ourselves is: What is the most efficient algorithm? In other words, what is the algorithm that will return the sorted data structure by executing the least possible number of steps?

Every function can be bounded above by another function. This upper bound is called Big-Oh. It shows a bound on the worst case running time for a given function.

First of all, we will talk about the sorting algorithms that are typically executed more inefficiently. These are bubble sort, selection sort and insertion sort. They are bounded above by O(n^2). I will assume that if you are reading this entry, you probably have a sense of how these algorithms are implemented. So let’s talk about why they are usually the slowest implementations of a sorting operation.
- Bubble sort makes n-1 comparisons during the first pass on a list containing n elements. Then, we will reduce by 1 the number of comparisons evaluated by the function for each pass. This is why it’s such an inefficient algorithm (it makes something less than n-1 comparisons n number of times…); the elements are not placed in their final location when they are manipulated (except for the largest one in the unsorted part of the list).  The passes keep taking place until the list is finally sorted. However, we should point out that using bubble sort is the most efficient way to know if a list is already sorted since it won’t swap any item on the first pass and we will know that the structure is sorted! (which is O(n)!)
- Selection sort is also a very inefficient algorithm. We will need to traverse the whole list n-1 number of times to find the largest (or smallest) item and put it in its right location. Again, we can see intuitively that this is O(n^2).
              - Insertion sort also takes n-1 passes but it only needs to compare the element being evaluated to the sorted part of the list (which has less than n elements). This is also O(n^2). But note that in the best case (already sorted list), each item only needs one comparison so we will get O(n)!

Now we will talk about the algorithms using a divide and conquer strategy. We say “divide and conquer” because we divide the list being passed into smaller sublists in order to improve the performance of the program. At the end, we will “conquer” or bring these pieces together. This picture shows how the process works:
- The merge sort algorithm recursively splits the given list in half. Then we place the smallest items of each sublist in the right location of a bigger list. At the end we will get the sorted list. The “dividing” step of this algorithm is O(log n) because we get smaller sublists each time the recursive step is applied. The “conquering” or “merging” step evaluates each item once. So it’s O(n). Putting these pieces together we can see that the merging algorithm is O(n log n), which is considerably faster than the algorithms we talked about previously.
              - Quick sort is usually faster than merge sort. However, it’s remarkably slower on its worst-case run-time. If the designed pivot value in each pass is always either the biggest or smallest element in the list, then the function will execute one whole pass to locate a single element in the right location each time. This, is O(n^2), we compare n-1 items n times.

We should also make the point about Timsort. This is Python’s built-in sorting algorithm. It's typically O(n log n) and performs faster than any of the algorithms previously stated (as we’ve seen in lab 10). This video will clarify the operating process of this algorithm: https://www.youtube.com/watch?v=jVXsjswWo44


Summarizing, we’ve seen which sorting algorithms are typically the fastest and which ones are the slowest. But again, we should always consider what is it exactly that we want to do with our program and what is the input that we are going to pass to it since sometimes an algorithm that is usually slow could be really efficient in a specific context and vice-versa.

Bibliography:
picture 1:
http://www.wikihow.com/images/2/26/Sort-a-Deck-of-Cards-Step-1.jpg
picture 2: picture: http://www.cise.ufl.edu/research/ParallelPatterns/PatternLanguage/AlgorithmStructure/Figures/d-and-c.gif
video: research task by Mark Lee.

Sunday, March 9, 2014

Linked Lists (and some wrappers and helpers insight...)

A linked list is a particular recursive data structure made up of nodes where each node contains some data and a reference to the next node in the linked list.
We can distinguish the head from the rest in a linked list. The head is the first element in this structure and the rest is simply the remaining elements of it.

This is what a linked list looks like:









There’s two important things that we should consider when dealing with linked lists:
-The reason why separating the head from the rest is so useful is because the head has a reference to the first node in the rest. So the head is indirectly connected to every element in the list. We can define methods that traverse the full length of the linked list by calling them on the head.
-When we create a new node, if we don’t define its next value, this will be set to None by default. So the last element in every linked list has always a reference to None.


This week, we have also learned that, in order to define methods in a given class, we can use wrapper methods that will inherit helper methods from another class during their implementation.
For example, we have seen how we can create two separate classes for nodes and linked lists. This will allow us to implement the insert method for linked lists (inserting new nodes at the head of the linked list) by using the references that nodes have to next nodes (defined in the node class). We would have to say that the new node has a reference to the current head of the linked list.

We can also use this approach in binary search trees. In this case, we would define an insert method in the node class to determine where should a new node be created according to a specific data (left or right?) depending on the data of another node. When we inherit this method in the binary search tree class it will automatically place new nodes in their corresponding location.

Saturday, March 1, 2014

Recursion

Recursion is definitely the most important concept that we have seen so far in this course. This feature allows us to implement a function by calling the function in its own implementation. But how is this possible? Well, it's as simple (and as complex) as including some code within the implementation of our function that will indicate the program how to work on some base case, so the recursive step will be executed by means of the base case with the indications that we will give to the code. This might sound a bit confusing, so let's define an example and trace it to see how recursion works.

In class, we've seen the function rec_max(L) which returns the maximum number in a possibly nested list of numbers. This is what it looks like:










As we can see, we are using rec_max() inside the function implementation. This is the recursive step. The base case is the bit of code following the 'else' statement. This tells our program to return the biggest element of a non-nested list considering every element in the list. So, our computer now knows that rec_max() works the same way, but only if the element of the list is a nested list. So if an element in L is a nested list, we will get its maximum separately by using the recursive step.
We will get a non nested list with the maximum of every nested list in L and the maximum of the non nested part of L. Finally, the function will return the maximum of this list comprehension.
And this is how recursion works!

As we can see, we could also implement rec_max() without using recursion by considering every element in L separately using if statements and for loops. But this will make our code much longer and slower. This is why, I can say with confidence that, we should always use recursion when it's possible.

Monday, February 24, 2014

Binary Trees

Trees are a special kind of recursive structure made up of a root node, branches nodes, and leaves nodes. Siblings nodes are designated when they have the same parent node. 
Trees also have levels. These are designated by the distance at which the same siblings (or one single node) are separated from the root node.

Here's an example of what a binary tree looks like:


To define a new tree, we create a tree class by implementing the __init__ method in the following way: __init__(self, cargo, left = None, right = None).
Since trees are a recursive structure, we can create as many subtrees as desired inside left or right.

Trees are a fine way to represent data. For example, we can represent a mathematical expression using trees. The operator will be the parent node of the two operands that we want to work with.
In order to build a tree this way, we can create a list where every element of the expression will be an element of the list. Then we will implement some recursive functions to evaluate the type of these elements to know where they should be placed in the tree.

When we translate a mathematical expression into a tree, we can traverse it in multiple ways…We can get a prefix expression by traversing the tree from the root to the left and right subtrees, or get a postfix expression by traversing the tree from the left and right subtrees to the root, or an infix expression by traversing the tree from the bottom of the left subtree to the root and then from the bottom of the right subtree to the top (omitting the root at the end).

Binary trees are definitely one of the most interesting things we have learned in this course so far and I can’t wait to learn how to apply them to solve problems in real life! 

While thinking about this, I couldn’t help myself to go back to this old app I used to play with some years ago…It guesses a character that you are thinking about by traversing a tree depending on the input you give to the program! Here’s the link: http://en.akinator.com/

picture taken from: http://upload.wikimedia.org/wikibooks/en/4/41/HSE_ch5_binary_tree.png

Sunday, February 9, 2014

Inheritance, Multiple Inheritance, and Composition

Inheritance is one of the most important features offered by Object-Oriented Programming languages.  We can create a subclass of a base class by passing the base class name as an argument in the subclass class definition (only if the base class is available in the current module, of course!). By proceeding this way, the subclass “inherits” all the attributes and methods of the base class. 

For example, let’s say that we create a class called Car. In the __init__ method of Car we will define some default attributed such as wheels, windows…and we will also include some methods in the implementation of this class such as accelerating or breaking…
Later on, we can create a subclass of Car called Ferrari, that will inherit all the properties of Car. This allows us to add attributes and methods to Ferrari without modifying Car and it will also save us tons of lines of code if we were going to create a Ferrari class from scratch!

Inheritance is in fact a very powerful and useful tool, but we need to know when to avoid it…
Multiple inheritance is usually something that should be avoided. This takes place when we create a subclass of two or more base classes. Multiple inheritance makes our code too complex and it can get confusing to choose names for new methods in our subclass if similar methods with similar names are defined in our base classes.

This is why we should always consider composition. We can call baseclass.__init__(self) inside the implementation of our subclass __init__ method to tell Python that we want our subclass to have the same attributes as our base class but the methods of base class will not be inherited by the subclass. Furthermore, we can call any method of the base class inside the implementation of the subclass by calling baseclass.method(self)

Sunday, January 26, 2014

Object Oriented Programming

Three weeks have already gone by this term, so it is about time to make the point on some things we've learned in this course so far:

Python is an object oriented programming (OOP) language, we got that, but what does this really mean? Let's build up an answer slowly but clearly...
Programming is mainly about writing the code of a function that will execute some steps using the parameters that we pass to the function as arguments. These arguments can be from different types: we have strings, ints, floats, lists, tuples, dictionaries...These are Python's built-in types. Each type has its own built-in functions called methods.
Methods allow us to make changes to the values assigned to the variables that we created and stored in the computer memory.

Well that sounded confusing but once we understand this we can move on to the next stage:

We can create our own types, these are called classes. When we initialize a class it creates an object instance of that class (it's like a factory to create instances of an abstract-data-type). Later on, we can define special attributes and methods to this class. For example, we can create a class Point with attributes x_coord and y_coord to enter its coordinates in a cartesian plane. We can also define a function position_in_plane() that returns a string with the coordinates of an object instance of Point. This function will act as a method for class Point.

We have seen how classes work and why Python is called an OOP language. Conceptual abstractions can be made by creating classes that allow us to pretend that we are manipulating the attributes and the methods of real world objects.

Welcome To My Slog

In this preliminary post, I am officially opening my SLOG to the public so we can all share our thoughts and evolution on CSC148. I am really looking forward to an interactive and enriching programming experience with my fellow students!