Data structure and algorithms

                                                                                               

Possible references:

 

Topic objective: Students should understand basic data structures, and several main algorithms. The concept of algorithmic complexity should also be understood, and approximating complexities for easy cases should be doable.


 

Data structure – forms of data organization, usually associated with data type

Algorithm – a procedure which is exactly specified, to solve a problem

 


Modularity

require a formal method to build a program efficiently and reliably.

- can be done by breaking up a program into small and appropriate modules, each of which can be written and tested before inclusion into bigger modules, which is in turn built and tested

spaghetti code – long and berlingkar-lingkar

 

Abstraction

high-level view of object or function where low-level details can be neglected.

- allows the main behaviour of object or function to be considered in solving problems without being complicated by details.

Function abstraction – specify function for a module

Structure abstraction (object-oriented - OO) – build software model for the behaviour of real-world objects

 

Data type – data structure and collection of functions or procedures which operate on the data structure

Abstract data type – data type which can be accessed only through an interface

Program which uses an abstract data type: client

Program which specifies the abstract data type: implementation

 

OO programming:

methods” – functions and procedures

class” – data structure and its methods (i.e., abstract data type) + additional characteristics: inheritance, polymorphism

object” – a class instance; represents real-world object; typed variable as given by class

 

Several basic data structures and several associated operations:

vector

ordered set of data elements

indexing

vector addition, etc

string

vector with variable length

concatenation

array

array of dimension n is vector with its elements arrays of dimension n-1 with corresponding index sets

(array of dimension 0 is a scalar)

linearization (write elements in a linear order of dimension 1)

tree

one data element as root and several (possibly 0) branches which consist of (sub)trees

 

point to root of first daughter branch

return element in root of first daughter branch

point to root of next branch

return element in root of next branch

forward-linked list

elements each with link to next element, terminated by a null link

 

initialize

point to first element

return first element

point to next element

return next element

return last element

insert element

delete element

test if list empty

bidirectional linked list

like forward linked list but with backwards links as well

 

point to previous element

return previous element

circular list

forward linked list with last element linked to first element

 

return list size

queue

list of data elements for which element insertion is done at one end and deletion at the other (FIFO list)

 

insert new element

extract (return & deletion) element

test if queue empty

stack

like queue, but insertion and deletion carried out at same end (LIFO list)

 

push (insert new element)

pop (extract element)

test for emptiness

graph

structure like tree but more general: an element may be a component of more than one substructure

-         directed: each element has a head (except for root) and a tail (possibly null)

-         rooted: a root element which has no head

-         tersambung: any two elements have at least one common ancestor

-         not circular: no element which is an ancestor of itself

 

 

ring

hierarchical structure where the component set of an element is linked as a circular list

 

 

heap

queue on a complete binary tree where the element at a node is later in order compared to elements on the branch nodes; input is at leftmost position on the lowest brach level

 

insert element  (order must be necessarily conserved by exchanging, if required, the contents of the daughter node and the parent node upwards until the root)

 

deletion of root element (conserve order by exchanging downwards until lowest branch; consider daughter node with element later in order)

 

 

 

Algorithm analysis

-         for comparing different algorithms for  same task

-         for predicting performance of algorithm in new environment

-         for determining values for parameters of algorithm

-         etc

 

complexity – approximation of (order of) number of calculations needed to be carried out with respect to a primary parameter N when N is large

 

 

several common algorithms:

 

sorting

 

bubble sort –  elements in a vector: compare two neighbouring elements; exchange positions if necessary, according to order; scan whole vector – if there are N elements, each scan requires N comparison operations, and in worst case need N rounds of scan complexity ~ N×N or O(N2)

 

heap sort – elements are put into a heap and are extracted from the root – order conservation in the heap requires ~ log2 N operations complexity O(log N)

 

quick sort –  elements in a vector: use recursion: divide original vector to 2 parts the elements of which are rearranged, and sort these parts by recursion; division is done as follows: choose an element, scan from one end of the vector until an element whose order is later than the chosen element is met and from the other end until an element whose order is earlier than the chosen element is met, exchange them, and continue the scans until they meet, whence the chosen element is placed – worse case when vector already ordered & division is carried out N times & number of comparisons = N+(N-1)+(N-2)+…+2+1 = (N+1)N/2; on average, division is into 2 parts of roughly equal size, so complexity CN = 2CN/2+N  ~ O(N log N)

 

merge-sort –  elements in a vector: use recursion: divide original vector to 2 parts each of which are subjected to merge-sort individually, then they are merged again, with the merge respecting order – division is done log N times & merging requires N operations all at αth level (α merges each with vector of length N/α), so complexity  ~ O(N log N)

 

etc

 

 

search

 

sequential – elements in a vector: find wanted element by examining elements one by one – complexity O(N)

 

binary – elements in a vector in order: divide vector into 2; determine in which part the wanted element should be; now divide this part; and so on, until found – vector of length N can be divided like this log2 N times complexity O(log N)

 

tree – use biary search tree (ordered binary tree: left subtree has root earlier in order to root node and right subtree has root later in order, and each subtree also ordered); search in appropriate subtree – complexity O(log N)

 

hash function / hash table – order of element represented by key; hash transforms key to address in table – complexity O(1)

 

etc

 

graphical problems:

minimum spanning tree (join up N islands in the cheapest way), travelling salesman (visit N places in the shortest way)

 

Ο greedy algorithm – move made based only on currently available information; usually suboptimum

 

Kruskal's algorithm – for minimum spanning trees: add cheapest edge if it does not produce a cycle

 

Djikstra's algorithm – for seeking chortest path from one vertex to other vertices: if S is set of vertices to which shortest paths have been obtained and V-S is the set of remaining vertices, sort vertices in V-S according to current best estimate for path length, add vertex with shortest path u, and update cost for all vertices v which are connected to u if the best estimate for the shortest path to v can be improved by including the edge (u,v) in the path to v. 

 

etc

 

 

 

back to synopsis of SMES3103