Algorithm and Complexity Assignment (Python)

profileFelmin Nicolas
Assignment2_Resources.zip

__MACOSX/._Assignment 2_Resources

Assignment 2_Resources/.DS_Store

__MACOSX/Assignment 2_Resources/._.DS_Store

Assignment 2_Resources/Assignment_2_HIT220 (1).pdf

HIT220 - Assignment 2

Cat Kutay

Aug 2020

1 Rubric

1. This document is available https://www.overleaf.com/read/cgwdgmvhppzr

2. Submit as a pdf by on Learnline

3. Hand drawings can be scanned and inserted

4. Marking rubric for each question: Marks will be awarded for accuracy, completeness, and well communicated reasoning which demon- strates your depth of understanding of the subject matter. Unless marks otherwise divided up:

a Advanced questions only given marks if correct. Otherwise:

b Full marks are possible if provide working code

c three quarter marks are possible if provide pseudo code

d For these marks:

i. Half marks will be given for right answer

ii. Half marks will be given for working shown with clear explanation of working

All questions total marks are shown. Not all parts are of equal value.

2 Questions

1. (4 points) Stack and Queue When working with data in a queue, write the following for the different data structures that are proposed:

a You are given an array of k single-linked-lists of length n, where each linked-list is sorted in ascending order. Write code or pseudo code to merge all the linked-lists into one sorted ascending linked-list and return it.

b Write the algorithm using (a) iteration and (b) recursion. What is the Big O complexity of each version of the algorithm

c Implement a queue using two stacks. What methods do you need to implement for a Queue? What is the worst-case run time complexity of the method to remove an element?

Advanced Write a method

1

def contains(queueName: Queue, search: str):

return bool

which will take a Queue and a String, and returns True if the Queue contains this string of characters in the same sequence. Return false if not found in the Queue. The elements in the Queue must remain in their original order once this method is complete.

2. (3 points) Optimize Merge

a Given an array arr[] of n integers, construct an output array prod[] such that prod[i] is equal to the product of all the elements of arr[] except arr[floor(i/2)]. Note: floor() is the largest integer < n/2.

b You are to provide code that does not use the division operator and completes in O(n) time.

Advanced Can you improve the space complexity to O(1)

3. (2 points) Matrix

a Given an n x m matrix write the code to output the transpose of the matrices as an m x n matrix First consider how to store a matrix in your code

b Given an n x m matrix write the code to output the diagonal elements of the matrix as a vector or sequence of values.

c Given an n x n matrix where each of the rows and columns are sorted in ascending order, return the kth smallest element in the matrix. Note that it is the kth smallest element in sorted order of the entire matrix, not the kth distinct element in the matrix array of arrays∣∣∣∣∣∣

1 5 9 10 11 13 12 13 15

∣∣∣∣∣∣ The 8th element is 13 as the elements of the matrix are [1,5,9,10,11,12,13,13,15]. 4. (4 points) Search Trees

The insertion of data into a tree can be done in various ways to ensure the height of the tree is minimum, and that searching the tree for an item is not more than O(log n).

a Draw separate trees with 8 nodes that are either: balanced; binary tree; neither of these.

b Write in pseudo code or code to traverse the tree and verify if it is balanced and/or binary. First consider how you will represent the edges and nodes as data in your program and used this in your code.

c Then consider which search method you will use, and name it in your code.

Advanced Assume your tree is binary. Now:

(a) Write pseudo code or code to carry out a breadth first search of a binary tree

(b) What is the complexity of the search?

(c) What data structure did you use for the storage of nodes as your run the search above?

5. (3 points) Optimizing storage [Advanced]

a We are looking to store water for fire management, and we have located a suitable area, but want to map out the storage, based on the height map of the surrounding area.

b Given n non-negative integers representing an elevation map where the width of each bar is 1, write code or pseudo code to calculate the maximum amount of water this elevation can trap between any two peaks. See example below.

Page 2

c Note you cannot tilt the elevation map.

6. (4 points) Map colouring Using the Bush Fire Map [2 points]:

Figure 1: input heights = [0,1,0,2,1,0,1,3,2,1,2,1]. Output = 4.

a Draw a graph of the regions on the map

b Store the graph of this map in code

c Write an algorithm in code or pseudo code to find if the map is 4-colourable

d Are you using an exhaustive or greedy algorithm method? Explain.

Advanced 2 points Use your code to advise each region where they can get extra fire trucks from, in an emergency. Assume the trucks are on average based in the centre of each region during fire season. Rough estimate of distance is sufficient.

Page 3

Figure 2: Bush Fire regions

Page 4

__MACOSX/Assignment 2_Resources/._Assignment_2_HIT220 (1).pdf

__MACOSX/Assignment 2_Resources/._Week 2

__MACOSX/Assignment 2_Resources/._Week 4

__MACOSX/Assignment 2_Resources/._Week 5

Assignment 2_Resources/Instruction for Assignment 2.docx

For this assessment you will need to show the various stages of the solution as you work through the problem, not just the final outcome.

You may create your answers as an electronic document which is machine readable. As explained in class, the use of tables to contain material or images of hand drawings is not allowed except for illustration, These will not be considered full answers.

Save to pdf and upload.

· No Plagiarism

· No Copying from internet

· Follow the marking rubric

__MACOSX/Assignment 2_Resources/._Instruction for Assignment 2.docx

__MACOSX/Assignment 2_Resources/._Week 3

__MACOSX/Assignment 2_Resources/._Week 6

Assignment 2_Resources/Week 2 /.DS_Store

__MACOSX/Assignment 2_Resources/Week 2 /._.DS_Store

Assignment 2_Resources/Week 2 /Week2.pptx

Week 2: Computational Complexity

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Complex Issue

Feedback

Why do we give it

How do we give it

Finding it on Learn line

Consultation time

Will you use it

When do you want it

Click on Insert > new slide to choose your cover, then delete this page.

Reference

See Chapter 3 in text.

For another similar approach see Drozdek Ch2 library site

Computers and Intractability , A guide the Theory of NP-Completeness, Garey and Johnson, W.H.Freeman and Company, New York, 1979.

Algorithms and Complexity, Second Edition, Herbert S. Wilf, AK Peter Ltd Canada, 2002.

The Nature of Computation, Christopher Moore and Stephan Mertens, Oxford University Press, 2011

Click on Insert > new slide to choose your cover, then delete this page.

Meet some Ancestors

Abu Ja’far Mohammed ibn Mûsâ al-Khowârizmî was a Persian mathematician (ca. 780–850) who wrote a book that was the foundation of algebra and algorithms

Alan Turing, developed Computer Science -  Please watch the You tube by Cambridge University 8 min "Alan Turing - Celebrating the Life of a Genius"  The HIT220 resources page also contains links to you tubes about Turing, Turing Machines and the Halting Problem.  

Leonard Euler, The Father of Graph Theory was one of the most prolific mathematicians of all time.  Worth at look around about his contributions to mathematics and science but entirely optional.

Euclid developed Geometry.  We studied the Euclidean Algorithm last week however Euclid is more famous for his 5 postulates.  Euclid's Book/s "The Elements" is the most published book of all time.   Beyond Geometry, Euclid proved many results in number theory which we love to use. The Fundamental Theorem of Arithmetic says that every positive integer greater than 1 has a unique prime decomposition.  So useful for finding GCD, LCM, simplifying fractions. 

Ada Lovelace developed the first computer algorithm designed for Babbage’s proposed computer, to calculate Bernoulli numbers

Click on Insert > new slide to choose your cover, then delete this page.

Computational Complexity

“As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise – By what course of calculation can these result be arrived at by the machine in the shortest test time?”

Charles Babbage (~1864)

http://www.cbi.umn.edu/about/babbage.html

“A wide variety of commonly encountered problems from mathematics, computer science, and operations research are known to be NP-complete …..

it is important for anyone concerned with the computational aspects of these fields to be familiar with the meaning and implications of this concept.”

Garey & Johnson, 1978

Click on Insert > new slide to choose your cover, then delete this page.

5

In defining a measure of efficiency

The efficiency of execution of an algorithm depends on the hardware, programming language used, compiler, programming skill, etc. These variables are not useful for measuring efficiency of an algorithm.

A computing time for millions of bits of input data can be expected to take longer than one where a small amount of input is required.

To evaluate an algorithm’s efficiency, logical time units that express a relationship between the size of an input and the amount of time and space required to process the input should be used.

Desirable scaling property – When the input size doubles, the algorithm should only slow down by some constant factor.

Click on Insert > new slide to choose your cover, then delete this page.

Why not good to look at hardware?

6

Analysis of Algorithms

Analysis of an algorithm gives insight into how long the program runs and how much memory it uses

time complexity

space complexity

Click on Insert > new slide to choose your cover, then delete this page.

Complexity Analysis

The same problem can frequently be solved with different algorithms which differ in efficiency. Or choice of data structure – e.g. Linked list vs array

Time is generally the most important

Space (memory)

Tradeoffs between time and space complexity

Type of measurement

Worst-case

Best- case

Average-case

Generally concerned with worst case analysis. We also study best case but will usually skip average case analysis.

Worst case analysis provides a running time guarantee for all inputs.

Click on Insert > new slide to choose your cover, then delete this page.

Computational Complexity

The computational complexity of an algorithm is measure of how many steps the algorithm will require in the worst case for an instance or input of a given size.  Most commonly used notation for describing asymptotic complexity is “big oh" or "big O" notation (O for order of the complexity).  

Very loosely, this notation strips away the less significant calculations to give the magnitude of the number of operations for a large input size.  

Click on Insert > new slide to choose your cover, then delete this page.

Experimental Analysis and Challenges

One way to study the efficiency of an algorithm is to implement it and experiment by running the program on various test inputs while recording the time spent during each execution.

While experimental studies of running times are valuable, especially when finetuning production-quality code, there are three major limitations to their use for algorithm analysis:

Experimental running times of two algorithms are difficult to directly compare unless the experiments are performed in the same hardware and software environments.

Experiments can be done only on a limited set of test inputs; hence, they leave out the running times of inputs not included in the experiment (and these inputs may be important).

An algorithm must be fully implemented in order to execute it to study its running time experimentally (most serious drawback).

Click on Insert > new slide to choose your cover, then delete this page.

Moving Beyond Experimental Analysis

The goal is to develop an approach to analysing the efficiency of algorithms that:

Allows us to evaluate the relative efficiency of any two algorithms in a way that is independent of the hardware and software environment.

Is performed by studying a high-level description of the algorithm without need for implementation.

Takes into account all possible inputs.

Click on Insert > new slide to choose your cover, then delete this page.

Counting Primitive Operations

To analyse the running time of an algorithm without performing experiments, we perform an analysis directly on a high-level description of the algorithm.

We base the description on a set of primitive operations such as the following:

Assigning a value to a variable

Following an object reference

Performing an arithmetic operation (i.e. adding two numbers)

Comparing numbers

Accessing a single element of an array by index

Calling a method

Returning from a method

Click on Insert > new slide to choose your cover, then delete this page.

Primitive Operations

This operation count will correlate to an actual running time in a specific computer, for each primitive operation corresponds to a constant number of instructions, and there are only a fixed number of primitive operations.

The implicit assumption in this approach is that the running times of different primitive operations will be fairly similar. Thus, the number, t, of primitive operations an algorithm performs will be proportional to the actual running time of that algorithm.

Click on Insert > new slide to choose your cover, then delete this page.

Measuring Operations as a Function of Input Size

To capture the order of growth of an algorithm's running time, we will associate, with each algorithm, a function f(n) that characterizes the number of primitive operations that are performed as a function of the input size n.

Click on Insert > new slide to choose your cover, then delete this page.

Classification

An algorithm is deemed efficient if it has a polynomial running time (e.g. f(n)=n2).

Polynomial time classification typically exposes or indicates the existence of “essential structure”.

When there are high constants and/or exponents, over a period of time improved algorithms are often discovered.

Exceptions – some polynomial time algorithms are not (practically) efficient.

Click on Insert > new slide to choose your cover, then delete this page.

Worst Case Analysis is main focus

Running time guarantee for any input size.

Some algorithms of exponential time complexity are successful/practical because the worst case occurs rarely.

Best Case Analysis

Average case analysis (typically complex calculations)

Amortized Complexity – Sections 5.3 Textbook. Read about it so that you could discuss the idea.

Click on Insert > new slide to choose your cover, then delete this page.

Amortized – break down

16

The order of a nested loop

for to

for to

next

next

Exercise: Count the number of operations which are additions, subtractions, multiplication.

Click on Insert > new slide to choose your cover, then delete this page.

Counting elementary operations

What is classified as an “elementary operation” may vary. E.g. Drozdek focuses on counting the number of assignments (eg changing values in memory) rather than operations. In fact a general rule of thumb is:

Memory references are counted in a data intensive operation as this will correlate well with running time on any machine

Comparison numbers will be the most important in a search of a list.

Elementary operations used in an algorithm to evaluate a polynomial, are the additions and multiplications needed

Click on Insert > new slide to choose your cover, then delete this page.

for to

for to

next

next

Number of calculations is:

Order of complexity:

There are two additions, one multiplication and one subtraction for each iteration of the inner loop. i.e. 4 operations per iteration of inner loop.

Inner loop executes when:

Click on Insert > new slide to choose your cover, then delete this page.

19

To get the order

Which is the simplest rising function that is bigger than f(n)

From Drozdek

Click on Insert > new slide to choose your cover, then delete this page.

Function growth and complexity

Function growth

Big-O notation – informally

Tractability

Polynomial growth

Exponential growth

Execution times

Click on Insert > new slide to choose your cover, then delete this page.

Growth Functions

Different functions can effect numbers, and how the growth can happen rapidly.

Smallest to Greatest

O(1), log2(n), n, (n)log2(n), n2, 2n , n!

Consider N=10

0(1) = 1

log2 (10)= 3.32

10

10 log2 (10)= 33.32

102= 100

210= 1024

10!= 3,628,800

Click on Insert > new slide to choose your cover, then delete this page.

Function growth within polynomial time

lg n is base 2 logarithm

Orders of magnitude:

O(1) constant

O(log n) logarithmic

O(n) linear

O(n log n)

O(n2) quadratic

O(n3) cubic

Note axis scale is very limited.

Figure 2-5 (of Drozdek) Typical functions applied in big-O estimates

Click on Insert > new slide to choose your cover, then delete this page.

Evaluating

Click on Insert > new slide to choose your cover, then delete this page.

Asymptotic complexity

In algorithm analysis, we focus on the growth rate of the running time as a function of the input size n, taking a "big-picture" approach. For example, it is often enough just to know that the running time of an algorithm grows proportionally to n.

The asymptotic complexity is the limiting behaviour of the execution time of an algorithm when the size of the problem goes to infinity.

This is usually denoted in big-O notation.

Definition: A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of input items.

Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n).

Click on Insert > new slide to choose your cover, then delete this page.

Rescaling of time axis

Image source Moore & Mertens

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation

Formal Definition:

f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k.

The values of c and k must be fixed for the function f and must not depend on n. 

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation – the definition

f(n) is O(g(n)) if there exists numbers c, N > 0

such that for each n ≥ N

f(n) ≤ c∙g(n)

The meaning:

f(n) is larger than g(n) only for finite number of n’s

a constant c and a value N can be found so that for every value of n ≥ N

f(n) ≤ c∙g(n) (such c and the N are called witnesses)

f(n) does not grow more than a constant factor faster than g(n)

We may write f(n) = O(g(n)) but the “=“ does not represent genuine equality.

We say that f(n) is O(g(n)), (because O(g(n)) is a collection - an infinite set).

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: illustration

N

c∙g(n)

f(n)

n

f(n) ≤ c∙g(n)

n ≥ N

The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

Click on Insert > new slide to choose your cover, then delete this page.

General rules

Ignore constants

5n -> 0(n)

Certain terms ‘dominate’ others

O(1) < 0(logn) < o(n), o(nlogn) < o(n2)< o(2n) < o(n!)

e.g. ignore low-order terms when they dominated by higher one

X=5+(15*20) ;

Independent of input size, N

O(1) ‘big O of one’

x=5+(15*20) ;

y=15-2;

Print x + y

Total time = 0(1) + 0(1) + 0(1)

3* O(1) drop constants

O(1)

Click on Insert > new slide to choose your cover, then delete this page.

Cont.

Linear time:

for x in range (0,n):

print x; // O(1)

N * O(1) = O(N)

x=5+(15*20) ; //O(1)

for x in range (0,n):

print x;

Total number = O(1) + O(N) = O(N)

In another word, when N gets large, the time it takes to compute Y is meaningless, as the for-loop dominate the run time.

}//O(N)

Click on Insert > new slide to choose your cover, then delete this page.

Constants

i :=0;

while i<n Do

i=i+1

i :=0;

while i<n Do

i=i+3

F(n) = n

O(f(n)) = O(n)

F(n) = n/3

O(f(n)) = O(n)

Click on Insert > new slide to choose your cover, then delete this page.

Quadratic time

for x in range (0,n):

for x in range (0,n):

print x*y // O(1)

Total number = O(N2)

}//O(N2)

Click on Insert > new slide to choose your cover, then delete this page.

O(?)

for x in range (0,n):

print x;

for x in range (0,n):

for x in range (0,n):

print x*y

What is the total runtime ???

In workshop

Click on Insert > new slide to choose your cover, then delete this page.

Big O and efficiency

An algorithm is deemed efficient if it has a polynomial running time i.e. O(nc) where c is a constant.

Polynomial time classification typically exposes or indicates the existence of “essential structure”.

When there are high constants and/or exponents, over a period of time improved algorithms are often discovered.

Exceptions – some polynomial time algorithms are not (practically) efficient.

More exceptions - Some exponential time algorithms work well in practice because the worst case occurs rarely.

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: properties

1. (transitivity) If f(n) is O(g(n)) and g(n) is O(h(n)), then

f(n) is O(h(n))

(i.e., f(n) = O(g(n)) = O(O(h(n))) = O(h(n)).

2. If f(n) is O(h(n)) and g(n) is O(h(n)), then

f(n) + g(n) is O(h(n)).

3. ank is O(nk).

4. nk is O(nk+j) for any j > 0.

5. If f(n) = cg(n), then f(n) = O(g(n)).

6. logan = O(logbn) for any numbers a, b > 1.

7. logan is O(log2n) for any positive a ≠ 1

Click on Insert > new slide to choose your cover, then delete this page.

Section 2.3.

Analysis of Algorithms

Relatives of Big-Oh

big-Omega

f(n) is (g(n)) if there is a constant c > 0

and an integer constant n0  1 such that

f(n)  c•g(n) for n  n0

big-Theta

f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0

Click on Insert > new slide to choose your cover, then delete this page.

Intuition for Asymptotic Notation

Analysis of Algorithms

Big-Oh

f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)

big-Omega

f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)

big-Theta

f(n) is (g(n)) if f(n) is asymptotically equal to g(n)

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Example Uses of the Relatives of Big-Oh

f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n) < c•g(n) for n  n0

Let c = 5 and n0 = 1

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 1 and n0 = 1

5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 5 and n0 = 1

5n2 is (n2)

Click on Insert > new slide to choose your cover, then delete this page.

39

Tractability - overview

The area of tractability explores problems and algorithms that can take an impossible amount of computation to solve except perhaps for very small examples of the problem.

Tractable – an efficient algorithm. i.e. Polynomial time.

Efficient - time complexity is at most a polynomial function of input size.

Inefficient:

A brute force algorithm or exhaustive search examines every possibility.

For non-trivial problems this approach leads to “combinatorial explosion” in the number of possibilities.

Unacceptable running time on large inputs. Typically 2n time or worse for inputs of size n. i.e. Exponential run time.

“Or Worse” - the input size need not be very large at all to generate a running time well beyond the lifetime of the universe.

Click on Insert > new slide to choose your cover, then delete this page.

Intractability examples

A problem is intractable in the strongest sense if the problem is undecidable. i.e. No algorithm is possible.

Read about Russel’s Paradox, and Turing’s Halting Problem.

Another strong sense of intractability occurs when the running time to output a solution to a problem is excessive.

Click on Insert > new slide to choose your cover, then delete this page.

Paradox – set of all sets that do not contain themselves

Halting - machine or algorithm that says if any algorithm will halt – put through an algorithm that is the machine plus a not gate

41

P vs. NP and the Computational Complexity Zoo

https://www.youtube.com/watch?v=YX40hbAHx3s

Many examples in this video are not decision problems, so do not use such problems as examples in your assignments

Click on Insert > new slide to choose your cover, then delete this page.

Decision problems

In HIT220 you only look at decision problems

These are yes-or-no question on an infinite set of inputs. The classification of computational complexity classes (e.g. P, NP,NPC) is based on decision problems.

Click on Insert > new slide to choose your cover, then delete this page.

Class P

A deterministic algorithm is a uniquely defined (determined) sequence of steps for a particular input. There is only one way to determine the next step that the algorithm can make.

A problem belongs to the class P of decision problems if it can be solved in polynomial time with a deterministic algorithm.

Two members of P:

Graph 2-colourability

Is G Eulerian?

The bridges of Königsberg problem solved by Euler in 1736.

Click on Insert > new slide to choose your cover, then delete this page.

The Seven Bridges of Königsberg

https://www.youtube.com/watch?v=f-Zy1q9XFQE

Click on Insert > new slide to choose your cover, then delete this page.

The graph of the Bridges of of Königsberg has 4 vertices (land) and 7 edges (representing the bridges).

One could solve the problem using “Brute Force”. An exhaustive search of all possibilities. Exponential time complexity.

Euler’s insight: In order to cross each edge once, any time we arrive at a vertex along one edge, we have to depart on a different edge.

Hence the degree of a vertex v, that is the number of edges incident to each v , must be even.

Click on Insert > new slide to choose your cover, then delete this page.

Class NP

A non-deterministic algorithm is an algorithm that can use a special operation that makes a guess when a decision is to be made.

A problem belongs to the class NP of decision problems if it can be solved by a non-deterministic polynomial time algorithm (on a theoretical machine allowing infinite parallelism).

The class of NP problems consists of problems with the property that “checking” whether or not a claimed solution is correct can be performed in polynomial time.

Two members of NP: Graph k-colourability,

Is G Hamiltonian?

A Hamiltonian path, also called a Hamilton path, is a graph path between two vertices of a graph that visits each vertex exactly once. If a Hamiltonian path exists whose endpoints are adjacent, then the resulting graph cycle is called a Hamiltonian cycle (or Hamiltonian cycle).

So to prove that a decision problem is in NP we would only need to show that checking can be performed in polynomial time. i.e. That there is an efficient checking algorithm.

(Probably the most useful thinking for HIT220)

Click on Insert > new slide to choose your cover, then delete this page.

A non-deterministic algorithm can solve a decision problem if there is a path in the decision tree of the algorithm that leads to a “yes” answer; otherwise it would answer “no”.

If the number of steps in the decision tree path to the affirmative answer is O(nk), where n is the size of the specific problem, the algorithm is considered polynomial.

There are other ways to describe NP, as per Drozdek which is below but I recommend the previous slide.

In HIT220 I suggest a simpler, but equivalent way of discussing class P: After showing that a problem is in NP, we can show membership in class P by describing and analysing an efficient algorithm for determining the Decision Problem. i.e. (1) Show that it is in NP (polynomial time checking algorithm) and then (2) give an algorithm to solve and show that the decision problem itself is O(nk), where n is input size. For example we could do these steps for 2-colourability or Eulerian graph.

Should be clear that, P is a subset of NP. i.e. It is known that every decision problem in P is also in NP. The big unknown is whether or not every problem in NP is also in P. Is it possible that efficient algorithms exist but that we just haven’t found them? Mostly conjectured that P is not equal to NP. See other readings for this week.

Click on Insert > new slide to choose your cover, then delete this page.

Class NPC

A problem is called NP-complete if it is NP and every NP problem can be polynomially reduced to this problem.

Cook 1970 established that SAT (Boolean satisfiability problem) is in NPC - Cook’s Theorem (Stephen Cook, 1970) Read pages 13,14 of Garey and Johnson.

If every instance of a problem P1 can be reduced, in polynomial time, to some instance of P2 then P2 is at least as hard as P1.

If there is a polynomial time algorithm for any NPC problem then every problem in NP can be solved efficiently.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Showing that any problem is in the class NPC means that there is no polynomial time algorithm unless P = NP.

Most in the field think that is unlikely that P = NP. What do you think? Is it plausible that “Easy to check YES’s” should have “Easy to obtain solutions”?

Our main focus is on awareness of Class NPC and its implication and developing some ability to recognise types of problems which might be intractable.

Click on Insert > new slide to choose your cover, then delete this page.

About Input size and encoding

Complexity of a calculation is measured by expressing the running time as a function of some measure of the input data n needed to describe the problem.

For graph problems, input size n is typically the number of vertices or edges in the graph. Assumption of a reasonable encoding scheme.

For problems that involve integers such as computing the GCD, n is the number of bits it takes to express these integers. i.e. log n rather than n.

Click on Insert > new slide to choose your cover, then delete this page.

Input size and encoding

Complexity of a calculation is measured by expressing the running time as a function of some measure of the input data n needed to describe the problem.

For graph problems, input size n is typically the number of vertices or edges in the graph. Assumption of a reasonable encoding scheme.

For problems that involve integers such as computing the GCD, n is the number of bits it takes to express these integers or memory size. i.e. log n rather than n.

Click on Insert > new slide to choose your cover, then delete this page.

Theoretical perspective

If the running time is at most a polynomial function of the amount of input data then the calculation is called “easy”. Otherwise, it is “hard”.

To show/classify that a computation problem is easy it is enough to describe a fast algorithm.

To prove that a problem is hard means that you would need to prove that it is impossible to find a fast way of solving the problem.

The fact that a computational problem is classified as hard does not mean that every instance of it has to be hard.

A problem is hard because we cannot devise an algorithm for which we can give a guarantee for fast performance for all instances.

Click on Insert > new slide to choose your cover, then delete this page.

Decision Problem example

Graph k-coloring

Input: A graph G=(V,E), a positive integer k.

Question: Is G k-colourable?

A graph is k-colourable if its vertices can be assigned colours from a list of k colours such that no pair of adjacent vertices has the same colour.

Click on Insert > new slide to choose your cover, then delete this page.

Graph 2-coloring

Input: A graph G=(V,E).

Question: Is G 2-colourable?

A polynomial time algorithm is known.

Graph k-coloring

Input: A graph G=(V,E).

Question: Is G k-colourable?

No polynomial time algorithm is known to determine YES or NO for all instances.

However if given a colouring of the vertices of G, checking whether or not it is a k-colouring can be executed in polynomial time.

Click on Insert > new slide to choose your cover, then delete this page.

Sometimes rewording problem helps

56

What is a computer doing?

Consider the Turing model of a machine

https://en.wikipedia.org/wiki/Universal_Turing_machine

Click on Insert > new slide to choose your cover, then delete this page.

Singly Linked List

A singly linked list is a concrete data structure consisting of a sequence of nodes, starting from a head pointer

Each node stores

element

link to the next node

next

elem

node

A

B

C

D

head

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

}

Click on Insert > new slide to choose your cover, then delete this page.

Inserting at the Head

Allocate a new node (memory)

Insert new element

Make new node point to old head

Update head to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

Removing at the Head

Update head to point to next node in the list

Allow garbage collector to reclaim the former first node

Click on Insert > new slide to choose your cover, then delete this page.

Inserting at the Tail

Find the tail node

Allocate a new node

Insert new element

Have new node point to null

Have old last node point to new node

Update tail to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

Removing at the Tail

Removing at the tail of a singly linked list is not efficient!

There is no constant-time way to update the tail to point to the previous node

Click on Insert > new slide to choose your cover, then delete this page.

Spatial efficiency

Storage space is the space for storing the input: data and algorithm

Auxiliary Space is the extra space or temporary space used by an algorithm.

Space Complexity of an algorithm is total space taken by the algorithm with respect to the input size. Space complexity includes both Auxiliary space and space used by input.

Click on Insert > new slide to choose your cover, then delete this page.

Improving lists

Double Linked List

We can backtrack

Circular List

In looping through list return easily to head

Click on Insert > new slide to choose your cover, then delete this page.

Recursion

One method of improving efficiency

We reduce the code length and run time

But we may increase the space load

Click on Insert > new slide to choose your cover, then delete this page.

What Is Recursion?

Recursion is a method of solving problems that involves breaking a problem down into smaller and smaller subproblems until you get to a small enough problem that it can be solved trivially.

Recursion involves a function calling itself.

While it may not seem like much on the surface, recursion allows us to write elegant solutions to problems that may otherwise be very difficult to program.

Click on Insert > new slide to choose your cover, then delete this page.

Calculating the Sum of a List of Numbers

We will begin our investigation with a simple problem that you already know how to solve without using recursion.

Suppose that you want to calculate the sum of a list of numbers such as: [1,3,5,7,9].

An iterative function that computes the sum is shown here.

The function uses an accumulator variable (total) to compute a running total of all the numbers in the list by starting with 0 and adding each number in the list.

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

Pretend that you do not have while loops or for loops. How would you compute the sum of a list of numbers?

If you were a mathematician you might start by recalling that addition is a function that is defined for two parameters, a pair of numbers. To redefine the problem from adding a list to adding pairs of numbers, we could rewrite the list as a fully parenthesized expression. Such an expression looks like this:

((((1+3)+5)+7)+9)

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

We can also parenthesize the expression the other way around: (1+(3+(5+(7+9))))

Notice that the innermost set of parentheses, (7+9), is a problem that we can solve without a loop or any special constructs. In fact, we can use the following sequence of simplifications to compute a final sum.

total= (1+(3+(5+(7+9))))

total= (1+(3+(5+16)))

total= (1+(3+21))

total= (1+24)

total= 25

Click on Insert > new slide to choose your cover, then delete this page.

A Different Approach

How can we take this idea and turn it into a program, like Python?

First, let’s restate the sum problem in terms of Python lists.

We might say the sum of the list numList is the sum of the first element of the list (numList[0]), and the sum of the numbers in the rest of the list (numList[1:]).

listSum(numList)=first(numList) + listSum(rest(numList))

first(numList) returns the first element of the list

rest(numList) returns a list of everything but the first element

Click on Insert > new slide to choose your cover, then delete this page.

Example Recursive Program

Tracing: http://interactivepython.org/courselib/static/pythonds/Recursion/pythondsCalculatingtheSumofaListofNumbers.html

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

The Three Rules of Recursion

All recursive algorithms must obey three important rules

A recursive algorithm must have a base case.

A recursive algorithm must change its state and move toward the base case.

A recursive algorithm must call itself, recursively.

Click on Insert > new slide to choose your cover, then delete this page.

The Rules!

First, a base case is the condition that allows the algorithm to stop recursing. A base case is typically a problem that is small enough to solve directly.

In the list_sum algorithm the base case is a list of length 1.

Second, we must arrange for a change of state that moves the algorithm toward the base case. A change of state means that some data that the algorithm is using is modified. Usually the data that represents our problem gets smaller in some way.

In the list_sum algorithm our primary data structure is a list, so we must focus our state-changing efforts on the list. Since the base case is a list of length 1, a natural progression toward the base case is to shorten the list.

Third, the algorithm must call itself.

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 2 /._Week2.pptx

Assignment 2_Resources/Week 2 /Week 2 workshop.pptx

Complexity Theory

Cat Kutay

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Big-Oh Rules

If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e.,

Drop lower-order terms

Drop constant factors

Use the smallest possible class of functions

Say “2n is O(n)” instead of “2n is O(n2)”

Use the simplest expression of the class

Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Asymptotic Algorithm Analysis

The asymptotic analysis of an algorithm determines the running time in big-Oh notation

To perform the asymptotic analysis

We find the worst-case number of primitive operations executed as a function of the input size

We express this function with big-Oh notation

Example:

We say that algorithm find_max “runs in O(n) time”

Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

3

Asymptotic complexity – an example

Most commonly used notation for (asymptotic) complexity is "big-O" notation

which estimates the rate of growth of a function, for “large” n.

f (n) = n2 + 100n + log10n + 1,000 = O(n2)

" f (n) is big-oh of n squared ”

The order of complexity of f (n) is a polynomial function (quadratic)

Figure 2-1 (Drozdek) The growth rate of all terms of function f (n) = n2 + 100n + log10n + 1,000

Click on Insert > new slide to choose your cover, then delete this page.

Which is the fastest growing component

4

Execution time

Figure 2-4 (Drozdek) Classes of algorithms and their execution times on a computer executing 1 million operations per second (1 sec = 106 μsec (microseconds) = 103 msec (milliseconds))

Click on Insert > new slide to choose your cover, then delete this page.

Computing Prefix Averages

We further illustrate asymptotic analysis with two algorithms for prefix averages

The i-th prefix average of an array X is average of the first (i + 1) elements of X:

A[i] = (X[0] + X[1] + … + X[i])/(i+1)

Computing the array A of prefix averages of another array X has applications to financial analysis

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages (Quadratic)

The following algorithm computes prefix averages in quadratic time by applying the definition

Click on Insert > new slide to choose your cover, then delete this page.

Arithmetic Progression

The running time of prefixAverage1 is O(1 + 2 + …+ n)

The sum of the first n integers is n(n + 1) / 2

There is a simple visual proof of this fact

Thus, algorithm prefixAverage1 runs in O(n2) time

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages 2 (Looks Better)

The following algorithm uses an internal Python function to simplify the code

Algorithm prefixAverage2 still runs in O(n2) time!

Click on Insert > new slide to choose your cover, then delete this page.

Reduce complexity

If x>0:

// O(I)

else if x<0:

//O(logn)

else:

//O(n2)

Click on Insert > new slide to choose your cover, then delete this page.

O(?) from seminar

for x in range (0,n):

print x;

for x in range (0,n):

for x in range (0,n):

print x*y

What is the total runtime ???

O(N2)

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: examples

These functions are O(n2): 3n2 1000n2 + 10000 3n 3n2 + log n + 123 These functions are not O(n2): 3n2.01 3n3

Click on Insert > new slide to choose your cover, then delete this page.

Exercises

Use the definition of big-O to show that f is O(g) for each of the following functions. (i.e. Find the witnesses c and N.) In each case, assume that n is a positive integer.

Remember The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

f(n) = 8n, g(n) = n

We want c and N such that 8n ≤ c × g(n) for all n ≥ N.

With c = 8, N = 1 we have

8n ≤ 8n which is true as required for all n ≥ 1.

Click on Insert > new slide to choose your cover, then delete this page.

(b) f(n) = 4n − 1, g(n) = n

We want c and N such that 4n − 1 ≤ c × n for all n ≥ N.

Similarly, try c = 4, N = 1.

4n − 1 ≤ 4n − 1 which is true as required for all n ≥ 1.

(c) f(n) = n2 + 2n + 1, g(n) = n2 (Details in Exercises – workshop slide)

0 ≤ n 2 + 2n + 1 ≤ n 2 + 2n 2 + n2 = 4n2

Hence, c = 4 and N = 1 are witnesses.

(d) f(n) = 2n2 + 5n − 3, g(n) = n2

2n2 + 5n − 3 ≤ 2n2 + 5n2 − 3n2

≤ 4n2 , for n ≥ 1

Hence, c = 4 and N = 1 are witnesses.

(e) f(n) = 457, g(n) = 1

c = 457 and N = 1.

The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

OR 0 ≤ f(n) ≤ c.g(n) for all n ≥ N

Click on Insert > new slide to choose your cover, then delete this page.

(f) Show that 3n2 + 25 is O(n2)

3n2 + 25 ≤ 3n2 + 25n2 n ≥ 1

3n2 + 25 ≤ 28n2

Hence, c = 28 and N = 1 are witnesses.

(g) Show that 2n2 is O(n3)

2n2 ≤ 2n3

Hence, c = 1 and N = 2 are witnesses.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Exercises

Click on Insert > new slide to choose your cover, then delete this page.

Figure 2-3 Comparison of functions for different values of c and N from Figure 2-2

Figure 2-2 (Drozdek) Different values of c and N for function f (n) = 2n2 + 3n + 1 = O(n2)

calculated according to the definition of big-O

Click on Insert > new slide to choose your cover, then delete this page.

TSP – Travelling salesman problem

Below link, solves the problem for however many cities you want to select by trying out all possible routes, and recording the best so far. You can get a feel for what an intractable problem looks like by seeing how long the interactive takes to solve the problem for different size maps

https://csfieldguide.org.nz/en/interactives/city-trip/

Stop, set up cities then start

From https://csfieldguide.org.nz/en/chapters/complexity-and-tractability/whats-the-big-picture/

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages 3 (Linear Time)

The following algorithm computes prefix averages in linear time by keeping a running sum

Algorithm prefixAverage3 runs in O(n) time

Click on Insert > new slide to choose your cover, then delete this page.

Lists

What are the features of a list

What are the operations required on a list

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

}

Click on Insert > new slide to choose your cover, then delete this page.

Recursion example

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

Self Test

How many recursive calls are made when computing the sum of the list [2,4,6,8,10]?

(A) 6

(B) 5

(C) 4

(D) 3

The first recursive call passes the list [4,6,8,10], the second [6,8,10] and so on until [10].

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Maths to review

Analysis of Algorithms

Summations

Logarithms and Exponents

Proof techniques

Basic probability

properties of logarithms:

logb(xy) = logbx + logby

logb (x/y) = logbx - logby

logbxa = alogbx

logba = logxa/logxb

properties of exponentials:

a(b+c) = aba c

abc = (ab)c

ab /ac = a(b-c)

b = a logab

bc = a c*logab

Click on Insert > new slide to choose your cover, then delete this page.

0

5

10

15

20

25

30

35

1234567

X

A

Chart1

21 21
23 22
25 23
31 25
20 24
18 23
16 22
X
A

Sheet1

X 21 23 25 31 20 18 16
A 21 22 23 25 24 23 22

0

1

2

3

4

5

6

7

123456

__MACOSX/Assignment 2_Resources/Week 2 /._Week 2 workshop.pptx

Assignment 2_Resources/Week 4/Priority Queues.pptx

HIT220 Algorithms and Complexity

Priority Queues

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Circular Linked Lists

Circular linked lists are a type of linked list in which the last node links back to the head of the list instead of pointing to None.

Circular linked lists have use cases such as:

Going around each player’s turn in a multiplayer game

Managing the application life cycle of a given operating system

Implementing a Fibonacci heap for a priority queue

Click on Insert > new slide to choose your cover, then delete this page.

Implementing

One of the advantages of circular linked lists is that you can traverse the whole list starting at any node. Since the last node points to the head of the list, you need to make sure that you stop traversing when you reach the starting point. Otherwise, you’ll end up in an infinite loop.

In terms of implementation, circular linked lists are very similar to singly linked list. The only difference is that you can define the starting point when you traverse the list:

Click on Insert > new slide to choose your cover, then delete this page.

Python

Click on Insert > new slide to choose your cover, then delete this page.

Priority Queue

What is a queue

What is a priority queue

struct item { int item; int priority; }

Click on Insert > new slide to choose your cover, then delete this page.

Heap

A Heap is a tree structure in which each parent node is less than or equal to its child node for a Min Heap and greater than or equal to its child node for a Max heap.

Order needs to be introduced to the heap to achieve the desired running time O(log n) . In particular, the degrees of the nodes are kept quite low: every node has degree at most O(log n) and the size of a subtree rooted in a node of degree k is at least Fk+2, where Fk is the kth Fibonacci number

Click on Insert > new slide to choose your cover, then delete this page.

Min Binary Heap

Click on Insert > new slide to choose your cover, then delete this page.

Implementing

import heapq H = [21,1,45,78,3,5]

# Use heapify to rearrange the elements heapq.heapify(H)

print(H)

[1, 3, 5, 78, 21, 45]

heapq.heappush(H,8)

print(H)

[1, 3, 5, 78, 21, 45, 8]

heapq.heappop(H)

[3, 8, 5, 78, 21 ,45]

heapq.heapreplace(H,6)

[5, 8, 6, 78, 21, 45]

Click on Insert > new slide to choose your cover, then delete this page.

Converting

A Min-Heap is a complete binary tree in which the value in each internal node is smaller than or equal to the values in the children of that node. Mapping the elements of a heap into an array is trivial: if a node is stored a index k, then its left child is stored at index 2k + 1 and its right child at index 2k + 2.

Click on Insert > new slide to choose your cover, then delete this page.

Some operations in binary heaps

getMin(): returns the root element. Time Complexity of this operation is O(1).

extractMin(): Removes the minimum element. Time Complexity of this Operation is O(Log n) as this operation needs to maintain the heap property after removing root.

insert(): Inserting a new key takes O(Log n) time. We add a new key at the end of the tree, if new key is larger than its parent. If not we need to traverse up to fix the violated heap property.

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction/

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Find Min: Θ(1) [Same as both Binary and Binomial]

Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]

Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]

Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]

Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and Θ(Log n) in Binomial]

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction

Click on Insert > new slide to choose your cover, then delete this page.

Matrices and vectors

Why do we use them and how

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking Matrix

Eigenvectors

Eigenvalues

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking

From Tim Chartier and Anne Greenbaum

Adjacency matrix

Click on Insert > new slide to choose your cover, then delete this page.

Ranking matrix

Form a matrix M where element mij gives the probability of a surfer visit webpage j from webpage i, which implies

Click on Insert > new slide to choose your cover, then delete this page.

Stochastic matric or transition matrix

Start a walk at page 1, represented by vector v

Click on Insert > new slide to choose your cover, then delete this page.

Calculating probability of each page

v2 = v1M = (0 0.33 0.33 0 0.33 0)

v3 = v2M = (0.67 0 0.17 0 0 0.17)

v3 = vM3

The adjacency matrix is sparse

Note all page ranks add to 1

V becomes an eigenvector of M

Click on Insert > new slide to choose your cover, then delete this page.

Thank you [email protected]

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Priority Queues.pptx

Assignment 2_Resources/Week 4/.DS_Store

__MACOSX/Assignment 2_Resources/Week 4/._.DS_Store

Assignment 2_Resources/Week 4/Stacks and Queues.pptx

Week 4: Stacks and Queues

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Space Efficiency

Example of using tail recursion:

def fact(n):

    if (n == 0):

        return 1

    return n * fact(n-1)

Is this the same as:

def fact(n, a = 1):

    if (n == 1):

        return a

    return fact(n - 1, n * a)

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Relatives of Big-Oh

big-Omega

f(n) is (g(n)) if there is a constant c > 0

and an integer constant n0  1 such that

f(n)  c•g(n) for n  n0

big-Theta

f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Example Uses of the Relatives of Big-Oh

f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n) < c•g(n) for n  n0

Let c = 5 and n0 = 1

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 1 and n0 = 1

5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 5 and n0 = 1

5n2 is (n2)

Click on Insert > new slide to choose your cover, then delete this page.

5

The Node Class for List Nodes Coding practice

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

}

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Insertion/Deletion in List

// insertion methods

public insert(Object e, Node n) {

//insert e after node n

newNode=node(e, n.next)

n.next=newNode;

}

//deletion

Public delete (Object e)

//delete node with data e

node=head;

while(node is not None):

    if node.element ==e:

prev.next=node.next

//garbage collection

del node;

else:

prev=node;

node=node.next

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Retrieval

//deletion

Public delete (Object e)

//delete node with data e

node=head;

while(node is not None):

    if node.element ==e:

return True; prev=node;

node=node.next

return False;

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Basic Data Structures

The Internet is designed to route information in discrete packets, which are at most 1500 bytes in length. Because of this design, any time a video stream is transmitted on the Internet, it must be subdivided into packets and these packets must each be individually routed to their destination.

In the case of a stored video file, it is desirable to send packets using the Transmission Control Protocol (TCP), since this protocol guarantees that all the packets will arrive at their destination in the correct order.

Nevertheless, because of vagaries and errors, the time it takes for these packets to arrive at their destination can be highly variable. Thus, we need a way of “smoothing out” these variations in order to avoid long pauses if someone wants to watch a video at the same time it is being transmitted. (like a buffer)

which is a portion of computer memory that is used to temporarily store items, as they are being produced by one computational process and consumed by another.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Internet web browsers store the addresses of recently visited sites on a stack. Each time a user visits a new site, that site’s address is “pushed” onto the stack of addresses. The browser then allows the user to “pop” back to previously visited sites using the “back” button.

Click on Insert > new slide to choose your cover, then delete this page.

Stacks

A stack is a container of objects that are inserted and removed according to the last-in first-out (LIFO) principle.

Linear data structure

Flexible size (delete and add as you go!)

Objects can be inserted into a stack at any time, but only the most-recently inserted (that is, “last”) object can be removed at any time.

The name “stack” is derived from the metaphor of a stack of plates in a spring loaded cafeteria plate dispenser.

In this case, the fundamental operations involve the “pushing”

and “popping” of plates on the stack.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Stacks

stack, S, is a container that supports the following two methods:

push(o): Insert object o at the top of the stack.

pop(): Remove from the stack and return the top object on the stack, that is, the most recently inserted element still in the stack; an error occurs if the stack is empty.

Stack data structure can be implemented in two ways. They are as follows...

Using Array (Python uses Lists, eg: my_var = [item1, item2] )

Using Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Application

The simplest application of a stack is to reverse a word. You push a given word to stack - letter by letter - and then pop letters from the stack.

Another application is an "undo" mechanism in text editors; this operation is accomplished by keeping all text changes in a stack.

Click on Insert > new slide to choose your cover, then delete this page.

A Simple Array-Based Implementation

A stack is easily implemented with an N-element array S, with elements stored from S[0] to S[t], where t is an integer that gives the index of the top element in S.

We must specify some maximum size N for our stack, say, N = 1, 000

Click on Insert > new slide to choose your cover, then delete this page.

Array-based implementation

If array begin at index 0

(t −1) for an initially empty stack, also to test if a stack is empty.

(t+1) to determine the number of elements in a stack.

Error condition that arises:

if we try to insert a new element and the array S is full.

If we try to remove a element from an empty array S

Algorithm push(obj):

if t + 1 = N then

return that a stack-full error has occurred

t ← t + 1

S[t] ← obj

return

Algorithm pop():

if t < 0 then

return that a stack-empty error has occurred

obj ← S[t]

S[t] ← null (or None in Python)

t ← t − 1

return obj

each method runs in O(1) time.

Click on Insert > new slide to choose your cover, then delete this page.

Implementing a Stack

Stack() creates a new stack that is empty. It needs no parameters and returns an empty stack.

push(item) adds a new item to the top of the stack. It needs the item and returns nothing.

pop() removes the top item from the stack. It needs no parameters and returns the item. The stack is modified.

peek() returns the top item from the stack but does not remove it. It needs no parameters. The stack is not modified.

is_empty() tests to see whether the stack is empty. It needs no parameters and returns a boolean value.

size() returns the number of items on the stack. It needs no parameters and returns an integer.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Click on Insert > new slide to choose your cover, then delete this page.

The biggest issue is that it can run into speed issue as it grows. The items in list are stored next to each other in memory, if the stack grows bigger than the block of memory that currently hold it, then Python needs to do some memory allocations. This can lead to some append() calls taking much longer than other ones.

Stack implemented using array is not suitable, when we don't know the size of data which we are going to use.

A stack data structure can be implemented by using linked list data structure. The stack implemented using linked list can work for unlimited number of values. That means, stack implemented using linked list works for variable size of data.

So, there is no need to fix the size at the beginning of the implementation. The Stack implemented using linked list can organize as many data values as we want. 

Linked List-based implementation provides the best (from the efficiency point of view) dynamic stack implementation.

Click on Insert > new slide to choose your cover, then delete this page.

Implementation of the

Stack class in Python

Demonstration of the Stack class being used in Python

Click on Insert > new slide to choose your cover, then delete this page.

Linked List-based implementation

In linked list implementation of a stack, every new element is inserted as 'top' element. That means every newly inserted element is pointed by 'top'. Whenever we want to remove an element from the stack, simply remove the node which is pointed by 'top' by moving 'top' to its next node in the list. The next field of the first element must be always NULL.

Click on Insert > new slide to choose your cover, then delete this page.

push(value) - Inserting an element into the Stack

We can use the following steps to insert a new node into the stack...

Step 1: Create a newNode with given value.

Step 2: Check whether stack is Empty (top == NULL)

Step 3: If it is Empty, then set newNode → next = NULL.

Step 4: If it is Not Empty, then set newNode → next = top.

Step 5: Finally, set top = newNode.

Click on Insert > new slide to choose your cover, then delete this page.

pop() - Deleting an Element from a Stack

We can use the following steps to delete a node from the stack...

Step 1: Check whether stack is Empty (top == NULL).

Step 2: If it is Empty, then display "Stack is Empty!!! Deletion is not possible!!!" and terminate the function

Step 3: If it is Not Empty, then define a Node pointer 'temp' and set it to 'top'.

Step 4: Then set 'top = top → next'.

Step 7: Finally, delete 'temp' (free(temp)).

Click on Insert > new slide to choose your cover, then delete this page.

display() - Displaying stack of elements

We can use the following steps to display the elements (nodes) of a stack...

Step 1: Check whether stack is Empty (top == NULL).

Step 2: If it is Empty, then display 'Stack is Empty!!!' and terminate the function.

Step 3: If it is Not Empty, then define a Node pointer 'temp' and initialize with top.

Step 4: Display 'temp → data --->' and move it to the next node. Repeat the same until temp reaches to the first node in the stack (temp → next != NULL).

Step 4: Finally! Display 'temp → data ---> NULL'.

Click on Insert > new slide to choose your cover, then delete this page.

An implementation of the

Stack class in Python utilizing a Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Example: Using a buffer to smooth video transmission over the Internet

The networking process is producing the packets and the playback process is consuming them. Thus, ignoring any rewinding, this producer-consumer model is enforcing a first-in, first-out (FIFO) protocol for the packets, in that the consumer process is always retrieving the packet that has been in the buffer the longest.

Click on Insert > new slide to choose your cover, then delete this page.

Queue buffers

Network packets are produced by the TCP process, which inserts packets in the correct order into the buffer, but does so at a variable rate.

Packets are consumed by the video player process, which removes packets from the buffer at a constant speed according to the video standard it is using.

The buffer enforces a first-in, first-out (FIFO) protocol and, if it is big enough, it should smooth out the packet production and consumption so that packets can be processed at the average transmission rate without annoying pauses.

This week, we will explore how to implement such a buffer using a queue data structure, which itself can be implemented with either an array or a linked list.

Click on Insert > new slide to choose your cover, then delete this page.

Queues

Queue is a linear data structure in which the insertion and deletion operations are performed at two different ends. In a queue data structure, adding and removing of elements are performed at two different positions. The insertion is performed at one end and deletion is performed at other end. In a queue data structure, the insertion operation is performed at a position which is known as 'rear' and the deletion operation is performed at a position which is known as 'front'. In queue data structure, the insertion and deletion operations are performed based on FIFO (First In First Out) principle.

Click on Insert > new slide to choose your cover, then delete this page.

A queue supports the following two fundamental methods:

enqueue(o): Insert object o at the rear of the queue.

dequeue(): Remove and return from the queue the object at the front; an error occurs if the queue is empty.

Queue data structure can be implemented in two ways. They are as follows...

Using Array

Using Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Example

Queue after inserting 25, 30, 51, 60 and 85.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Click on Insert > new slide to choose your cover, then delete this page.

Implementing a Queue

Queue() creates a new queue that is empty. It needs no parameters and returns an empty queue.

enqueue(item) adds a new item to the rear of the queue. It needs the item and returns nothing.

dequeue() removes the front item from the queue. It needs no parameters and returns the item. The queue is modified.

is_empty() tests to see whether the queue is empty. It needs no parameters and returns a boolean value.

size() returns the number of items in the queue. It needs no parameters and returns an integer.

display() - To display the elements of the queue.

Click on Insert > new slide to choose your cover, then delete this page.

Implementation of the

Queue class in Python

Demonstration of the Queue class being used in Python

Click on Insert > new slide to choose your cover, then delete this page.

Queue implementation using singly linked list

set_next  will take a new_node as a parameter and will set the pointer from previous Node towards this Node.

In the Node class, we will set the data and next_node(i.e pointer) equal to None as parameter in init method so that if we don’t send any data and pointer to the next node, it will return None.

Click on Insert > new slide to choose your cover, then delete this page.

An implementation of the

Queue class in Python utilizing a Linked List

Code Continues 

Click on Insert > new slide to choose your cover, then delete this page.

Simulation: Hot Potato

https://www.youtube.com/watch?v=uCsD3ZGzMgE

This game is a modern-day equivalent of the famous Josephus problem.

Josephus and 39 of his comrades held out against the Romans in a cave

They decided that they would rather die than be slaves to the Romans

They arranged themselves in a circle. One man was designated as number one, and proceeding clockwise they killed every seventh man.

Josephus instantly figured out where he ought to sit in order to be the last to go.

When the time came, instead of killing himself, he joined the Roman side.

You can find many different versions of this story. Some count every third man and some allow

the last man to escape on a horse. In any case, the idea is the same.

Click on Insert > new slide to choose your cover, then delete this page.

Simulation: Hot Potato

To simulate the circle, we will use a queue (see Figure 3.14).

Assume that the child holding the potato will be at the front of the queue.

Upon passing the potato, the simulation will simply dequeue and then immediately enqueue that child, putting them at the end of the line.

They will then wait until all the others have been at the front before it will be her turn again.

After num dequeue/enqueue operations, the child at the front will be removed permanently and another cycle will begin.

This process will continue until only one name remains (the size of the queue is 1).

Click on Insert > new slide to choose your cover, then delete this page.

Pseudo code

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Stacks and Queues.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Singly Linked List.docx

HIT 220 Week 4 Activity Singly Linked List

Implement a singly linked list for primitive integers

a) Using the implementation details below as a guide, write and compile a program to accept and store (unordered) integers as a singly linked list. Do not use any library classes.

Modify to your own coding style if you like but your list class must include at least methods:

· Add or delete from either the head or tail. (4 methods)

· Find a given integer (e.g. method boolean isInList) (1 method)

· Delete a given integer from the list. (1 method)

b) Test your program for robustness.

If using pseudo code you will need to consider odd cases, etc

You will need a main method but hardcoding of input and your method calls is fine for the current purpose.

c) Extend your program to include iterative methods:

· To insert a node at the ith position in the list

· to delete the ith node

d) Write up a summary of the big O complexity of each algorithm in your program. In each case, write a sentence to explain the complexity. (You do not need to give calculations or formal proof here.)

https://www.geeksforgeeks.org/linked-list-set-1-introduction/

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Singly Linked List.docx

Assignment 2_Resources/Week 4/Part 2.pptx

Week 4: Hash and Matrices

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Simulation: Hot Potato

To simulate the circle, we will use a queue (see Figure 3.14).

Assume that the child holding the potato will be at the front of the queue.

Upon passing the potato, the simulation will simply dequeue and then immediately enqueue that child, putting them at the end of the line.

They will then wait until all the others have been at the front before it will be her turn again.

After num dequeue/enqueue operations, the child at the front will be removed permanently and another cycle will begin.

This process will continue until only one name remains (the size of the queue is 1).

Click on Insert > new slide to choose your cover, then delete this page.

Iteration

At each cycle, kth person removed, leaving n-1. So the (n-1)th starting state is now k-1 moved from the original starting point

Therefore

F(n,k)=(F(n-1,k)+k-1) mod n +1

F(1,k)=1

Click on Insert > new slide to choose your cover, then delete this page.

Pseudo code

def josephus(n, k):

if (n == 1):

return 1

else:

# The position returned by josephus(n - 1, k) is adjusted

# because the recursive call josephus(n - 1, k) considers

# the original position k%n + 1 as position 1

return (josephus(n - 1, k) + k-1) % n + 1

Click on Insert > new slide to choose your cover, then delete this page.

Heap types

Binary heap

Binomial heap

Fibonacci heap

Click on Insert > new slide to choose your cover, then delete this page.

Binary heap

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

Fibonacci Heap is a collection of trees with min-heap or max-heap property.

Joined by a circular doubly linked list for accessing min value

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction/

Click on Insert > new slide to choose your cover, then delete this page.

Complexity comparison

Find Min: Θ(1) [Same as both Binary and Binomial]

Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]

Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]

Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]

Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and Θ(Log n) in Binomial]

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction

Click on Insert > new slide to choose your cover, then delete this page.

Skipped list

Click on Insert > new slide to choose your cover, then delete this page.

Example

1, 5, 7, 9, 13, 15, 17, 19, 25

Condition 1: Every third odd position node goes to second level and points to the position two node ahead.

Condition 2: Every 4th position node goes to 3rd level and points to the position four node ahead.

Find the sequence of searching of the following numbers: 13,1 and 20.

Click on Insert > new slide to choose your cover, then delete this page.

Matrices and vectors

We have been using linear arrays and lists

Why do we use 2D arrays and how

Click on Insert > new slide to choose your cover, then delete this page.

Arrays as coordinates

Vectors are lists

In physics the numbers represent direction in space and length

eg [1 2] is a vector from zero to (1, 2) ie 1 on x axis and 2 on y axis

In computer science we can put any data in a vector and consider it a measure of the object that data is describing

Eg [1 2] is the size measure of a box 1m by 2m. We can compare it to another box [3 4]

https://www.youtube.com/watch?v=fNk_zzaMoSs

Click on Insert > new slide to choose your cover, then delete this page.

Matrices

Matrices can be a combination of such data points, or vectors

Eg an array of vectors is a matrix

[1 2]

[1 2]

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking Matrix

Eigenvectors

Eigenvalues

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking

From Tim Chartier and Anne Greenbaum

Adjacency matrix

Click on Insert > new slide to choose your cover, then delete this page.

Ranking matrix

Form a matrix M where element mij gives the probability of a surfer visit webpage j from webpage i, which implies

Click on Insert > new slide to choose your cover, then delete this page.

Stochastic matric or transition matrix

Start a walk at page 1, represented by vector v

Click on Insert > new slide to choose your cover, then delete this page.

Calculating probability of each page

v2 = v1M = (0 0.33 0.33 0 0.33 0)

v3 = v2M = (0.67 0 0.17 0 0 0.17)

v3 = vM3

The adjacency matrix is sparse

Note all page ranks add to 1

V becomes an eigenvector of M

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Part 2.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Answers Linked List.docx

HIT 220 Week 4 Activity Singly Linked List

Implement a singly linked list for primitive integers

a) Using the implementation details below as a guide, write and compile a program to accept and store (unordered) integers as a list using a singly linked list. Do not use any library classes.

https://www.geeksforgeeks.org/linked-list-set-1-introduction/

Look up lists in Python, what are they, how are they used.

Create Node

CreateNode(data, node)

Allocate data

Allocate next

Return node

Create Linked Listhead (integers)

Node=Create first node

Head = node

Repeat for each data_value in integers:

Node2=Create new_node (data_value,null)

Node2.next=node

Node=node2

Head=node

Return node

Create linked listtail (integers)

Node=Create first node

Head = node

Repeat for each data_value in integers:

node2=Create_node(data,null)

node.next=node2

node=node2

Add_to_head (data)

Allocate head to node.next

Return node

Add_to_tail (data)

Create new_node(data)

Search_list_tail() #where tail.next=null

Tail.next =node

Search_list_tail()

While node.next not equal null

Node=node.next

Return node

Delete-from-head()

head =head.next;

Modify to your own coding style if you like but your list class must include at least methods:

· Add or delete in a list (append, extend, or insert).

· Find a given element in the list : index() (e.g. method boolean isInList)

· Delete a given element from the list.

b) Test your program for robustness.

If using pseudo code you will need to consider odd cases, etc

You will need a main method but hardcoding of input and your method calls is fine for the current purpose.

c) Extend your program to include iterative methods:

· To insert a node at the ith position in the list

· to delete the ith node

d) Write up a summary of the big O complexity of each algorithm in your program. In each case, write a sentence to explain the complexity. (You do not need to give calculations or formal proof here.)

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Answers Linked List.docx

Assignment 2_Resources/Week 5/.DS_Store

__MACOSX/Assignment 2_Resources/Week 5/._.DS_Store

Assignment 2_Resources/Week 5/Binary trees.pptx

Week 5: Trees and Binary Trees

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Visualising tools

https://pythontutor.com/

https://visualgo.net/en

Click on Insert > new slide to choose your cover, then delete this page.

Trees

Now that we have studied linear data structures like stacks and queues and have some experience with recursion, we will look at a common data structure called the tree.

Trees are used in many areas of computer science, including operating systems, graphics, database systems, and computer networking.

Trees represent data in the real world. Like a  family tree where you have a parents whose children are below them in the tree and then you've got the children of those children even further down the tree.

Trees are like trees, where the continual subdivision of trunk, branch, stems to more and more details functions in the photosynthesis process, communicating down the tree via sap

Click on Insert > new slide to choose your cover, then delete this page.

Decision Trees

A decision tree is a map of the possible outcomes of a series of related choices. It allows an individual or organization to weigh possible actions against one another based on their costs, probabilities, and benefits.

They can be used either to drive informal discussion or to map out an algorithm that predicts the best choice mathematically.

Click on Insert > new slide to choose your cover, then delete this page.

Expression Trees: Syntax Tree in Compiler

if you want evaluate the expression 45 / (3 +6), you actually represent it really cleanly as a tree and be able to evaluate it much more easily.

Click on Insert > new slide to choose your cover, then delete this page.

File Systems

File systems are perfect examples of trees.  Where if someone wants to know that a path is root, then users, admin, then steve,  we know it based on the tree structure.  

Click on Insert > new slide to choose your cover, then delete this page.

Auto corrector and spell checker :

Click on Insert > new slide to choose your cover, then delete this page.

Next Move in games:

Click on Insert > new slide to choose your cover, then delete this page.

There's lots of trees in computer science.  There's regular trees, binary trees, heaps, there’s binary search trees, there's Huffman trees, there's AVL trees, randomized search trees, red black trees, tons of these, all throughout computer science.

Why are there so many trees? They're a dynamic data structure (easy to add and remove). They have the benefits of both an ordered array and a linked list as search is as quick as in a sorted array and insertion or deletion operations (excluding the search) are as fast as in linked list.

Unlike Array and Linked List, which are linear data structures, tree is hierarchical (or non-linear) data structure.

The more powerful feature to trees is that their structure conveys information.  So the fact that users is a child of root tells me that that a path exists.

Click on Insert > new slide to choose your cover, then delete this page.

Advantages of trees

Trees reflect structural relationships in the data

Trees are used to represent hierarchies

Trees are efficient for insertion and searching

Trees are a very flexible data type, allowing to move subtrees around with minimum effort

Click on Insert > new slide to choose your cover, then delete this page.

Vocabulary and Definitions

A tree is a data type that consists of nodes and arcs

These trees are depicted upside down with the root at the top and the leaves (terminal nodes) at the bottom

The root is a node that has no parent; it can have only child nodes

Leaves have no children (their children are null)

Path − the sequence of nodes along the edges of a tree.

The number of arcs in a path is called the length of the path.

The level of a node is the length of the

path from the root to the node plus 1

(which is the number of nodes in the path)

The height of a nonempty tree is the

maximum level of a node in the tree

Click on Insert > new slide to choose your cover, then delete this page.

Examples

Figure 6-1 Examples of trees

Click on Insert > new slide to choose your cover, then delete this page.

Figure 6-2 Hierarchical structure of a university shown as a tree

Click on Insert > new slide to choose your cover, then delete this page.

An ordered tree is where all elements are stored according to some predetermined criterion of ordering

Figure 6-3 Transforming (a) a linked list into (b) a tree

Click on Insert > new slide to choose your cover, then delete this page.

Binary Trees

Binary Tree is a special data structure used for data storage purposes.

A binary tree has a special condition that each node can have a maximum of two children (possibly empty), and each child is designated as either a left child or a right child

Figure 6-4 Examples of binary trees

Click on Insert > new slide to choose your cover, then delete this page.

In a complete binary tree, all nonterminal nodes have both their children, and all leaves are at the same level

Max no. of nodes at level i = 2i

There are at most 2i nodes at level i

Max no. of nodes at a tree with height h

= 20 + 21+ …. + 2h

= 2h+1 -1

20 nodes

21 nodes

22 nodes

23 nodes

Click on Insert > new slide to choose your cover, then delete this page.

In a perfect binary tree, the maximum number of

nodes in a binary tree with height h

What is the max no. of nodes of 4 levels?

What will be height of a perfect binary tree with

N nodes?

= 2h+1 -1

= 23+1 -1 = 15

n = 2h+1 -1

2h+1 = n+1

h + 1 = log2 (n+1)

h = log2 (n+1) – 1

= log2 16 – 1

= 4 – 1 = 3

Height of complete binary tree

log2 (n) = 3.90689 = 3

Minimum height of a tree with n nodes

log2 (n)

O (h) = log2 (n)

Maximum height

n-1

= o(n)

Time taken for an operation is proportional to height of the tree or (n)

Worst case

Click on Insert > new slide to choose your cover, then delete this page.

The height of a tree determines the worst case complexity of all the usual operations.

Click on Insert > new slide to choose your cover, then delete this page.

Implementing Binary Trees

Binary trees can be implemented in at least two ways:

As arrays

As linked structures

To implement a tree as an array, a node is declared as an object with an information field and two “reference/pointers” fields

These reference fields contain the index of the array cells in which left and right children are stored, if there are any.

Figure 6-7 Array representation of the tree in Figure 6.6c

Figure 6-6 Examples of binary search trees

Click on Insert > new slide to choose your cover, then delete this page.

Implementation

• BinaryTree() creates a new instance of a binary tree.

• get_left_child() returns the binary tree corresponding to the left child of the current node.

• get_right_child() returns the binary tree corresponding to the right child of the current node.

• set_root_val(val) stores the object in parameter val in the current node.

• get_root_val() returns the object stored in the current node.

• insert_left(val) creates a new binary tree and installs it as the left child of the current node.

• insert_right(val) creates a new binary tree and installs it as the right child of the current node.

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Click on Insert > new slide to choose your cover, then delete this page.

List of Lists Representation

In a list of lists tree, we will store the value of the root node as the first element of the list.

The second element of the list will itself be a list that represents the left subtree

The third element of the list will be another list that represents the right subtree.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Binary Tree using nodes

The tree itself only needs a root node, kind of just like a linked list only needs a head. A linked list can also have a tail, but for a binary tree, we only want the root node.

 Each node in the tree needs:

A value

A parent (optional)

A left child

A right child

Click on Insert > new slide to choose your cover, then delete this page.

Example tree structure only, not representative of code sample

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Click on Insert > new slide to choose your cover, then delete this page.

What is the minimum number of external nodes for a proper binary tree with height h ? Justify your answer.

What is the maximum number of external nodes for a proper binary tree with height h ? Justify your answer.

Let T be a proper binary tree with height h and n nodes. Show that log ( n + 1 ) − 1 ≤ h ≤ ( n − 1 ) / 2 . d. For which values of n and h can the above lower and upper bounds on h be attained with equality?

Click on Insert > new slide to choose your cover, then delete this page.

Binary Search Tree (BST)

Binary Search Tree is a binary tree in which every node contains only smaller values in its left subtree and only larger values in its right subtree.

In this type of tree, the left subtree of every node contains nodes with smaller values and right subtree of every node contains larger values.

Click on Insert > new slide to choose your cover, then delete this page.

Search Operation in BST

The search operation is performed with O(log n) time complexity.

Click on Insert > new slide to choose your cover, then delete this page.

Searching in the BST: example

20

28

10

15

25

23

27

30

26

20

25

10

15

25

23

27

30

26

success

failure

32

Click on Insert > new slide to choose your cover, then delete this page.

Insert node in BST

33

Click on Insert > new slide to choose your cover, then delete this page.

Section 6.5 and Figure 6.23.

20

Insert node in BST: example

null

20

30

20

15

10

30

25

20

10

30

25

20

25

30

34

the insertion operation is performed with O(log n) time complexity

Click on Insert > new slide to choose your cover, then delete this page.

20

26

10

15

25

23

27

20

23

10

30

15

25

27

20

27

10

30

25

30

15

HIT220 Week 6 - Trees Part 1

35

Click on Insert > new slide to choose your cover, then delete this page.

BST Node Deletion

Deleting a node from Binary search tree has following three cases...

Case 1: Deleting a Leaf node (A node with no children)

Case 2: Deleting a node with one child

Case 3: Deleting a node with two children

Case 1: Deleting a leaf node

Step 1: Find the node to be deleted using search operation

Step 2: Delete the node using del function (If it is a leaf) and terminate the function.

36

the deletion operation is performed with O(log n) time complexity.

Figure 6-26 Deleting a leaf

Click on Insert > new slide to choose your cover, then delete this page.

Case 2: Deleting a node with one child

Step 1: Find the node to be deleted using search operation

Step 2: If it has only one child, then create a link between its parent and child nodes.

Step 3: Delete the node using del function and terminate the function.

Figure 6-27 Deleting a node with one child

Click on Insert > new slide to choose your cover, then delete this page.

Click on Insert > new slide to choose your cover, then delete this page.

Making one tree out of the two subtrees of the node and then attaching it to the node’s parent is called deleting by merging

Deleting a node with two children:

Step 1: Find the node to be deleted using search operation

Step 2: If it has two children, then find the largest node in its left subtree (OR) the smallest node in its right subtree.

Step 3: Swap both deleting node and node which found in above step.

Step 4: Then, check whether deleting node came to case 1 or case 2 else goto steps 2

Step 5: If it comes to case 1, then delete using case 1 logic.

Step 6: If it comes to case 2, then delete using case 2 logic.

Step 7: Repeat the same process until node is deleted from the tree.

Click on Insert > new slide to choose your cover, then delete this page.

20

10

15

25

23

27

30

26

35

27

20

10

15

25

23

26

27

35

15

10

25

23

26

27

35

15

26

27

delete 30

delete 20

HIT220 Week 6 - Trees Part 1

40

Click on Insert > new slide to choose your cover, then delete this page.

Traversals

A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data structure, there is no unique traversal. We will consider several traversal algorithms which we group in the following two kinds

depth-first traversal

breadth-first traversal

Click on Insert > new slide to choose your cover, then delete this page.

depth-first traversals (DFS)

There are three different types of depth-first traversals, :

PreOrder traversal - visit the parent first and then left and right children;

InOrder traversal - visit the left child, then the parent and the right child;

PostOrder traversal - visit left child, then the right child and then the parent;

Click on Insert > new slide to choose your cover, then delete this page.

breadth-first traversal (BST)

This traversal visits nodes by levels from top to bottom and from left to right.

Click on Insert > new slide to choose your cover, then delete this page.

Dictionary

Another format for data is the dictionary

It is unordered, changeable and indexed

used often in adjacency maps – see code week I graph representation for searching

 Dictionary is written by placing sequence of elements within curly {} braces, separated by ‘comma’.

Dictionary holds a pair of values, one being the Key and the other corresponding pair element being its Key:value.

Values in a dictionary can be of any datatype and can be duplicated

Keys can’t be repeated and must be immutable.

Click on Insert > new slide to choose your cover, then delete this page.

Tries

efficient information reTrieval data structure,

search complexity with Trie is linear in terms of word (or key) length to be searched = O(kl) where k is maximum key length

binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree.

Click on Insert > new slide to choose your cover, then delete this page.

Inserting a Trie

Create node

Create 26 sub nodes (linked to null)

Add node to parent at relevant key

O(ALPHABET_SIZE * M * N)

Click on Insert > new slide to choose your cover, then delete this page.

Hashing also provides word search in O(n) time on average.

Trie have no collisions (like hashing) so worst case time complexity is O(n).

The most important thing is Prefix Search. With Trie

We can find all words beginning with a prefix but not with hashing

Tries require a lot of extra space. Tries are also known as radix tree or prefix tree

Click on Insert > new slide to choose your cover, then delete this page.

Hash Tables

48

0

1

2

3

4

451-229-0004

981-101-0002

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

Dictionaries

9/6/21 2:32 PM

48

Maps

Map is a collection of elements where each element is stored as a Key, value pair. 

Map object can hold both objects and primitive values as either key or value.

When we iterate over the map object it returns the key,value pair in the same order as inserted cf array

Click on Insert > new slide to choose your cover, then delete this page.

50

Notion of a Map

Intuitively, a map M supports the abstraction of using keys as indices with a syntax such as M[k].

As a mental warm-up, consider a restricted setting in which a map with n items uses keys that are known to be integers in a range from 0 to N − 1, for some N ≥ n.

Click on Insert > new slide to choose your cover, then delete this page.

Sorted maps

You can insert map data in a naturally sorted order or sorted by a comparator

Implemented by TreeMaps

Click on Insert > new slide to choose your cover, then delete this page.

TreeMap, HashMap and LinkedHashMap

All offer a key->value map and a way to iterate through the keys

HashMap offers 0(1) lookup and insertion. The ordering of the keys is arbitrary. Implemented by an array of linked lists.

LinkedHashMap offers 0(1) lookup and insertion. Keys are ordered by their insertion order. Implemented by doubly-linked buckets.

TreeMap offers O(log N) lookup and insertion. Keys are ordered, so if you need to iterate through the keys in sorted order.

Click on Insert > new slide to choose your cover, then delete this page.

Hashing General Kinds of Keys

But what should we do if our keys are not integers in the range from 0 to N – 1?

Use a hash function to map general keys to corresponding indices in a table.

For instance, the last four digits of a Social Security number.

53

0

1

2

3

4

451-229-0004

981-101-0002

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

54

Hash Functions and Hash Tables (hash-based map)

A hash function h maps keys of a given type to integers in a fixed interval [0, N - 1]

Example: h(x) = x mod N is a hash function for integer keys

The integer h(x) is called the hash value of key x

A hash table for a given key type consists of

Hash function h

Array (called table) of size N

When implementing a map with a hash table, the goal is to store item (k, o) at index i = h(k)

Click on Insert > new slide to choose your cover, then delete this page.

55

SSN Example

We design a hash table for a map storing entries as (SSN, Name), where SSN (social security number) is a nine-digit positive integer

Our hash table uses an array of size N = 10,000 and the hash function h(x) = last four digits of x

0

1

2

3

4

9997

9998

9999

451-229-0004

981-101-0002

200-751-9998

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

56

Hash Functions

A hash function is usually specified as the composition of two functions:

Hash code: h1: keys  integers

Compression function: h2: integers  [0, N - 1]

The hash code is applied first, and the compression function is applied next on the result, i.e., h(x) = h2(h1(x))

The goal of the hash function is to “disperse” the keys in an apparently random way

Click on Insert > new slide to choose your cover, then delete this page.

57

Hash Codes

Memory address:

We reinterpret the memory address of the key object as an integer. Good in general, except for numeric and string keys

Integer cast:

We reinterpret the bits of the key as an integer

Suitable for keys of length less than or equal to the number of bits of the integer

Component sum:

We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits) and we sum the components (ignoring overflows)

Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type

Click on Insert > new slide to choose your cover, then delete this page.

58

Hash Codes (cont.)

Polynomial accumulation:

We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits) a0 a1 … an-1

We evaluate the polynomial

p(z) = a0 + a1 z + a2 z2 + … … + an-1zn-1

at a fixed value z, ignoring overflows

Especially suitable for strings (e.g., the choice z = 33 gives at most 6 collisions on a set of 50,000 English words)

Polynomial p(z) can be evaluated in O(n) time using Horner’s rule:

The following polynomials are successively computed, each from the previous one in O(1) time

p0(z) = an-1

pi (z) = an-i-1 + zpi-1(z) (i = 1, 2, …, n -1)

We have p(z) = pn-1(z)

Click on Insert > new slide to choose your cover, then delete this page.

59

Compression Functions

Division:

h2 (y) = y mod N

The size N of the hash table is usually chosen to be a prime

The reason has to do with number theory

Multiply, Add and Divide (MAD):

h2 (y) = (ay + b) mod N

a and b are nonnegative integers such that a mod N  0

Otherwise, every integer would map to the same value b

Click on Insert > new slide to choose your cover, then delete this page.

Abstract Hash Table Class

60

Click on Insert > new slide to choose your cover, then delete this page.

61

Collision Handling

Collisions occur when different elements are mapped to the same cell

Separate Chaining: let each cell in the table point to a linked list of entries that map there

Separate chaining is simple, but requires additional memory outside the table

0

1

2

3

4

451-229-0004

981-101-0004

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

62

Map with Separate Chaining

Delegate operations to a list-based map at each cell:

Algorithm get(k):

return A[h(k)].get(k)

Algorithm put(k,v):

t = A[h(k)].put(k,v)

if t = null then {k is a new key}

n = n + 1

return t

Algorithm remove(k):

t = A[h(k)].remove(k)

if t ≠ null then {k was found}

n = n - 1

return t

Click on Insert > new slide to choose your cover, then delete this page.

Hash Table with Separate Chaining

Hash Tables

63

Click on Insert > new slide to choose your cover, then delete this page.

64

Linear Probing

Open addressing: the colliding item is placed in a different cell of the table

Linear probing: handles collisions by placing the colliding item in the next (circularly) available table cell

Each table cell inspected is referred to as a “probe”

Colliding items lump together, causing future collisions to cause a longer sequence of probes

Example:

h(x) = x mod 13

Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order

0

1

2

3

4

5

6

7

8

9

10

11

12

41

18

44

59

32

22

31

73

0

1

2

3

4

5

6

7

8

9

10

11

12

Click on Insert > new slide to choose your cover, then delete this page.

65

Search with Linear Probing

Consider a hash table A that uses linear probing

get(k)

We start at cell h(k)

We probe consecutive locations until one of the following occurs

An item with key k is found, or

An empty cell is found, or

N cells have been unsuccessfully probed

Algorithm get(k)

i  h(k)

p  0

repeat

c  A[i]

if c = 

return null

else if c.getKey () = k

return c.getValue()

else

i  (i + 1) mod N

p  p + 1

until p = N

return null

Click on Insert > new slide to choose your cover, then delete this page.

66

Updates with Linear Probing

To handle insertions and deletions, we introduce a special object, called AVAILABLE, which replaces deleted elements

remove(k)

We search for an entry with key k

If such an entry (k, o) is found, we replace it with the special item AVAILABLE and we return element o

Else, we return null

put(k, o)

We throw an exception if the table is full

We start at cell h(k)

We probe consecutive cells until one of the following occurs

A cell i is found that is either empty or stores AVAILABLE, or

N cells have been unsuccessfully probed

We store (k, o) in cell i

Click on Insert > new slide to choose your cover, then delete this page.

Hash Table with Linear Probing

67

Click on Insert > new slide to choose your cover, then delete this page.

68

Double Hashing

Double hashing uses a secondary hash function d(k) and handles collisions by placing an item in the first available cell of the series (i + jd(k)) mod N for j = 0, 1, … , N - 1

The secondary hash function d(k) cannot have zero values

The table size N must be a prime to allow probing of all the cells

Common choice of compression function for the secondary hash function:

d2(k) = q - k mod q

where

q < N

q is a prime

The possible values for d2(k) are 1, 2, … , q

Click on Insert > new slide to choose your cover, then delete this page.

69

Consider a hash table storing integer keys that handles collision with double hashing

N = 13

h(k) = k mod 13

d(k) = 7 - k mod 7

Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order

Example of Double Hashing

0

1

2

3

4

5

6

7

8

9

10

11

12

31

41

18

32

59

73

22

44

0

1

2

3

4

5

6

7

8

9

10

11

12

Click on Insert > new slide to choose your cover, then delete this page.

70

Performance of Hashing

In the worst case, searches, insertions and removals on a hash table take O(n) time

The worst case occurs when all the keys inserted into the map collide

The load factor a = n/N affects the performance of a hash table

Assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is 1 / (1 - a)

The expected running time of all the dictionary ADT operations in a hash table is O(1)

In practice, hashing is very fast provided the load factor is not close to 100%

Applications of hash tables:

small databases

compilers

browser caches

Click on Insert > new slide to choose your cover, then delete this page.

Distributed Paradigm

Click on Insert > new slide to choose your cover, then delete this page.

Distributed Database

Click on Insert > new slide to choose your cover, then delete this page.

kh

(

k

)

d

(

k

)

Probes

18535

41212

22969

4455510

59747

32636

3154590

73848

Sheet1

0 1 2 3 4
k h(k) d(k) Probes
18 5 3 5
41 2 1 2
22 9 6 9
44 5 5 5 10
59 7 4 7
32 6 3 6
31 5 4 5 9 0
73 8 4 8

__MACOSX/Assignment 2_Resources/Week 5/._Binary trees.pptx

Assignment 2_Resources/Week 3 /Week3-linkedlists.pptx

HIT220 Algorithms and Complexity

Week 1: Overview and Introduction to Algorithms

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

Linked Lists

Linked Lists

2

Click on Insert > new slide to choose your cover, then delete this page.

Sequences

8/31/21 10:24 AM

2

Linked Lists

3

Singly Linked List

A singly linked list is a concrete data structure consisting of a sequence of nodes, starting from a head pointer

Each node stores

element

link to the next node

next

elem

node

A

B

C

D

head

Click on Insert > new slide to choose your cover, then delete this page.

Linked Lists

4

Inserting at the Head

Allocate a new node

Insert new element

Have new node point to old head

Update head to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

Removing at the Head

Linked Lists

5

Update head to point to next node in the list

Allow garbage collector to reclaim the former first node

Click on Insert > new slide to choose your cover, then delete this page.

Inserting at the Tail

Linked Lists

6

Allocate a new node

Insert new element

Have new node point to null

Have old last node point to new node

Update tail to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

}

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Retrieval

//deletion

Public retrieve (Object e)

//delete node with data e

node=head;

while(node is not None):

    if node.element ==e:

return node; prev=node;

node=node.next

return Null;

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Insertion/Deletion in List

// insertion methods

public insert(Object e, Node n) {

//insert e after node n

newNode=node(e, n.next)

n.next=newNode;

}

//deletion

Public delete (Object e)

//delete node with data e

if retrieve (e):

node=retrieve(e)

If node!=Null:

prev.next=node.next;

    //garbage collection

del node;

return

else:

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Linked Lists

10

Removing at the Tail

Removing at the tail of a singly linked list is not efficient!

There is no constant-time way to update the tail to point to the previous node

© 2013 Goodrich, Tamassia, Goldwasser

Click on Insert > new slide to choose your cover, then delete this page.

Linked Lists

11

Stack as a Linked List

We can implement a stack with a singly linked list

The top element is stored at the first node of the list

The space used is O(n) and each operation of the Stack ADT takes O(1) time

t

nodes

elements

Click on Insert > new slide to choose your cover, then delete this page.

Linked-List Stack in Python

Linked Lists

12

Click on Insert > new slide to choose your cover, then delete this page.

Linked Lists

13

Queue as a Linked List

We can implement a queue with a singly linked list

The front element is stored at the first node

The rear element is stored at the last node

The space used is O(n) and each operation of the Queue ADT takes O(1) time

f

r

nodes

elements

Click on Insert > new slide to choose your cover, then delete this page.

Linked-List Queue in Python

Linked Lists

14

Click on Insert > new slide to choose your cover, then delete this page.

Double linked lists

Linked Lists

15

Click on Insert > new slide to choose your cover, then delete this page.

Linked list and arrays

https://www.geeksforgeeks.org/linked-list-vs-array/

Linked Lists

16

Click on Insert > new slide to choose your cover, then delete this page.

Inserting and deleting

O(1)

Click on Insert > new slide to choose your cover, then delete this page.

Inserting and deleting

In a list A = [a0,a1,a2, 03,…,an]

O(n)

Arrays

18

Click on Insert > new slide to choose your cover, then delete this page.

Search

For a linked list

Step through all nodes O(n)

Click on Insert > new slide to choose your cover, then delete this page.

Search an element

In a list A = [a0,a1,a2, 03,…,an]

#search for value

for i =1 to len(array):

if a[i]=value:

O(n)

Arrays

20

Click on Insert > new slide to choose your cover, then delete this page.

Swap elements

In a list A = [a0,a1,a2, 03,…,an]

temp=ak

ak=ak+m

ak+m=temp

In python

x, y = y, x

Linked Lists

21

Click on Insert > new slide to choose your cover, then delete this page.

Length

this.node=head

len =0

While next!=null:

next=this.node->next;

this.node=next;

len=len+1;

Linked Lists

22

Click on Insert > new slide to choose your cover, then delete this page.

Trees as lists

https://www.geeksforgeeks.org/how-coronavirus-outbreak-can-end-visualize-using-data-structures/?ref=leftbar-rightbar

Trees

23

Click on Insert > new slide to choose your cover, then delete this page.

Getting to List stage

Click on Insert > new slide to choose your cover, then delete this page.

Thank you [email protected]

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 3 /._Week3-linkedlists.pptx

Assignment 2_Resources/Week 3 /.DS_Store

__MACOSX/Assignment 2_Resources/Week 3 /._.DS_Store

Assignment 2_Resources/Week 3 /Week3 part 2.pptx

Week 3: Recursion

Dr Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

1

What Is Recursion?

Recursion is a method of solving problems that involves breaking a problem down into smaller and smaller subproblems until you get to a small enough problem that it can be solved trivially.

Recursion involves a function calling itself.

While it may not seem like much on the surface, recursion allows us to write elegant solutions to problems that may otherwise be very difficult to program.

Click on Insert > new slide to choose your cover, then delete this page.

Calculating the Sum of a List of Numbers

We will begin our investigation with a simple problem that you already know how to solve without using recursion.

Suppose that you want to calculate the sum of a list of numbers such as: [1,3,5,7,9].

An iterative function that computes the sum is shown here.

The function uses an accumulator variable (total) to compute a running total of all the numbers in the list by starting with 0 and adding each number in the list.

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

Pretend that you do not have while loops or for loops. How would you compute the sum of a list of numbers?

If you were a mathematician you might start by recalling that addition is a function that is defined for two parameters, a pair of numbers. To redefine the problem from adding a list to adding pairs of numbers, we could rewrite the list as a fully parenthesized expression. Such an expression looks like this:

((((1+3)+5)+7)+9)

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

We can also parenthesize the expression the other way around: (1+(3+(5+(7+9))))

Notice that the innermost set of parentheses, (7+9), is a problem that we can solve without a loop or any special constructs. In fact, we can use the following sequence of simplifications to compute a final sum.

total= (1+(3+(5+(7+9))))

total= (1+(3+(5+16)))

total= (1+(3+21))

total= (1+24)

total= 25

Click on Insert > new slide to choose your cover, then delete this page.

A Different Approach

How can we take this idea and turn it into a program, like Python?

First, let’s restate the sum problem in terms of Python lists.

We might say the sum of the list numList is the sum of the first element of the list (numList[0]), and the sum of the numbers in the rest of the list (numList[1:]).

listSum(numList)=first(numList) + listSum(rest(numList))

first(numList) returns the first element of the list

rest(numList) returns a list of everything but the first element

Click on Insert > new slide to choose your cover, then delete this page.

Example Recursive Program

Tracing: http://interactivepython.org/courselib/static/pythonds/Recursion/pythondsCalculatingtheSumofaListofNumbers.html

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

The Three Rules of Recursion

All recursive algorithms must obey three important rules

A recursive algorithm must have a base case.

A recursive algorithm must change its state and move toward the base case.

A recursive algorithm must call itself, recursively.

Click on Insert > new slide to choose your cover, then delete this page.

The Rules!

First, a base case is the condition that allows the algorithm to stop recursing. A base case is typically a problem that is small enough to solve directly.

In the list_sum algorithm the base case is a list of length 1.

Second, we must arrange for a change of state that moves the algorithm toward the base case. A change of state means that some data that the algorithm is using is modified. Usually the data that represents our problem gets smaller in some way.

In the list_sum algorithm our primary data structure is a list, so we must focus our state-changing efforts on the list. Since the base case is a list of length 1, a natural progression toward the base case is to shorten the list.

Third, the algorithm must call itself.

Click on Insert > new slide to choose your cover, then delete this page.

Self Test

How many recursive calls are made when computing the sum of the list [2,4,6,8,10]?

(A) 6

(B) 5

(C) 4

(D) 3

The first recursive call passes the list [4,6,8,10], the second [6,8,10] and so on until [10].

Click on Insert > new slide to choose your cover, then delete this page.

Factorial

Suppose you are going to write a recursive function to calculate the factorial of a number.

fact(n) returns n * n-1 * n-2 * ... Where the factorial of zero is defined to be 1. What would be the most appropriate base case?

(A) n == 0

(B) n == 1

(C) n >= 0

(D) n <= 1

  n <= 1 this is the most efficient, and even keeps your program from crashing if you try to compute the factorial of a negative number.

Click on Insert > new slide to choose your cover, then delete this page.

12

Recursive Factorial Example

Here is a recursive Python function to compute the factorial of n.

The first line of the function constitutes the base case.

If the value <= 1, then answer is always 1.

If the value > 1, then the function calls itself to find the answer for (n − 1)

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci Sequence

Each new term in the Fibonacci sequence is generated by adding the previous two terms. So, for example, starting with 1 and 2, the first 9 numbers in the sequence would be:

0, 1, 1, 2, 3, 5, 8, 13, 21

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci Sequence

We can write a recursive program to get the nth number from the Fibonacci Sequence.

Click on Insert > new slide to choose your cover, then delete this page.

Examine Recursive Calls to Function

We can see that the subtree f(2) appears 3 times and the subtree for the calculation of f(3) two times.

If you imagine extending this tree for f(6), you will understand that f(4) will be called two times, f(3) three times and so on. This means, our recursion doesn't remember previously calculated values. 

Click on Insert > new slide to choose your cover, then delete this page.

Big O

The Fibonacci sequence is O(2n).

Remember the definition of Big-O: f(n) is O(g(n)) iff there exist constants C and k (positive integers) such that f(x) <= C*|g(x)| for all x >= k.

So we know that Fibonacci is f(n) = f(n-1) + f(n-2), f(0) = 0, f(1) = 1.

Let g(n) = 2^n.

f(n) <= g(n) starting at some x. Then solve for C.

Click on Insert > new slide to choose your cover, then delete this page.

Memoization of Calculated Values

We can implement a "memory" for our recursive version by using a dictionary to save the previously calculated values.

What is the Big-O complexity now?

O(n)

Click on Insert > new slide to choose your cover, then delete this page.

18

Dynamic programming

Divide a problem into sub problems

Cache the results of these sub problems if used multiple times or called in sequence

eg fib(10)=fib(9)+fib(8)

Click on Insert > new slide to choose your cover, then delete this page.

Linked Lists: Recursion

How do we count the number of nodes in a linked list recursively?

How do we print out the nodes in a linked list recursively?

How do we find out a node in a linked list recursively?

How do we insert a node in a linked list recursively?

How do we delete a node in a linked list recursively?

Click on Insert > new slide to choose your cover, then delete this page.

Tail Recursion

Tail recursion is characterized by the use of only one recursive call at the very end of a method implementation

def tail(value):

if value > 0:

print(value)

tail(value - 1)

This form saves Python interpreter keeping track of the state of each nested call (see Hanoi Tower)

Click on Insert > new slide to choose your cover, then delete this page.

Space Efficiency

Example of using tail recursion:

def fact(n):

    if (n == 0):

        return 1

    return n * fact(n-1)

Is this the same as:

def fact(n, a = 1):

    if (n == 1):

        return a

    return fact(n - 1, n * a)

Click on Insert > new slide to choose your cover, then delete this page.

Backtracking

Backtracking is a technique for returning to a given position (e.g., entry point) after trying other avenues that are unsuccessful in solving a particular problem

Backtracking is a form of recursion. But it involves choosing only one option out of any possibilities. We begin by choosing an option and backtrack from it, if we reach a state where we conclude that this specific option does not give the required solution.

We repeat these steps by going across each available option until we get the desired solution.

Click on Insert > new slide to choose your cover, then delete this page.

Example:

Finding all possible order of arrangements of a given set of letters. When we choose a pair we apply backtracking to verify if that exact pair has already been created or not. If not already created, the pair is added to the answer list else it is ignored.

Click on Insert > new slide to choose your cover, then delete this page.

24

Checking duplicates

Click on Insert > new slide to choose your cover, then delete this page.

Towers of Hanoi

The Tower of Hanoi puzzle was invented by the French mathematician Edouard Lucas in 1883

Tower of Hanoi, is a mathematical puzzle which consists of three towers (pegs) and more than one rings

These rings are of different sizes and stacked upon in an ascending order, i.e. the smaller one sits over the larger one. There are other variations of the puzzle where the number of disks increase, but the tower count remains the same.

Rules

The mission is to move all the disks to some another tower without violating the sequence of arrangement.

Only one disk can be moved among the towers at any given time.

Only the "top" disk can be removed.

No large disk can sit over a small disk.

Click on Insert > new slide to choose your cover, then delete this page.

Tower of Hanoi puzzle with n disks can be solved in minimum 2n−1 steps. This presentation shows that a puzzle with 3 disks has taken 23 - 1 = 7 steps.

https://www.mathsisfun.com/games/towerofhanoi.html

Click on Insert > new slide to choose your cover, then delete this page.

The Math Behind The Towers of Hanoi

To move all discs from one pin to another takes 2D-1 moves where "D" is the number of discs. Therefore the number of discs moves is approximately doubled every time you put another one on it.

Click on Insert > new slide to choose your cover, then delete this page.

Algorithm

First we need to learn how to solve this problem with lesser amount of disks, say → 1 or 2. We mark three towers with name, source, destination and aux (only to help moving the disks). If we have only one disk, then it can easily be moved from source to destination peg.

If we have 2 disks −

First, we move the smaller (top) disk to aux peg.

Then, we move the larger (bottom) disk to destination peg.

And finally, we move the smaller disk from aux to destination peg.

Click on Insert > new slide to choose your cover, then delete this page.

Algorithm cont.

More than two disks. We divide the stack of disks in two parts. The largest disk (nth disk) is in one part and all other (n-1) disks are in the second part.

Our ultimate aim is to move disk n from source to destination and then put all other (n1) disks onto it. We can imagine to apply the same in a recursive way for all given set of disks.

The steps to follow are −

Step 1 − Move n-1 disks from source to aux

Step 2 − Move remaining disk nth from source to dest

Step 3 − Move n-1 disks from aux to dest

Click on Insert > new slide to choose your cover, then delete this page.

30

Pseudo Code

START

Procedure Hanoi(disk, source, dest, helper)

IF disk == 1, THEN

move disk from source to dest

ELSE

Hanoi(disk - 1, source, helper, dest) // Step 1

move disk from source to dest // Step 2

Hanoi(disk - 1, helper, dest, source) // Step 3

END IF

END Procedure

STOP

Click on Insert > new slide to choose your cover, then delete this page.

31

Tower of Hanoi Example

Click on Insert > new slide to choose your cover, then delete this page.

32

See stack trace

https://www.cs.cmu.edu/~cburch/survey/recurse/hanoiimpl.html

What is the space/time complexity?

Click on Insert > new slide to choose your cover, then delete this page.

Time

Let the time required for n disks is T(n) .

There are 2 recursive call to. move n-1 disks and one constant time operation to move nth disk from ‘from’ source peg to ‘to’ target peg .

Let this constant be k1.

Therefore,

T(n) = 2 T(n-1) + k1

T(0) = k2 , a constant.

T(1) = 2 k2 + k1

T(2) = 4 k2 + 2k1 + k1

T(2) = 8 k2 + 4k1 + 2k1 + k1

Click on Insert > new slide to choose your cover, then delete this page.

Two variables

Coefficient of k1 =2n

Coefficient of k2 =2n -1

Time complexity is O(2n ) or O(an ) where a is a constant greater than 1.

So it has exponential time complexity.

For single increase in problem size the time required is double the previous one. This is computationally very expensive.

Most of the recursive programs takes exponential time

Click on Insert > new slide to choose your cover, then delete this page.

Space Complexity

Space for parameter for each call is independent of n, so a constant.

Let it be k.

When we do the 2nd recursive call 1st recursive call is over.

So, we can reuse the space of 1st call for 2nd call.

T(n) = T(n-1) + k

T(0) = k

T(1) = 2k

T(2) = 3k

T(3) = 4k O(n)

Click on Insert > new slide to choose your cover, then delete this page.

Recursion types

Liner, Binary and Multiple Recursion – defined by the number of recursion in one call to the function

Binary is a two-branching tree

Developing a recursive function requires removing “odd” cases and finding a similar structure through each enumeration of the function

Iterative sequences are ideal for this where xn+1 = F(xn) eg n!

Click on Insert > new slide to choose your cover, then delete this page.

Desiging a recursive funciton

Divide into smaller sub problem

Specify the base condition

Click on Insert > new slide to choose your cover, then delete this page.

Eliminating Tail Recursion space

def print(n):

    if (n < 0):

       return

    print(" ", n)

    # The last executed statement is recursive call

    print(n - 1)

Click on Insert > new slide to choose your cover, then delete this page.

Removing stack call

start:

    if (n < 0)

       return;

    print(no)

 

    // Update parameters of recursive call

    // and replace recursive call with goto

    n = n-1

    goto start;

Click on Insert > new slide to choose your cover, then delete this page.

Thank you [email protected]

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 3 /._Week3 part 2.pptx

Assignment 2_Resources/Week 6/.DS_Store

__MACOSX/Assignment 2_Resources/Week 6/._.DS_Store

Assignment 2_Resources/Week 6/trees.pptx

Week 7: Improving Trees

[email protected]

Click on Insert > new slide to choose your cover, then delete this page.

1

Trees

Random Trees

Binary trees

Generating Unordered Binary Trees is necessary in both the graph theory and in different applications. For example, a list of all trees with a given number of internal nodes can be used in computer science to test or analyse an algorithm for its correctness or computational complexity

Binary Search Trees

Balanced Binary Search tree

AVL Trees

Priority Trees

Heap Structure

Click on Insert > new slide to choose your cover, then delete this page.

Nomenclature

Proper tree

Complete tree

Balanced tree

Binary/Ternary tree

Internal nodes

External nodes

Leaf/node/root/arcs

Click on Insert > new slide to choose your cover, then delete this page.

Binary Search Tree (BST)

Binary Search Tree is a binary tree in which every node contains only smaller values in its left subtree and only larger values in its right subtree.

In this tree, left subtree of every node contains nodes with smaller values and right subtree of every node contains larger values.

Click on Insert > new slide to choose your cover, then delete this page.

Algorithm Average
Search O(log n)
Insert O(log n)
Delete O(log n)

All BST operations are O(d), where d is tree depth

minimum d is O(log n) for a binary tree with N nodes

What is the best case tree?

What is the worst case tree?

Click on Insert > new slide to choose your cover, then delete this page.

Worst case
Search O(n)
Insert O(n)
Delete O(n)

Problem: Lack of “balance”:

compare depths of left and right subtree

Unbalanced degenerate tree

A skewed binary search tree would give poor performance

Click on Insert > new slide to choose your cover, then delete this page.

Balanced and unbalanced BST

4

2

5

1

3

1

5

2

4

3

7

6

4

2

6

5

7

1

3

Is this “balanced”?

As long as the tree maintains the balanced property, if the tree contains n nodes, then it has a depth of at most log2n. (even in the worst case!!)

Click on Insert > new slide to choose your cover, then delete this page.

7

Balanced and unbalanced BST

4

2

5

1

3

1

5

2

4

3

7

6

4

2

6

5

7

1

3

Is this “balanced”?

Has a depth of at most log2n.

7 Nodes  log27 = 2.80735…

Depth = 2  Balanced

Click on Insert > new slide to choose your cover, then delete this page.

8

Balancing Binary Search Trees

Many algorithms exist for keeping binary search trees balanced

Adelson-Velskii and Landis (AVL) trees (height-balanced trees)

Splay trees and other self-adjusting trees

B-trees and other multiway search trees

Click on Insert > new slide to choose your cover, then delete this page.

AVL - Good but not Perfect Balance

AVL trees are height-balanced binary search trees

Balance factor of a node

height(right subtree) - height(left subtree)

An AVL tree has balance factor calculated at every node

For every node, heights of left and right subtree can differ by no more than 1

Store current heights in each node

Click on Insert > new slide to choose your cover, then delete this page.

AVL Trees

An AVL tree is one in which the height of the left and right subtrees of every node differ by at most; one

Difference of the subtrees height is named Balanced Factor. A node with balance factor 1, 0, or -1 is considered balanced.

Click on Insert > new slide to choose your cover, then delete this page.

Which one is AVL tree?

Click on Insert > new slide to choose your cover, then delete this page.

Single Rotation in an AVL Tree

6

4

9

8

1

5

7

6

4

9

8

1

5

7

Click on Insert > new slide to choose your cover, then delete this page.

13

Let the node that needs rebalancing be Z.

There are 4 cases:

Left-Left Case : x is the left child of y and y is the left child of z.

Left-Right Case : x is the right child of y and y is the left child of z.

Right-Left Case : x is the left child of y and y is the right child of z.

Right-Right Case : x is the right child of y and y is the right child of z.

The rebalancing is performed through four separate rotation algorithms.

Insertions in AVL Trees

Click on Insert > new slide to choose your cover, then delete this page.

14

a) Left Left Case

Left-Left Case : x is the left child of y and y is the left child of z.

Click on Insert > new slide to choose your cover, then delete this page.

b) Left Right Case

Left-Right Case : x is the right child of y and y is the left child of z.

Click on Insert > new slide to choose your cover, then delete this page.

c) Right Right Case

Right-Right Case : x is the right child of y and y is the right child of z.

Click on Insert > new slide to choose your cover, then delete this page.

d) Right Left Case

Right-Left Case : x is the left child of y and y is the right child of z.

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Insert 14, 17, 11, 7, 53, 4 into an empty AVL tree

14

17

11

7

53

4

Click on Insert > new slide to choose your cover, then delete this page.

19

AVL Tree Example:

Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree

14

17

7

4

53

11

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now insert 12

14

17

7

4

53

11

13

12

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now insert 12

14

17

7

4

53

11

12

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now the AVL tree is balanced.

14

17

7

4

53

12

13

11

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now insert 8

14

17

7

4

53

12

13

11

8

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now insert 8

14

17

7

4

53

11

12

8

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now the AVL tree is balanced.

14

17

7

4

53

11

12

8

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now remove 53

14

17

7

4

53

11

12

8

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Now remove 53, unbalanced

14

17

7

4

11

12

8

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Balanced! Remove 11

14

17

7

4

11

12

8

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Remove 11, replace it with the largest in its left branch

14

17

7

4

8

12

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Remove 8, unbalanced

14

17

4

7

12

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Remove 8, unbalanced

14

17

4

7

12

13

Click on Insert > new slide to choose your cover, then delete this page.

AVL Tree Example:

Balanced!!

14

17

4

7

12

13

Click on Insert > new slide to choose your cover, then delete this page.

In Class Exercises

Build an AVL tree with the following values:

15, 20, 24, 10, 13, 7, 30, 36, 25

Click on Insert > new slide to choose your cover, then delete this page.

15

15, 20, 24, 10, 13, 7, 30, 36, 25

20

24

15

20

24

10

13

15

20

24

13

10

13

20

24

15

10

Click on Insert > new slide to choose your cover, then delete this page.

35

13

20

24

15

10

15, 20, 24, 10, 13, 7, 30, 36, 25

7

13

20

24

15

10

7

30

36

13

20

30

15

10

7

36

24

Click on Insert > new slide to choose your cover, then delete this page.

13

20

30

15

10

7

36

24

15, 20, 24, 10, 13, 7, 30, 36, 25

25

13

20

30

15

10

7

36

24

25

13

24

36

20

10

7

25

30

15

Click on Insert > new slide to choose your cover, then delete this page.

Remove 24 and 20 from the AVL tree.

13

24

36

20

10

7

25

30

15

13

20

36

15

10

7

25

30

13

15

36

10

7

25

30

13

30

36

10

7

25

15

Click on Insert > new slide to choose your cover, then delete this page.

Example (AVL Tree)

Draw the AVL tree that results from inserting the keys: 4, 10, 2, 12, 8, 9, in that order into an initially empty AVL tree. Drawing final load then adjusting.

4

10

12

8

2

9

Click on Insert > new slide to choose your cover, then delete this page.

4

10

12

8

2

9

4

8

10

12

2

9

RL – longest leg

RR Imbalance at 4

8

9

10

12

4

2

Imbalance at 4

Click on Insert > new slide to choose your cover, then delete this page.

Self-Adjusting Trees

The strategy in self-adjusting trees is to restructure trees by moving up the tree with only those elements that are used more often, that have priority, and creating a priority tree

Click on Insert > new slide to choose your cover, then delete this page.

Priority Trees

Retain key of tree nodes as you move them. This can be a record of their priority

The nodes can be stored not sorted by this key or sorted

This was developed to deal with situations where “Queue jumping” is allowed

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Click on Insert > new slide to choose your cover, then delete this page.

Exercise

An airport is developing a computer simulation of air-traffic control that handles events such as landings and takeoffs. Each event has a time stamp that denotes the time when the event will occur. The simulation program needs to efficiently perform the following two fundamental operations:

Insert an event with a given time stamp (that is, add a future event).

Extract the event with smallest time stamp (that is, determine the next event to process).

Which data structure should be used for the above operations? Why?

Click on Insert > new slide to choose your cover, then delete this page.

Heaps

General definition

Click on Insert > new slide to choose your cover, then delete this page.

Binary Heaps

A binary heap is a complete binary tree which satisfies the heap ordering property. The ordering can be one of two types:

the min-heap property: the value of each child node is greater than or equal to the value of its parent, with the minimum-value element at the root.

the max-heap property: the value of each child node is less than or equal to the value of its parent, with the maximum-value element at the root.

Click on Insert > new slide to choose your cover, then delete this page.

Binary Heaps

In a heap the highest (or lowest) priority element is always stored at the root, hence the name "heap". A heap is not a sorted structure and can be regarded as partially ordered.

There is no particular relationship among nodes on any given level, even among the siblings.

Since a heap is a complete binary tree, it has a smallest possible height - a heap with N nodes always has O(log2 N) or often written just as O(log N) height.

Click on Insert > new slide to choose your cover, then delete this page.

Heaps

A heap is a certain kind of complete binary tree.

A heap is a data structure with several applications, including a way to implement Priority Queues

Max-Heap Property

Min-Heap Property

Click on Insert > new slide to choose your cover, then delete this page.

A heap is a data structure with several applications, including a way to implement Priority Queues, as shown in Chapter 11. The definition of a heap is a special kind of complete binary tree.

You probably recall that a complete binary tree requires that its nodes are added in a particular order...

A heap is a certain kind of complete binary tree.

When a complete

binary tree is built,

its first node must be

the root.

Root

Click on Insert > new slide to choose your cover, then delete this page.

The first node of a complete binary tree is always the root...

Complete binary tree.

Left child

of the

root

The second node is

always the left child

of the root.

Click on Insert > new slide to choose your cover, then delete this page.

...the second node is always the left child of the root...

Complete binary tree.

Right child

of the

root

The third node is

always the right child

of the root.

Click on Insert > new slide to choose your cover, then delete this page.

...then the right child of the root...

Complete binary tree.

The next nodes

always fill the next

level from left-to-right.

Click on Insert > new slide to choose your cover, then delete this page.

...and so on. The nodes always fill each level from left-to-right...

Complete binary tree.

The next nodes

always fill the next

level from left-to-right.

Click on Insert > new slide to choose your cover, then delete this page.

...from left-to-right...

Complete binary tree.

The next nodes

always fill the next

level from left-to-right.

Click on Insert > new slide to choose your cover, then delete this page.

...from left-to-right...

Complete binary tree.

The next nodes

always fill the next

level from left-to-right.

Click on Insert > new slide to choose your cover, then delete this page.

...from left-to-right...

Complete binary tree.

Click on Insert > new slide to choose your cover, then delete this page.

...and when a level is filled you start the next level at the left.

A heap is a certain kind of complete binary tree.

Each node in a heap

contains a key that

can be compared to

other nodes' keys.

19

4

22

21

27

23

45

35

Click on Insert > new slide to choose your cover, then delete this page.

So, a heap is a complete binary tree. Each node in a heap contains a key, and these keys must be organized in a particular manner. Notice that this is not a binary search tree, but the keys do follow some semblance of order.

Can you see what rule is being enforced here?

A heap is a certain kind of complete binary tree.

The "heap property"

requires that each

node's key is >= the

keys of its children

19

4

22

21

27

23

45

35

This is a handy property because the biggest node is always at the top. Because of this, a heap can easily implement a priority queue (where we need quick access to the highest priority item).

Click on Insert > new slide to choose your cover, then delete this page.

The heap property requires that each node's key is >= to the keys of its children.

This is a handy property because the biggest node is always at the top. Because of this, a heap can easily implement a priority queue (where we need quick access to the highest priority item).

Adding a Node to a Heap

Put the new node in the next available spot.

Push the new node upward, swapping with its parent until the new node reaches an acceptable location.

19

4

22

21

27

23

45

35

42

Click on Insert > new slide to choose your cover, then delete this page.

We can add new elements to a heap whenever we like. Because the heap is a complete binary search tree, we must add the new element at the next available location, filling in the levels from left-to-right.

In this example, I have just added the new element with a key of 42.

Of course, we now have a problem: The heap property is no longer valid. The 42 is bigger than its parent 27.

To fix the problem, we will push the new node upwards until it reaches an acceptable location.

Adding a Node to a Heap

Put the new node in the next available spot.

Push the new node upward, swapping with its parent until the new node reaches an acceptable location.

19

4

22

21

42

23

45

35

27

Click on Insert > new slide to choose your cover, then delete this page.

Here we have pushed the 42 upward one level, swapping it with its smaller parent 27.

We can't stop here though, because the parent 35 is still smaller than the new node 42.

Adding a Node to a Heap

Put the new node in the next available spot.

Push the new node upward, swapping with its parent until the new node reaches an acceptable location.

19

4

22

21

35

23

45

42

27

Click on Insert > new slide to choose your cover, then delete this page.

Can we stop now? Yes, because the 42 is less than or equal to its parent.

Adding a Node to a Heap

The parent has a key that is >= new node, or

The node reaches the root.

The process of pushing the new node upward is called reheapification upward.

19

4

22

21

35

23

45

42

27

Click on Insert > new slide to choose your cover, then delete this page.

In general, there are two conditions that can stop the pushing upward:

1. We reach a spot where the parent is >= the new node, or

2. We reach the root.

This process is called reheapification upward (I didn't just make up that name, really).

Removing the Top of a Heap

Move the last node onto the root.

19

4

22

21

35

23

45

42

27

Click on Insert > new slide to choose your cover, then delete this page.

We can also remove the top node from a heap. The first step of the removal is to move the last node of the tree onto the root. In this example we move the 27 onto the root.

Removing the Top of a Heap

Move the last node onto the root.

19

4

22

21

35

23

27

42

Click on Insert > new slide to choose your cover, then delete this page.

Now the 27 is on top of the heap, and the original root (45) is no longer around. But the heap property is once again violated.

Removing the Top of a Heap

Move the last node onto the root.

Push the out-of-place node downward, swapping with its larger child until the new node reaches an acceptable location.

19

4

22

21

35

23

27

42

Click on Insert > new slide to choose your cover, then delete this page.

We'll fix the problem by pushing the out-of-place node downward. Perhaps you can guess what the downward pushing is called....reheapification downward.

Removing the Top of a Heap

Move the last node onto the root.

Push the out-of-place node downward, swapping with its larger child until the new node reaches an acceptable location.

19

4

22

21

35

23

42

27

Click on Insert > new slide to choose your cover, then delete this page.

When we push a node downward it is important to swap it with its largest child. (Otherwise we are creating extra problems by placing the smaller child on top of the larger child.) This is what the tree looks like after one swap.

Should I continue with the reheapification downward?

Removing the Top of a Heap

Move the last node onto the root.

Push the out-of-place node downward, swapping with its larger child until the new node reaches an acceptable location.

19

4

22

21

27

23

42

35

Click on Insert > new slide to choose your cover, then delete this page.

Yes, I swap again, and now the 27 is in an acceptable location.

Removing the Top of a Heap

The children all have keys <= the out-of-place node, or

The node reaches the leaf.

The process of pushing the new node downward is called reheapification downward.

19

4

22

21

27

23

42

35

Click on Insert > new slide to choose your cover, then delete this page.

Reheapification downward can stop under two circumstances:

1. The children all have keys that are <= the out-of-place node.

2. The out-of-place node reaches a leaf.

Heap structure

A heap can be set up as a sorted structure

Or you can re-heap after data installed

Then you need to re-heap after each change

Click on Insert > new slide to choose your cover, then delete this page.

I claim that a preorder traversal of a min heap will list its keys in nondecreasing order. Draw an example of a heap that proves me wrong.

Click on Insert > new slide to choose your cover, then delete this page.

Implementing a Heap

We will store the data from the nodes in a partially-filled array.

An array of data

21

27

23

42

35

Click on Insert > new slide to choose your cover, then delete this page.

This slide shows the typical way that a heap is implemented. For the most part, there is nothing new here, because you already know how to implement a complete binary tree using a partially-filled array. That is what we are doing with the heap.

Implementing a Heap

Data from the root goes in the first location of the array.

An array of data

21

27

23

42

35

42

Click on Insert > new slide to choose your cover, then delete this page.

Following the usual technique for implementing a complete binary tree, the data from the root is stored in the first entry of the array.

Implementing a Heap

Data from the next row goes in the next two array locations.

An array of data

21

27

23

42

35

42

35

23

Click on Insert > new slide to choose your cover, then delete this page.

The next two nodes go in the next two locations of the array.

Implementing a Heap

Data from the next row goes in the next two array locations.

An array of data

21

27

23

42

35

42

35

23

27

21

Click on Insert > new slide to choose your cover, then delete this page.

and so on.

Implementing a Heap

Data from the next row goes in the next two array locations.

An array of data

21

27

23

42

35

42

35

23

27

21

We don't care what's in

this part of the array.

Click on Insert > new slide to choose your cover, then delete this page.

As with any partially-filled array, we are only concerned with the front part of the array. If the tree has five nodes, then we are only concerned with the entries in the first five components of the array.

Important Points about the Implementation

The links between the tree's nodes are not actually stored as pointers, or in any other way.

The only way we "know" that "the array is a tree" is from the way we manipulate the data.

21

27

23

42

35

42

35

23

27

21

Click on Insert > new slide to choose your cover, then delete this page.

With this implementation of a heap, there are no pointers. The only way that we know that the array is a heap is the manner in which we manipulate it.

Important Points about the Implementation

If you know the index of a node, then it is easy to figure out the indexes of that node's parent and children.

[1] [2] [3] [4] [5]

21

27

23

42

35

42

35

23

27

21

Click on Insert > new slide to choose your cover, then delete this page.

The manipulations are the same manipulations that you've used for a complete binary tree, making it easy to compute the index where various nodes are stored.

Indexing Binary Heaps

Index from 0

i left = 2i + 1

i right = 2i+2

i parent = [(i-1)/2]

print(heap[3])

print(heap[2*3 + 1])

print(heap[2*3 + 2])

print(heap[(3-1) / 2])

Click on Insert > new slide to choose your cover, then delete this page.

Insert

The new element is initially appended to the end of the heap (as the last element of the array). The heap property is repaired by comparing the added element with its parent and moving the added element up a level (swapping positions with the parent). This process is called "percolation up".

The comparison is repeated until the parent is larger than or equal to the percolating element.

Index from 1

i left = 2i

i right = 2i+1

i parent = ⎣i/2⎦

Click on Insert > new slide to choose your cover, then delete this page.

Insert

heap[i] = value_to_insert

while heap[i] < heap[i/2]:

heap[i], heap⎣i/2⎦ = heap⎣i/2⎦, heap[i]

Index from 1

i left = 2i

i right = 2i+1

i parent = ⎣i/2⎦

Click on Insert > new slide to choose your cover, then delete this page.

Consider the following binary tree representation of a max-heap. Give the array representation of the heap

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 6/._trees.pptx