Algorithm and Complexity Assignment (Python)

Felmin Nicolas

Assignment2_Resources.zip

Home >Computer Science homework help >Algorithm and Complexity Assignment (Python)

__MACOSX/._Assignment 2_Resources

Assignment 2_Resources/.DS_Store

__MACOSX/Assignment 2_Resources/._.DS_Store

Assignment 2_Resources/Assignment_2_HIT220 (1).pdf

HIT220 - Assignment 2

Cat Kutay

Aug 2020

1 Rubric

1. This document is available https://www.overleaf.com/read/cgwdgmvhppzr

2. Submit as a pdf by on Learnline

3. Hand drawings can be scanned and inserted

4. Marking rubric for each question: Marks will be awarded for accuracy, completeness, and well communicated reasoning which demon- strates your depth of understanding of the subject matter. Unless marks otherwise divided up:

a Advanced questions only given marks if correct. Otherwise:

b Full marks are possible if provide working code

c three quarter marks are possible if provide pseudo code

d For these marks:

i. Half marks will be given for right answer

ii. Half marks will be given for working shown with clear explanation of working

All questions total marks are shown. Not all parts are of equal value.

2 Questions

1. (4 points) Stack and Queue When working with data in a queue, write the following for the different data structures that are proposed:

a You are given an array of k single-linked-lists of length n, where each linked-list is sorted in ascending order. Write code or pseudo code to merge all the linked-lists into one sorted ascending linked-list and return it.

b Write the algorithm using (a) iteration and (b) recursion. What is the Big O complexity of each version of the algorithm

c Implement a queue using two stacks. What methods do you need to implement for a Queue? What is the worst-case run time complexity of the method to remove an element?

Advanced Write a method

def contains(queueName: Queue, search: str):

return bool

which will take a Queue and a String, and returns True if the Queue contains this string of characters in the same sequence. Return false if not found in the Queue. The elements in the Queue must remain in their original order once this method is complete.

2. (3 points) Optimize Merge

a Given an array arr[] of n integers, construct an output array prod[] such that prod[i] is equal to the product of all the elements of arr[] except arr[floor(i/2)]. Note: floor() is the largest integer < n/2.

b You are to provide code that does not use the division operator and completes in O(n) time.

Advanced Can you improve the space complexity to O(1)

3. (2 points) Matrix

a Given an n x m matrix write the code to output the transpose of the matrices as an m x n matrix First consider how to store a matrix in your code

b Given an n x m matrix write the code to output the diagonal elements of the matrix as a vector or sequence of values.

c Given an n x n matrix where each of the rows and columns are sorted in ascending order, return the kth smallest element in the matrix. Note that it is the kth smallest element in sorted order of the entire matrix, not the kth distinct element in the matrix array of arrays∣∣∣∣∣∣

1 5 9 10 11 13 12 13 15

∣∣∣∣∣∣ The 8th element is 13 as the elements of the matrix are [1,5,9,10,11,12,13,13,15]. 4. (4 points) Search Trees

The insertion of data into a tree can be done in various ways to ensure the height of the tree is minimum, and that searching the tree for an item is not more than O(log n).

a Draw separate trees with 8 nodes that are either: balanced; binary tree; neither of these.

b Write in pseudo code or code to traverse the tree and verify if it is balanced and/or binary. First consider how you will represent the edges and nodes as data in your program and used this in your code.

c Then consider which search method you will use, and name it in your code.

Advanced Assume your tree is binary. Now:

(a) Write pseudo code or code to carry out a breadth first search of a binary tree

(b) What is the complexity of the search?

5. (3 points) Optimizing storage [Advanced]

a We are looking to store water for fire management, and we have located a suitable area, but want to map out the storage, based on the height map of the surrounding area.

b Given n non-negative integers representing an elevation map where the width of each bar is 1, write code or pseudo code to calculate the maximum amount of water this elevation can trap between any two peaks. See example below.

Page 2

c Note you cannot tilt the elevation map.

6. (4 points) Map colouring Using the Bush Fire Map [2 points]:

Figure 1: input heights = [0,1,0,2,1,0,1,3,2,1,2,1]. Output = 4.

a Draw a graph of the regions on the map

b Store the graph of this map in code

c Write an algorithm in code or pseudo code to find if the map is 4-colourable

d Are you using an exhaustive or greedy algorithm method? Explain.

Advanced 2 points Use your code to advise each region where they can get extra fire trucks from, in an emergency. Assume the trucks are on average based in the centre of each region during fire season. Rough estimate of distance is sufficient.

Page 3

Figure 2: Bush Fire regions

Page 4

__MACOSX/Assignment 2_Resources/._Assignment_2_HIT220 (1).pdf

__MACOSX/Assignment 2_Resources/._Week 2

__MACOSX/Assignment 2_Resources/._Week 4

__MACOSX/Assignment 2_Resources/._Week 5

Assignment 2_Resources/Instruction for Assignment 2.docx

For this assessment you will need to show the various stages of the solution as you work through the problem, not just the final outcome.

You may create your answers as an electronic document which is machine readable. As explained in class, the use of tables to contain material or images of hand drawings is not allowed except for illustration, These will not be considered full answers.

Save to pdf and upload.

· No Plagiarism

· No Copying from internet

· Follow the marking rubric

__MACOSX/Assignment 2_Resources/._Instruction for Assignment 2.docx

__MACOSX/Assignment 2_Resources/._Week 3

__MACOSX/Assignment 2_Resources/._Week 6

Assignment 2_Resources/Week 2 /.DS_Store

__MACOSX/Assignment 2_Resources/Week 2 /._.DS_Store

Assignment 2_Resources/Week 2 /Week2.pptx

Week 2: Computational Complexity

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

Complex Issue

Feedback

Why do we give it

How do we give it

Finding it on Learn line

Consultation time

Will you use it

When do you want it

Click on Insert > new slide to choose your cover, then delete this page.

Reference

See Chapter 3 in text.

For another similar approach see Drozdek Ch2 library site

Computers and Intractability , A guide the Theory of NP-Completeness, Garey and Johnson, W.H.Freeman and Company, New York, 1979.

Algorithms and Complexity, Second Edition, Herbert S. Wilf, AK Peter Ltd Canada, 2002.

The Nature of Computation, Christopher Moore and Stephan Mertens, Oxford University Press, 2011

Click on Insert > new slide to choose your cover, then delete this page.

Meet some Ancestors

Abu Ja’far Mohammed ibn Mûsâ al-Khowârizmî was a Persian mathematician (ca. 780–850) who wrote a book that was the foundation of algebra and algorithms

Alan Turing, developed Computer Science - Please watch the You tube by Cambridge University 8 min "Alan Turing - Celebrating the Life of a Genius" The HIT220 resources page also contains links to you tubes about Turing, Turing Machines and the Halting Problem.

Leonard Euler, The Father of Graph Theory was one of the most prolific mathematicians of all time. Worth at look around about his contributions to mathematics and science but entirely optional.

Euclid developed Geometry. We studied the Euclidean Algorithm last week however Euclid is more famous for his 5 postulates. Euclid's Book/s "The Elements" is the most published book of all time. Beyond Geometry, Euclid proved many results in number theory which we love to use. The Fundamental Theorem of Arithmetic says that every positive integer greater than 1 has a unique prime decomposition. So useful for finding GCD, LCM, simplifying fractions.

Ada Lovelace developed the first computer algorithm designed for Babbage’s proposed computer, to calculate Bernoulli numbers

Click on Insert > new slide to choose your cover, then delete this page.

Computational Complexity

“As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise – By what course of calculation can these result be arrived at by the machine in the shortest test time?”

Charles Babbage (~1864)

http://www.cbi.umn.edu/about/babbage.html

“A wide variety of commonly encountered problems from mathematics, computer science, and operations research are known to be NP-complete …..

it is important for anyone concerned with the computational aspects of these fields to be familiar with the meaning and implications of this concept.”

Garey & Johnson, 1978

Click on Insert > new slide to choose your cover, then delete this page.

In defining a measure of efficiency

The efficiency of execution of an algorithm depends on the hardware, programming language used, compiler, programming skill, etc. These variables are not useful for measuring efficiency of an algorithm.

A computing time for millions of bits of input data can be expected to take longer than one where a small amount of input is required.

To evaluate an algorithm’s efficiency, logical time units that express a relationship between the size of an input and the amount of time and space required to process the input should be used.

Desirable scaling property – When the input size doubles, the algorithm should only slow down by some constant factor.

Click on Insert > new slide to choose your cover, then delete this page.

Why not good to look at hardware?

Analysis of Algorithms

Analysis of an algorithm gives insight into how long the program runs and how much memory it uses

time complexity

space complexity

Click on Insert > new slide to choose your cover, then delete this page.

Complexity Analysis

The same problem can frequently be solved with different algorithms which differ in efficiency. Or choice of data structure – e.g. Linked list vs array

Time is generally the most important

Space (memory)

Tradeoffs between time and space complexity

Type of measurement

Worst-case

Best- case

Average-case

Generally concerned with worst case analysis. We also study best case but will usually skip average case analysis.

Worst case analysis provides a running time guarantee for all inputs.

Click on Insert > new slide to choose your cover, then delete this page.

Computational Complexity

The computational complexity of an algorithm is measure of how many steps the algorithm will require in the worst case for an instance or input of a given size. Most commonly used notation for describing asymptotic complexity is “big oh" or "big O" notation (O for order of the complexity).

Very loosely, this notation strips away the less significant calculations to give the magnitude of the number of operations for a large input size.

Click on Insert > new slide to choose your cover, then delete this page.

Experimental Analysis and Challenges

One way to study the efficiency of an algorithm is to implement it and experiment by running the program on various test inputs while recording the time spent during each execution.

While experimental studies of running times are valuable, especially when finetuning production-quality code, there are three major limitations to their use for algorithm analysis:

Experimental running times of two algorithms are difficult to directly compare unless the experiments are performed in the same hardware and software environments.

Experiments can be done only on a limited set of test inputs; hence, they leave out the running times of inputs not included in the experiment (and these inputs may be important).

An algorithm must be fully implemented in order to execute it to study its running time experimentally (most serious drawback).

Click on Insert > new slide to choose your cover, then delete this page.

Moving Beyond Experimental Analysis

The goal is to develop an approach to analysing the efficiency of algorithms that:

Allows us to evaluate the relative efficiency of any two algorithms in a way that is independent of the hardware and software environment.

Is performed by studying a high-level description of the algorithm without need for implementation.

Takes into account all possible inputs.

Click on Insert > new slide to choose your cover, then delete this page.

Counting Primitive Operations

To analyse the running time of an algorithm without performing experiments, we perform an analysis directly on a high-level description of the algorithm.

We base the description on a set of primitive operations such as the following:

Assigning a value to a variable

Following an object reference

Performing an arithmetic operation (i.e. adding two numbers)

Comparing numbers

Accessing a single element of an array by index

Calling a method

Returning from a method

Click on Insert > new slide to choose your cover, then delete this page.

Primitive Operations

This operation count will correlate to an actual running time in a specific computer, for each primitive operation corresponds to a constant number of instructions, and there are only a fixed number of primitive operations.

The implicit assumption in this approach is that the running times of different primitive operations will be fairly similar. Thus, the number, t, of primitive operations an algorithm performs will be proportional to the actual running time of that algorithm.

Click on Insert > new slide to choose your cover, then delete this page.

Measuring Operations as a Function of Input Size

To capture the order of growth of an algorithm's running time, we will associate, with each algorithm, a function f(n) that characterizes the number of primitive operations that are performed as a function of the input size n.

Click on Insert > new slide to choose your cover, then delete this page.

Classification

An algorithm is deemed efficient if it has a polynomial running time (e.g. f(n)=n2).

Polynomial time classification typically exposes or indicates the existence of “essential structure”.

When there are high constants and/or exponents, over a period of time improved algorithms are often discovered.

Exceptions – some polynomial time algorithms are not (practically) efficient.

Click on Insert > new slide to choose your cover, then delete this page.

Worst Case Analysis is main focus

Running time guarantee for any input size.

Some algorithms of exponential time complexity are successful/practical because the worst case occurs rarely.

Best Case Analysis

Average case analysis (typically complex calculations)

Amortized Complexity – Sections 5.3 Textbook. Read about it so that you could discuss the idea.

Click on Insert > new slide to choose your cover, then delete this page.

Amortized – break down

The order of a nested loop

for to

Exercise: Count the number of operations which are additions, subtractions, multiplication.

Click on Insert > new slide to choose your cover, then delete this page.

Counting elementary operations

What is classified as an “elementary operation” may vary. E.g. Drozdek focuses on counting the number of assignments (eg changing values in memory) rather than operations. In fact a general rule of thumb is:

Memory references are counted in a data intensive operation as this will correlate well with running time on any machine

Comparison numbers will be the most important in a search of a list.

Elementary operations used in an algorithm to evaluate a polynomial, are the additions and multiplications needed

Click on Insert > new slide to choose your cover, then delete this page.

for to

Number of calculations is:

Order of complexity:

There are two additions, one multiplication and one subtraction for each iteration of the inner loop. i.e. 4 operations per iteration of inner loop.

Inner loop executes when:

Click on Insert > new slide to choose your cover, then delete this page.

To get the order

Which is the simplest rising function that is bigger than f(n)

From Drozdek

Click on Insert > new slide to choose your cover, then delete this page.

Function growth and complexity

Function growth

Big-O notation – informally

Tractability

Polynomial growth

Exponential growth

Execution times

Click on Insert > new slide to choose your cover, then delete this page.

Growth Functions

Different functions can effect numbers, and how the growth can happen rapidly.

Smallest to Greatest

O(1), log2(n), n, (n)log2(n), n2, 2n , n!

Consider N=10

0(1) = 1

log2 (10)= 3.32

10 log2 (10)= 33.32

102= 100

210= 1024

10!= 3,628,800

Click on Insert > new slide to choose your cover, then delete this page.

Function growth within polynomial time

lg n is base 2 logarithm

Orders of magnitude:

O(1) constant

O(log n) logarithmic

O(n) linear

O(n log n)

O(n2) quadratic

O(n3) cubic

Note axis scale is very limited.

Figure 2-5 (of Drozdek) Typical functions applied in big-O estimates

Click on Insert > new slide to choose your cover, then delete this page.

Evaluating

Click on Insert > new slide to choose your cover, then delete this page.

Asymptotic complexity

In algorithm analysis, we focus on the growth rate of the running time as a function of the input size n, taking a "big-picture" approach. For example, it is often enough just to know that the running time of an algorithm grows proportionally to n.

The asymptotic complexity is the limiting behaviour of the execution time of an algorithm when the size of the problem goes to infinity.

This is usually denoted in big-O notation.

Definition: A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of input items.

Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n).

Click on Insert > new slide to choose your cover, then delete this page.

Rescaling of time axis

Image source Moore & Mertens

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation

Formal Definition:

f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k.

The values of c and k must be fixed for the function f and must not depend on n.

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation – the definition

f(n) is O(g(n)) if there exists numbers c, N > 0

such that for each n ≥ N

f(n) ≤ c∙g(n)

The meaning:

f(n) is larger than g(n) only for finite number of n’s

a constant c and a value N can be found so that for every value of n ≥ N

f(n) ≤ c∙g(n) (such c and the N are called witnesses)

f(n) does not grow more than a constant factor faster than g(n)

We may write f(n) = O(g(n)) but the “=“ does not represent genuine equality.

We say that f(n) is O(g(n)), (because O(g(n)) is a collection - an infinite set).

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: illustration

c∙g(n)

f(n)

f(n) ≤ c∙g(n)

n ≥ N

The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

Click on Insert > new slide to choose your cover, then delete this page.

General rules

Ignore constants

5n -> 0(n)

Certain terms ‘dominate’ others

O(1) < 0(logn) < o(n), o(nlogn) < o(n2)< o(2n) < o(n!)

e.g. ignore low-order terms when they dominated by higher one

X=5+(15*20) ;

Independent of input size, N

O(1) ‘big O of one’

x=5+(15*20) ;

y=15-2;

Print x + y

Total time = 0(1) + 0(1) + 0(1)

3* O(1) drop constants

O(1)

Click on Insert > new slide to choose your cover, then delete this page.

Cont.

Linear time:

for x in range (0,n):

print x; // O(1)

N * O(1) = O(N)

x=5+(15*20) ; //O(1)

for x in range (0,n):

print x;

Total number = O(1) + O(N) = O(N)

In another word, when N gets large, the time it takes to compute Y is meaningless, as the for-loop dominate the run time.

}//O(N)

Click on Insert > new slide to choose your cover, then delete this page.

Constants

i :=0;

while i<n Do

i=i+1

i :=0;

while i<n Do

i=i+3

F(n) = n

O(f(n)) = O(n)

F(n) = n/3

O(f(n)) = O(n)

Click on Insert > new slide to choose your cover, then delete this page.

Quadratic time

for x in range (0,n):

print x*y // O(1)

Total number = O(N2)

}//O(N2)

Click on Insert > new slide to choose your cover, then delete this page.

O(?)

for x in range (0,n):

print x;

for x in range (0,n):

print x*y

What is the total runtime ???

In workshop

Click on Insert > new slide to choose your cover, then delete this page.

Big O and efficiency

An algorithm is deemed efficient if it has a polynomial running time i.e. O(nc) where c is a constant.

Polynomial time classification typically exposes or indicates the existence of “essential structure”.

When there are high constants and/or exponents, over a period of time improved algorithms are often discovered.

Exceptions – some polynomial time algorithms are not (practically) efficient.

More exceptions - Some exponential time algorithms work well in practice because the worst case occurs rarely.

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: properties

1. (transitivity) If f(n) is O(g(n)) and g(n) is O(h(n)), then

f(n) is O(h(n))

(i.e., f(n) = O(g(n)) = O(O(h(n))) = O(h(n)).

2. If f(n) is O(h(n)) and g(n) is O(h(n)), then

f(n) + g(n) is O(h(n)).

3. ank is O(nk).

4. nk is O(nk+j) for any j > 0.

5. If f(n) = cg(n), then f(n) = O(g(n)).

6. logan = O(logbn) for any numbers a, b > 1.

7. logan is O(log2n) for any positive a ≠ 1

Click on Insert > new slide to choose your cover, then delete this page.

Section 2.3.

Analysis of Algorithms

Relatives of Big-Oh

big-Omega

f(n) is (g(n)) if there is a constant c > 0

and an integer constant n0  1 such that

f(n)  c•g(n) for n  n0

big-Theta

f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0

Click on Insert > new slide to choose your cover, then delete this page.

Intuition for Asymptotic Notation

Analysis of Algorithms

Big-Oh

f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)

big-Omega

f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)

big-Theta

f(n) is (g(n)) if f(n) is asymptotically equal to g(n)

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Example Uses of the Relatives of Big-Oh

f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n) < c•g(n) for n  n0

Let c = 5 and n0 = 1

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 1 and n0 = 1

5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 5 and n0 = 1

5n2 is (n2)

Click on Insert > new slide to choose your cover, then delete this page.

Tractability - overview

The area of tractability explores problems and algorithms that can take an impossible amount of computation to solve except perhaps for very small examples of the problem.

Tractable – an efficient algorithm. i.e. Polynomial time.

Efficient - time complexity is at most a polynomial function of input size.

Inefficient:

A brute force algorithm or exhaustive search examines every possibility.

For non-trivial problems this approach leads to “combinatorial explosion” in the number of possibilities.

Unacceptable running time on large inputs. Typically 2n time or worse for inputs of size n. i.e. Exponential run time.

“Or Worse” - the input size need not be very large at all to generate a running time well beyond the lifetime of the universe.

Click on Insert > new slide to choose your cover, then delete this page.

Intractability examples

A problem is intractable in the strongest sense if the problem is undecidable. i.e. No algorithm is possible.

Read about Russel’s Paradox, and Turing’s Halting Problem.

Another strong sense of intractability occurs when the running time to output a solution to a problem is excessive.

Click on Insert > new slide to choose your cover, then delete this page.

Paradox – set of all sets that do not contain themselves

Halting - machine or algorithm that says if any algorithm will halt – put through an algorithm that is the machine plus a not gate

P vs. NP and the Computational Complexity Zoo

https://www.youtube.com/watch?v=YX40hbAHx3s

Many examples in this video are not decision problems, so do not use such problems as examples in your assignments

Click on Insert > new slide to choose your cover, then delete this page.

Decision problems

In HIT220 you only look at decision problems

These are yes-or-no question on an infinite set of inputs. The classification of computational complexity classes (e.g. P, NP,NPC) is based on decision problems.

Click on Insert > new slide to choose your cover, then delete this page.

Class P

A deterministic algorithm is a uniquely defined (determined) sequence of steps for a particular input. There is only one way to determine the next step that the algorithm can make.

A problem belongs to the class P of decision problems if it can be solved in polynomial time with a deterministic algorithm.

Two members of P:

Graph 2-colourability

Is G Eulerian?

The bridges of Königsberg problem solved by Euler in 1736.

Click on Insert > new slide to choose your cover, then delete this page.

The Seven Bridges of Königsberg

https://www.youtube.com/watch?v=f-Zy1q9XFQE

Click on Insert > new slide to choose your cover, then delete this page.

The graph of the Bridges of of Königsberg has 4 vertices (land) and 7 edges (representing the bridges).

One could solve the problem using “Brute Force”. An exhaustive search of all possibilities. Exponential time complexity.

Euler’s insight: In order to cross each edge once, any time we arrive at a vertex along one edge, we have to depart on a different edge.

Hence the degree of a vertex v, that is the number of edges incident to each v , must be even.

Click on Insert > new slide to choose your cover, then delete this page.

Class NP

A non-deterministic algorithm is an algorithm that can use a special operation that makes a guess when a decision is to be made.

A problem belongs to the class NP of decision problems if it can be solved by a non-deterministic polynomial time algorithm (on a theoretical machine allowing infinite parallelism).

The class of NP problems consists of problems with the property that “checking” whether or not a claimed solution is correct can be performed in polynomial time.

Two members of NP: Graph k-colourability,

Is G Hamiltonian?

A Hamiltonian path, also called a Hamilton path, is a graph path between two vertices of a graph that visits each vertex exactly once. If a Hamiltonian path exists whose endpoints are adjacent, then the resulting graph cycle is called a Hamiltonian cycle (or Hamiltonian cycle).

So to prove that a decision problem is in NP we would only need to show that checking can be performed in polynomial time. i.e. That there is an efficient checking algorithm.

(Probably the most useful thinking for HIT220)

Click on Insert > new slide to choose your cover, then delete this page.

A non-deterministic algorithm can solve a decision problem if there is a path in the decision tree of the algorithm that leads to a “yes” answer; otherwise it would answer “no”.

If the number of steps in the decision tree path to the affirmative answer is O(nk), where n is the size of the specific problem, the algorithm is considered polynomial.

There are other ways to describe NP, as per Drozdek which is below but I recommend the previous slide.

In HIT220 I suggest a simpler, but equivalent way of discussing class P: After showing that a problem is in NP, we can show membership in class P by describing and analysing an efficient algorithm for determining the Decision Problem. i.e. (1) Show that it is in NP (polynomial time checking algorithm) and then (2) give an algorithm to solve and show that the decision problem itself is O(nk), where n is input size. For example we could do these steps for 2-colourability or Eulerian graph.

Should be clear that, P is a subset of NP. i.e. It is known that every decision problem in P is also in NP. The big unknown is whether or not every problem in NP is also in P. Is it possible that efficient algorithms exist but that we just haven’t found them? Mostly conjectured that P is not equal to NP. See other readings for this week.

Click on Insert > new slide to choose your cover, then delete this page.

Class NPC

A problem is called NP-complete if it is NP and every NP problem can be polynomially reduced to this problem.

Cook 1970 established that SAT (Boolean satisfiability problem) is in NPC - Cook’s Theorem (Stephen Cook, 1970) Read pages 13,14 of Garey and Johnson.

If every instance of a problem P1 can be reduced, in polynomial time, to some instance of P2 then P2 is at least as hard as P1.

If there is a polynomial time algorithm for any NPC problem then every problem in NP can be solved efficiently.

Click on Insert > new slide to choose your cover, then delete this page.

https://www.donkcowan.com/blog/2013/5/13/p-versus-np-complexity-t

Click on Insert > new slide to choose your cover, then delete this page.

Showing that any problem is in the class NPC means that there is no polynomial time algorithm unless P = NP.

Most in the field think that is unlikely that P = NP. What do you think? Is it plausible that “Easy to check YES’s” should have “Easy to obtain solutions”?

Our main focus is on awareness of Class NPC and its implication and developing some ability to recognise types of problems which might be intractable.

Click on Insert > new slide to choose your cover, then delete this page.

About Input size and encoding

Complexity of a calculation is measured by expressing the running time as a function of some measure of the input data n needed to describe the problem.

For graph problems, input size n is typically the number of vertices or edges in the graph. Assumption of a reasonable encoding scheme.

For problems that involve integers such as computing the GCD, n is the number of bits it takes to express these integers. i.e. log n rather than n.

Click on Insert > new slide to choose your cover, then delete this page.

Input size and encoding

Complexity of a calculation is measured by expressing the running time as a function of some measure of the input data n needed to describe the problem.

For graph problems, input size n is typically the number of vertices or edges in the graph. Assumption of a reasonable encoding scheme.

For problems that involve integers such as computing the GCD, n is the number of bits it takes to express these integers or memory size. i.e. log n rather than n.

Click on Insert > new slide to choose your cover, then delete this page.

Theoretical perspective

If the running time is at most a polynomial function of the amount of input data then the calculation is called “easy”. Otherwise, it is “hard”.

To show/classify that a computation problem is easy it is enough to describe a fast algorithm.

To prove that a problem is hard means that you would need to prove that it is impossible to find a fast way of solving the problem.

The fact that a computational problem is classified as hard does not mean that every instance of it has to be hard.

A problem is hard because we cannot devise an algorithm for which we can give a guarantee for fast performance for all instances.

Click on Insert > new slide to choose your cover, then delete this page.

Decision Problem example

Graph k-coloring

Input: A graph G=(V,E), a positive integer k.

Question: Is G k-colourable?

A graph is k-colourable if its vertices can be assigned colours from a list of k colours such that no pair of adjacent vertices has the same colour.

Click on Insert > new slide to choose your cover, then delete this page.

Graph 2-coloring

Input: A graph G=(V,E).

Question: Is G 2-colourable?

A polynomial time algorithm is known.

Graph k-coloring

Input: A graph G=(V,E).

Question: Is G k-colourable?

No polynomial time algorithm is known to determine YES or NO for all instances.

However if given a colouring of the vertices of G, checking whether or not it is a k-colouring can be executed in polynomial time.

Click on Insert > new slide to choose your cover, then delete this page.

Sometimes rewording problem helps

What is a computer doing?

Consider the Turing model of a machine

https://en.wikipedia.org/wiki/Universal_Turing_machine

Click on Insert > new slide to choose your cover, then delete this page.

Singly Linked List

A singly linked list is a concrete data structure consisting of a sequence of nodes, starting from a head pointer

Each node stores

element

link to the next node

elem

node



head

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

Click on Insert > new slide to choose your cover, then delete this page.

Inserting at the Head

Allocate a new node (memory)

Insert new element

Make new node point to old head

Update head to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

Removing at the Head

Update head to point to next node in the list

Allow garbage collector to reclaim the former first node

Click on Insert > new slide to choose your cover, then delete this page.

Inserting at the Tail

Find the tail node

Allocate a new node

Insert new element

Have new node point to null

Have old last node point to new node

Update tail to point to new node

Click on Insert > new slide to choose your cover, then delete this page.

Removing at the Tail

Removing at the tail of a singly linked list is not efficient!

There is no constant-time way to update the tail to point to the previous node

Click on Insert > new slide to choose your cover, then delete this page.

Spatial efficiency

Storage space is the space for storing the input: data and algorithm

Auxiliary Space is the extra space or temporary space used by an algorithm.

Space Complexity of an algorithm is total space taken by the algorithm with respect to the input size. Space complexity includes both Auxiliary space and space used by input.

Click on Insert > new slide to choose your cover, then delete this page.

Improving lists

Double Linked List

We can backtrack

Circular List

In looping through list return easily to head

Click on Insert > new slide to choose your cover, then delete this page.

Recursion

One method of improving efficiency

We reduce the code length and run time

But we may increase the space load

Click on Insert > new slide to choose your cover, then delete this page.

What Is Recursion?

Recursion is a method of solving problems that involves breaking a problem down into smaller and smaller subproblems until you get to a small enough problem that it can be solved trivially.

Recursion involves a function calling itself.

While it may not seem like much on the surface, recursion allows us to write elegant solutions to problems that may otherwise be very difficult to program.

Click on Insert > new slide to choose your cover, then delete this page.

Calculating the Sum of a List of Numbers

We will begin our investigation with a simple problem that you already know how to solve without using recursion.

Suppose that you want to calculate the sum of a list of numbers such as: [1,3,5,7,9].

An iterative function that computes the sum is shown here.

The function uses an accumulator variable (total) to compute a running total of all the numbers in the list by starting with 0 and adding each number in the list.

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

Pretend that you do not have while loops or for loops. How would you compute the sum of a list of numbers?

If you were a mathematician you might start by recalling that addition is a function that is defined for two parameters, a pair of numbers. To redefine the problem from adding a list to adding pairs of numbers, we could rewrite the list as a fully parenthesized expression. Such an expression looks like this:

((((1+3)+5)+7)+9)

Click on Insert > new slide to choose your cover, then delete this page.

Premise of Recursion

We can also parenthesize the expression the other way around: (1+(3+(5+(7+9))))

Notice that the innermost set of parentheses, (7+9), is a problem that we can solve without a loop or any special constructs. In fact, we can use the following sequence of simplifications to compute a final sum.

total= (1+(3+(5+(7+9))))

total= (1+(3+(5+16)))

total= (1+(3+21))

total= (1+24)

total= 25

Click on Insert > new slide to choose your cover, then delete this page.

A Different Approach

How can we take this idea and turn it into a program, like Python?

First, let’s restate the sum problem in terms of Python lists.

We might say the sum of the list numList is the sum of the first element of the list (numList[0]), and the sum of the numbers in the rest of the list (numList[1:]).

listSum(numList)=first(numList) + listSum(rest(numList))

first(numList) returns the first element of the list

rest(numList) returns a list of everything but the first element

Click on Insert > new slide to choose your cover, then delete this page.

Example Recursive Program

Tracing: http://interactivepython.org/courselib/static/pythonds/Recursion/pythondsCalculatingtheSumofaListofNumbers.html

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

The Three Rules of Recursion

All recursive algorithms must obey three important rules

A recursive algorithm must have a base case.

A recursive algorithm must change its state and move toward the base case.

A recursive algorithm must call itself, recursively.

Click on Insert > new slide to choose your cover, then delete this page.

The Rules!

First, a base case is the condition that allows the algorithm to stop recursing. A base case is typically a problem that is small enough to solve directly.

In the list_sum algorithm the base case is a list of length 1.

Second, we must arrange for a change of state that moves the algorithm toward the base case. A change of state means that some data that the algorithm is using is modified. Usually the data that represents our problem gets smaller in some way.

In the list_sum algorithm our primary data structure is a list, so we must focus our state-changing efforts on the list. Since the base case is a list of length 1, a natural progression toward the base case is to shorten the list.

Third, the algorithm must call itself.

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 2 /._Week2.pptx

Assignment 2_Resources/Week 2 /Week 2 workshop.pptx

Complexity Theory

Cat Kutay

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Big-Oh Rules

If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e.,

Drop lower-order terms

Drop constant factors

Use the smallest possible class of functions

Say “2n is O(n)” instead of “2n is O(n2)”

Use the simplest expression of the class

Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Asymptotic Algorithm Analysis

The asymptotic analysis of an algorithm determines the running time in big-Oh notation

To perform the asymptotic analysis

We find the worst-case number of primitive operations executed as a function of the input size

We express this function with big-Oh notation

Example:

We say that algorithm find_max “runs in O(n) time”

Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Asymptotic complexity – an example

Most commonly used notation for (asymptotic) complexity is "big-O" notation

which estimates the rate of growth of a function, for “large” n.

f (n) = n2 + 100n + log10n + 1,000 = O(n2)

" f (n) is big-oh of n squared ”

The order of complexity of f (n) is a polynomial function (quadratic)

Figure 2-1 (Drozdek) The growth rate of all terms of function f (n) = n2 + 100n + log10n + 1,000

Click on Insert > new slide to choose your cover, then delete this page.

Which is the fastest growing component

Execution time

Figure 2-4 (Drozdek) Classes of algorithms and their execution times on a computer executing 1 million operations per second (1 sec = 106 μsec (microseconds) = 103 msec (milliseconds))

Click on Insert > new slide to choose your cover, then delete this page.

Computing Prefix Averages

We further illustrate asymptotic analysis with two algorithms for prefix averages

The i-th prefix average of an array X is average of the first (i + 1) elements of X:

A[i] = (X[0] + X[1] + … + X[i])/(i+1)

Computing the array A of prefix averages of another array X has applications to financial analysis

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages (Quadratic)

The following algorithm computes prefix averages in quadratic time by applying the definition

Click on Insert > new slide to choose your cover, then delete this page.

Arithmetic Progression

The running time of prefixAverage1 is O(1 + 2 + …+ n)

The sum of the first n integers is n(n + 1) / 2

There is a simple visual proof of this fact

Thus, algorithm prefixAverage1 runs in O(n2) time

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages 2 (Looks Better)

The following algorithm uses an internal Python function to simplify the code

Algorithm prefixAverage2 still runs in O(n2) time!

Click on Insert > new slide to choose your cover, then delete this page.

Reduce complexity

If x>0:

// O(I)

else if x<0:

//O(logn)

else:

//O(n2)

Click on Insert > new slide to choose your cover, then delete this page.

O(?) from seminar

for x in range (0,n):

print x;

for x in range (0,n):

print x*y

What is the total runtime ???

O(N2)

Click on Insert > new slide to choose your cover, then delete this page.

Big-O notation: examples

These functions are O(n2): 3n2 1000n2 + 10000 3n 3n2 + log n + 123 These functions are not O(n2): 3n2.01 3n3

Click on Insert > new slide to choose your cover, then delete this page.

Exercises

Use the definition of big-O to show that f is O(g) for each of the following functions. (i.e. Find the witnesses c and N.) In each case, assume that n is a positive integer.

Remember The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

f(n) = 8n, g(n) = n

We want c and N such that 8n ≤ c × g(n) for all n ≥ N.

With c = 8, N = 1 we have

8n ≤ 8n which is true as required for all n ≥ 1.

Click on Insert > new slide to choose your cover, then delete this page.

(b) f(n) = 4n − 1, g(n) = n

We want c and N such that 4n − 1 ≤ c × n for all n ≥ N.

Similarly, try c = 4, N = 1.

4n − 1 ≤ 4n − 1 which is true as required for all n ≥ 1.

0 ≤ n 2 + 2n + 1 ≤ n 2 + 2n 2 + n2 = 4n2

Hence, c = 4 and N = 1 are witnesses.

(d) f(n) = 2n2 + 5n − 3, g(n) = n2

2n2 + 5n − 3 ≤ 2n2 + 5n2 − 3n2

≤ 4n2 , for n ≥ 1

Hence, c = 4 and N = 1 are witnesses.

(e) f(n) = 457, g(n) = 1

c = 457 and N = 1.

The function f(n) is O(g(n)), since f(n) ≤ c·g(n) when n ≥ N.

OR 0 ≤ f(n) ≤ c.g(n) for all n ≥ N

Click on Insert > new slide to choose your cover, then delete this page.

(f) Show that 3n2 + 25 is O(n2)

3n2 + 25 ≤ 3n2 + 25n2 n ≥ 1

3n2 + 25 ≤ 28n2

Hence, c = 28 and N = 1 are witnesses.

(g) Show that 2n2 is O(n3)

2n2 ≤ 2n3

Hence, c = 1 and N = 2 are witnesses.

Click on Insert > new slide to choose your cover, then delete this page.

Exercises

Click on Insert > new slide to choose your cover, then delete this page.

Figure 2-3 Comparison of functions for different values of c and N from Figure 2-2

Figure 2-2 (Drozdek) Different values of c and N for function f (n) = 2n2 + 3n + 1 = O(n2)

calculated according to the definition of big-O

Click on Insert > new slide to choose your cover, then delete this page.

TSP – Travelling salesman problem

Below link, solves the problem for however many cities you want to select by trying out all possible routes, and recording the best so far. You can get a feel for what an intractable problem looks like by seeing how long the interactive takes to solve the problem for different size maps

https://csfieldguide.org.nz/en/interactives/city-trip/

Stop, set up cities then start

From https://csfieldguide.org.nz/en/chapters/complexity-and-tractability/whats-the-big-picture/

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Prefix Averages 3 (Linear Time)

The following algorithm computes prefix averages in linear time by keeping a running sum

Algorithm prefixAverage3 runs in O(n) time

Click on Insert > new slide to choose your cover, then delete this page.

Lists

What are the features of a list

What are the operations required on a list

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

Click on Insert > new slide to choose your cover, then delete this page.

Recursion example

Analysis of Algorithms

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

Self Test

How many recursive calls are made when computing the sum of the list [2,4,6,8,10]?

(A) 6

(B) 5

(D) 3

The first recursive call passes the list [4,6,8,10], the second [6,8,10] and so on until [10].

Click on Insert > new slide to choose your cover, then delete this page.

Other sources

P vs NP and the Computational Complexity Zoo

Turing and the Halting Problem

Turing and the Halting Problem Computerphile Decoded science The Turing Machine versus the Decision Problem of Hilbert

Turing machines explained - Computerphile

Click on Insert > new slide to choose your cover, then delete this page.

Maths to review

Analysis of Algorithms

Summations

Logarithms and Exponents

Proof techniques

Basic probability

properties of logarithms:

logb(xy) = logbx + logby

logb (x/y) = logbx - logby

logbxa = alogbx

logba = logxa/logxb

properties of exponentials:

a(b+c) = aba c

abc = (ab)c

ab /ac = a(b-c)

b = a logab

bc = a c*logab

Click on Insert > new slide to choose your cover, then delete this page.

1234567

Chart1

21	21
23	22
25	23
31	25
20	24
18	23
16	22

Sheet1

X	21	23	25	31	20	18	16
A	21	22	23	25	24	23	22

123456

__MACOSX/Assignment 2_Resources/Week 2 /._Week 2 workshop.pptx

Assignment 2_Resources/Week 4/Priority Queues.pptx

HIT220 Algorithms and Complexity

Priority Queues

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

Circular Linked Lists

Circular linked lists are a type of linked list in which the last node links back to the head of the list instead of pointing to None.

Circular linked lists have use cases such as:

Going around each player’s turn in a multiplayer game

Managing the application life cycle of a given operating system

Implementing a Fibonacci heap for a priority queue

Click on Insert > new slide to choose your cover, then delete this page.

Implementing

One of the advantages of circular linked lists is that you can traverse the whole list starting at any node. Since the last node points to the head of the list, you need to make sure that you stop traversing when you reach the starting point. Otherwise, you’ll end up in an infinite loop.

In terms of implementation, circular linked lists are very similar to singly linked list. The only difference is that you can define the starting point when you traverse the list:

Click on Insert > new slide to choose your cover, then delete this page.

Python

Click on Insert > new slide to choose your cover, then delete this page.

Priority Queue

What is a queue

What is a priority queue

struct item { int item; int priority; }

Click on Insert > new slide to choose your cover, then delete this page.

Heap

A Heap is a tree structure in which each parent node is less than or equal to its child node for a Min Heap and greater than or equal to its child node for a Max heap.

Order needs to be introduced to the heap to achieve the desired running time O(log n) . In particular, the degrees of the nodes are kept quite low: every node has degree at most O(log n) and the size of a subtree rooted in a node of degree k is at least Fk+2, where Fk is the kth Fibonacci number

Click on Insert > new slide to choose your cover, then delete this page.

Min Binary Heap

Click on Insert > new slide to choose your cover, then delete this page.

Implementing

import heapq H = [21,1,45,78,3,5]

# Use heapify to rearrange the elements heapq.heapify(H)

print(H)

[1, 3, 5, 78, 21, 45]

heapq.heappush(H,8)

print(H)

[1, 3, 5, 78, 21, 45, 8]

heapq.heappop(H)

[3, 8, 5, 78, 21 ,45]

heapq.heapreplace(H,6)

[5, 8, 6, 78, 21, 45]

Click on Insert > new slide to choose your cover, then delete this page.

Converting

A Min-Heap is a complete binary tree in which the value in each internal node is smaller than or equal to the values in the children of that node. Mapping the elements of a heap into an array is trivial: if a node is stored a index k, then its left child is stored at index 2k + 1 and its right child at index 2k + 2.

Click on Insert > new slide to choose your cover, then delete this page.

Some operations in binary heaps

getMin(): returns the root element. Time Complexity of this operation is O(1).

extractMin(): Removes the minimum element. Time Complexity of this Operation is O(Log n) as this operation needs to maintain the heap property after removing root.

insert(): Inserting a new key takes O(Log n) time. We add a new key at the end of the tree, if new key is larger than its parent. If not we need to traverse up to fix the violated heap property.

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction/

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Find Min: Θ(1) [Same as both Binary and Binomial]

Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]

Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]

Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]

Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and Θ(Log n) in Binomial]

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction

Click on Insert > new slide to choose your cover, then delete this page.

Matrices and vectors

Why do we use them and how

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking Matrix

Eigenvectors

Eigenvalues

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking

From Tim Chartier and Anne Greenbaum

Adjacency matrix

Click on Insert > new slide to choose your cover, then delete this page.

Ranking matrix

Form a matrix M where element mij gives the probability of a surfer visit webpage j from webpage i, which implies

Click on Insert > new slide to choose your cover, then delete this page.

Stochastic matric or transition matrix

Start a walk at page 1, represented by vector v

Click on Insert > new slide to choose your cover, then delete this page.

Calculating probability of each page

v2 = v1M = (0 0.33 0.33 0 0.33 0)

v3 = v2M = (0.67 0 0.17 0 0 0.17)

v3 = vM3

The adjacency matrix is sparse

Note all page ranks add to 1

V becomes an eigenvector of M

Click on Insert > new slide to choose your cover, then delete this page.

Thank you [email protected]

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Priority Queues.pptx

Assignment 2_Resources/Week 4/.DS_Store

__MACOSX/Assignment 2_Resources/Week 4/._.DS_Store

Assignment 2_Resources/Week 4/Stacks and Queues.pptx

Week 4: Stacks and Queues

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

Space Efficiency

Example of using tail recursion:

def fact(n):

if (n == 0):

return 1

return n * fact(n-1)

Is this the same as:

def fact(n, a = 1):

if (n == 1):

return a

return fact(n - 1, n * a)

Click on Insert > new slide to choose your cover, then delete this page.

Program Execution Logic

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Relatives of Big-Oh

big-Omega

f(n) is (g(n)) if there is a constant c > 0

and an integer constant n0  1 such that

f(n)  c•g(n) for n  n0

big-Theta

f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0

Click on Insert > new slide to choose your cover, then delete this page.

Analysis of Algorithms

Example Uses of the Relatives of Big-Oh

Let c = 5 and n0 = 1

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 1 and n0 = 1

5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0

let c = 5 and n0 = 1

5n2 is (n2)

Click on Insert > new slide to choose your cover, then delete this page.

The Node Class for List Nodes Coding practice

public class Node {

// Instance variables:

private Object element;

private Node next;

/** Creates a node with null references to its element and next node. */

public Node() {

this(null, null);

}

/** Creates a node with the given element and next node. */

public Node(Object e, Node n) {

element = e;

next = n;

}

// Accessor methods:

public Object getElement() {

return element;

}

public Node getNext() {

return next;

}

// Modifier methods:

public void setElement(

Object newElem) {

element = newElem;

}

public void setNext(

Node newNext) {

next = newNext;

}

Click on Insert > new slide to choose your cover, then delete this page.

Insertion/Deletion in List

// insertion methods

public insert(Object e, Node n) {

//insert e after node n

newNode=node(e, n.next)

n.next=newNode;

}

//deletion

Public delete (Object e)

//delete node with data e

node=head;

while(node is not None):

if node.element ==e:

prev.next=node.next

//garbage collection

del node;

else:

prev=node;

node=node.next

Click on Insert > new slide to choose your cover, then delete this page.

Retrieval

//deletion

Public delete (Object e)

//delete node with data e

node=head;

while(node is not None):

if node.element ==e:

return True; prev=node;

node=node.next

return False;

Click on Insert > new slide to choose your cover, then delete this page.

Basic Data Structures

The Internet is designed to route information in discrete packets, which are at most 1500 bytes in length. Because of this design, any time a video stream is transmitted on the Internet, it must be subdivided into packets and these packets must each be individually routed to their destination.

In the case of a stored video file, it is desirable to send packets using the Transmission Control Protocol (TCP), since this protocol guarantees that all the packets will arrive at their destination in the correct order.

Nevertheless, because of vagaries and errors, the time it takes for these packets to arrive at their destination can be highly variable. Thus, we need a way of “smoothing out” these variations in order to avoid long pauses if someone wants to watch a video at the same time it is being transmitted. (like a buffer)

which is a portion of computer memory that is used to temporarily store items, as they are being produced by one computational process and consumed by another.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Internet web browsers store the addresses of recently visited sites on a stack. Each time a user visits a new site, that site’s address is “pushed” onto the stack of addresses. The browser then allows the user to “pop” back to previously visited sites using the “back” button.

Click on Insert > new slide to choose your cover, then delete this page.

Stacks

A stack is a container of objects that are inserted and removed according to the last-in first-out (LIFO) principle.

Linear data structure

Flexible size (delete and add as you go!)

Objects can be inserted into a stack at any time, but only the most-recently inserted (that is, “last”) object can be removed at any time.

The name “stack” is derived from the metaphor of a stack of plates in a spring loaded cafeteria plate dispenser.

In this case, the fundamental operations involve the “pushing”

and “popping” of plates on the stack.

Click on Insert > new slide to choose your cover, then delete this page.

Stacks

stack, S, is a container that supports the following two methods:

push(o): Insert object o at the top of the stack.

pop(): Remove from the stack and return the top object on the stack, that is, the most recently inserted element still in the stack; an error occurs if the stack is empty.

Stack data structure can be implemented in two ways. They are as follows...

Using Array (Python uses Lists, eg: my_var = [item1, item2] )

Using Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Application

The simplest application of a stack is to reverse a word. You push a given word to stack - letter by letter - and then pop letters from the stack.

Another application is an "undo" mechanism in text editors; this operation is accomplished by keeping all text changes in a stack.

Click on Insert > new slide to choose your cover, then delete this page.

A Simple Array-Based Implementation

A stack is easily implemented with an N-element array S, with elements stored from S[0] to S[t], where t is an integer that gives the index of the top element in S.

We must specify some maximum size N for our stack, say, N = 1, 000

Click on Insert > new slide to choose your cover, then delete this page.

Array-based implementation

If array begin at index 0

(t −1) for an initially empty stack, also to test if a stack is empty.

(t+1) to determine the number of elements in a stack.

Error condition that arises:

if we try to insert a new element and the array S is full.

If we try to remove a element from an empty array S

Algorithm push(obj):

if t + 1 = N then

return that a stack-full error has occurred

t ← t + 1

S[t] ← obj

return

Algorithm pop():

if t < 0 then

return that a stack-empty error has occurred

obj ← S[t]

S[t] ← null (or None in Python)

t ← t − 1

return obj

each method runs in O(1) time.

Click on Insert > new slide to choose your cover, then delete this page.

Implementing a Stack

Stack() creates a new stack that is empty. It needs no parameters and returns an empty stack.

push(item) adds a new item to the top of the stack. It needs the item and returns nothing.

pop() removes the top item from the stack. It needs no parameters and returns the item. The stack is modified.

peek() returns the top item from the stack but does not remove it. It needs no parameters. The stack is not modified.

is_empty() tests to see whether the stack is empty. It needs no parameters and returns a boolean value.

size() returns the number of items on the stack. It needs no parameters and returns an integer.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Click on Insert > new slide to choose your cover, then delete this page.

The biggest issue is that it can run into speed issue as it grows. The items in list are stored next to each other in memory, if the stack grows bigger than the block of memory that currently hold it, then Python needs to do some memory allocations. This can lead to some append() calls taking much longer than other ones.

Stack implemented using array is not suitable, when we don't know the size of data which we are going to use.

A stack data structure can be implemented by using linked list data structure. The stack implemented using linked list can work for unlimited number of values. That means, stack implemented using linked list works for variable size of data.

So, there is no need to fix the size at the beginning of the implementation. The Stack implemented using linked list can organize as many data values as we want.

Linked List-based implementation provides the best (from the efficiency point of view) dynamic stack implementation.

Click on Insert > new slide to choose your cover, then delete this page.

Implementation of the

Stack class in Python

Demonstration of the Stack class being used in Python

Click on Insert > new slide to choose your cover, then delete this page.

Linked List-based implementation

In linked list implementation of a stack, every new element is inserted as 'top' element. That means every newly inserted element is pointed by 'top'. Whenever we want to remove an element from the stack, simply remove the node which is pointed by 'top' by moving 'top' to its next node in the list. The next field of the first element must be always NULL.

Click on Insert > new slide to choose your cover, then delete this page.

push(value) - Inserting an element into the Stack

We can use the following steps to insert a new node into the stack...

Step 1: Create a newNode with given value.

Step 2: Check whether stack is Empty (top == NULL)

Step 3: If it is Empty, then set newNode → next = NULL.

Step 4: If it is Not Empty, then set newNode → next = top.

Step 5: Finally, set top = newNode.

Click on Insert > new slide to choose your cover, then delete this page.

pop() - Deleting an Element from a Stack

We can use the following steps to delete a node from the stack...

Step 1: Check whether stack is Empty (top == NULL).

Step 2: If it is Empty, then display "Stack is Empty!!! Deletion is not possible!!!" and terminate the function

Step 3: If it is Not Empty, then define a Node pointer 'temp' and set it to 'top'.

Step 4: Then set 'top = top → next'.

Step 7: Finally, delete 'temp' (free(temp)).

Click on Insert > new slide to choose your cover, then delete this page.

display() - Displaying stack of elements

We can use the following steps to display the elements (nodes) of a stack...

Step 1: Check whether stack is Empty (top == NULL).

Step 2: If it is Empty, then display 'Stack is Empty!!!' and terminate the function.

Step 3: If it is Not Empty, then define a Node pointer 'temp' and initialize with top.

Step 4: Display 'temp → data --->' and move it to the next node. Repeat the same until temp reaches to the first node in the stack (temp → next != NULL).

Step 4: Finally! Display 'temp → data ---> NULL'.

Click on Insert > new slide to choose your cover, then delete this page.

An implementation of the

Stack class in Python utilizing a Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Example: Using a buffer to smooth video transmission over the Internet

The networking process is producing the packets and the playback process is consuming them. Thus, ignoring any rewinding, this producer-consumer model is enforcing a first-in, first-out (FIFO) protocol for the packets, in that the consumer process is always retrieving the packet that has been in the buffer the longest.

Click on Insert > new slide to choose your cover, then delete this page.

Queue buffers

Network packets are produced by the TCP process, which inserts packets in the correct order into the buffer, but does so at a variable rate.

Packets are consumed by the video player process, which removes packets from the buffer at a constant speed according to the video standard it is using.

The buffer enforces a first-in, first-out (FIFO) protocol and, if it is big enough, it should smooth out the packet production and consumption so that packets can be processed at the average transmission rate without annoying pauses.

This week, we will explore how to implement such a buffer using a queue data structure, which itself can be implemented with either an array or a linked list.

Click on Insert > new slide to choose your cover, then delete this page.

Queues

Queue is a linear data structure in which the insertion and deletion operations are performed at two different ends. In a queue data structure, adding and removing of elements are performed at two different positions. The insertion is performed at one end and deletion is performed at other end. In a queue data structure, the insertion operation is performed at a position which is known as 'rear' and the deletion operation is performed at a position which is known as 'front'. In queue data structure, the insertion and deletion operations are performed based on FIFO (First In First Out) principle.

Click on Insert > new slide to choose your cover, then delete this page.

A queue supports the following two fundamental methods:

enqueue(o): Insert object o at the rear of the queue.

dequeue(): Remove and return from the queue the object at the front; an error occurs if the queue is empty.

Queue data structure can be implemented in two ways. They are as follows...

Using Array

Using Linked List

Click on Insert > new slide to choose your cover, then delete this page.

Example

Queue after inserting 25, 30, 51, 60 and 85.

Click on Insert > new slide to choose your cover, then delete this page.

Example

Click on Insert > new slide to choose your cover, then delete this page.

Implementing a Queue

Queue() creates a new queue that is empty. It needs no parameters and returns an empty queue.

enqueue(item) adds a new item to the rear of the queue. It needs the item and returns nothing.

dequeue() removes the front item from the queue. It needs no parameters and returns the item. The queue is modified.

is_empty() tests to see whether the queue is empty. It needs no parameters and returns a boolean value.

size() returns the number of items in the queue. It needs no parameters and returns an integer.

display() - To display the elements of the queue.

Click on Insert > new slide to choose your cover, then delete this page.

Implementation of the

Queue class in Python

Demonstration of the Queue class being used in Python

Click on Insert > new slide to choose your cover, then delete this page.

Queue implementation using singly linked list

set_next will take a new_node as a parameter and will set the pointer from previous Node towards this Node.

In the Node class, we will set the data and next_node(i.e pointer) equal to None as parameter in init method so that if we don’t send any data and pointer to the next node, it will return None.

Click on Insert > new slide to choose your cover, then delete this page.

An implementation of the

Queue class in Python utilizing a Linked List

Code Continues 

Click on Insert > new slide to choose your cover, then delete this page.

Simulation: Hot Potato

https://www.youtube.com/watch?v=uCsD3ZGzMgE

This game is a modern-day equivalent of the famous Josephus problem.

Josephus and 39 of his comrades held out against the Romans in a cave

They decided that they would rather die than be slaves to the Romans

They arranged themselves in a circle. One man was designated as number one, and proceeding clockwise they killed every seventh man.

Josephus instantly figured out where he ought to sit in order to be the last to go.

When the time came, instead of killing himself, he joined the Roman side.

You can find many different versions of this story. Some count every third man and some allow

the last man to escape on a horse. In any case, the idea is the same.

Click on Insert > new slide to choose your cover, then delete this page.

Simulation: Hot Potato

To simulate the circle, we will use a queue (see Figure 3.14).

Assume that the child holding the potato will be at the front of the queue.

Upon passing the potato, the simulation will simply dequeue and then immediately enqueue that child, putting them at the end of the line.

They will then wait until all the others have been at the front before it will be her turn again.

After num dequeue/enqueue operations, the child at the front will be removed permanently and another cycle will begin.

This process will continue until only one name remains (the size of the queue is 1).

Click on Insert > new slide to choose your cover, then delete this page.

Pseudo code

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Stacks and Queues.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Singly Linked List.docx

HIT 220 Week 4 Activity Singly Linked List

Implement a singly linked list for primitive integers

a) Using the implementation details below as a guide, write and compile a program to accept and store (unordered) integers as a singly linked list. Do not use any library classes.

Modify to your own coding style if you like but your list class must include at least methods:

· Add or delete from either the head or tail. (4 methods)

· Find a given integer (e.g. method boolean isInList) (1 method)

· Delete a given integer from the list. (1 method)

b) Test your program for robustness.

If using pseudo code you will need to consider odd cases, etc

You will need a main method but hardcoding of input and your method calls is fine for the current purpose.

c) Extend your program to include iterative methods:

· To insert a node at the ith position in the list

· to delete the ith node

d) Write up a summary of the big O complexity of each algorithm in your program. In each case, write a sentence to explain the complexity. (You do not need to give calculations or formal proof here.)

https://www.geeksforgeeks.org/linked-list-set-1-introduction/

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Singly Linked List.docx

Assignment 2_Resources/Week 4/Part 2.pptx

Week 4: Hash and Matrices

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

Simulation: Hot Potato

To simulate the circle, we will use a queue (see Figure 3.14).

Assume that the child holding the potato will be at the front of the queue.

Upon passing the potato, the simulation will simply dequeue and then immediately enqueue that child, putting them at the end of the line.

They will then wait until all the others have been at the front before it will be her turn again.

After num dequeue/enqueue operations, the child at the front will be removed permanently and another cycle will begin.

This process will continue until only one name remains (the size of the queue is 1).

Click on Insert > new slide to choose your cover, then delete this page.

Iteration

At each cycle, kth person removed, leaving n-1. So the (n-1)th starting state is now k-1 moved from the original starting point

Therefore

F(n,k)=(F(n-1,k)+k-1) mod n +1

F(1,k)=1

Click on Insert > new slide to choose your cover, then delete this page.

Pseudo code

def josephus(n, k):

if (n == 1):

return 1

else:

# The position returned by josephus(n - 1, k) is adjusted

# because the recursive call josephus(n - 1, k) considers

# the original position k%n + 1 as position 1

return (josephus(n - 1, k) + k-1) % n + 1

Click on Insert > new slide to choose your cover, then delete this page.

Heap types

Binary heap

Binomial heap

Fibonacci heap

Click on Insert > new slide to choose your cover, then delete this page.

Binary heap

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

Fibonacci Heap is a collection of trees with min-heap or max-heap property.

Joined by a circular doubly linked list for accessing min value

Click on Insert > new slide to choose your cover, then delete this page.

Fibonacci heap

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction/

Click on Insert > new slide to choose your cover, then delete this page.

Complexity comparison

Find Min: Θ(1) [Same as both Binary and Binomial]

Delete Min: O(Log n) [Θ(Log n) in both Binary and Binomial]

Insert: Θ(1) [Θ(Log n) in Binary and Θ(1) in Binomial]

Decrease-Key: Θ(1) [Θ(Log n) in both Binary and Binomial]

Merge: Θ(1) [Θ(m Log n) or Θ(m+n) in Binary and Θ(Log n) in Binomial]

https://www.geeksforgeeks.org/fibonacci-heap-set-1-introduction

Click on Insert > new slide to choose your cover, then delete this page.

Skipped list

Click on Insert > new slide to choose your cover, then delete this page.

Example

1, 5, 7, 9, 13, 15, 17, 19, 25

Condition 1: Every third odd position node goes to second level and points to the position two node ahead.

Condition 2: Every 4th position node goes to 3rd level and points to the position four node ahead.

Find the sequence of searching of the following numbers: 13,1 and 20.

Click on Insert > new slide to choose your cover, then delete this page.

Matrices and vectors

We have been using linear arrays and lists

Why do we use 2D arrays and how

Click on Insert > new slide to choose your cover, then delete this page.

Arrays as coordinates

Vectors are lists

In physics the numbers represent direction in space and length

eg [1 2] is a vector from zero to (1, 2) ie 1 on x axis and 2 on y axis

In computer science we can put any data in a vector and consider it a measure of the object that data is describing

Eg [1 2] is the size measure of a box 1m by 2m. We can compare it to another box [3 4]

https://www.youtube.com/watch?v=fNk_zzaMoSs

Click on Insert > new slide to choose your cover, then delete this page.

Matrices

Matrices can be a combination of such data points, or vectors

Eg an array of vectors is a matrix

[1 2]

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking Matrix

Eigenvectors

Eigenvalues

Click on Insert > new slide to choose your cover, then delete this page.

Page Ranking

From Tim Chartier and Anne Greenbaum

Adjacency matrix

Click on Insert > new slide to choose your cover, then delete this page.

Ranking matrix

Form a matrix M where element mij gives the probability of a surfer visit webpage j from webpage i, which implies

Click on Insert > new slide to choose your cover, then delete this page.

Stochastic matric or transition matrix

Start a walk at page 1, represented by vector v

Click on Insert > new slide to choose your cover, then delete this page.

Calculating probability of each page

v2 = v1M = (0 0.33 0.33 0 0.33 0)

v3 = v2M = (0.67 0 0.17 0 0 0.17)

v3 = vM3

The adjacency matrix is sparse

Note all page ranks add to 1

V becomes an eigenvector of M

Click on Insert > new slide to choose your cover, then delete this page.

__MACOSX/Assignment 2_Resources/Week 4/._Part 2.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Answers Linked List.docx

HIT 220 Week 4 Activity Singly Linked List

Implement a singly linked list for primitive integers

a) Using the implementation details below as a guide, write and compile a program to accept and store (unordered) integers as a list using a singly linked list. Do not use any library classes.

https://www.geeksforgeeks.org/linked-list-set-1-introduction/

Look up lists in Python, what are they, how are they used.

Create Node

CreateNode(data, node)

Allocate data

Allocate next

Return node

Create Linked Listhead (integers)

Node=Create first node

Head = node

Repeat for each data_value in integers:

Node2=Create new_node (data_value,null)

Node2.next=node

Node=node2

Head=node

Return node

Create linked listtail (integers)

Node=Create first node

Head = node

Repeat for each data_value in integers:

node2=Create_node(data,null)

node.next=node2

node=node2

Add_to_head (data)

Allocate head to node.next

Return node

Add_to_tail (data)

Create new_node(data)

Search_list_tail() #where tail.next=null

Tail.next =node

Search_list_tail()

While node.next not equal null

Node=node.next

Return node

Delete-from-head()

head =head.next;

Modify to your own coding style if you like but your list class must include at least methods:

· Add or delete in a list (append, extend, or insert).

· Find a given element in the list : index() (e.g. method boolean isInList)

· Delete a given element from the list.

b) Test your program for robustness.

If using pseudo code you will need to consider odd cases, etc

You will need a main method but hardcoding of input and your method calls is fine for the current purpose.

c) Extend your program to include iterative methods:

· To insert a node at the ith position in the list

· to delete the ith node

d) Write up a summary of the big O complexity of each algorithm in your program. In each case, write a sentence to explain the complexity. (You do not need to give calculations or formal proof here.)

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Answers Linked List.docx

Assignment 2_Resources/Week 5/.DS_Store

__MACOSX/Assignment 2_Resources/Week 5/._.DS_Store

Assignment 2_Resources/Week 5/Binary trees.pptx

Week 5: Trees and Binary Trees

Cat Kutay

Click on Insert > new slide to choose your cover, then delete this page.

Visualising tools

https://pythontutor.com/

https://visualgo.net/en

Click on Insert > new slide to choose your cover, then delete this page.

Trees

Now that we have studied linear data structures like stacks and queues and have some experience with recursion, we will look at a common data structure called the tree.

Trees are used in many areas of computer science, including operating systems, graphics, database systems, and computer networking.

Trees represent data in the real world. Like a family tree where you have a parents whose children are below them in the tree and then you've got the children of those children even further down the tree.

Trees are like trees, where the continual subdivision of trunk, branch, stems to more and more details functions in the photosynthesis process, communicating down the tree via sap

Click on Insert > new slide to choose your cover, then delete this page.

Decision Trees

A decision tree is a map of the possible outcomes of a series of related choices. It allows an individual or organization to weigh possible actions against one another based on their costs, probabilities, and benefits.

They can be used either to drive informal discussion or to map out an algorithm that predicts the best choice mathematically.

Click on Insert > new slide to choose your cover, then delete this page.

Expression Trees: Syntax Tree in Compiler

if you want evaluate the expression 45 / (3 +6), you actually represent it really cleanly as a tree and be able to evaluate it much more easily.

Click on Insert > new slide to choose your cover, then delete this page.

File Systems

File systems are perfect examples of trees. Where if someone wants to know that a path is root, then users, admin, then steve, we know it based on the tree structure.

Click on Insert > new slide to choose your cover, then delete this page.

Auto corrector and spell checker :

Click on Insert > new slide to choose your cover, then delete this page.

Next Move in games:

Click on Insert > new slide to choose your cover, then delete this page.

There's lots of trees in computer science. There's regular trees, binary trees, heaps, there’s binary search trees, there's Huffman trees, there's AVL trees, randomized search trees, red black trees, tons of these, all throughout computer science.

Why are there so many trees? They're a dynamic data structure (easy to add and remove). They have the benefits of both an ordered array and a linked list as search is as quick as in a sorted array and insertion or deletion operations (excluding the search) are as fast as in linked list.

Unlike Array and Linked List, which are linear data structures, tree is hierarchical (or non-linear) data structure.

The more powerful feature to trees is that their structure conveys information. So the fact that users is a child of root tells me that that a path exists.

Click on Insert > new slide to choose your cover, then delete this page.

Advantages of trees

Trees reflect structural relationships in the data

Trees are used to represent hierarchies

Trees are efficient for insertion and searching

Trees are a very flexible data type, allowing to move subtrees around with minimum effort

Click on Insert > new slide to choose your cover, then delete this page.

Vocabulary and Definitions

A tree is a data type that consists of nodes and arcs

These trees are depicted upside down with the root at the top and the leaves (terminal nodes) at the bottom

The root is a node that has no parent; it can have only child nodes

Leaves have no children (their children are null)

Path − the sequence of nodes along the edges of a tree.

The number of arcs in a path is called the length of the path.

The level of a node is the length of the

path from the root to the node plus 1

(which is the number of nodes in the path)

The height of a nonempty tree is the

maximum level of a node in the tree

Click on Insert > new slide to choose your cover, then delete this page.

Examples

Figure 6-1 Examples of trees

Click on Insert > new slide to choose your cover, then delete this page.

Figure 6-2 Hierarchical structure of a university shown as a tree

Click on Insert > new slide to choose your cover, then delete this page.

An ordered tree is where all elements are stored according to some predetermined criterion of ordering

Figure 6-3 Transforming (a) a linked list into (b) a tree

Click on Insert > new slide to choose your cover, then delete this page.

Binary Trees

Binary Tree is a special data structure used for data storage purposes.

A binary tree has a special condition that each node can have a maximum of two children (possibly empty), and each child is designated as either a left child or a right child

Figure 6-4 Examples of binary trees

Click on Insert > new slide to choose your cover, then delete this page.

In a complete binary tree, all nonterminal nodes have both their children, and all leaves are at the same level

Max no. of nodes at level i = 2i

There are at most 2i nodes at level i

Max no. of nodes at a tree with height h

= 20 + 21+ …. + 2h

= 2h+1 -1

20 nodes

21 nodes

22 nodes

23 nodes

Click on Insert > new slide to choose your cover, then delete this page.

In a perfect binary tree, the maximum number of

nodes in a binary tree with height h

What is the max no. of nodes of 4 levels?

What will be height of a perfect binary tree with

N nodes?

= 2h+1 -1

= 23+1 -1 = 15

n = 2h+1 -1

2h+1 = n+1

h + 1 = log2 (n+1)

h = log2 (n+1) – 1

= log2 16 – 1

= 4 – 1 = 3

Height of complete binary tree

log2 (n) = 3.90689 = 3

Minimum height of a tree with n nodes

log2 (n)

O (h) = log2 (n)

Maximum height

n-1

= o(n)

Time taken for an operation is proportional to height of the tree or (n)

Worst case

Click on Insert > new slide to choose your cover, then delete this page.

The height of a tree determines the worst case complexity of all the usual operations.

Click on Insert > new slide to choose your cover, then delete this page.

Implementing Binary Trees

Binary trees can be implemented in at least two ways:

As arrays

As linked structures

To implement a tree as an array, a node is declared as an object with an information field and two “reference/pointers” fields

These reference fields contain the index of the array cells in which left and right children are stored, if there are any.

Figure 6-7 Array representation of the tree in Figure 6.6c

Figure 6-6 Examples of binary search trees

Click on Insert > new slide to choose your cover, then delete this page.

Implementation

• BinaryTree() creates a new instance of a binary tree.

• get_left_child() returns the binary tree corresponding to the left child of the current node.

• get_right_child() returns the binary tree corresponding to the right child of the current node.

• set_root_val(val) stores the object in parameter val in the current node.

• get_root_val() returns the object stored in the current node.

• insert_left(val) creates a new binary tree and installs it as the left child of the current node.

• insert_right(val) creates a new binary tree and installs it as the right child of the current node.

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Click on Insert > new slide to choose your cover, then delete this page.

List of Lists Representation

In a list of lists tree, we will store the value of the root node as the first element of the list.

The second element of the list will itself be a list that represents the left subtree

The third element of the list will be another list that represents the right subtree.

Click on Insert > new slide to choose your cover, then delete this page.

Binary Tree using nodes

The tree itself only needs a root node, kind of just like a linked list only needs a head. A linked list can also have a tail, but for a binary tree, we only want the root node.

Each node in the tree needs:

A value

A parent (optional)

A left child

A right child

Click on Insert > new slide to choose your cover, then delete this page.

Example tree structure only, not representative of code sample

Click on Insert > new slide to choose your cover, then delete this page.

Complexity

Click on Insert > new slide to choose your cover, then delete this page.

What is the minimum number of external nodes for a proper binary tree with height h ? Justify your answer.

What is the maximum number of external nodes for a proper binary tree with height h ? Justify your answer.

Let T be a proper binary tree with height h and n nodes. Show that log ( n + 1 ) − 1 ≤ h ≤ ( n − 1 ) / 2 . d. For which values of n and h can the above lower and upper bounds on h be attained with equality?

Click on Insert > new slide to choose your cover, then delete this page.

Binary Search Tree (BST)

Binary Search Tree is a binary tree in which every node contains only smaller values in its left subtree and only larger values in its right subtree.

In this type of tree, the left subtree of every node contains nodes with smaller values and right subtree of every node contains larger values.

Click on Insert > new slide to choose your cover, then delete this page.

Search Operation in BST

The search operation is performed with O(log n) time complexity.

Click on Insert > new slide to choose your cover, then delete this page.

Searching in the BST: example

success

failure

Click on Insert > new slide to choose your cover, then delete this page.

Insert node in BST

Click on Insert > new slide to choose your cover, then delete this page.

Section 6.5 and Figure 6.23.

Insert node in BST: example

null

the insertion operation is performed with O(log n) time complexity

Click on Insert > new slide to choose your cover, then delete this page.

HIT220 Week 6 - Trees Part 1

Click on Insert > new slide to choose your cover, then delete this page.

BST Node Deletion

Deleting a node from Binary search tree has following three cases...

Case 1: Deleting a Leaf node (A node with no children)

Case 2: Deleting a node with one child

Case 3: Deleting a node with two children

Case 1: Deleting a leaf node

Step 1: Find the node to be deleted using search operation

Step 2: Delete the node using del function (If it is a leaf) and terminate the function.

the deletion operation is performed with O(log n) time complexity.

Figure 6-26 Deleting a leaf

Click on Insert > new slide to choose your cover, then delete this page.

Case 2: Deleting a node with one child

Step 1: Find the node to be deleted using search operation

Step 2: If it has only one child, then create a link between its parent and child nodes.

Step 3: Delete the node using del function and terminate the function.

Figure 6-27 Deleting a node with one child

Click on Insert > new slide to choose your cover, then delete this page.

Making one tree out of the two subtrees of the node and then attaching it to the node’s parent is called deleting by merging

Deleting a node with two children:

Step 1: Find the node to be deleted using search operation

Step 2: If it has two children, then find the largest node in its left subtree (OR) the smallest node in its right subtree.

Step 3: Swap both deleting node and node which found in above step.

Step 4: Then, check whether deleting node came to case 1 or case 2 else goto steps 2

Step 5: If it comes to case 1, then delete using case 1 logic.

Step 6: If it comes to case 2, then delete using case 2 logic.

Step 7: Repeat the same process until node is deleted from the tree.

Click on Insert > new slide to choose your cover, then delete this page.

delete 30

delete 20

HIT220 Week 6 - Trees Part 1

Click on Insert > new slide to choose your cover, then delete this page.

Traversals

A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data structure, there is no unique traversal. We will consider several traversal algorithms which we group in the following two kinds

depth-first traversal

breadth-first traversal

Click on Insert > new slide to choose your cover, then delete this page.

depth-first traversals (DFS)

There are three different types of depth-first traversals, :

PreOrder traversal - visit the parent first and then left and right children;

InOrder traversal - visit the left child, then the parent and the right child;

PostOrder traversal - visit left child, then the right child and then the parent;

Click on Insert > new slide to choose your cover, then delete this page.

breadth-first traversal (BST)

This traversal visits nodes by levels from top to bottom and from left to right.

Click on Insert > new slide to choose your cover, then delete this page.

Dictionary

Another format for data is the dictionary

It is unordered, changeable and indexed

used often in adjacency maps – see code week I graph representation for searching

Dictionary is written by placing sequence of elements within curly {} braces, separated by ‘comma’.

Dictionary holds a pair of values, one being the Key and the other corresponding pair element being its Key:value.

Values in a dictionary can be of any datatype and can be duplicated

Keys can’t be repeated and must be immutable.

Click on Insert > new slide to choose your cover, then delete this page.

Tries

efficient information reTrieval data structure,

search complexity with Trie is linear in terms of word (or key) length to be searched = O(kl) where k is maximum key length

binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree.

Click on Insert > new slide to choose your cover, then delete this page.

Inserting a Trie

Create node

Create 26 sub nodes (linked to null)

Add node to parent at relevant key

O(ALPHABET_SIZE * M * N)

Click on Insert > new slide to choose your cover, then delete this page.

Hashing also provides word search in O(n) time on average.

Trie have no collisions (like hashing) so worst case time complexity is O(n).

The most important thing is Prefix Search. With Trie

We can find all words beginning with a prefix but not with hashing

Tries require a lot of extra space. Tries are also known as radix tree or prefix tree

Click on Insert > new slide to choose your cover, then delete this page.

Hash Tables



451-229-0004

981-101-0002

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

Dictionaries

9/6/21 2:32 PM

Maps

Map is a collection of elements where each element is stored as a Key, value pair.

Map object can hold both objects and primitive values as either key or value.

When we iterate over the map object it returns the key,value pair in the same order as inserted cf array

Click on Insert > new slide to choose your cover, then delete this page.

Notion of a Map

Intuitively, a map M supports the abstraction of using keys as indices with a syntax such as M[k].

As a mental warm-up, consider a restricted setting in which a map with n items uses keys that are known to be integers in a range from 0 to N − 1, for some N ≥ n.

Click on Insert > new slide to choose your cover, then delete this page.

Sorted maps

You can insert map data in a naturally sorted order or sorted by a comparator

Implemented by TreeMaps

Click on Insert > new slide to choose your cover, then delete this page.

TreeMap, HashMap and LinkedHashMap

All offer a key->value map and a way to iterate through the keys

HashMap offers 0(1) lookup and insertion. The ordering of the keys is arbitrary. Implemented by an array of linked lists.

LinkedHashMap offers 0(1) lookup and insertion. Keys are ordered by their insertion order. Implemented by doubly-linked buckets.

TreeMap offers O(log N) lookup and insertion. Keys are ordered, so if you need to iterate through the keys in sorted order.

Click on Insert > new slide to choose your cover, then delete this page.

Hashing General Kinds of Keys

But what should we do if our keys are not integers in the range from 0 to N – 1?

Use a hash function to map general keys to corresponding indices in a table.

For instance, the last four digits of a Social Security number.



451-229-0004

981-101-0002

025-612-0001

…

Click on Insert > new slide to choose your cover, then delete this page.

Hash Functions and Hash Tables (hash-based map)

A hash function h maps keys of a given type to integers in a fixed interval [0, N - 1]

Example: h(x) = x mod N is a hash function for integer keys

The integer h(x) is called the hash value of key x

A hash table for a given key type consists of

Hash function h

Array (called table) of size N

When implementing a map with a hash table, the goal is to store item (k, o) at index i = h(k)

Click on Insert > new slide to choose your cover, then delete this page.

SSN Example

We design a hash table for a map storing entries as (SSN, Name), where SSN (social security number) is a nine-digit positive integer

Our hash table uses an array of size N = 10,000 and the hash function h(x) = last four digits of x



9997

9998

9999

…

451-229-0004

981-101-0002

200-751-9998

025-612-0001

Click on Insert > new slide to choose your cover, then delete this page.

Hash Functions

A hash function is usually specified as the composition of two functions:

Hash code: h1: keys  integers

Compression function: h2: integers  [0, N - 1]

The hash code is applied first, and the compression function is applied next on the result, i.e., h(x) = h2(h1(x))

The goal of the hash function is to “disperse” the keys in an apparently random way

Click on Insert > new slide to choose your cover, then delete this page.

Hash Codes

Memory address:

We reinterpret the memory address of the key object as an integer. Good in general, except for numeric and string keys

Integer cast:

We reinterpret the bits of the key as an integer

Suitable for keys of length less than or equal to the number of bits of the integer

Component sum:

We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits) and we sum the components (ignoring overflows)

Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type

Click on Insert > new slide to choose your cover, then delete this page.

Hash Codes (cont.)

Polynomial accumulation:

We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits) a0 a1 … an-1

We evaluate the polynomial

p(z) = a0 + a1 z + a2 z2 + … … + an-1zn-1

at a fixed value z, ignoring overflows

Especially suitable for strings (e.g., the choice z = 33 gives at most 6 collisions on a set of 50,000 English words)

Polynomial p(z) can be evaluated in O(n) time using Horner’s rule:

The following polynomials are successively computed, each from the previous one in O(1) time

p0(z) = an-1

pi (z) = an-i-1 + zpi-1(z) (i = 1, 2, …, n -1)

We have p(z) = pn-1(z)

Click on Insert > new slide to choose your cover, then delete this page.

Compression Functions

Division:

h2 (y) = y mod N

The size N of the hash table is usually chosen to be a prime

The reason has to do with number theory

Multiply, Add and Divide (MAD):

h2 (y) = (ay + b) mod N

a and b are nonnegative integers such that a mod N  0

Otherwise, every integer would map to the same value b

Click on Insert > new slide to choose your cover, then delete this page.

			0	1	2	3	4
k	h(k)	d(k)	Probes
18	5	3	5
41	2	1	2
22	9	6	9
44	5	5	5	10
59	7	4	7
32	6	3	6
31	5	4	5	9	0
73	8	4	8

Algorithm and Complexity Assignment (Python)

__MACOSX/._Assignment 2_Resources

Assignment 2_Resources/.DS_Store

__MACOSX/Assignment 2_Resources/._.DS_Store

Assignment 2_Resources/Assignment_2_HIT220 (1).pdf

__MACOSX/Assignment 2_Resources/._Assignment_2_HIT220 (1).pdf

__MACOSX/Assignment 2_Resources/._Week 2

__MACOSX/Assignment 2_Resources/._Week 4

__MACOSX/Assignment 2_Resources/._Week 5

Assignment 2_Resources/Instruction for Assignment 2.docx

__MACOSX/Assignment 2_Resources/._Instruction for Assignment 2.docx

__MACOSX/Assignment 2_Resources/._Week 3

__MACOSX/Assignment 2_Resources/._Week 6

Assignment 2_Resources/Week 2 /.DS_Store

__MACOSX/Assignment 2_Resources/Week 2 /._.DS_Store

Assignment 2_Resources/Week 2 /Week2.pptx

__MACOSX/Assignment 2_Resources/Week 2 /._Week2.pptx

Assignment 2_Resources/Week 2 /Week 2 workshop.pptx

Chart1

Sheet1

__MACOSX/Assignment 2_Resources/Week 2 /._Week 2 workshop.pptx

Assignment 2_Resources/Week 4/Priority Queues.pptx

__MACOSX/Assignment 2_Resources/Week 4/._Priority Queues.pptx

Assignment 2_Resources/Week 4/.DS_Store

__MACOSX/Assignment 2_Resources/Week 4/._.DS_Store

Assignment 2_Resources/Week 4/Stacks and Queues.pptx

__MACOSX/Assignment 2_Resources/Week 4/._Stacks and Queues.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Singly Linked List.docx

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Singly Linked List.docx

Assignment 2_Resources/Week 4/Part 2.pptx

__MACOSX/Assignment 2_Resources/Week 4/._Part 2.pptx

Assignment 2_Resources/Week 4/HIT220 Wk 4 Activity Answers Linked List.docx

__MACOSX/Assignment 2_Resources/Week 4/._HIT220 Wk 4 Activity Answers Linked List.docx

Assignment 2_Resources/Week 5/.DS_Store

__MACOSX/Assignment 2_Resources/Week 5/._.DS_Store

Assignment 2_Resources/Week 5/Binary trees.pptx

Sheet1

__MACOSX/Assignment 2_Resources/Week 5/._Binary trees.pptx

Assignment 2_Resources/Week 3 /Week3-linkedlists.pptx

__MACOSX/Assignment 2_Resources/Week 3 /._Week3-linkedlists.pptx

Assignment 2_Resources/Week 3 /.DS_Store

__MACOSX/Assignment 2_Resources/Week 3 /._.DS_Store

Assignment 2_Resources/Week 3 /Week3 part 2.pptx

__MACOSX/Assignment 2_Resources/Week 3 /._Week3 part 2.pptx

Assignment 2_Resources/Week 6/.DS_Store

__MACOSX/Assignment 2_Resources/Week 6/._.DS_Store

Assignment 2_Resources/Week 6/trees.pptx

__MACOSX/Assignment 2_Resources/Week 6/._trees.pptx