Algorithm design problem set

lazy808
week2-3digest.pdf

LE/EECS3101: Design and Analysis of Algorithms

Weekly Digest

Weeks 2-3: Runtime Analysis & Divide and

Conquer

1 Case Analysis vs Asymptotic Bounds

There is often confusion between ’case analysis’ (best/worst case) and ’asymp- totic bounds’ (Big-O, Omega, Theta). Big-O doesn’t necessarily mean worst case. We often use Big-O as an upper bound on the worse case, but we could also use it as an upper bound on the best case. Big-O and Omega are just functions (say, g(n)) that bind the running time (say, f(n)). O(g(n)) puts a bound on f(n) from above. As the input size approaches infinity, f(n) might get closer and closer to g(n) but never crosses it. Ω(g(n)) puts a bound on f(n) from below. f(n)can never be less than g(n). Now f(n), the running time, can be that of the worst case, best case, or even average case. Often, we are interested in placing an upper bound on the worst case, which is why Big-O is often associated with the worst case.

E.g., in Insertion Sort, the running time of the worst case is in O(n2) (bounded from above by g(n) = n2). In the worst case, the running time is also in Ω(n2) (bounded from below by g(n) = n2). Since the running time of the worst case is bounded from above and below by g(n) = n2, we say that we have a tight bound, i.e., in the worst case, Insertion Sort is in Θ(n2).

Similarly, the running time of Insertion Sort in the best case (array is sorted) is in (n) (also in (n2), (n3), (2n)...). The running time of the best case is also in Ω(n). Therefore, the running time of Insertion Sort in the best case is in Θ(n).

Again, we are often interested in the worst case. So when we are talking about the complexity or running time of an algorithm we are referring to the worst-case unless otherwise stated.

1

2 Runtime Analysis

We can compute the ’exact’ running time of an algorithm by summing the total cost of each line of code. The cost of a line of code is the number of times that line is executed multiplied by some constant cost c. The result of the summation is a polynomial.

Often, we are interested in the asymptotic running time, that is the running time as the input grows larger and larger. As the input size approaches infinity, the contribution of the lower-order terms and constants becomes minuscule, and the majority of the running time is taken by the highest- order term. Therefore, we often ignore the constants and lower-order terms and take the highest-order term as an upper bound.

We can compute the running time and derive an asymptotic upper, lower, or tight bound for the best, worst, and average cases. Although the average case is the most informative, it is pretty hard to define. An upper bound on the worst-case guarantees that the running time will never be polynomially larger (i.e., by a polynomial factor) than the bound.

3 Recurrences

Runtime analysis becomes more complicated when we add recursion into the mix. The cost of a recursive call is unknown. It is, in fact, the thing we are trying to compute! We can, however, write the running time in terms of itself. We call such expressions recurrence relations. Once we determine a recurrence relation, there are three main ways we can try to solve it:

• Guess and check: If we have a good guess, we can prove our guess is correct by induction over the size of the input n. Let T(n) = f(n) be our guess. The inductive steps assumes T (m) = f(m) is true for all m < n, and proves T (n) is true. We can guess the running time by examining the recursion tree or by forward substitution (trying out the first few terms, i.e., n = 1, n = 2, etc.)

• Backward substitution: Keep on substituting T (n) until a pattern is detected. Write the pattern in terms of the number of substitutions i. Compute the value of i, k, for which T (n) = T (1). Finally, substitute T (n) for T (1) and i for k.

• The Master Method: The Master method if useful in solving recur- rences of the form T (n) = aT (n/b) + f(n). This is often the case for divide and conquer algorithms where a is the number of subproblems, b is the factor by which the size of each subproblem is reduced, and f(n) is the time to divide and combine.

2

The recursive tree for a recurrence of the form T (n) = aT (n/b) + f(n) has logb n levels and a branching factor of a, i.e., it has a

i nodes at level i. The total number of base cases in such tree is nlogb a. The Master theorem defines three cases based on how much the time to solve the base cases takes up from the total computation time:

– Case 1: Time to solve the base cases is polynomially larger than the time to divide and combine, in other words, the time to divide and combine f(n) is bounded from above by the time to solve the bases cases. In this case T (n) = Θ(nlogb a).

– Case 2: Time to solve the base cases is roughly the same as the time to divide and combine, in other words, the time to divide and combine f(n) is bounded from above and below by the time to solve the bases cases. In this case T (n) = Θ(nlogb a log n).

– Case 3: Time to solve the base cases is polynomially smaller than the time to divide and combine, in other words, the time to divide and combine f(n) is bounded from below by the time to solve the bases cases. In this case T (n) = Θ(f(n)).

4 Divide and Conquer

A divide and conquer algorithm solves a large problem by recursively divid- ing it into smaller, more manageable subproblems, and then combining the solutions of the smaller subproblems to construct a solution to the larger problem. There are four questions to ask when designing a divide and con- quer algorithm:

1. What is the input size? e.g., in a sorting algorithm, the input size is the number of elements in the array. For an algorithm that computes the product of two integers, the input size is the maximum number of bits of both integers.

2. What is the smallest problem instance that we know how to trivially solve? For sorting, the smallest problem is when n = 1, i.e., the array has one element. In that case, the array is trivially sorted.

3. How to recursively divide the problem to get to the smallest problem?

4. How to combine the results of the smaller subproblems and compute the result of the larger problem? Often questions 3 and 4 are inter- twined.

3

  • Case Analysis vs Asymptotic Bounds
  • Runtime Analysis
  • Recurrences
  • Divide and Conquer