Algorithm design problem set
LE/EECS3101
Design and Analysis of Algorithms
Divide and Conquer II
Karim Jahed
Recall - MergeSort
1 fun m e r g e S o r t (A [ 1 . . n ] , l , h ) :
2 i f l >= h :
3 return
4 end
5
6 mid = ( l + h ) / 2
7 m e r g e S o r t ( A , l , mid )
8 m e r g e S o r t ( A , mid + 1 , h )
9 m e r g e ( A , l , mid , h )
T (n) = 2T (n/2) + n
1
Recursion Tree - MergeSort
2
Forward Substitution
If we have a good guess we can try to prove it by induction. We will
prove that
for n ≥ 1, T (n) = n log n + n
• Base case: for n = 1, T (1) = 1 log 1 + 1 = 1.
• Induction hypothesis: assume T (m) = m log m + m for all m < n
• Inductive step:
T (n) = 2T (n/2) + n
= 2((n/2) log(n/2) + n/2) + n
= n log(n/2) + 2n
= n(log n − log 2) + 2n = n log n + n
3
Divide and Conquer - Maximum Subarray Sum
• Given an array A[1..n] we would like to find the sum of a contiguous subarray which has the largest sum.
• E.g., for A = 5,−10, 3, 3,−4, 7 the contiguous subarray with the largest sum is 3, 3,−4, 7 = 9.
• The naive solution tries to build a contiguous subarray starting from each element of A and runs in O(n2).
• Can we do better?
4
Divide and Conquer - Maximum Subarray Sum (2)
• What is the smallest instance that we know how to trivially solve?
• If n = 1 then the array has one element. The contiguous subarray which has the largest sum is the entire array.
• How can we divide the problem into smaller subproblems?
5
Divide and Conquer - Maximum Subarray Sum (2)
• What is the smallest instance that we know how to trivially solve? • If n = 1 then the array has one element. The contiguous subarray
which has the largest sum is the entire array.
• How can we divide the problem into smaller subproblems?
5
Divide and Conquer - Maximum Subarray Sum (2)
• What is the smallest instance that we know how to trivially solve? • If n = 1 then the array has one element. The contiguous subarray
which has the largest sum is the entire array.
• How can we divide the problem into smaller subproblems?
5
Divide and Conquer - Maximum Subarray Sum (2)
• Case 1: the maximum subarray is in the left part
• Case 2: the maximum subarray is in the right part
• Case 3: the maximum subarray crosses the midpoint
6
Divide and Conquer - Maximum Subarray Sum (3)
1 fun mss (A [ 1 . . n ] , l , h ) :
2 i f l == h :
3 return A [ l ]
4 end
5
6 mid = ( l + h ) / 2
7 l e f t S u m = mss ( A , l , mid )
8 r i g h t S u m = mss ( A , mid + 1 , h )
9 c r o s s i n g S u m = c r o s s i n g S u m ( A , l , m , h )
10 return max( l e f t S u m , r i g h t S u m , c r o s s i n g S u m )
What is the running time of mss?
7
Divide and Conquer - Maximum Subarray Sum (3)
1 fun mss (A [ 1 . . n ] , l , h ) :
2 i f l == h :
3 return A [ l ]
4 end
5
6 mid = ( l + h ) / 2
7 l e f t S u m = mss ( A , l , mid )
8 r i g h t S u m = mss ( A , mid + 1 , h )
9 c r o s s i n g S u m = c r o s s i n g S u m ( A , l , m , h )
10 return max( l e f t S u m , r i g h t S u m , c r o s s i n g S u m )
What is the running time of mss?
7
Runtime Analysis - CrossingSum
1 fun c r o s s i n g S u m (A [ 1 . . n ] , l , m, h ) :
2 l e f t S u m = −∞ 3 sum = 0
4 for i = m to l :
5 sum = sum + A [ i ]
6 i f sum > l e f t S u m :
7 l e f t S u m = sum
8
9 r i g h t S u m = −∞ 10 sum = 0
11 for i = m+1 to h :
12 sum = sum + A [ i ]
13 i f sum > r i g h t S u m :
14 r i g h t S u m = sum
15
16 return l e f t S u m + r i g h t S u m
crossingSum is a linear time procedure, i.e., Θ(n)
8
Runtime Analysis - CrossingSum
1 fun c r o s s i n g S u m (A [ 1 . . n ] , l , m, h ) :
2 l e f t S u m = −∞ 3 sum = 0
4 for i = m to l :
5 sum = sum + A [ i ]
6 i f sum > l e f t S u m :
7 l e f t S u m = sum
8
9 r i g h t S u m = −∞ 10 sum = 0
11 for i = m+1 to h :
12 sum = sum + A [ i ]
13 i f sum > r i g h t S u m :
14 r i g h t S u m = sum
15
16 return l e f t S u m + r i g h t S u m
crossingSum is a linear time procedure, i.e., Θ(n) 8
Runtime Analysis - Maximum Subarray Sum
1 fun mss (A [ 1 . . n ] , l , h ) :
2 i f l == h :
3 return A [ l ]
4 end
5
6 mid = ( l + h ) / 2
7 l e f t S u m = mss ( A , l , mid )
8 r i g h t S u m = mss ( A , mid + 1 , h )
9 c r o s s i n g S u m = c r o s s i n g S u m ( A , l , m , h )
10 return max( l e f t S u m , r i g h t S u m , c r o s s i n g S u m )
T (n) =
{ Θ(1) n = 1
2T (n/2) + Θ(n) n > 1
Therefore, T (n) ∈ Θ(n log n)
9
Divide and Conquer - Matrix Multiplication
Given two nxn matrices X and Y we would like to compute their product
matrix Z .
• Divide: We can divide the problem into 8 subproblems of size n/2
Z =
[ A B
C D
] .
[ E F
G H
] =
[ (A.E + B.G ) (A.F + B.H)
(C.E + D.G ) (C.D + D.H)
]
• Conquer: Solve each matrix multiplication problem recursively
• Combine: Take the sum of the appropriate matrices
10
Divide and Conquer - Matrix Multiplication
The running time is given by the recurrence relation:
T (n) =
{ Θ(1) n = 1
8T (n/2) + Θ(n2) n > 1
Notice the similarities with the running time of the recurrences we have
seen so far
• findMax: T (n) = 2T (n/2) + 1
• mergeSort: T (n) = 2T (n/2) + n
• mss: T (n) = 2T (n/2) + n
11
Divide and Conquer - General recurrence form
Almost all divide and conquer algorithms have a recurrence relation of
the form
T (n) = aT (n/b) + f (n)
where a,b are constants.
What does a, b, and f(n) represent?
12
Divide and Conquer - General recurrence form (2)
13
Divide and Conquer - General recurrence form (3)
• a: number of subproblems.
• b: factor by which the size of each subproblem is decreased.
• f (n): cost to divide and combine a subproblem of size n.
• T (n): the sum of all labels (i.e., all the f (n/bi )).
14
Divide and Conquer - General recurrence form (3)
• a: number of subproblems.
• b: factor by which the size of each subproblem is decreased.
• f (n): cost to divide and combine a subproblem of size n.
• T (n): the sum of all labels (i.e., all the f (n/bi )).
14
The Master’s Theorem - Intuition
• We can define the total cost of a divide and conquer algorithm as: • Total cost = cost to solve the base cases + cost to divide and
combine.
• There are nlogb a base cases in a tree of height logb n.
• The three cases of the Master’s theorem relates to how much the cost of solving the base cases dominate the total cost.
• Case 1: The total cost is dominated by the cost of solving base cases, i.e., T (n) = Θ(nlogb a).
• Case 2: The total cost is evenly distributed across all levels of the tree, i.e., T (n) = Θ(nlogb a log n).
• Case 3: The total cost is dominated by the work done at the root, i.e., T (n) = Θ(f (n)).
15
The Master’s Theorem
The Master’s theorem is useful in solving recurrences of the form
T (n) = aT (n/b) + f (n)
There are three cases:
1. if f (n) = O(nlogb a−�) for some constant � > 0, then T (n) = Θ(nlogb a).
2. if f (n) = Θ(nlogb a), then T (n) = Θ(nlogb a log n).
3. if f (n) = Ω(nlogb a+�) for some constant � > 0, and af (n/b) ≤ cf (n) for some constant c < 1, then T (n) = Θ(f (n)).
16
The Master’s Theorem - Merge Sort
Recurrence relation for Merge Sort:
T (n) = 2T (n/2) + n
• f (n) = n.
• nlogb a = nlog2 2 = n.
• Since f (n) = Θ(nlogb a) case 2 applies and T (n) = Θ(n log n).
17
The Master’s Theorem - Find Max
Recurrence relation for Find Max:
T (n) = 2T (n/2) + 1
• f (n) = 1.
• nlogb a = nlog2 2 = n.
• Since f (n) = O(nlogb a) case 1 applies and T (n) = Θ(n).
18
The Master’s Theorem - Matrix Multiplication
Recurrence relation for Matrix Multiplication using divide and conquer:
T (n) = 8T (n/2) + n2
• f (n) = n2.
• nlogb a = nlog2 8 = n3.
• Since f (n) = O(nlogb a) case 1 applies and T (n) = Θ(n3).
19