COSC 2 Dis

Gareth Beckham

AB-JAVA2-Chapter23.pdf

Home >Computer Science homework help >COSC 2 Dis

Chapter 23:

Sorting

Dr. Adriana Badulescu

Objectives ▪ To study and analyze time complexity of various sorting algorithms

(§§23.2–23.7).

▪ To design, implement, and analyze insertion sort (§23.2).

▪ To design, implement, and analyze bubble sort (§23.3).

▪ To design, implement, and analyze merge sort (§23.4).

▪ To design, implement, and analyze quick sort (§23.5).

▪ To design and implement a binary heap (§23.6).

▪ To design, implement, and analyze heap sort (§23.7).

▪ To design, implement, and analyze bucket sort and radix sort (§23.8).

▪ To design, implement, and analyze external sort for files that have a large amount of data (§23.9).

Why Study Sorting? ▪ Sorting is a classic subject in computer science. There

are three reasons for studying sorting algorithms. ▪ Sorting algorithms illustrate many creative approaches to

problem solving and these approaches can be applied to solve other problems.

▪ Sorting algorithms are good for practicing fundamental programming techniques using selection statements, loops, methods, and arrays.

▪ Sorting algorithms are excellent examples to demonstrate algorithm performance.

What Data To Sort? ▪ The data to be sorted might be integers, doubles, characters, or

objects. ▪ The Java API contains several overloaded sort methods for

sorting primitive type values and objects in the java.util.Arrays and java.util.Collections class.

▪ For simplicity, this section assumes: ▪ data to be sorted are integers, ▪ data are sorted in ascending order, and ▪ data are stored in an array. ▪ The programs can be easily modified to sort other types of

data, to sort in descending order, or to sort data in an ArrayList or a LinkedList.

Selection Sort ▪ Selection sort

finds the smallest number in the list and places it first. It then finds the smallest number remaining and places it second, and so on until the list contains only a single number.

Selection Sort

Insertion Sort

▪ The insertion sort algorithm sorts a list of values by repeatedly inserting an unsorted element into a sorted sublist until the whole list is sorted.

Insertion Sort ▪ The insertion

sort algorithm sorts a list of values by repeatedly inserting an unsorted element into a sorted sublist until the whole list is sorted.

[0] [1] [2] [3] [4] [5] [6]

2 5 9 4 list Step 1: Save 4 to a temporary variable currentElement

[0] [1] [2] [3] [4] [5] [6]

2 5 9 list Step 2: Move list[2] to list[3]

[0] [1] [2] [3] [4] [5] [6]

2 5 9 list Step 3: Move list[1] to list[2]

[0] [1] [2] [3] [4] [5] [6]

2 4 5 9 list Step 4: Assign currentElement to list[1]

Insertion Sort

Bubble Sort

2 5 9 4 8 1

2 5 4 9 8 1

2 5 4 8 9 1 2 5 4 8 1 9

(a) 1st pass

2 4 5 8 1 9

2 4 5 8 1 9 2 4 5 1 8 9

(b) 2nd pass

2 4 5 1 8 9

2 4 1 5 8 9

2 1 4 5 8 9

(d) 4th pass

2 9 5 4 8 1

(e) 5th pass

2 5 4 8 1 9

2 4 5 1 8 9

2 4 1 5 8 9

1 2 4 5 8 9

https://liveexample.pearsoncmg.com/dsanimation/BubbleSortNeweBook.html

Bubble Sort

Bubble sort time: O(n2)

Merge Sort

2 9 5 4 8 1 6 7

split

2 9

split

5 4

split

9 5 4

8 1 6 7

2 9

merge

4 5 1 8 6 7

2 4 5 9 1 6 7 8

1 2 4 5 6 7 8 9

merge

divide

conquer

Merge Sort

mergeSort(list): firstHalf = mergeSort(firstHalf); secondHalf = mergeSort(secondHalf); list = merge(firstHalf, secondHalf);

Merge Sort

Merge Two Sorted Lists

2 4 5 9

current1

1 6 7 8

current2

current3

(a) After moving 1 to temp (b) After moving all the

elements in list2 to temp

to temp

2 4 5 9

current1

1 2 4 5 6 7 8 9

1 6 7 8

current2

current3

temp

2 4 5 9

current1

1 2 4 5 6 7 8

1 6 7 8

current2

current3

https://liveexample.pearsoncmg.com/dsanimation/MergeSortNew.html

Merge Sort Time ▪ Let T(n) denote the time required for sorting an

array of n elements using merge sort. ▪ Without loss of generality, assume n is a power

of 2. ▪ The merge sort algorithm splits the array into

two subarrays, sorts the subarrays using the same algorithm recursively, and then merges the subarrays.

𝑇(𝑛) = 𝑇( 𝑛

2 ) + 𝑇(

𝑛

2 ) + 𝑚𝑒𝑟𝑔𝑒𝑡𝑖𝑚𝑒 = 𝑇(

𝑛

2 ) + 𝑇(

𝑛

2 ) + 𝑂(𝑛)

Merge Sort Time ▪ The first T(n/2) is the time for sorting the first half of

the array and the second T(n/2) is the time for sorting the second half.

▪ To merge two subarrays, it takes at most n-1 comparisons to compare the elements from the two subarrays and n moves to move elements to the temporary array.

▪ So, the total time is 2n-1. Therefore, 𝑇(𝑛) = 2𝑇(

𝑛

2 ) + 2𝑛 − 1 = 2(2𝑇(

𝑛

4 ) + 2

𝑛

2 − 1) + 2𝑛 − 1

= 22𝑇( 𝑛

22 ) + 2𝑛 − 2 + 2𝑛 − 1

= 2𝑘𝑇( 𝑛

2𝑘 ) + 2𝑛 − 2𝑘−1+.. . +2𝑛 − 2 + 2𝑛 − 1

= 2log 𝑛𝑇( 𝑛

2log 𝑛 ) + 2𝑛 − 2log 𝑛−1+.. . +2𝑛 − 2 + 2𝑛 − 1

= 𝑛 + 2𝑛 log𝑛 − 2log 𝑛 + 1 = 2𝑛 log𝑛 + 1 = 𝑂(𝑛 log𝑛)

Quick Sort

▪ Quick sort, developed by C. A. R. Hoare (1962), works as follows: The algorithm selects an element, called the pivot, in the array.

▪ Divide the array into two parts such that all the elements in the first part are less than or equal to the pivot and all the elements in the second part are greater than the pivot.

▪ Recursively apply the quick sort algorithm to the first part and then the second part.

Quick Sort

5 2 9 3 8 4 0 1 6 7

pivot

(a) The original array

4 2 1 3 0 5 8 9 6 7

pivot

(b)The original array is partitioned

0 2 1 3 4 (c) The partial array (4 2 1 3 0) is

partitioned

0 2 1 3 (d) The partial array (0 2 1 3) is

partitioned

1 2 3

pivot

(e) The partial array (2 1 3) is

partitioned

Partition

5 2 9 3 8 4 0 1 6 7

pivot low high

(a) Initialize pivot, low, and high

5 2 9 3 8 4 0 1 6 7

pivot low high

(b) Search forward and backward

5 2 1 3 8 4 0 9 6 7

pivot low high

5 2 1 3 8 4 0 9 6 7

pivot low high

(d) Continue search

5 2 1 3 0 4 8 9 6 7

pivot low high

(e) 8 is swapped with 0

5 2 1 3 0 4 8 9 6 7

pivot low high

(f) when high < low, search is over

4 2 1 3 0 5 8 9 6 7

pivot

(g) pivot is in the right place

The index of the pivot is returned

https://liveexample.pearsoncmg.com/dsanimation/QuickSortNeweBook.html

Quick Sort

Quick Sort Time

▪ To partition an array of n elements, it takes n-1 comparisons and n moves in the worst case. So, the time required for partition is O(n).

Worst-Case Time ▪ In the worst case, each time the pivot divides

the array into one big subarray with the other

empty.

▪ The size of the big subarray is one less than the

one before divided. The algorithm requires

𝑂(𝑛2) time: (𝑛 − 1) + (𝑛 − 2)+. . . +2 + 1 = 𝑂(𝑛2)

Best-Case Time ▪ In the best case, each time the pivot divides the

array into two parts of about the same size.

▪ Let T(n) denote the time required for sorting

an array of elements using quick sort.

▪ So,

𝑇(𝑛) = 𝑇( 𝑛

2 ) + 𝑇(

𝑛

2 ) + 𝑛 = 𝑂(𝑛 log 𝑛)

Average-Case Time ▪ On the average, each time the pivot will not

divide the array into two parts of the same size

nor one empty part.

▪ Statistically, the sizes of the two parts are very

close.

▪ So the average time is O(nlogn).

▪ The exact average-case analysis is beyond the

scope of this book.

Heap

▪ Heap is a useful data structure for designing efficient sorting algorithms and priority queues.

▪ A heap is a binary tree with the following properties:

▪ It is a complete binary tree.

▪ Each node is greater than or equal to any of its children.

Complete Binary Tree ▪ A binary tree is complete if every level of the tree is full

except that the last level may not be full and all the leaves on the last level are placed left-most.

▪ For example, in the following figure, the binary trees in (a) and (b) are complete, but the binary trees in (c) and (d) are not complete.

▪ Further, the binary tree in (a) is a heap, but the binary tree in (b) is not a heap, because the root (39) is less than its right child (42).

22 29 14 33

32 39

22 29 14

32 42

22 14 33

32 39

22 29

See How a Heap Works

https://liveexample.pearsoncmg.com/dsanimation/HeapeBook.html

Representing a Heap ▪ For a node at position i, its left child is at position 2i+1

and its right child is at position 2i+2, and its parent is at index (i-1)/2.

▪ For example, the node for element 39 is at position 4, so its left child (element 14) is at 9 (2*4+1), its right child (element 33) is at 10 (2*4+2), and its parent (element 42) is at 1 ((4-1)/2).

22 29 14 33 30 17 9

32 39 44 13

42 59

62 62 42 59 32 39 44 13 22 29 14 33 30 17 9

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10][11][12][13]

[10][11]

parent

left

right

Adding Elements to the Heap Adding 3, 5, 1, 19, 11, and 22 to a heap, initially empty

(a) After adding 3 (b) After adding 5 (c) After adding 1

(d) After adding 19

5 1

(e) After adding 11

3 5

11 1

(f) After adding 22

3 5 1

11 19

3 1

Rebuild the heap after adding a new node

Adding 88 to the heap

(a) Add 88 to a heap

3 5 1 88

11 19

(b) After swapping 88 with 19

3 5 1 19

11 88

(b) After swapping 88 with 22

3 5 1 19

11 22