Heap insertion time Data Structures and Algorithams project

profilerahul reddy
finaldsatermproject1.docx

IP LOOKUP ON ROUTING TABLES USING SCAPEGOAT TREES [Document subtitle]

ACKNOWLEDGEMENT

We wish to express our sincere thanks to Dr. Ahmad Al Shami Professor of the Department of Computer Science and Engineering, for his encouragement and support that he extends towards our project work.

INTRODUCTION

On the Internet, the communication between machines is performed by packets of information Once the machines transmit their packets in the network, there are the routers that forward the packets on network links to their final destinations. The rapid growth in the number of Internet users results the increase in the size of routing tables and complicate the data packets routing operation. To forward packets, the routers must determine where to send each incoming packet. More exactly, The router uses the destination IP address of each incoming packet and searches the routing information in its forwarding table.

 

To search the output port to an IP address of an incoming packet, the algorithm uses a self-balancing scapegoat tree with alpha = 2/3 and traverses the tree that represents all IPs in the routing table using the bits of the IP address.Since scapegoat tree does not require any extra storage per node of the tree this can be used on routers where more effective use of available memory is necessary. The balancing idea is to make sure that nodes are α (alpha) size balanced. What this means is that the sizes of left and right subtrees are at most α times (size of node) i.e. α * (size of node). This concept is based on the fact that if a node is α weight balanced, it is also height balanced with height less than log_(1/α) s ize + 1.

Time Complexity in Big O notation (Average Case) : The time complexity for a Scapegoat tree for average case is as shown below

Space - O(n)

Search - O(log n)

Insert - O(log n)

Delete - O(log n)

Time Complexity in Big O notation (Worst Case) : The time complexity for a Scapegoat tree for worst case is as shown below

Space - O(n)

Search - O(log n)

Insert - amortized O(log n)

Delete - amortized O(log n)

 

In real world, the tree node stores the prefix, the output port (OP), the priority value (PV) of the prefix and the router searches in their forwarding table to find the longest prefix matching the destination IP address of the incoming packet. To do this, the prefixes are compared bit by bit to the IP address of incoming packets and that the routing information associated with the longest of the matching prefixes should be used to forward the packet. For simplicity purpose in the below case study we have included only IPs and the lookup happens to match the exact destination IP of the incoming packet.

PROJECT CODE

import math

import socket

import struct

class ScapeGoatTree:

"""

A ScapeGoat tree implementation.

This implementation handles duplicates similar to a set, hence multiple insertions with the same value will result

in only a single node inserted in the tree.

"""

class Node:

"""

Represents a node in Scapegoat tree.

"""

def __init__(self, val):

self.left = None

self.right = None

self.parent = None

self.val = val

def __init__(self):

self.__root = None

self.__number_of_nodes = 0

self.__max_number_of_nodes = 0

# This is the reciprocal of the 'α' value that determines the weight balance. An α-weight-balanced node

# is defined as meeting a relaxed weight balance criterion such as:

#

# size(left) ≤ α*size(node)

# size(right) ≤ α*size(node)

self.__alpha = 3 / 2

def insert(self, val):

"""

Inserts a node with the provided value into the tree.

:param val: The value to be inserted.

:return: True if successful insertion else False

"""

node = self.Node(val)

depth_inserted = self.__bst_insert(node)

if self.__is_node_unbalanced(depth_inserted, self.__number_of_nodes):

# Finding the scapegoat. Initially we blame it on the parent of inserted node and

# traverse upwards towards the root and evaluate the balance condition until we find the scapegoat.

scape_goat = node.parent

while (3 * self.__get_size_of_node(scape_goat)) <= (2 * self.__get_size_of_node(scape_goat.parent)):

scape_goat = scape_goat.parent

self.__rebuild_tree(scape_goat.parent)

# Updating max number of nodes

self.__max_number_of_nodes = max(self.__max_number_of_nodes, self.__number_of_nodes)

return depth_inserted >= 0

def find(self, val):

"""

Determines whether a node with the provided value is present in the tree or not.

:param val: The value to find

:return: True if found else False

"""

return self.__find_inorder(self.__root, val)

def __find_inorder(self, node, val):

if not node:

return False

if node.val == val:

return True

return self.__find_inorder(node.left, val) or self.__find_inorder(node.right, val)

def delete(self, val):

"""

Deletes the node with the provided value if found.

:param val: The value to delete

:return: True if the value is successfully deleted else False

"""

number_of_nodes_before_deletion = self.__number_of_nodes

self.__root = self.__delete_node_from_bst(self.__root, val)

if (3 * self.__number_of_nodes) <= (2 * self.__max_number_of_nodes):

self.__rebuild_tree(self.__root)

self.__max_number_of_nodes = self.__number_of_nodes

print(number_of_nodes_before_deletion)

return number_of_nodes_before_deletion != self.__number_of_nodes

def __delete_node_from_bst(self, root, val):

if root is None:

return root

if val < root.val:

root.left = self.__delete_node_from_bst(root.left, val)

elif val > root.val:

root.right = self.__delete_node_from_bst(root.right, val)

else:

if not root.left:

temp = root.right

return temp

elif not root.right:

temp = root.left

return temp

temp = self.__minimum_node(root.right)

root.val = temp.val

root.right = self.__delete_node_from_bst(root.right, temp.val)

self.__number_of_nodes -= 1

return root

@staticmethod

def __minimum_node(node):

current = node

while current.left:

current = current.left

return current

def size(self):

"""

Returns the size of the current tree.

:return: The size

"""

return self.__number_of_nodes

def get_preorder_list(self):

"""

Returns the preorder traversal of the scapegoat tree.

:return: List contains the nodes of the tree in preorder traversal

"""

return self.__flatten_tree_into_list(self.__root)

def __is_node_unbalanced(self, height, node_size):

return height > math.log(node_size, self.__alpha)

def __flatten_tree_into_list(self, root):

if not root:

return []

return self.__flatten_tree_into_list(root.left) + [root.val] + self.__flatten_tree_into_list(root.right)

def __rebuild_tree(self, node):

parent_of_node = node.parent

flattened_tree = self.__flatten_tree_into_list(node)

if not parent_of_node:

# This means the node is the root of the tree.

self.__root = self.__build_bst_from_list(flattened_tree)

self.__root.parent = None

elif parent_of_node.left == node:

parent_of_node.left = self.__build_bst_from_list(flattened_tree)

parent_of_node.left.parent = parent_of_node

else:

parent_of_node.right = self.__build_bst_from_list(flattened_tree)

parent_of_node.right.parent = parent_of_node

def __bst_insert(self, node_to_insert):

"""

Inserts the provided node into the underlying tree by performing the standard binary search tree insert

operation.

:param node_to_insert: The node to insert.

:return: The depth at which the node was inserted into the tree.

"""

current_node = self.__root

if not current_node:

# Tree is empty

self.__root = node_to_insert

self.__number_of_nodes += 1

return 0

inserted = False

depth = 0

while not inserted:

if node_to_insert.val > current_node.val:

if not current_node.right:

current_node.right = node_to_insert

node_to_insert.parent = current_node

inserted = True

else:

current_node = current_node.right

elif node_to_insert.val < current_node.val:

if not current_node.left:

current_node.left = node_to_insert

node_to_insert.parent = current_node

inserted = True

else:

current_node = current_node.left

else:

# Duplicate value.

return -1

depth += 1

self.__number_of_nodes += 1

return depth

def __build_bst_from_list(self, node_list):

if len(node_list) == 0:

return None

half_size = len(node_list) // 2

root = self.Node(node_list[half_size])

root.left = self.__build_bst_from_list(node_list[:half_size])

root.right = self.__build_bst_from_list(node_list[half_size + 1:])

if root.left:

root.left.parent = root

if root.right:

root.right.parent = root

return root

def __get_size_of_node(self, node):

if not node:

return 0

return self.__get_size_of_node(node.left) + 1 + self.__get_size_of_node(node.right)

class IPSet:

"""

Class used to maintain a set of Internet Provider addresses.

This implementation uses ScapeGoat tree for its set implementation and so can provide faster insertion and retrieval

times.

"""

def __init__(self):

self.__ip_set = ScapeGoatTree()

def insert_ip(self, ip):

"""

Inserts an IP into IPSet.

:param ip: The IP to insert

:return: True if successful insertion else False

"""

ip_decimal = self.__ip2decimal(ip)

return self.__ip_set.insert(ip_decimal)

def contains(self, ip):

"""

Checks whether the provided IP is present in the IPSet or not

:param ip: The IP to check for.

:return: True if present else False

"""

ip_decimal = self.__ip2decimal(ip)

return self.__ip_set.find(ip_decimal)

def __contains__(self, item):

return self.contains(item)

def delete(self, ip):

"""

Checks whether the provided IP is present in the IPSet or not

:param ip: The IP to delete for.

:return: True if deleted else False

"""

ip_decimal = self.__ip2decimal(ip)

return self.__ip_set.delete(ip_decimal)

@staticmethod

def __ip2decimal(ip):

"""

Converts an IP string to decimal

"""

packed_ip = socket.inet_aton(ip)

return struct.unpack("!L", packed_ip)[0]

def get_storedIPaddress(self) :

list =self.__ip_set.get_preorder_list()

for i, value in enumerate(list):

list[i] = self.__decimal2ip(value)

return list

@staticmethod

def __decimal2ip(number):

"""

Converts an decimal to IP address

"""

return socket.inet_ntoa(struct.pack('!L', number))

if __name__ == "__main__":

ip_set=IPSet()

ip_set.insert_ip("192.168.10.2")

ip_set.insert_ip("192.168.10.5")

ip_set.insert_ip("192.168.10.0")

ip_set.insert_ip("192.168.9.1")

ip_set.insert_ip("192.168.10.10")

ip_set.insert_ip("192.168.10.13")

ip_set.insert_ip("192.168.10.20")

ip_set.insert_ip("192.168.10.15")

ip_set.insert_ip("192.168.10.30")

ip_set.insert_ip("192.168.10.50")

ip_set.insert_ip("192.168.10.70")

print(ip_set.contains("192.168.10.20"))

print(ip_set.delete("192.168.10.20"))

print(ip_set.delete("192.168.10.30"))

print(ip_set.delete("192.168.10.50"))

print(ip_set.delete("192.168.10.70"))

print(ip_set.contains("192.168.10.20"))

print(ip_set.get_storedIPaddress())

Scapegoat Tree Visualization:

192.168.10.2

3232238082

d=0

192.168.10.5

3232238085

192.168.10.0

3232238080

d=1

192.168.10.10

3232238090

192.168.9.1

3232237825

d=2

192.168.10.20

3232238100

d=3

192.168.10.30

3232238110

192.168.10.15

3232238095

d=4

192.168.10.13

3232238093

192.168.10.50

3232238130

d=5

192.168.10.70

3232238150

d=6

If d > log3/2n ,we have to find the scapegoat in order to solve the problem of exceeding height.

To insert value x in a Scapegoat Tree:

· Create a new node u and insert x using the  BST insert  algorithm.

· If the depth of is greater than log3/2n where n is number of nodes in tree then we need to make tree balanced. To make balanced, we use below step to find a scapegoat.

· Walk up from until we reach a node with size(w) > (2/3)*size(w.parent). This node is scapegoat

· Rebuild the subtree rooted at w.parent

Rebuilding:

In rebuilding, we simply convert the subtree to the most possible balanced BST. We first store inorder traversal of BST in an array, then we build a new BST from array by recursively dividing it into two halves.

COMPARISION OF DIFFERENT BINARY SEARCH TREES

METRIC

SCAPEGOAT TREE

REDBLACK TREES

AVL TREES

SPLAY TREES

Insertion in worst case

Amortized O(logn)

O(1)

O(logn)

Amortized O(logn)

Search in worst case

O(log n)

O(logn), Moderate

O(logn), Faster

Amortized O(logn), Slower

Deletion in worst case

Amortized O(logn)

O(logn)

O(logn)

Amortized O(logn)

Maximum height of tree

height <= log1/&aplpha;(size) + 1

2*log(n)

1.44*log(n)

O(n)

Efficient Implementation requires

Aplha=2/3

Three pointers with color bit per node

Two pointers with balance factor per node

Only two pointers with no extra information

Mostly used

For insertion and deletion of large volume of data

As universal data structure

When frequent lookups are required

When same element is retrieved again and again

Real world Application

Routers,

Multiset, Multimap, Map, Set, etc.

Database Transactions

Cache implementation, Garbage collection Algorithms