Heap insertion time Data Structures and Algorithams project
IP LOOKUP ON ROUTING TABLES USING SCAPEGOAT TREES [Document subtitle]
ACKNOWLEDGEMENT
We wish to express our sincere thanks to Dr. Ahmad Al Shami Professor of the Department of Computer Science and Engineering, for his encouragement and support that he extends towards our project work.
INTRODUCTION
On the Internet, the communication between machines is performed by packets of information Once the machines transmit their packets in the network, there are the routers that forward the packets on network links to their final destinations. The rapid growth in the number of Internet users results the increase in the size of routing tables and complicate the data packets routing operation. To forward packets, the routers must determine where to send each incoming packet. More exactly, The router uses the destination IP address of each incoming packet and searches the routing information in its forwarding table.
To search the output port to an IP address of an incoming packet, the algorithm uses a self-balancing scapegoat tree with alpha = 2/3 and traverses the tree that represents all IPs in the routing table using the bits of the IP address.Since scapegoat tree does not require any extra storage per node of the tree this can be used on routers where more effective use of available memory is necessary. The balancing idea is to make sure that nodes are α (alpha) size balanced. What this means is that the sizes of left and right subtrees are at most α times (size of node) i.e. α * (size of node). This concept is based on the fact that if a node is α weight balanced, it is also height balanced with height less than log_(1/α) s ize + 1.
Time Complexity in Big O notation (Average Case) : The time complexity for a Scapegoat tree for average case is as shown below
Space - O(n)
Search - O(log n)
Insert - O(log n)
Delete - O(log n)
Time Complexity in Big O notation (Worst Case) : The time complexity for a Scapegoat tree for worst case is as shown below
Space - O(n)
Search - O(log n)
Insert - amortized O(log n)
Delete - amortized O(log n)
In real world, the tree node stores the prefix, the output port (OP), the priority value (PV) of the prefix and the router searches in their forwarding table to find the longest prefix matching the destination IP address of the incoming packet. To do this, the prefixes are compared bit by bit to the IP address of incoming packets and that the routing information associated with the longest of the matching prefixes should be used to forward the packet. For simplicity purpose in the below case study we have included only IPs and the lookup happens to match the exact destination IP of the incoming packet.
PROJECT CODE
import math
import socket
import struct
class ScapeGoatTree:
"""
A ScapeGoat tree implementation.
This implementation handles duplicates similar to a set, hence multiple insertions with the same value will result
in only a single node inserted in the tree.
"""
class Node:
"""
Represents a node in Scapegoat tree.
"""
def __init__(self, val):
self.left = None
self.right = None
self.parent = None
self.val = val
def __init__(self):
self.__root = None
self.__number_of_nodes = 0
self.__max_number_of_nodes = 0
# This is the reciprocal of the 'α' value that determines the weight balance. An α-weight-balanced node
# is defined as meeting a relaxed weight balance criterion such as:
#
# size(left) ≤ α*size(node)
# size(right) ≤ α*size(node)
self.__alpha = 3 / 2
def insert(self, val):
"""
Inserts a node with the provided value into the tree.
:param val: The value to be inserted.
:return: True if successful insertion else False
"""
node = self.Node(val)
depth_inserted = self.__bst_insert(node)
if self.__is_node_unbalanced(depth_inserted, self.__number_of_nodes):
# Finding the scapegoat. Initially we blame it on the parent of inserted node and
# traverse upwards towards the root and evaluate the balance condition until we find the scapegoat.
scape_goat = node.parent
while (3 * self.__get_size_of_node(scape_goat)) <= (2 * self.__get_size_of_node(scape_goat.parent)):
scape_goat = scape_goat.parent
self.__rebuild_tree(scape_goat.parent)
# Updating max number of nodes
self.__max_number_of_nodes = max(self.__max_number_of_nodes, self.__number_of_nodes)
return depth_inserted >= 0
def find(self, val):
"""
Determines whether a node with the provided value is present in the tree or not.
:param val: The value to find
:return: True if found else False
"""
return self.__find_inorder(self.__root, val)
def __find_inorder(self, node, val):
if not node:
return False
if node.val == val:
return True
return self.__find_inorder(node.left, val) or self.__find_inorder(node.right, val)
def delete(self, val):
"""
Deletes the node with the provided value if found.
:param val: The value to delete
:return: True if the value is successfully deleted else False
"""
number_of_nodes_before_deletion = self.__number_of_nodes
self.__root = self.__delete_node_from_bst(self.__root, val)
if (3 * self.__number_of_nodes) <= (2 * self.__max_number_of_nodes):
self.__rebuild_tree(self.__root)
self.__max_number_of_nodes = self.__number_of_nodes
print(number_of_nodes_before_deletion)
return number_of_nodes_before_deletion != self.__number_of_nodes
def __delete_node_from_bst(self, root, val):
if root is None:
return root
if val < root.val:
root.left = self.__delete_node_from_bst(root.left, val)
elif val > root.val:
root.right = self.__delete_node_from_bst(root.right, val)
else:
if not root.left:
temp = root.right
return temp
elif not root.right:
temp = root.left
return temp
temp = self.__minimum_node(root.right)
root.val = temp.val
root.right = self.__delete_node_from_bst(root.right, temp.val)
self.__number_of_nodes -= 1
return root
@staticmethod
def __minimum_node(node):
current = node
while current.left:
current = current.left
return current
def size(self):
"""
Returns the size of the current tree.
:return: The size
"""
return self.__number_of_nodes
def get_preorder_list(self):
"""
Returns the preorder traversal of the scapegoat tree.
:return: List contains the nodes of the tree in preorder traversal
"""
return self.__flatten_tree_into_list(self.__root)
def __is_node_unbalanced(self, height, node_size):
return height > math.log(node_size, self.__alpha)
def __flatten_tree_into_list(self, root):
if not root:
return []
return self.__flatten_tree_into_list(root.left) + [root.val] + self.__flatten_tree_into_list(root.right)
def __rebuild_tree(self, node):
parent_of_node = node.parent
flattened_tree = self.__flatten_tree_into_list(node)
if not parent_of_node:
# This means the node is the root of the tree.
self.__root = self.__build_bst_from_list(flattened_tree)
self.__root.parent = None
elif parent_of_node.left == node:
parent_of_node.left = self.__build_bst_from_list(flattened_tree)
parent_of_node.left.parent = parent_of_node
else:
parent_of_node.right = self.__build_bst_from_list(flattened_tree)
parent_of_node.right.parent = parent_of_node
def __bst_insert(self, node_to_insert):
"""
Inserts the provided node into the underlying tree by performing the standard binary search tree insert
operation.
:param node_to_insert: The node to insert.
:return: The depth at which the node was inserted into the tree.
"""
current_node = self.__root
if not current_node:
# Tree is empty
self.__root = node_to_insert
self.__number_of_nodes += 1
return 0
inserted = False
depth = 0
while not inserted:
if node_to_insert.val > current_node.val:
if not current_node.right:
current_node.right = node_to_insert
node_to_insert.parent = current_node
inserted = True
else:
current_node = current_node.right
elif node_to_insert.val < current_node.val:
if not current_node.left:
current_node.left = node_to_insert
node_to_insert.parent = current_node
inserted = True
else:
current_node = current_node.left
else:
# Duplicate value.
return -1
depth += 1
self.__number_of_nodes += 1
return depth
def __build_bst_from_list(self, node_list):
if len(node_list) == 0:
return None
half_size = len(node_list) // 2
root = self.Node(node_list[half_size])
root.left = self.__build_bst_from_list(node_list[:half_size])
root.right = self.__build_bst_from_list(node_list[half_size + 1:])
if root.left:
root.left.parent = root
if root.right:
root.right.parent = root
return root
def __get_size_of_node(self, node):
if not node:
return 0
return self.__get_size_of_node(node.left) + 1 + self.__get_size_of_node(node.right)
class IPSet:
"""
Class used to maintain a set of Internet Provider addresses.
This implementation uses ScapeGoat tree for its set implementation and so can provide faster insertion and retrieval
times.
"""
def __init__(self):
self.__ip_set = ScapeGoatTree()
def insert_ip(self, ip):
"""
Inserts an IP into IPSet.
:param ip: The IP to insert
:return: True if successful insertion else False
"""
ip_decimal = self.__ip2decimal(ip)
return self.__ip_set.insert(ip_decimal)
def contains(self, ip):
"""
Checks whether the provided IP is present in the IPSet or not
:param ip: The IP to check for.
:return: True if present else False
"""
ip_decimal = self.__ip2decimal(ip)
return self.__ip_set.find(ip_decimal)
def __contains__(self, item):
return self.contains(item)
def delete(self, ip):
"""
Checks whether the provided IP is present in the IPSet or not
:param ip: The IP to delete for.
:return: True if deleted else False
"""
ip_decimal = self.__ip2decimal(ip)
return self.__ip_set.delete(ip_decimal)
@staticmethod
def __ip2decimal(ip):
"""
Converts an IP string to decimal
"""
packed_ip = socket.inet_aton(ip)
return struct.unpack("!L", packed_ip)[0]
def get_storedIPaddress(self) :
list =self.__ip_set.get_preorder_list()
for i, value in enumerate(list):
list[i] = self.__decimal2ip(value)
return list
@staticmethod
def __decimal2ip(number):
"""
Converts an decimal to IP address
"""
return socket.inet_ntoa(struct.pack('!L', number))
if __name__ == "__main__":
ip_set=IPSet()
ip_set.insert_ip("192.168.10.2")
ip_set.insert_ip("192.168.10.5")
ip_set.insert_ip("192.168.10.0")
ip_set.insert_ip("192.168.9.1")
ip_set.insert_ip("192.168.10.10")
ip_set.insert_ip("192.168.10.13")
ip_set.insert_ip("192.168.10.20")
ip_set.insert_ip("192.168.10.15")
ip_set.insert_ip("192.168.10.30")
ip_set.insert_ip("192.168.10.50")
ip_set.insert_ip("192.168.10.70")
print(ip_set.contains("192.168.10.20"))
print(ip_set.delete("192.168.10.20"))
print(ip_set.delete("192.168.10.30"))
print(ip_set.delete("192.168.10.50"))
print(ip_set.delete("192.168.10.70"))
print(ip_set.contains("192.168.10.20"))
print(ip_set.get_storedIPaddress())
Scapegoat Tree Visualization:
192.168.10.2
3232238082
d=0
192.168.10.5
3232238085
192.168.10.0
3232238080
d=1
192.168.10.10
3232238090
192.168.9.1
3232237825
d=2
192.168.10.20
3232238100
d=3
192.168.10.30
3232238110
192.168.10.15
3232238095
d=4
192.168.10.13
3232238093
192.168.10.50
3232238130
d=5
192.168.10.70
3232238150
d=6
If d > log3/2n ,we have to find the scapegoat in order to solve the problem of exceeding height.
To insert value x in a Scapegoat Tree:
· Create a new node u and insert x using the BST insert algorithm.
· If the depth of u is greater than log3/2n where n is number of nodes in tree then we need to make tree balanced. To make balanced, we use below step to find a scapegoat.
· Walk up from u until we reach a node w with size(w) > (2/3)*size(w.parent). This node is scapegoat
· Rebuild the subtree rooted at w.parent
Rebuilding:
In rebuilding, we simply convert the subtree to the most possible balanced BST. We first store inorder traversal of BST in an array, then we build a new BST from array by recursively dividing it into two halves.
COMPARISION OF DIFFERENT BINARY SEARCH TREES
|
METRIC |
SCAPEGOAT TREE |
REDBLACK TREES |
AVL TREES |
SPLAY TREES |
|
Insertion in worst case |
Amortized O(logn) |
O(1) |
O(logn) |
Amortized O(logn) |
|
Search in worst case |
O(log n) |
O(logn), Moderate |
O(logn), Faster |
Amortized O(logn), Slower |
|
Deletion in worst case |
Amortized O(logn) |
O(logn) |
O(logn) |
Amortized O(logn) |
|
Maximum height of tree |
height <= log1/&aplpha;(size) + 1 |
2*log(n) |
1.44*log(n) |
O(n) |
|
Efficient Implementation requires |
Aplha=2/3 |
Three pointers with color bit per node |
Two pointers with balance factor per node |
Only two pointers with no extra information |
|
Mostly used |
For insertion and deletion of large volume of data |
As universal data structure |
When frequent lookups are required |
When same element is retrieved again and again |
|
Real world Application |
Routers, |
Multiset, Multimap, Map, Set, etc. |
Database Transactions |
Cache implementation, Garbage collection Algorithms |