Data structures - graphs and strings

profileroops
fwdwork.zip

__MACOSX/._fwdwork

fwdwork/lorem_ipsum (1).txt

Lorem ipsum dolor sit amet, consectetur adipiscing elit. In dignissim sollicitudin interdum. Praesent est eros, lobortis ac fermentum in, facilisis non tellus. Morbi euismod quis elit in rhoncus. Etiam lorem arcu, imperdiet a libero non, lobortis pretium elit. Suspendisse egestas diam in nunc suscipit, ac convallis lectus commodo. Nullam tincidunt est sed arcu semper gravida id et sapien. Praesent commodo dignissim sagittis. Fusce id tristique sem, vel aliquam justo. Cras at malesuada tellus. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Integer id viverra tortor, vel rhoncus diam. Pellentesque eget accumsan tellus. Integer ac est non turpis tincidunt efficitur. Nam ex augue, feugiat vitae ultrices sed, pharetra quis leo. Ut vitae erat elit. Suspendisse faucibus molestie neque. Proin luctus, lorem eget tempor ornare, nisl est posuere dolor, sit amet varius mauris odio in dolor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Fusce vel suscipit nisi. Cras suscipit ultrices arcu sit amet euismod. Nunc bibendum finibus quam. Fusce non ipsum ut mi tincidunt iaculis. Nunc eget porttitor tortor, et porttitor mi. Morbi varius, dui sit amet dignissim efficitur, eros ipsum volutpat eros, at efficitur metus dolor vitae justo. Mauris rutrum eros dolor, a iaculis diam ullamcorper at. Aenean sit amet felis tempus, dignissim orci ut, fermentum mi. Nulla facilisi. Sed eget turpis nunc. Vestibulum et lacus semper, auctor augue quis, auctor ante. Maecenas euismod, diam non egestas semper, purus orci blandit enim, ultricies vulputate neque tellus ac urna. Curabitur euismod magna finibus, laoreet massa sed, semper lacus. Vestibulum maximus rhoncus sem, at vehicula ante elementum ut. Aenean venenatis in velit at ullamcorper. In sit amet laoreet turpis. Donec leo sem, tempor et pellentesque eu, aliquet nec mi. Sed ut dolor sit amet ex dictum pharetra et et mauris. Nam tincidunt, magna id pulvinar dapibus, dui tellus vulputate felis, et fringilla nisl mi et nibh. Aliquam finibus tempor augue, ut rutrum velit efficitur ac. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Sed et elementum augue, nec malesuada nulla. Morbi quis libero quis tellus convallis elementum. Nunc pulvinar ex ut velit convallis, et semper nisl egestas. Praesent lorem libero, scelerisque vitae cursus sit amet, maximus nec nisi. Donec rutrum risus eget massa commodo dapibus. Vestibulum pharetra turpis non posuere viverra. Mauris eu dui semper, pulvinar turpis eget, aliquet lorem. In feugiat cursus pretium. Etiam ornare aliquam luctus. Vestibulum nibh magna, maximus sed augue suscipit, egestas cursus orci. Maecenas rutrum congue odio vitae venenatis. Donec mattis pretium odio, sodales semper est ultricies vel. Suspendisse potenti. Vivamus ante neque, egestas et elit a, luctus auctor risus. Vestibulum et arcu eget odio auctor bibendum nec commodo tortor. Nulla aliquet magna tortor, ac ornare nulla eleifend in. In mollis semper nulla a efficitur. Suspendisse potenti. Proin facilisis blandit sapien. Sed fringilla, ipsum ac fermentum scelerisque, felis ligula cursus libero, ac efficitur risus libero et diam. Vivamus lacinia pretium sapien, sed aliquam massa semper a. Curabitur semper neque mi, sit amet sollicitudin tellus pharetra at. Aliquam venenatis sagittis eros, ac ultrices ligula vehicula id. Integer feugiat ultricies sodales. Duis consectetur orci eu risus lobortis, et tempor purus volutpat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Etiam est diam, ornare vel nisi ut, bibendum dapibus eros. Nullam eget placerat turpis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Morbi sit amet accumsan enim, ac facilisis tortor. Maecenas faucibus, metus sed placerat tempor, lacus massa interdum arcu, ut aliquam massa leo a ante. Nulla facilisis magna ut justo lobortis placerat. Cras feugiat venenatis nisi congue pharetra. Maecenas ut lorem id nisi viverra porta sit amet at ipsum. Duis fringilla, justo scelerisque luctus vulputate, nunc erat posuere eros, in semper velit libero quis velit. Ut semper, massa non elementum egestas, tortor massa pharetra mi, a malesuada nisl arcu laoreet eros. Etiam tincidunt, nisi ut suscipit gravida, est ante ullamcorper purus, eget finibus nisl nisl quis justo. Integer semper massa eget nunc pellentesque, in mattis tellus vestibulum. In ut diam placerat, venenatis enim quis, faucibus odio. Sed et dignissim ipsum. Sed lacinia lacinia velit, ac tristique velit pharetra ut. Cras cursus enim mauris, at auctor dolor pretium vel. Curabitur ac nisl et diam fringilla placerat. Pellentesque neque lacus, rhoncus vitae nibh vitae, sagittis sodales eros. In eu vehicula magna. Donec rhoncus a lacus vitae tincidunt. Donec venenatis, risus at aliquet semper, tortor purus auctor nibh, nec porta ex enim id nulla. Ut ac scelerisque lectus. Mauris vitae turpis massa. Vestibulum ultricies, nibh quis placerat porttitor, arcu est convallis enim, in euismod massa nisi quis dui. Fusce congue elit ut lacus consequat pharetra pellentesque eu purus. Etiam semper, odio a viverra pellentesque, magna felis consectetur velit, porttitor finibus libero eros eget ex. Proin aliquet fringilla risus id dictum. Fusce vel nibh in metus cursus lobortis eu blandit nunc. Mauris et lacus gravida, posuere nibh sed, finibus diam. Proin vel turpis fermentum, dapibus diam nec, porta ipsum. Aenean tempor leo non aliquam semper. Cras sit amet ex eu ipsum suscipit rutrum et vel lacus. Vivamus vitae justo id justo pellentesque venenatis. Aliquam erat volutpat. Phasellus mi augue, accumsan nec accumsan eget, hendrerit aliquam elit. Pellentesque dictum suscipit sem, a vehicula sapien rhoncus ut. Nunc dignissim, est quis ullamcorper hendrerit, tortor orci egestas magna, sed ultricies lacus sapien non orci. Sed maximus convallis nunc, ac finibus lorem luctus nec. Curabitur eu massa mattis, viverra nibh at, tristique sapien. Phasellus molestie eros eget nibh tincidunt, ultricies pulvinar augue sollicitudin. Nullam dui enim, lobortis quis ligula in, iaculis ullamcorper felis. Fusce ac interdum elit, in dignissim mauris.

__MACOSX/fwdwork/._lorem_ipsum (1).txt

fwdwork/dfs_graph.py

graph = dict() graph['A'] = ['B', 'S'] graph['B'] = ['A'] graph['S'] = ['A','G','C'] graph['D'] = ['C'] graph['G'] = ['S','F','H'] graph['H'] = ['G','E'] graph['E'] = ['C','H'] graph['F'] = ['C','G'] graph['C'] = ['D','S','E','F'] def depth_first_search(graph, root): visited_vertices = list() graph_stack = list() graph_stack.append(root) node = root while len(graph_stack) > 0: if node not in visited_vertices: visited_vertices.append(node) adj_nodes = graph[node] if set(adj_nodes).issubset(set(visited_vertices)): graph_stack.pop() if len(graph_stack) > 0: node = graph_stack[-1] continue else: remaining_elements = set(adj_nodes).difference(set(visited_vertices)) first_adj_node = sorted(remaining_elements)[0] graph_stack.append(first_adj_node) node = first_adj_node return visited_vertices print(depth_first_search(graph, 'A'))

__MACOSX/fwdwork/._dfs_graph.py

fwdwork/work 1 (1).docx

Assignment – Graphs

The Extended Families of Wuther Heights (Modified):

Family 1

Patrick Earnshaw (M) {id: 001}

Hannah Earnshaw (F) {id: 002}

Relationship: Married

Children:

Catherine Earnshaw (F) {id: 003}

Hindley Earnshaw (M) {id: 004}

Family 2

Andrew Linton (M) {id: 005}

Dolores Linton (F) {id: 006}

Relationship: Divorced

Children:

Isabella Linton (F) {id: 007}

Edgar Linton (M) {id: 008}

Heathcliff Linton (M) [Adopted] {id: 009}

Family 3

Hindley Earnshaw (M) {id: 004}

Frances Byler (M) {id: 010}

Relationship: Married

Children:

Hareton Earnshaw (M) [Adopted] {id: 011}

Family 4

Catherine Earnshaw (F) {id: 003}

Edgar Linton (M) {id: 008}

Relationship: Married

Children:

Cathy Linton (F) {id: 012}

Family 5

Isabella Linton (F) {id: 007}

Children:

Linton Heathcliff (M) {id: 013}

Family 6

Heathcliff Linton (M) {id: 009}

Children:

Linton Heathcliff (M) {id: 013}

Family 7

Hareton Earnshaw (M) {id: 011}

Cathy Linton (F) {id: 012}

Relationship: Married

Family 8

Cathy Linton (F) {id: 012}

Linton Heathcliff (M) {id: 013}

Relationship: Divorced

Assignment : Graphs

Family tree's and genealogy software has become more and more prevalent in recent years. From the name you might expect that a family tree would be easily represented by a tree structure, but that is not the case! A more appropriate data structure to represent a family tree would be a type of graph. Using the description of the family that accompanies this assignment, you must represent this family using a graph structure. The graph needs to be a weighted graph. The weights will constitute the types of relationships, I recommend using some kind mapping between numbers and strings to represent the relationships. When adding family members to the graph, this can be done programmatically for the provided family members within the description file. Additionally, I also want there to be an interface in which a user can create a new family member and add them to the tree. This can be a simple CLI where the user provides a name, gender, and age to create a person. Then another simple CLI where they select which member of the family they want the original relationship to be with and what kind of relationship it should be. Finally, they can edit the family member using another CLI and selecting the family member they wish to edit, the operation they wish to perform (edit name, edit age, edit relationship), and then add new relationship between family members which can call a function that you create in order to add the original relationship. Remember the DRY philosophy, where code can be modularized or made into a function, it should be if you plan on using the logic again.

Finally, I want you to make data assertions within the FamilyTree class that enforce certain "rules" that exist in a typical human family. An example would be a person should not have any kind of relationship to itself (a person can not marry themselves, a person can not be their own brother, sister, father, mother, etc.). There should be at least 3 data assertions. These should exists as part of the family tree, not as part of the graph.

As a hint, for a successful design: I would recommend using layers of abstraction. Your graph class is the backing structure to the family tree class. Your family tree should implement methods that interface with the graph class, i.e. add_family_member() should call the constructor to create a node and then call a function within the graph class to add a node to the graph. Then using the relationships function parameter, you can add edges to the graph between the new nodes and the existing nodes. The family tree should be what enforces what relationships can exist through the data assertions, the graph does not care about what relationships are made between family members. Your functions that the user would interface with would be greatly reduced compared to the total number of methods within the classes themselves. The user should be able to add, remove, and modify family members and that's about it. Therefore those should be your function calls.

Submission Goals

· (120 pts.) Create a FamilyTree class that will represent a family tree for a given family.

The class should contain several types of relationships that commonly happen within a family (siblings, marriage, offspring, etc.)

· (40 pts.) Programmatically add the family members to the graph as described by the accompanying family description file.

· (40 pts.) Give data assertions to the FamilyTree class to enforce restrictions for basic family structure (at least 3); i.e A person can not marry themselves.

· (40 pts.) Provide a simple CLI the enables users to add, remove, and edit family members.

* Note that there is a total of 240 points on the table here. The assignment is only worth 200 points. You have 40 points of optional goals.

Question 2-

You have been provided with the following string matching algorithms: Boyer_Moore.py, Brute_force.py, KMP_matcher.py, and Rabin_Karp.py. Using any of these algorithms, I want you to find the word(s) that have:

1. Longest Continual Substring of:

· Vowels

· Consonants

2. Longest Continual Prefix of:

· Vowels

· Consonants

3. Longest Continual Suffix of:

· Vowels

· Consonants

Example: ["Hello", "World", "Face","Asthma", "Spring", "Because", "Thorough", "Sequoia" ]. In this list of strings, "Sequoia" has the longest substring of consecutive vowels. "Asthma" has the longest substring of consecutive consonants. "Asthma" Also has the longest prefix of consecutive vowels. "Spring" has the longest prefix of consecutive consonants. "World" has the longest suffix of consecutive consonants and "Sequoia" has the longest suffix of consecutive vowels. Your word file is going to be a long excerpt from "Lorem Ipsum" text found in lore_ipsum.txt. If there are multiple words that tie in any criteria, return ALL words that tie. You can assume that the only characters in the file will be standard English letters, no numbers, special characters, or characters with diacritic marks

__MACOSX/fwdwork/._work 1 (1).docx

fwdwork/graph.py

graph = dict() graph['A'] = ['B', 'C'] graph['B'] = ['E','C', 'A'] graph['C'] = ['A', 'B', 'E','F'] graph['E'] = ['B', 'C'] graph['F'] = ['C'] matrix_elements = sorted(graph.keys()) cols = rows = len(matrix_elements) adjacency_matrix = [[0 for x in range(rows)] for y in range(cols)] edges_list = [] for key in matrix_elements: for neighbor in graph[key]: edges_list.append((key,neighbor)) print(edges_list) for edge in edges_list: index_of_first_vertex = matrix_elements.index(edge[0]) index_of_second_vertex = matrix_elements.index(edge[1]) adjacency_matrix[index_of_first_vertex][index_of_second_vertex] = 1 println(adjacency_matrix)

__MACOSX/fwdwork/._graph.py

fwdwork/Rabin_Karp.py

def generate_hash(text, pattern): ord_text = [ord(i) for i in text] # stores unicode value of each character in text ord_pattern = [ord(j) for j in pattern] # stores unicode value of each character in pattern len_text = len(text) # stores length of the text len_pattern = len(pattern) # stores length of the pattern len_hash_array = len_text - len_pattern + 1 # stores the length of new array that will contain the hash values of text hash_text = [0]*(len_hash_array) # Initialize all the values in the array to 0. hash_pattern = sum(ord_pattern) for i in range(0,len_hash_array): # step size of the loop will be the size of the pattern if i == 0: # Base condition hash_text[i] = sum(ord_text[:len_pattern]) # initial value of hash function else: hash_text[i] = ((hash_text[i-1] - ord_text[i-1]) + ord_text[i+len_pattern-1]) # calculating next hash value using previous value return [hash_text, hash_pattern] # return the hash values def Rabin_Karp_Matcher(text, pattern): text = str(text) # convert text into string format pattern = str(pattern) # convert pattern into string format hash_text, hash_pattern = generate_hash(text, pattern) # generate hash values using generate_hash function len_text = len(text) # length of text len_pattern = len(pattern) # length of pattern flag = False # checks if pattern is present atleast once or not at all for i in range(len(hash_text)): if hash_text[i] == hash_pattern: # if the hash value matches count = 0 # count stores the total characters upto which both are similar for j in range(len_pattern): if pattern[j] == text[i+j]: # checking equality for each character count += 1 # if value is equal, then update the count value else: break if count == len_pattern: # if count is equal to length of pattern, it means match has been found flag = True # update flag accordingly print('Pattern occours at index',i) if not flag: # if pattern doesn't match even once, then this if statement is executed print('Pattern is not at all present in the text') Rabin_Karp_Matcher("101110000011010010101101","10112") # Works for numeric Rabin_Karp_Matcher("101110000011010010101101","1011") # Works for alphabets Rabin_Karp_Matcher("ABBACCADABBACCEDF","ACCE") # Works for alpha numeric Rabin_Karp_Matcher("abc1-3klm890zsdoifjwej8cjv09wn vn09aej09jv 09wje09cj 09 j093j0 9j 092j3 09c09", "09w")

__MACOSX/fwdwork/._Rabin_Karp.py

fwdwork/Screen Shot 2020-12-08 at 12.28.31 AM.png

__MACOSX/fwdwork/._Screen Shot 2020-12-08 at 12.28.31 AM.png

fwdwork/Boyer_Moore.py

text = "acbaacacababacacac" pattern = "acacac" matched_indexes = [] i=0 flag = True while i<=len(text)-len(pattern): for j in range(len(pattern)-1, -1, -1): #reverse searching if pattern[j] != text[i+j]: flag = False #indicates there is a mismatch if j == len(pattern)-1: #if good-suffix is not present, we test bad character if text[i+j] in pattern[0:j]: i=i+j-pattern[0:j].rfind(text[i+j]) #i+j is index of bad character, this line is used for jumping pattern to match bad character of text with same character in pattern else: i=i+j+1 #if bad character is not present, jump pattern next to it else: k=1 while text[i+j+k:i+len(pattern)] not in pattern[0:len(pattern)-1]: #used for finding sub part of a good-suffix k=k+1 if len(text[i+j+k:i+len(pattern)]) != 1: #good-suffix should not be of one character gsshift=i+j+k-pattern[0:len(pattern)-1].rfind(text[i+j+k:i+len(pattern)]) #jumps pattern to a position where good-suffix of pattern matches with good-suffix of text else: #gsshift=i+len(pattern) gsshift=0 #when good-suffix heuristic is not applicable, we prefer bad character heuristic if text[i+j] in pattern[0:j]: bcshift=i+j-pattern[0:j].rfind(text[i+j]) #i+j is index of bad character, this line is used for jumping pattern to match bad character of text with same character in pattern else: bcshift=i+j+1 i=max((bcshift, gsshift)) break if flag: #if pattern is found then normal iteration matched_indexes.append(i) i = i+1 else: #again set flag to True so new string in text can be examined flag = True print ("Pattern found at", matched_indexes)

__MACOSX/fwdwork/._Boyer_Moore.py

fwdwork/Brute_force.py

def brute_force(text, pattern): l1 = len(text) # The text which is to be checked for the existence of the pattern l2 = len(pattern) # The pattern to be determined in the text i= 0 j=0 # looping variables are set to 0 flag = False # If the pattern doesn't appear at all, then set this to false and execute the last if statement while i < l1: # iterating from the 0th index of text j = 0 count = 0 # Count stores the length upto which the pattern and the text have matched while j < l2: if i+j<l1 and text[i+j] == pattern[j]: # statement to check if a match has occoured or not count += 1 # if the statement evaluates to true, then update count j += 1 if count == l2: # if total number of successful matches is equal to count of the array print("\nPattern occours at index",i) # print the starting index of the successful match flag = True # Even if the matching occours once, set this flag to True i += 1 if not flag: # If the pattern doesn't occours even once, this statement gets executed print('\nPattern is not at all present in the array') brute_force('acbcabccababcaacbcaabacbbc','acbcaa') # function call

__MACOSX/fwdwork/._Brute_force.py

fwdwork/Screen Shot 2020-12-08 at 12.27.17 AM.png

__MACOSX/fwdwork/._Screen Shot 2020-12-08 at 12.27.17 AM.png

fwdwork/KMP_matcher.py

def pfun(pattern): # function to generate prefix function for the given pattern n = len(pattern) # length of the pattern prefix_fun = [0]*(n) # initialize all elements of the list to 0 k = 0 for q in range(2,n): while k>0 and pattern[k+1] != pattern[q]: k = prefix_fun[k] if pattern[k+1] == pattern[q]: # If the kth element of the pattern is equal to the qth element k += 1 # update k accordingly prefix_fun[q] = k return prefix_fun # return the prefix function def KMP_Matcher(text,pattern): # KMP matcher function m = len(text) n = len(pattern) flag = False text = '-' + text # append dummy character to make it 1-based indexing pattern = '-' + pattern # append dummy character to the pattern also prefix_fun = pfun(pattern) # generate prefix function for the pattern q = 0 for i in range(1,m+1): while q>0 and pattern[q+1] != text[i]: # while pattern and text are not equal, decrement the value of q if it is > 0 q = prefix_fun[q] if pattern[q+1] == text[i]: # if pattern and text are equal, update value of q q += 1 if q == n: # if q is equal to the length of the pattern, it means that the pattern has been found. print("Pattern occours with shift",i-n) # print the index, where first match occours. flag = True q = prefix_fun[q] if not flag: print('\nNo match found') KMP_Matcher('aabaacaadaabaaba','aabac') # function call, with two parameters,text and pattern

__MACOSX/fwdwork/._KMP_matcher.py

fwdwork/bfs_graph.py

from collections import deque graph = dict() graph['A'] = ['B', 'G', 'D'] graph['B'] = ['A', 'F', 'E'] graph['C'] = ['F', 'H'] graph['D'] = ['F', 'A'] graph['E'] = ['B', 'G'] graph['F'] = ['B', 'D', 'C'] graph['G'] = ['A', 'E'] graph['H'] = ['C'] def breadth_first_search(graph, root): visited_vertices = list() graph_queue = deque([root]) visited_vertices.append(root) node = root while len(graph_queue) > 0: node = graph_queue.popleft() adj_nodes = graph[node] remaining_elements = set(adj_nodes).difference(set(visited_vertices)) if len(remaining_elements) > 0: for elem in sorted(remaining_elements): visited_vertices.append(elem) graph_queue.append(elem) return visited_vertices print(breadth_first_search(graph, 'A'))

__MACOSX/fwdwork/._bfs_graph.py

fwdwork/Screen Shot 2020-12-08 at 12.27.31 AM.png

__MACOSX/fwdwork/._Screen Shot 2020-12-08 at 12.27.31 AM.png