Python MRJob
Assignment 1 (100 points) Due Date: 5/28/2021 (11:59pm) In this assignment, you are going to find the words that share the same letters using MRJOB, e.g. (act, cat) The data file is stored in data.txt (MRJOB will read the data from the file and send it to the mapper). Note that not all the words have a match. The following is what you need to do
• Convert all words to lower case • Sorted all the letters and use it as a key and the word will be your value • Gather the values from the reducer
Input Text act takes big cause Tames expel dog dig listen vase flow race stressed cheater meats tofu
… Expected Output "Output" ["baker","break"] "Output" ["cheater","teacher"] "Output" ["race","care"] "Output" ["cause","sauce"] "Output" ["act","cat"] "Output" ["read","dare"] "Output" ["heart","Earth"] "Output" ["takes","stake"] "Output" ["Tames","meats","teams","mates"] "Output" ["vase","save"] "Output" ["part","trap"] "Output" ["builder","rebuild"]
"Output" ["kitchen","thicken"] "Output" ["stressed","desserts"] "Output" ["dog","God"] "Output" ["study","dusty"] "Output" ["knee","keen"] "Output" ["listen","silent"] "Output" ["flow","wolf"] "Output" ["tofu","tofu"] "Output" ["night","thing"] To run the code, use the following command >> python assignment1.py data.txt > output.txt Submission Submit your python program to blackboard.