Map/Reduce and TF.IDF

profileabcity84

 Assume you have 3 documents with the following terms:

  • D1 = "computer", "web", "storage", "options"
  • D2 = "computer", "game", "development"
  • D3 = "web", "development", "frameworks"

If the query Q is composed of terms "computer" and "development", what is the relevance of each document to the query using the TF.IDF measure?

2. Explain in detail how the Hadoop system deals with DataNode failures.

3. Explain and write the pseudocode for a Mapper/Reducer that takes as input a large file (possibly split into chucks) of integers and outputs:

  1. The sum of the squares of each integer
  2. The maximum integer

4.  Explain in detail why MapReduce may be a better solution than OLAP for some problems. Provide concrete examples.

    • 10 years ago
    • 10
    Answer(1)

    Purchase the answer to view it

    blurred-text
    • attachment
      map_reduce_and_tf.idf_.docx