Parallel algorithm

tiancaigu
Main
Similar Questions
Home >Homework Answsers >Computer Science homework help
The genome of an organism can be expresses as some number G of



"base pairs" (see http://en.wikipedia.org/wiki/Base_pair). Typical



sizes of various genomes are given in (http://en.wikipedia.org/wiki/Genome).







String matching can be used to find particular sequences in a



genome. Several string matching algorithms are described



in (http://en.wikipedia.org/wiki/String_matching)







Consider a program to find to find if a particular sequence of



base pairs is found in a genome, and if so, where and how



many times.







Your program will run on a cluster with 



the following properties:







Number of nodes - 20







  Number of processors per node 16 2.6 GHz Xeon



  Memory per node               16 GB



  GPU - 2 (NVIDIA CUDA) per node, 1024 stream processors and 4GB RAM, running at 1.5 GHz 



  local drives 1 T SATA , 6 GB/sec



  NFS drive 10TB  RAID, bandwidth limited by network



	                



Switched Ethernet network



Latency               L = 20 microseconds



Bandwidth             B = 1Gb/sec == 100 Mbytes/sec for messages



			   larger than 32Kbytes











You may not need all the above information. If you feel you need some other system 



property, feel free to assume some reasonable value (Try Wikipedia)







Assume the genome you are exploring and the sequence you are



trying to find, are both initially files on the NFS disk.







Deliverables:







1. Parallel String Match algorithm - in MPI, OpenMP, CUDA or some



combination of these. Description in English and/or pseudocode is



sufficient. Is yoyur algorithm data parallel, task parallel or both? 







Describe data transfer during computation (disk to program, process to process,



CPU - GPU and node - node). Describe how data is partitioned between processes, 



shared between processes, or replicated at each process.







2. You may not need all the hardware available for your algorithm. You may use the



entire cluster or any part of it. Describe what resources your algorithm will use to 



execute. Explain your choice.







3. Estimate how your algorithm would perform on the computer



system described above. Consider:



    a. Complexity; communication costs.



    b. Is there some file size (in bytes, number of elements, or both)



that is too small for your algorithm to work efficiently? Given the wide range of genome



sizes (see http://en.wikipedia.org/wiki/Genome), is there some range of size that you expect would



be best for your algorithm?



    c. How much speedup would you exepect on the given hardware as compared to running



on a single CPU? Justify your answer
8 years ago
06.12.2018
200
Report issue
Answer(0)
Bids(19)
other Questions(10)
Parallel algorithm

Parallel algorithm

CSE 525

PROGRAMMING HOMEWORK, PARALLEL ALGORITHM.