Sort This Dataset Blindfolded
2/8/2020 Sort This Dataset Blindfolded
https://geosciencestamu.instructure.com/courses/275/assignments/4682?module_item_id=21325 1/1
Sort This Dataset Blindfolded
Due Monday by 11:59pm Points 10 Submitting a file upload File Types txt and r
Submit Assignment
Let’s be like the magicians throwing daggers at audience members while blindfolded, and see how well we do when we have to process data without seeing the whole dataset.
Here’s a snippet of a dataset:
sample_id test_group sp_x1 sp_y1 sp_x2 sp_y2 sp_x3 sp_y3 sp_x4 sp_y4 sp_x5 sp_y5 sp_x6 sp_y6 sp_x7 sp_y7 sp_x8 sp_y8 TDT_30 4 3.024 3.277 3.035 3.952 2.919 6.586 NA NA 3.825 4.71 2.245 6.544 3.625 6.597 2.139 6.902
TDT_31 1 1.883 3.834 1.875 4.142 1.944 5.563 1.445 4.557 2.321 4.518 1.598 5.525 2.251 5.51 1.56 5.763
TDT_32 1 1.475 2.667 1.481 2.914 1.481 3.956 1.144 3.203 1.794 3.167 1.198 3.974 1.77 3.968 1.168 4.143
TDT_33 1 1.165 0.282 1.159 0.576 1.183 1.606 NA NA 1.459 0.797 0.883 1.619 1.435 1.612 0.846 1.784
TDT_34 2 1.144 0.095 1.15 0.251 1.15 1.022 0.904 0.424 1.385 0.43 0.966 1.033 1.357 1.016 0.91 1.15
TDT_35 2 1.348 3.156 1.348 3.302 1.359 4.049 NA NA 1.598 3.494 1.161 4.054 1.528 4.049 1.114 4.165
TDT_36 1 3.647 0.447 3.611 1.295 3.519 4.759 2.425 2.234 4.659 2.334 2.589 4.714 4.358 4.732 2.38 5.142
TDT_37 3 3.092 3.693 3.082 4.493 3.202 7.746 2.091 5.374 NA NA 2.282 7.695 NA NA 2.172 8.166
TDT_38 1 2.382 3.021 2.414 3.605 2.6 6.098 1.654 4.351 3.174 4.249 1.937 6.126 3.207 6.048 1.84 6.428
TDT_39 2 2.778 4.122 2.759 4.68 2.736 6.968 2.006 5.276 3.393 5.257 2.135 6.944 3.246 6.896 2.064 7.277
This is coordinate data, and even worse, its got missing pairs of coordinates.
Write code to read in this dataset, convert the coordinate data to a numeric array with three dimensions, cleans the dataset of coordinate pairs that are missing from any samples, and then sorts the data by the first x coordinate. Don’t worry about the group variable, but you might want to label the array with the sample IDs.
Please upload this code as a plain-text file, such as an R script. I’ll run it on the full dataset and we’ll see what happens. If it runs without error, you get full credit (10/10). If it mostly runs but it runs into a bug or two, you’ll get 8/10. If the code doesn’t get off the ground at all, or the code doesn’t even adequately try to take into account the steps that you needed to account for, you’ll get 0 - 7, depending on how much is understandable about what you were trying to get the code to do. In other words, please use copious amounts of commenting!
(Hint: Copy and paste the table into a spreadsheet and then save as a plain-text table if you want to use the snippet as an example data file.)
(Addendum: Canvas isn't handling the HTML table very well from my end, with the right-hand margin cutting off the far end of the table, so copy-pasting the table into another file to see all the columns is probably necessary.)