Machine learning and artificial intelligence for bioinformatics

  • a month ago
  • 10
files (1)


Machine Learning and Ar.ficial Intelligence for Bioinforma.cs

Homework 5 – Due October 6th at 10am

Each of the 3 ques.ons below is worth 33.333 points (you get 0.001 free to reach 100).

This homework needs to be completed on the Google Collaboratory, and the results submi:ed as screenshots in a .doc or .pdf. Please include the completed run of the corresponding code the ques?on refers too along with your wri:en answer (you can include addi?onal code if you want). You will need to

Also you are welcome to run Tensorflow code outside of the Collaboratory, if you have such a setup, please note though that the submission need to follow the same format, meaning code cells –> output as shown on the Collaboratory (for example do not submit Python interac?ve command terminal code)

In prepara?on for the homework, you can review again the Google Collaboratory posted in the last lecture. Please watch the following videos in order to become familiar with the Collaboratory (feel free to watch any addi?onal on Youtube):



For the codes in the ques?ons below, each shaded box showing a code can run in a separate Google Collaboratory cell. Note: You need to run cells from top to bo:om (since top code cells generate dependencies for the lower cells), so you have to copy-paste and run the code cells in your own Google Collaboratory, in the same order shown in the code each ques?on points you too. Then as the ques?ons request you to do (for example, adjus?ng the number of epochs), you have to edit the code in the corresponding cells and re-run each cell. If you are s?ll confused on how this works, re-watch the above videos with tutorials on the Google Collaboratory and also addi?onal videos.

Ques.on 1.

NOTE: Use instead of “from keras.layers.normaliza?on import BatchNormaliza?on” the “from keras.layers import BatchNormaliza?on”.

Run the following code on the Collaboratory. Tip: If you are logged in your Google account and click the “Copy to Drive” bu:on on the top. This will make a full copy of this Google Collaboratory sheet under your own account, and save you a lot of typing and copy-pas?ng compared to star?ng a new sheet and transferring everything over manually.

h:ps:// in Keras.ipynb

a. How many different types of neural networks (and what kind of networks) are being used to classify the digits – show the corresponding part of the code where these networks are implemented.

b. Run the code with both types of neural networks that are in it, based on the metrics, which one does it classify the digits be:er ? Please explain your answer by also defining the metrics (so you understand what each metric means).

c. Could you try a different ac?va?on func?on instead of sobmax in the final layer and see what happens with the model predic?ons and its metrics ? Choose one from the list h:ps:// keras/ac?va?ons

       Due March 13th 2024

Ques.on 2.

Run the following code on the Collaboratory (you can skip the part showing the images if you wish). You will need to copy this code in a new, clean sheet of the Google Collaboratory.


a. Modify the number of Convolu?onal and Max Pooling layers, for example add a pair or two, and remove a layer or two :

model = tf.keras.Sequential([ tf.keras.layers.experimental.preprocessing.Rescaling(1./255), tf.keras.layers.Conv2D(32, 3, activation='relu'), tf.keras.layers.MaxPooling2D(), tf.keras.layers.Conv2D(32, 3, activation='relu'), ….

Then rerun the training with the modifica?ons

model.compile( …

and also train_ds, ..

What do you observe changing in the metrics ? (just run it for 3 epochs as it is)

b. Modify the number of epochs increasing them gradually (you might reach a point where it gets too slow in the Google Collaboratory). What do you observe in the metrics as you increase the epochs, is there a point where the metrics plateau?

c. In which part of the code we split the dataset in training / valida?ons and what por?ons ? What is the purpose of doing this ?

d. Look at the structure of the Convolu?onal Neural Network as specified in the code for this image classifica?on example h:ps:// What are the differences ? Make those adjustments to modify the code you just made on a – c above, and re-run the model (use 5 epochs or so). What do you observe in the model metrics ?

Ques.on 3.

Run the following code on Deep Learning for genomics on the Google Collaboratory:

h:ps:// A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb

a. Describe in a couple of sentences the overall func?on of this neural network for bioinforma?cs predic?ons – what the predic?ons taking place, what are the data used, and what type of neural network we are using ? From which parts of the code you can find the answers to each of these points ?

b. How many predic?on classes this neural network has, and describe what are these classes. In addi?on to finding this from the text cells in the code, please also point the parts of the actual code that would demonstrate the number of predic?on classes (it should be one of the final layers in the network).

c. What por?on of the data we use for training, valida?on and tes?ng ? Where do you see that in the code ?

d. Run the code in your Google Collaboratory up to the point where we have the model lost / accuracy graphs (including prin?ng these graphs). What do you observe in these graphs if you modify the tes?ng and valida?on por?ons of the datasets ? You would need to re-run the cells from all the way up (where we define the training / valida?on por?ons) up and including the cells genera?ng the graphs. Similarly if you reduce significantly the number of epochs, what do you observe in those graphs ?