Python Problems

PythonProblems.docx

Problem 1:

• Load the boston dataset by using Pandas read_csv().

• Remove column zero (the tag for this column is ‘Unnamed: 0’

• Remove column tagged as ‘dist’ and join the two parts of the dataframe (to the left and right of

the column ‘dis’) back together in a new dataframe called df2.

• Calculate the mean of column called ‘age’ and add it as a new column with the mean value

repeated for all rows.

Problem 2:

• Generate a vector of 1000 random numbers between 0 to 100.

• Plot a histogram of these numbers with number of bins equal to 10.

• Calculate the average of these numbers by using numpy method mean().

• Plot a line with a red color from the mean point on the histogram plot in y direction to show the

location of mean in the histogram plot.

• Make two matrices as follows and perform matrix multiplication:

3 6

4 9

1 5

* 4 12 21

23 15 −4

• Take the transpose of the first matrix and multiply it by itself. What is the relationship of the

resultant matrix and the original matrix?

Problem 3

Generate an array of normally distributed that contains 10000 samples. The mean of the

distribution is 10 and the standard deviation is 3.

1. Plot the samples vs. its index

2. Draw a line from the mean value with color green and thickness of 2.

3. Draw dashed lines from mean ± 2* standard deviation with a red color and thickness of 1.

2* standard deviation with magenta.

5. Calculate the percentage of the samples that fall between the two standard deviation

lines from the data you have generated and print it as an output.