Preprocessing

profileHGFGkhhdsf
Preprocessing_06_Iterative_Imputer_Solution.ipynb

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## IterativeImputer\n", "### This notebook outlines the usage of Iterative Imputer (Multivariate Imputation).\n", "### Iterative Imputer substitutes missing values as a function of other features\n", "#### Dataset: [https://github.com/subashgandyer/datasets/blob/main/heart_disease.csv]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Demographic**\n", "- Sex: male or female(Nominal)\n", "- Age: Age of the patient;(Continuous - Although the recorded ages have been truncated to whole numbers, the concept of age is continuous)\n", "\n", "**Behavioral**\n", "- Current Smoker: whether or not the patient is a current smoker (Nominal)\n", "- Cigs Per Day: the number of cigarettes that the person smoked on average in one day.(can be considered continuous as one can have any number of cigarettes, even half a cigarette.)\n", "\n", "**Medical(history)**\n", "- BP Meds: whether or not the patient was on blood pressure medication (Nominal)\n", "- Prevalent Stroke: whether or not the patient had previously had a stroke (Nominal)\n", "- Prevalent Hyp: whether or not the patient was hypertensive (Nominal)\n", "- Diabetes: whether or not the patient had diabetes (Nominal)\n", "\n", "**Medical(current)**\n", "- Tot Chol: total cholesterol level (Continuous)\n", "- Sys BP: systolic blood pressure (Continuous)\n", "- Dia BP: diastolic blood pressure (Continuous)\n", "- BMI: Body Mass Index (Continuous)\n", "- Heart Rate: heart rate (Continuous - In medical research, variables such as heart rate though in fact discrete, yet are considered continuous because of large number of possible values.)\n", "- Glucose: glucose level (Continuous)\n", "\n", "**Predict variable (desired target)**\n", "- 10 year risk of coronary heart disease CHD (binary: “1”, means “Yes”, “0” means “No”)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from matplotlib import pyplot as plt\n", "import seaborn as sns" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>male</th>\n", " <th>age</th>\n", " <th>education</th>\n", " <th>currentSmoker</th>\n", " <th>cigsPerDay</th>\n", " <th>BPMeds</th>\n", " <th>prevalentStroke</th>\n", " <th>prevalentHyp</th>\n", " <th>diabetes</th>\n", " <th>totChol</th>\n", " <th>sysBP</th>\n", " <th>diaBP</th>\n", " <th>BMI</th>\n", " <th>heartRate</th>\n", " <th>glucose</th>\n", " <th>TenYearCHD</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>39</td>\n", " <td>4.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>195.0</td>\n", " <td>106.0</td>\n", " <td>70.0</td>\n", " <td>26.97</td>\n", " <td>80.0</td>\n", " <td>77.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>250.0</td>\n", " <td>121.0</td>\n", " <td>81.0</td>\n", " <td>28.73</td>\n", " <td>95.0</td>\n", " <td>76.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>48</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>245.0</td>\n", " <td>127.5</td>\n", " <td>80.0</td>\n", " <td>25.34</td>\n", " <td>75.0</td>\n", " <td>70.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>0</td>\n", " <td>61</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>30.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>225.0</td>\n", " <td>150.0</td>\n", " <td>95.0</td>\n", " <td>28.58</td>\n", " <td>65.0</td>\n", " <td>103.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>23.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>285.0</td>\n", " <td>130.0</td>\n", " <td>84.0</td>\n", " <td>23.10</td>\n", " <td>85.0</td>\n", " <td>85.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>4233</th>\n", " <td>1</td>\n", " <td>50</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>313.0</td>\n", " <td>179.0</td>\n", " <td>92.0</td>\n", " <td>25.97</td>\n", " <td>66.0</td>\n", " <td>86.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4234</th>\n", " <td>1</td>\n", " <td>51</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>43.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>207.0</td>\n", " <td>126.5</td>\n", " <td>80.0</td>\n", " <td>19.71</td>\n", " <td>65.0</td>\n", " <td>68.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4235</th>\n", " <td>0</td>\n", " <td>48</td>\n", " <td>2.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>248.0</td>\n", " <td>131.0</td>\n", " <td>72.0</td>\n", " <td>22.00</td>\n", " <td>84.0</td>\n", " <td>86.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4236</th>\n", " <td>0</td>\n", " <td>44</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>15.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>210.0</td>\n", " <td>126.5</td>\n", " <td>87.0</td>\n", " <td>19.16</td>\n", " <td>86.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4237</th>\n", " <td>0</td>\n", " <td>52</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>269.0</td>\n", " <td>133.5</td>\n", " <td>83.0</td>\n", " <td>21.47</td>\n", " <td>80.0</td>\n", " <td>107.0</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>4238 rows × 16 columns</p>\n", "</div>" ], "text/plain": [ " male age education currentSmoker cigsPerDay BPMeds \\\n", "0 1 39 4.0 0 0.0 0.0 \n", "1 0 46 2.0 0 0.0 0.0 \n", "2 1 48 1.0 1 20.0 0.0 \n", "3 0 61 3.0 1 30.0 0.0 \n", "4 0 46 3.0 1 23.0 0.0 \n", "... ... ... ... ... ... ... \n", "4233 1 50 1.0 1 1.0 0.0 \n", "4234 1 51 3.0 1 43.0 0.0 \n", "4235 0 48 2.0 1 20.0 NaN \n", "4236 0 44 1.0 1 15.0 0.0 \n", "4237 0 52 2.0 0 0.0 0.0 \n", "\n", " prevalentStroke prevalentHyp diabetes totChol sysBP diaBP BMI \\\n", "0 0 0 0 195.0 106.0 70.0 26.97 \n", "1 0 0 0 250.0 121.0 81.0 28.73 \n", "2 0 0 0 245.0 127.5 80.0 25.34 \n", "3 0 1 0 225.0 150.0 95.0 28.58 \n", "4 0 0 0 285.0 130.0 84.0 23.10 \n", "... ... ... ... ... ... ... ... \n", "4233 0 1 0 313.0 179.0 92.0 25.97 \n", "4234 0 0 0 207.0 126.5 80.0 19.71 \n", "4235 0 0 0 248.0 131.0 72.0 22.00 \n", "4236 0 0 0 210.0 126.5 87.0 19.16 \n", "4237 0 0 0 269.0 133.5 83.0 21.47 \n", "\n", " heartRate glucose TenYearCHD \n", "0 80.0 77.0 0 \n", "1 95.0 76.0 0 \n", "2 75.0 70.0 0 \n", "3 65.0 103.0 1 \n", "4 85.0 85.0 0 \n", "... ... ... ... \n", "4233 66.0 86.0 1 \n", "4234 65.0 68.0 0 \n", "4235 84.0 86.0 0 \n", "4236 86.0 NaN 0 \n", "4237 80.0 107.0 0 \n", "\n", "[4238 rows x 16 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df=pd.read_csv(\"data/heart_disease.csv\")\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How many Categorical variables in the dataset?" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<class 'pandas.core.frame.DataFrame'>\n", "RangeIndex: 4238 entries, 0 to 4237\n", "Data columns (total 16 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 male 4238 non-null int64 \n", " 1 age 4238 non-null int64 \n", " 2 education 4133 non-null float64\n", " 3 currentSmoker 4238 non-null int64 \n", " 4 cigsPerDay 4209 non-null float64\n", " 5 BPMeds 4185 non-null float64\n", " 6 prevalentStroke 4238 non-null int64 \n", " 7 prevalentHyp 4238 non-null int64 \n", " 8 diabetes 4238 non-null int64 \n", " 9 totChol 4188 non-null float64\n", " 10 sysBP 4238 non-null float64\n", " 11 diaBP 4238 non-null float64\n", " 12 BMI 4219 non-null float64\n", " 13 heartRate 4237 non-null float64\n", " 14 glucose 3850 non-null float64\n", " 15 TenYearCHD 4238 non-null int64 \n", "dtypes: float64(9), int64(7)\n", "memory usage: 529.9 KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How many Missing values in the dataset?\n", "Hint: df.Series.isna( ).sum( )" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Feature 1 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 2 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 3 >> Missing entries: 105 | Percentage: 2.48\n", "Feature 4 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 5 >> Missing entries: 29 | Percentage: 0.68\n", "Feature 6 >> Missing entries: 53 | Percentage: 1.25\n", "Feature 7 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 8 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 9 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 10 >> Missing entries: 50 | Percentage: 1.18\n", "Feature 11 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 12 >> Missing entries: 0 | Percentage: 0.0\n", "Feature 13 >> Missing entries: 19 | Percentage: 0.45\n", "Feature 14 >> Missing entries: 1 | Percentage: 0.02\n", "Feature 15 >> Missing entries: 388 | Percentage: 9.16\n", "Feature 16 >> Missing entries: 0 | Percentage: 0.0\n" ] } ], "source": [ "for i in range(len(df.columns)):\n", " missing_data = df[df.columns[i]].isna().sum()\n", " perc = missing_data / len(df) * 100\n", " print(f'Feature {i+1} >> Missing entries: {missing_data} | Percentage: {round(perc, 2)}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bonus: Visual representation of missing values" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<AxesSubplot:>" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAGqCAYAAAAP2J5ZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAtGklEQVR4nO3de7zu5Zz/8fd7d05n5VAUckzaiQ6qSUUTw/YjhUYTOcuofohJpeR8yPxoCCEkphLC6KQQjTSddmdjhkLxM50TqXaf+eO67r3ude+11t5r676u7/dar+fjsR6t+157+37sfe/7fn+vw+dyRAgAAKBl82oXAAAAMG4EHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzVtxph/uNm8v9qwDAIBeOPv+UzzdzxjhAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANG/F2gUAAIDZOfPGhbVLWGz3DefXLmGZEHgAAOiZvoSMLiHwAADQM4zwzB6BBwCAnulLyOgSFi0DAIDmEXgAAEDzmNICAKBnWMMzewQeAAB6pi8ho0sIPAAA9AwjPLNH4AEAoGf6EjK6hEXLAACgeYzwLIeuDCWS8AEAWDYEnuVA0AAAoF+Y0gIAAM1jhAcAgJ7pytIKqT+zHgQeAAB6pi8ho0uY0gIAAM0j8AAAgOYxpQUAQM+whmf2GOEBAADNY4QHAICe6cuoSpcwwgMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmken5eXQlTNM6LQJAMCyIfAsB4IGAAD9wpQWAABoHoEHAAA0j8ADAACaR+ABAADNY9EymtOVXXQSC9wBoCsIPGgOIQMAMIopLQAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j7O0lgOHUwIA0C8EnuVAyAAAoF+Y0gIAAM0j8AAAgOYReAAAQPMIPAAAoHksWkZz2EUHABhF4EFzCBkAgFEEnuXACAIAAP1C4FkOhAwAAPqFRcsAAKB5BB4AANA8prTQHNZYYTZ4vaCPeN3OniNi2h/uNm+v6X8IAADQIWfff4qn+xlTWgAAoHlMaQEAOo3pmyXxZzJ7BB4AQKf15QO1JP5MZo8pLQAA0DwCDwAAaB6BBwAANI/AAwAAmsei5eXQldXxLFoDAGDZEHiWA0EDAIB+YUoLAAA0j8ADAACaR+ABAADNI/AAAIDmsWgZwJzXhZ2XbIbAbHXhdSv157VL4AEw5/XlDRsYxut2dpjSAgAAzSPwAACA5jGlBQBAz3Rl/Y7Un6k1Ag8AAD3Tl5DRJQSeWSJVAwDQPwSeWSJkAADQPyxaBgAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+NBAAB6hq7/s8cIDwAAaB4jPAAA9ExfRlW6hBEeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmsS19OXSl4RPbEgEAWDaM8AAAgOYxwrMcGFkBAKBfGOEBAADNI/AAAIDmMaUFAEDPdGXzjNSfZR4EHgAAeqYvIaNLmNICAADNI/AAAIDmMaUFAEDPsIZn9gg8AAD0TF9CRpcwpQUAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8zg8FMCcxqnT6CNet7NH4AEwp/XlzRoYxut29pjSAgAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmcbQEAAA9w1las8cIDwAAaB4jPAAA9ExfRlW6hBEeAADQPAIPAABoHlNaAAD0DIuWZ48RHgAA0DxGeAAA6Jm+jKp0CSM8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGjeirULAAAAs3PmjQtrl7DY7hvOr13CMiHwAADQM30JGV3ClBYAAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABo3oq1CwCAms68cWHtEhbbfcP5tUtAT/C6nT0CD4A5rS9v1sAwXrezR+BZDiRrAAD6hcCzHAgZAICauPGePQIPAAA905eQ0SUEHgAAeoYRntkj8AAA0DN9CRldQh8eAADQPAIPAABoHoEHAAA0j8ADAACax6JlAAB6hl1as0fgAQCgZ/oSMrqEKS0AANA8Ag8AAGgegQcAADSPNTwA5jQWf6KPeN3OHoEHwJzWlzdrAH8dAg8AAD1DUJ89Ag+AOY2pAfQRr9vZI/AAmNP68mYNDON1O3vs0gIAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAADQM106WqIvCDwAAPQMR0vMHoEHAAA0j8ADAEDPMKU1ewQeAAB6himt2SPwAACA5hF4AABA8wg8AACgeQQeAADQvBVrFwAAAGanS7u0+rKAmsADAEDP9CVkdAlTWgAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACax2npy+HMGxfWLkESp+UCALCsCDzLgaABAEC/MKUFAACaR+ABAADNY0prObCGBwCAfiHwLAeCBgAA/ULgAQCgZ7oy0yD1ZxCAwAMAQM/0JWR0CYuWAQBA8wg8AACgeQQeAADQPNbwAADQMyxanj0CDwAAPdOXkNElTGkBAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADRvxdoFAA+0M29cWLuExXbfcH7tEgAAIvCgQYQMAMAoprQAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAALQvIsb+Jel1Ja7Tp1q6Uge1UAu1tFVLV+qgFmrpWi2lRnheV+g6y6IrtXSlDolapkMtU6OWqXWllq7UIVHLdKhlamOthSktAADQPAIPAABoXqnA89lC11kWXamlK3VI1DIdapkatUytK7V0pQ6JWqZDLVMbay3OC4UAAACaxZQWAABoHoEHAAA0j9PSAQBAcbbXkfS4/PA/I+L2cV5vrCM8tlez/YRxXmNZ2X5Q7Rq6wvYKtj9Su44B25vXrqGrbO9oe7/8/Qa2H12xlj1sf8z20bZfVKmGTW2vkr/f2fYB+U0TQE/YXtn2FyVdp7RQ+ThJ19n+gu2Vx3XdsQUe2wskXSbpjPx4S9vfHtf1Zqhje9tXS7omP55v+1Ol68jXfrzt42yfZfvcwVfpOiJikaSn2Xbpa0/j07YvtL1/7Q8v23vZXjN/f5jtb9jeqlItR0h6h6RD8lMrSfpKpVo+JekNkq6QdKWk19v+ZIVSTpW0yPZjJX1e0qMlfbVkAbbvtH3H0Nedw/8tXMvjbJ9m+0rbX7O9Ucnrd7iWt8z0Vammh9r+vO3T8+PNbL+6Qh2Pt32O7Svz4y1sH1a4jMOU3s8eGRFPjYgtJW2sNOt0+LguOrZdWrYvlrSrpB9GxFPzc5dHxBZjueD0dfxM0p6Svj1Ux5URUXxUwfZCSZ+WdLGkRYPnI+LiCrUcrTSUeIqku4Zq+UbpWnI9j5P0Kkl7SbpQ0vERcXaFOi6PiC1s7yjpA5I+KumdEbFthVouk/RUSZfU/DeUr3uVpM0jv2HYnifpioh4cuE6LomIrWwfLOnuiDjG9qWDP5+5xvaPJX1Z0nmSXiDpGRGxB7X4iJl+HhHvLlXLQA46x0s6NCLm215R0qUR8ZTCdfxI0sGSPlPrMzGHrW0i4k8jz68h6YJx1TLONTz3RcTtXRhEiIjfjNSxaLpfO2b3RcSxla49aj1JNyuF0oGQVCXwRMQv8l3GRZI+IempeQTqnYVD2OC18TxJx0bEabaPLHj9YfdERNgehIya07I/V7oDuz4/fqSkyyvUca/tvSW9QtKC/NxKFeqQlEaMJf1NfnheRJT+M1kzIo7L33/E9iWFr9/JWmoEmmWwfkScbPsQSYqI+2zX+CxaPSIuHPlMvK9wDfePhh1Jiog/Dt7vxmGcgedK238vaYV8936ApH8f4/Wm8xvb20uKPDd4gPL0VgXfsb2/pG9K+svgyYi4pXQhEbFf6WtOx/YWkvZTChlnS1oQEZfY3lDST1U2hN1g+zOSni3pQ3m9SK3djCfnWtax/VqlEbDPVarlwZKusX1hfry1pJ8Opqkj4gWF6thPaWrtfRHxK6c1TbWm+Q6U9FpNvD5PtP3ZiDimYBmr2n6qpMGn12rDjyOiZOjoTC22PzHTzyPigFK1DLnL9oOVbixleztJY12kO42bbG86VMeekn5XuIawva4mXivD7h/XRcc5pbW6pEMl/a3S/6kzJb0nIu4eywWnr2N9SR9X+gCzpLMkHRgRN5esI9fyqymejoh4TIVaHi/pWEkPjYjNc+h4QUS8t0It5yktWvt6RPx55Gf/EBEnFKxldUnPUZqu+YXth0t6SkScVaqGkXp20+R/Q+dFxF9m/l1jqeOZM/08In5UsJbVJG0cET8vdc1p6rhcadrmrvz4QZJ+WnLK0fYPZvhxRMSuM/y85VruUVprdrKkGzXywRoRXypVy1BNW0k6RtLmubYNJO0VEQsL1/EYpYXC20u6VdKvJO0TEdcVrOE6pWAzVeAZ22cinZbnqC7M43aJ7fVm+nmNUTjbX4iIVw09XkPSaRHxrAq1/KOkEyPi1tLXHqljgdK6qpUj4tG2t5R0VMERpuFarpC09eAmzvaqkv6j9JoMLCmPpOwl6aVK0zUnSTq15us3jxYvkvQEpQ/6n0uaV+MGJtfzoHz9O2tcv4YHfErL9neUh8qmUvqNaZqhzdslXRQRpxWuZSVJb5S0U37qh0qB496SdWRdmMeVtHjB8gckbSZp1cHzhUe+LlZ63Vpprcqt+ft1JP1aaTdQaTfYPjYi3piHf/9NaSSshodJ+o+8LuMLks6MOndLR0raRunfjiLiMtfbqn+8pJ/Z/mZ+/EKlnWNF2d5E0l0RcVOeJtlR0n9FxLfmai15BP/TSjtAN5K0t6SrbL+j5IjxiJ9GxFaSrho8kf89Fd0Fmqdij5d0p6Tj8sjTP5UcxfZSdr6Oa/pzHGt4PjqG/82/xqqSnqi0G0mSXqz0gnu17V0i4qCCtRyrtMBysC3+H/JzrylYw0AX5nEHjpd0hKR/lrSL0jqNoqvdI+LRkmT700o7+r6XHz9XaTq0uIg43PaHck1Pk/TBiDi1Ui2H2T5caXptP0n/YvtkSZ+PiP8uWMpUmyGqDFNHxMfySOkOSq/X/SLi0pI12H6X0gLusP2vSq/VH0p6nu2dS76/damWoZq2Ugo7u0k6XenGpnQND5O0kUbWNElaS9LqpeuR9KqI+Ljt3SU9ROnf8/FKyz1KOXro+6dp8t9LaPJmmgfMAx54Ss7lL6PHSto1Iu6TJNvHKv3F7qbUU6SkrSNi/tDjc522qtfwJqV53CfavkFpHvfllWpZLSLOse2IuF7SkU5bXGfcWjomW0fEGwYPIuJ02+8pWYDt4a28Fyr1pbhQ6YNkj1qtA/KOsd9L+r3SaOC6kr5u++yIeHuhMrqyGWLgMqUbhRUlyfbGEfHrgtd/maQnKX1w/lrSwyLiT05bni8rWEenarH9bknPV9qg8q+SDhl8BlSwu6RXSnqEpI8NPX+npHdWqGcQuP5Oqf3HQrvsduqI2GVxMamtxC4z/foHyth2aXVkmkJKyfpBmlgN/yBJG0bEItul504X2d50cEecF4/V2iK/bkQ8e3geN6+PuH5pv3EM7nbq6/KLvFbkBqU7jxpuctoe/xWlO419lLbvl7Rg5PGlSiODC1SpdYDtA5Tu3m9S2il2cETcO/h7k1Qq8LxZaTPEX5QaDp4pqWggHbD9ZqVQ/v+V/h1b6e+nZJ+kuyPiHkn32P7vwVbfvOX5noJ1dK2WwyX9UtL8/PX+/JnuVFK5heV5gfSXbL+41gjtiIttn6U0TX+IU6PVse2MWgbFRmjHuS29+jRF9mFJl9n+Yb7+Tkov/gdJ+n7hWg6W9APbv8y1bKL051LDcbZfERFXSJLtl0n6v5K+U6GWg5TuCg9Q+vDaVenDtYa9lV6331T6h3hefq6YLrUMGLK+pD3yCNxiEXG/7ecXrON5EXGoUuiRlLpja2LKuqQDJT2hxo7PIevkEUFLWmtodNCS1p7DtVQ7gmU6EXGq7edJerImDwIcVbiUV0vaUtIv8wjcg1Xvc6iosXZajoin2b5isGvB9o8j4m+W9nvHUMuGSutlrlUa4fltRJxXuo5cyyqaWKV/bcUV+o+R9HWlaawdJe0r6fkx5sPb+sL2GhHxx8o1PEJpG+sOSuHrJ0otFX5bsIZO7V5z7rS8tOcK1fIDSbtVnCqR7eNn+nnJ8NylWqbi1KLk5kqL7QfrA1dXGgD4nNIJABdGRI3jJV6gic0zP4qIoje6to/RxMjOy5SmHReLMfVJGmfgOV+pA+nXJZ2rNE3xwYgoepio7dco3Yk9QmkeeTul1fIle0LsGhHnjqzNWKzWmgynXjzfkvQbSS+MkR44hWp4hfKdcn7qGkmfiIgvl64l17O90pvRGhGxsVMn3ddHxP4VajlbadpmsKtkH0kvj4jdCtbwK03sXnu4Jvc0GVu/jCnqeK7SmoOXKG0xHlhL0mYRsU2JOnItg7OYnqz0uv03TW4k+rGpfh/KyTvEPijpFqVR4xOURinnSdo3Is6oUNPg2JrBf9eQ9I2I+NvCdXxQqXHoifmpvZV2LR8y/e96wGuYcQQ/xtQnaZxTWgdp8jTFLkqjCKUdqPSXe0FE7GL7iZJKtx1/plLoG12bIRVek+HUO2Q45a4naQWl7bUqObdte1+l18lbJF2i9EG6lVJbelUKPf+stMhw0EF4oe2dZv4tY7NBRAzfNX/R9kElCxjsXpMWLy6sdWbVjUrHjrxAk3d03Kk0FVvSmvm/v85fK+evaka3Giv9Oyq61XikntpTN/+itCB4baX33udGxAX5/f9ryodaFza4ofxTnnW4WXWm3v5O0pYRcb8k2f6S0jrBYoFH6aZlzYj4n+EnbT9E0tgO4B1n4AmlVL2JJs66OU5lF/RJaSHd3bZle5WIuNZ20VGmiBjsNjoqIiZ1W3b5HiIl11sszf6SXhSTO3yea/vFSkOcVUZ5ojtnr91kex+lN2gp3YnVXC9SrUtppG60C21/Vel9q2an5Q8rvVn/YfhJ2w9VnaMCpG5sNZY0/dRN4TJWHIQ920dFxAWSlN//C5ey2HdtryPpI0o3eKF6R8WsozT6JZVfXyWl8xLP0JI3+7spLbF44zguOs4zgk5U+gf3YqUP2edr6hGOcfttfpF9S9LZtk9TulusYaoV+l8vWUBEXD/4UnrRL8hf64wuSC1grZiinXl+bq3CtQxMOnvN9ttU7+y1VylN4fxeaevznvm5uew5SlPTZ0iS7S2dz/Mq6ONKb8qjnq00QljDEluNh54rbfuI2FfSrZEO8XyG0mGzJQ3vOhqdqq/Vt+k9EXFb3qm1iVJ/uA9WKOUDki61/cU8unOxpPcXrmHHqZZyRMSJmlhb9IAb5wjP/0RE6TeiJUTEi/K3R+ZFhmur8HBmHkZ9sqS1R9bxrKWhId/CNY0efPgVlz/4cKY1Q8XXE2VvUPpA20jSb5XukN9UugjbK0h6f1Q4MmGkjsF6FUt6yNBjSVXWqxypJTstP6pwDTtGxOtGn4yIE23X6KsidWurcRembubbvkPpdbta/l75cfH3XKduzw+XdHneur+20nT+KyVtWLKWiPha3rW8tdKfxzsi4vcla9DMYXxsAzHjDDxH2P6cpHM0eUFflQW6+dq1miI+QWmEax1NHuW6Uyl01PBqSdvGxMGHH1I6mbxk4HmS0wGMoyyp+IGqkhQRN6leA8bhOhbZ3sD2yvkNspY1h74/buRxjTvlqTotl1blzXopurTVuPrUTUSsUPJ6M8nr7g6V9F+SVrH9caUGhF9W6jJcup4XSTp3MCBhex3bL4yyx3/8wfY2ETFpqtP21pL+Z5rf81cbZ+DZT2nIbiVN3GlUaZpWW6Qzu06z/YyI+GntejJr8tqUQeO0kp5U+HrT8tRnri02rm2SS3GdpPPzlM1dQ7UUG1XJUxKyvUNEnD/8M9s7lKpjSBc6LVd5s16Kk5WWEFwmLT5Lqsp6r4gYNII81fZ3Ja06x9tdvE6pX9MttjdWCj47DdYVVXBERAzOf1NE3Gb7CKVlH6UcLOlk21/UxCaEpyttbHrZuC46zsAzPzg1eNSltt+kJXcv1FiXUf3gw4i4Pk/dnBkRVc6rGvIGSVcqfXAMb72u6cb8NU8TIyu1Fg4foyUPOZzquXEb7rT8NdXptFzlzXopPq10k/kJ26dI+mJEXFuygJnab+Rdl3PuZje7e9CvKiJ+bfs/K4YdaepRyHFmgSVEOrh6G6XlAq/MT1+lNOvwh2l/419pnP8nL7C9WURcPcZr9M0JSs0Pd5d0lNLUSZUFsZEOPvyh0uLLKgcf5joW2f6T7bUr3wU+XNJekl6qdE7USZJOjYhbK9Z0dURM6iDs1FW4GNvPkLS9pA1G1u+spdTOoKhIxxUcmqdgIyLurFBDlTfrpdT0fUnft7220m6+s23/Rmka8isRcW+BMnbSRPuN4WA+OHJjrgaeR4yMID9k+HGF0eOLbH9M0ieV/l7erAqHqiqNQG4aES8udcFxNh68RtKmSodS/kUVzjDpmkEfk6HGUyspjW4Ua4I4Us+6SrsnFgffiLikQh0nKzWEPFuTp25qTCMNFhjurdQf6B0RccJSfsu46qjeVdj2MyXtrDQC9umhH90p6TsR8YtSteR6tpb0BU2MeN2utCW7xinYB0bEx5f2XMF6HqzUnPIflEYGT1S6oXlKROxc4Ppv1USTysF/lb+fsw0ZXanJ3nScjlU6XGlXoZU2Zrx3sJ6zcC1nSlpQap3iOEd4njPG/+2+Gtxl3WZ7c6Xtxo+qUYjTCeCvlPTfmrgbC6VzrEr7t/xVne2tlMLObpJOV4U7H090Fd5o5M5wLaXRp2LyQv8f2f5ihbYFU/m8pP0j4seSZHtHpenZGjdSr1Da0TfslVM8N3a2v6G0ZvIEpQ+Q3+UfnWT7okJlrJH/+wSlHUCnKX2gLlA6k25OGgQa23vVHrHN9dwl6Z9KX3ca16ngOsWxBZ6OvDl2zWfzqMphSp1815D0rkq1vERpOLHmDiBJ6Q3B9mqq2EzO9ruVdtJdo9T08JCod0ZSl7oKD6xi+7NKAX14RLB0QL5zEHby9X9iu+i0lu29Jf29pEeP9ABaU/UaQ/6rpDMi4g7bh+Xg/t6IuCQinl6igKEF7mdJ2mow3Wj7SNU53LVrDtGSfw5TPTdWuT3LElM7lWYaplqnODZjm9JCt9k+VdIba605GKllgaSPSlo5Ih5te0ulrtTFetDYvl/SLzXRQ2TwD6PaVKztlSLi3jz1ubmkG2r9fdleqDSldbGGdveVmkrKH+BSmq5ZXWnBciitubo10gnqRdjeRKmvzAc0+U75TqU+K8WD8tA0+Y65ro9KemdEbFuhlmuVNq38JT9eRdLCiHhi6Vq6wB06By7XM7wVflWl5sD3RcTbS9ZRQ9GV2XOd7fdL+nBE3JYfryvprRFxWIVyBt02r9TkPkk1Gt0dqSWbyZVuVFbjTJspObXmPyYirsqLUH+qFDLWs/22iPjazP8LY3FfRBxb4boDR488PmLo+6J3bXn0+npJz3A6TmLr/KNrKo4KDkLo8yQdGxGn5ZGVGk6QdGHeARqSXiSp6DqVjunUiO0UNynn267So872BpLeriV3Lo9ltIkRnoI8xeGLpRehDl33KkmfkXSFhjqyRoXmjLZ/FhHbDv/5DO5YC9fxQkmPlXRFRJxZ8tojdVwVEU/O3x8kaeeIeKHth0k6ffQ1VKimIyX9QdI3NTkg3zLd72ldXn/xUaWgbkl/I+ngiCh6XEyu5buSblBaiPo0pZHKCyNifulacj1bKf15SNJ5NXaAdkluv/HliKje1NT2ekMP5ym9Xj4REUXPmMy1nKU06vU2pY0Rr1A6peEd47geIzxlreB0gOlgqHc1SatUquWmiJix2V5B1ZvJ2f6U0l3Gv0t6T24sV7q/y8DwuqrdlOf4I+L3rtdheLDT5OCh50IVOmK7/kncA4dJ2nowzZjvVr+vwufjZS9R2ijy0UiN5B6uyX9XReXdnsV3fHZVbr/xYNfvnC6lUabBLrr7lHZSv7pSLQ+OiM/n3Y2DDRJju+km8JT1FUnn2D5e6QX3KtUb6r3Y9geUFk8P37HXeJMabib3VaVmcu8tXMNOSusOFtleXdKPVb6h3cBttp+vNBS+g/Kbke0VJa1Wo6CI6MSUn7txEvfAvJE1VTer0tESuT/RN4Ye/07pwFl0x/Wq3Dk9X68T/5azwc7l3+UbmRslPWJcFyPwFBQRH7Z9haRnKaXr91ScOhlMi2w39FzRbem2V1Uaxnys0tTaMyqugbgnIhZJ6cPDFYdSJL1e0ickPUzSQTFxsN+zVGn7fg6Bb1HaSfe6PBL3hIj4buFSts+Lcy+PiHfbPlr1GtqdnvuIDNZUvVTS9yrVgu4ruiNplKfogD0s6nTCfm9ep/hWpc7ta2mM65pYw4NqbJ+klPB/LOm5kq6LiIMq1fInpTNupBRGN82Pa+7S2jEifjLy3BJnWhWq5SSlofB9I2LzPB3704jYsnAdg/VeF0jaQ2lU5cqIeFzJOnItH5L0M010Kz9P0nbjWn8A/DXyzMKoxY0io84RR0UReArK/UIGf+ArKx2seldErFWwhgVKW2evz4/fpbQt8XpJB0TEdQVruSLyeWt5uubCGgu48/U3mennNfpKTbWgveIi94si4ukjC8sXll4Ua/twpTvBZ2miNf7nIuLwknXkWqb6+ym+2B79UHpH0gx1DHfEVv7+dkkXR8RlhWt5vKRjJT0030htIekFETGWJQ1MaRUUEZOGMfOuoKI9GCS9T3kaK68T2Ueps/BTlXZt7V6wlsXn+0TEfTVnkaYKNLbXl3RzFL4rcMfOr8ruyaM6IUm2N9XQ2q9SogMncdt+o6T9JT3G9uVDP1pTUvHRN/TGiUo7kp6voR1JFep4mtJht99WCj3Pk/Qfkt5g+5SI+HDBWo5TWlz/GUmKiMttf1VjWsNJ4KkoIr5lu3SL78gLHKU0JfD53JfhYtv7F65lvu078veWtFp+PBhiLTnytZ2kD0q6RWmx8gmS1pc0z/a+EXFGqVqURv/WUPr3ORyS71BapFvDEZLOkPRI2ycqLaZ+ZamLu1sncX9V6diRJRoPzuVt+liqojuSZqpDqRP2HyXJ9hFKOwt3Upq2Lhl4Vo90GO/wc2Nbx0ngKWjkzXqeUsouPado22tI+pPStMCnhn626tS/ZTwiotZoxVT+RdI7Ja2tdOLzcyPiAttPVFqUWizwRPfOr1JEnG37EqXRQUs6MCJuKlhCZ07iziNKtyuNjALLquiOpBlsrMmtL+6VtElE/Nl26VHbm/Jo8WDkeE+NcXchgaesBUPf36d0cNr/KVzD/5N0mdJowTURcZEk2X6qKm1jzS/430bEX2zvrHQQ5Jcjd6QuZMWIOCvXc1REXCBJEXFtxam26udXeeJIh4HBa2Rj2xsXbGNwZ57eu1JTnMQN9EDRHUkz+KqkC2yflh8vkPQ1p1PUry5RgO23K3VPf5Okz0p6ou0blHoC7TO267Joee6xvZGkhyidb3N/fu7hSh/6v6lQz2VKo12PUurB822lLc9/V7CGxQtQRxejVlwoXPX8qlzDD2b4cZQKX3nYXZrmJO6IeE2JOoAWOJ2nNdhd+JPBjW/B639SaVr8TRFxfg5b8yIfODu26xJ4xs/2MZrhTjQiDihYjiTJ9jkR8aylPVeolksiYivbB0u6OyKO8RTHcIy5hkVKzcCs1NxvsM7JSgtjVypVy1BNF0fE05b+K+cOp1b0L46Jk7jXlHRKRDynbmXAzErvSOq6PHp8jKRrlf5cho84GsvIMVNaZQzS8w6SNtPEibl7afJhcmOXm/2tLml9p8NLB1MDa0nasGQtQ+61vbfSroXBtF/RgNGx9UQD38kLyTtxfpXtzZVev8Nbar9cuIzR9Qf3KI0MAl1XdEdS10XEJbYPlXSqUt+zwaDA2BrgEngKiIgvSZLtV0raJSLuzY8/LemswuW8XtJBSuHmYk0EnjuU+prUsJ/SNs33RcSvnE5K/0qlWrqkS+dXHSFpZ6XA8z2lRpE/kVQ68HASN/qq6I6kLrP9EKU1PI+RtGtELCxyXaa0yrH9c6XjE27Jj9eVdEHUOaX2zRFxTOnrop+cjkSZL+nSiJhv+6FKDf8WLOW3jqMWTuJG79g+XdI/Kk3BbpV3JL06Ip5bubTibP9SqQ3IcSX7nDHCU9YHJV1i+4f58TMlHVmjkLxOZnstuQOo9B374MN09EV/u9JU4Hsj4ubSNXWBu3N+lST9OSLut32f7bUk/UEVRpokTuJGb021I+nldUuqZtuIKN50kcBT1heVdtscpBR03qV0QGRxtk9Qmje9TBM7gELlpyik1MRtkdJ2SUl6mdJU2+1Kf2bFRxE64nilacft8+PfSjpFUo3Ac5HtdZTWIVws6Y+qd0o50Ec3KP2b/oGk9ZSWEbxC0lE1i6phEHZs76D0WbiJUh4ZNJ0dy80UU1oF2R6sRN81Ip6Up7TOioitK9RyjaTNSh+bME0t50fEDlM956HztuaarpxfNUVdj5K0VkRcvrRfCyCxfYak25RGJ4fbTBxdq6babF+r1ItotPXGWEb1GeEpa9s8d3upJEXErbZXrlTLlUqjS1WaDY5Yw/a2EfEzSbK9jdLRCtIcXdSXdeL8qnzt05R2F54WBQ+YBRryCNonLOH2iDi91MUIPGXda3sFTXyAbaCh3gOFrS/patsXavKW5xdUqOU1kr6Qj7yw0lDva3Izqg9UqKcrqp5fNeJjkl4q6QP5NXOSpO9GxN2V6gH65t9tPyUirqhdSIf8wPZHlI6GGf4cGssaPaa0CrL9cqUPja2UttLuKemwiDilQi3PnOr5fI5TFbntugsfKdFpth+sifOrLih8ftVU9ayg1CPjtZKeU/KAV6CPhjZlrCjpcZJ+qfThPlivskXF8qqappP72Dq4E3gKy4dRPkvpxX5ORFxTsZZNJD0uIr6fdwStMO7W3iPX3ycivmL7rZqiE3VEfKxULV0yxflVkxQ8v2qSPL22QBOh/bsR8eYatQB9kd9np9WVA4LnAqa0CouIa5VaaVdl+7WSXqe0W2BTSRspndtU8miJB+X/rjHFz+ZyEp9pEePYupDOxPZJkrZVmmL7pKQfDs5hAzA9As30cj+v90vaMCKea3szpV51nx/L9RjhmZvygZ3bSPrZ0A6gKjuibH9J0oGDqay8e+3oiHhV6VowNdvPkXR2RCxa6i8GgGWQmzEeL+nQ3NB0RaXmpmP5HGKEZ+76S0TcM2hznl9otdLvFsPrdvLutWIHh3ZZR86vkqTzJB1iuwtNEAH0mO0VI+I+SetHxMm2D5GkiLgvH+Q8FvPG9T+MzvuR7XdKWs32bkoN7b5TqZZ5eVRHkmR7PRHGB+dXHZO/dpH0YUk1dtFJ6S7sHk1ugjgnDz0E8FcbNC29K2/MGOxc3k6p4exYzPkPlTnsHUrbwa9QOlD0e5I+V6mWo5W2bH5d6YX/Eknvq1RLl+ypifOr9hucX1Wplk0j4qX5VHtFxJ89cgoiACyjwXvHWyR9W9Kmts+XtIHS+95YEHjmINvzJF0eEZsrHRVQVUR82fZFSotxLWmPiLi6clld0Jnzq9ShJogAem8D22/J339T6YbbSu8pz5Y0li7uBJ45KH+ILszrMX5dux5JygGHkDNZl86v6lITRAD9toLS7tzRUeLVx3lRdmnNUbbPlbS10gfoXYPnK3VaxlLUPL8qjwjuKekcdagJIoB+sn1JRMzYb2ws1yXwzE1d7LSMyUbOr7prab9+zLWcFxE71awBQBuGD0Quel0Cz9wzsoYHHZVD6UslPU9pJK7a+VW2D5f051zD8IjgLaVrAdBvtter8d5B4Jmj8jqMQ7qyhgfT68L5VbZ/pamP/6i1iBoAZoVFy3PXwyVdlU++Zg1PR01xftWXKpWymaT9Je2oFHx+rHQUCQD0AiM8cxRreLpv5Pyqk1Xx/CrbJ0u6Q9KJ+am9Ja0TES+pUQ8AzBaBB+ioLp1fZXthRMxf2nMA0FUcLTFH2b7T9h35627bi2zfUbsuTDI4v+qzkmT7cbafX6mWS3Pbd+VatpV0fqVaAGDWWMMzR0XEmsOPbb9Q6fR0dMfxSg0Hh8+vOkVSjQM7t5W0r+3BIveNJV1j+wpJERFbVKgJAJYZU1pYzPYFEbHd0n8lSrB9UUQ8fbhnRa1pJNubzPTziLi+VC0AsDwY4ZmjbO8x9HCepKdrim3HqKoz51cRaAD0HYFn7low9P19kq6TxJb0buH8KgB4gBB45q55kg6MiNskyfa6ko6W9KqaRSHJ3bDXlbSHJs6vOpDzqwBg+bCGZ46a6iyTWuebYGqcXwUADxy2pc9d8/KojqR0tokY8euas22/zfYjba83+KpdFAD0ESM8c5TtfSUdIunrSotiXyLpfRFxQtXCsBjnVwHAA4fAM4fZ3kzpUEpLOicirq5cEobkHVpLnF8VEX+uWhgA9BCBB+gozq8CgAcOgQfoKM6vAoAHDouWge7i/CoAeIAwwgN0lO1rJD1B0qTzqyTdL86vAoBZIfAAHcX5VQDwwCHwAACA5rGGBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8/4XE6AmpY7xenIAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 720x432 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "sns.heatmap(df.isna(), cbar=False, cmap='viridis', yticklabels=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import IterativeImputer" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from sklearn.experimental import enable_iterative_imputer\n", "from sklearn.impute import IterativeImputer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create IterativeImputer object with max_iterations and random_state=0" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "imputer = IterativeImputer(max_iter=10, random_state=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optional - converting df into numpy array" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "data = df.values" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "X = data[:, :-1]\n", "y = data[:, -1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fit the imputer model on dataset to perform iterative multivariate imputation" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "IterativeImputer(random_state=0)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "imputer.fit(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Trained imputer model is applied to dataset to create a copy of dataset with all filled missing values using transform( ) " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "X_transform = imputer.transform(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sanity Check: Whether missing values are filled or not" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Missing cells: 645\n" ] } ], "source": [ "print(f\"Missing cells: {sum(np.isnan(X).flatten())}\")" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Missing cells: 0\n" ] } ], "source": [ "print(f\"Missing cells: {sum(np.isnan(X_transform).flatten())}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's try to visualize the missing values." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<AxesSubplot:>" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAGqCAYAAAAP2J5ZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAtGklEQVR4nO3de7zu5Zz/8fd7d05n5VAUckzaiQ6qSUUTw/YjhUYTOcuofohJpeR8yPxoCCEkphLC6KQQjTSddmdjhkLxM50TqXaf+eO67r3ude+11t5r676u7/dar+fjsR6t+157+37sfe/7fn+vw+dyRAgAAKBl82oXAAAAMG4EHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzVtxph/uNm8v9qwDAIBeOPv+UzzdzxjhAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANG/F2gUAAIDZOfPGhbVLWGz3DefXLmGZEHgAAOiZvoSMLiHwAADQM4zwzB6BBwCAnulLyOgSFi0DAIDmEXgAAEDzmNICAKBnWMMzewQeAAB6pi8ho0sIPAAA9AwjPLNH4AEAoGf6EjK6hEXLAACgeYzwLIeuDCWS8AEAWDYEnuVA0AAAoF+Y0gIAAM1jhAcAgJ7pytIKqT+zHgQeAAB6pi8ho0uY0gIAAM0j8AAAgOYxpQUAQM+whmf2GOEBAADNY4QHAICe6cuoSpcwwgMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmken5eXQlTNM6LQJAMCyIfAsB4IGAAD9wpQWAABoHoEHAAA0j8ADAACaR+ABAADNY9EymtOVXXQSC9wBoCsIPGgOIQMAMIopLQAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j7O0lgOHUwIA0C8EnuVAyAAAoF+Y0gIAAM0j8AAAgOYReAAAQPMIPAAAoHksWkZz2EUHABhF4EFzCBkAgFEEnuXACAIAAP1C4FkOhAwAAPqFRcsAAKB5BB4AANA8prTQHNZYYTZ4vaCPeN3OniNi2h/uNm+v6X8IAADQIWfff4qn+xlTWgAAoHlMaQEAOo3pmyXxZzJ7BB4AQKf15QO1JP5MZo8pLQAA0DwCDwAAaB6BBwAANI/AAwAAmsei5eXQldXxLFoDAGDZEHiWA0EDAIB+YUoLAAA0j8ADAACaR+ABAADNI/AAAIDmsWgZwJzXhZ2XbIbAbHXhdSv157VL4AEw5/XlDRsYxut2dpjSAgAAzSPwAACA5jGlBQBAz3Rl/Y7Un6k1Ag8AAD3Tl5DRJQSeWSJVAwDQPwSeWSJkAADQPyxaBgAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+NBAAB6hq7/s8cIDwAAaB4jPAAA9ExfRlW6hBEeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmsS19OXSl4RPbEgEAWDaM8AAAgOYxwrMcGFkBAKBfGOEBAADNI/AAAIDmMaUFAEDPdGXzjNSfZR4EHgAAeqYvIaNLmNICAADNI/AAAIDmMaUFAEDPsIZn9gg8AAD0TF9CRpcwpQUAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8zg8FMCcxqnT6CNet7NH4AEwp/XlzRoYxut29pjSAgAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmcbQEAAA9w1las8cIDwAAaB4jPAAA9ExfRlW6hBEeAADQPAIPAABoHlNaAAD0DIuWZ48RHgAA0DxGeAAA6Jm+jKp0CSM8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGjeirULAAAAs3PmjQtrl7DY7hvOr13CMiHwAADQM30JGV3ClBYAAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABo3oq1CwCAms68cWHtEhbbfcP5tUtAT/C6nT0CD4A5rS9v1sAwXrezR+BZDiRrAAD6hcCzHAgZAICauPGePQIPAAA905eQ0SUEHgAAeoYRntkj8AAA0DN9CRldQh8eAADQPAIPAABoHoEHAAA0j8ADAACax6JlAAB6hl1as0fgAQCgZ/oSMrqEKS0AANA8Ag8AAGgegQcAADSPNTwA5jQWf6KPeN3OHoEHwJzWlzdrAH8dAg8AAD1DUJ89Ag+AOY2pAfQRr9vZI/AAmNP68mYNDON1O3vs0gIAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAADQM106WqIvCDwAAPQMR0vMHoEHAAA0j8ADAEDPMKU1ewQeAAB6himt2SPwAACA5hF4AABA8wg8AACgeQQeAADQvBVrFwAAAGanS7u0+rKAmsADAEDP9CVkdAlTWgAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACax2npy+HMGxfWLkESp+UCALCsCDzLgaABAEC/MKUFAACaR+ABAADNY0prObCGBwCAfiHwLAeCBgAA/ULgAQCgZ7oy0yD1ZxCAwAMAQM/0JWR0CYuWAQBA8wg8AACgeQQeAADQPNbwAADQMyxanj0CDwAAPdOXkNElTGkBAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADRvxdoFAA+0M29cWLuExXbfcH7tEgAAIvCgQYQMAMAoprQAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAALQvIsb+Jel1Ja7Tp1q6Uge1UAu1tFVLV+qgFmrpWi2lRnheV+g6y6IrtXSlDolapkMtU6OWqXWllq7UIVHLdKhlamOthSktAADQPAIPAABoXqnA89lC11kWXamlK3VI1DIdapkatUytK7V0pQ6JWqZDLVMbay3OC4UAAACaxZQWAABoHoEHAAA0j9PSAQBAcbbXkfS4/PA/I+L2cV5vrCM8tlez/YRxXmNZ2X5Q7Rq6wvYKtj9Su44B25vXrqGrbO9oe7/8/Qa2H12xlj1sf8z20bZfVKmGTW2vkr/f2fYB+U0TQE/YXtn2FyVdp7RQ+ThJ19n+gu2Vx3XdsQUe2wskXSbpjPx4S9vfHtf1Zqhje9tXS7omP55v+1Ol68jXfrzt42yfZfvcwVfpOiJikaSn2Xbpa0/j07YvtL1/7Q8v23vZXjN/f5jtb9jeqlItR0h6h6RD8lMrSfpKpVo+JekNkq6QdKWk19v+ZIVSTpW0yPZjJX1e0qMlfbVkAbbvtH3H0Nedw/8tXMvjbJ9m+0rbX7O9Ucnrd7iWt8z0Vammh9r+vO3T8+PNbL+6Qh2Pt32O7Svz4y1sH1a4jMOU3s8eGRFPjYgtJW2sNOt0+LguOrZdWrYvlrSrpB9GxFPzc5dHxBZjueD0dfxM0p6Svj1Ux5URUXxUwfZCSZ+WdLGkRYPnI+LiCrUcrTSUeIqku4Zq+UbpWnI9j5P0Kkl7SbpQ0vERcXaFOi6PiC1s7yjpA5I+KumdEbFthVouk/RUSZfU/DeUr3uVpM0jv2HYnifpioh4cuE6LomIrWwfLOnuiDjG9qWDP5+5xvaPJX1Z0nmSXiDpGRGxB7X4iJl+HhHvLlXLQA46x0s6NCLm215R0qUR8ZTCdfxI0sGSPlPrMzGHrW0i4k8jz68h6YJx1TLONTz3RcTtXRhEiIjfjNSxaLpfO2b3RcSxla49aj1JNyuF0oGQVCXwRMQv8l3GRZI+IempeQTqnYVD2OC18TxJx0bEabaPLHj9YfdERNgehIya07I/V7oDuz4/fqSkyyvUca/tvSW9QtKC/NxKFeqQlEaMJf1NfnheRJT+M1kzIo7L33/E9iWFr9/JWmoEmmWwfkScbPsQSYqI+2zX+CxaPSIuHPlMvK9wDfePhh1Jiog/Dt7vxmGcgedK238vaYV8936ApH8f4/Wm8xvb20uKPDd4gPL0VgXfsb2/pG9K+svgyYi4pXQhEbFf6WtOx/YWkvZTChlnS1oQEZfY3lDST1U2hN1g+zOSni3pQ3m9SK3djCfnWtax/VqlEbDPVarlwZKusX1hfry1pJ8Opqkj4gWF6thPaWrtfRHxK6c1TbWm+Q6U9FpNvD5PtP3ZiDimYBmr2n6qpMGn12rDjyOiZOjoTC22PzHTzyPigFK1DLnL9oOVbixleztJY12kO42bbG86VMeekn5XuIawva4mXivD7h/XRcc5pbW6pEMl/a3S/6kzJb0nIu4eywWnr2N9SR9X+gCzpLMkHRgRN5esI9fyqymejoh4TIVaHi/pWEkPjYjNc+h4QUS8t0It5yktWvt6RPx55Gf/EBEnFKxldUnPUZqu+YXth0t6SkScVaqGkXp20+R/Q+dFxF9m/l1jqeOZM/08In5UsJbVJG0cET8vdc1p6rhcadrmrvz4QZJ+WnLK0fYPZvhxRMSuM/y85VruUVprdrKkGzXywRoRXypVy1BNW0k6RtLmubYNJO0VEQsL1/EYpYXC20u6VdKvJO0TEdcVrOE6pWAzVeAZ22cinZbnqC7M43aJ7fVm+nmNUTjbX4iIVw09XkPSaRHxrAq1/KOkEyPi1tLXHqljgdK6qpUj4tG2t5R0VMERpuFarpC09eAmzvaqkv6j9JoMLCmPpOwl6aVK0zUnSTq15us3jxYvkvQEpQ/6n0uaV+MGJtfzoHz9O2tcv4YHfErL9neUh8qmUvqNaZqhzdslXRQRpxWuZSVJb5S0U37qh0qB496SdWRdmMeVtHjB8gckbSZp1cHzhUe+LlZ63Vpprcqt+ft1JP1aaTdQaTfYPjYi3piHf/9NaSSshodJ+o+8LuMLks6MOndLR0raRunfjiLiMtfbqn+8pJ/Z/mZ+/EKlnWNF2d5E0l0RcVOeJtlR0n9FxLfmai15BP/TSjtAN5K0t6SrbL+j5IjxiJ9GxFaSrho8kf89Fd0Fmqdij5d0p6Tj8sjTP5UcxfZSdr6Oa/pzHGt4PjqG/82/xqqSnqi0G0mSXqz0gnu17V0i4qCCtRyrtMBysC3+H/JzrylYw0AX5nEHjpd0hKR/lrSL0jqNoqvdI+LRkmT700o7+r6XHz9XaTq0uIg43PaHck1Pk/TBiDi1Ui2H2T5caXptP0n/YvtkSZ+PiP8uWMpUmyGqDFNHxMfySOkOSq/X/SLi0pI12H6X0gLusP2vSq/VH0p6nu2dS76/damWoZq2Ugo7u0k6XenGpnQND5O0kUbWNElaS9LqpeuR9KqI+Ljt3SU9ROnf8/FKyz1KOXro+6dp8t9LaPJmmgfMAx54Ss7lL6PHSto1Iu6TJNvHKv3F7qbUU6SkrSNi/tDjc522qtfwJqV53CfavkFpHvfllWpZLSLOse2IuF7SkU5bXGfcWjomW0fEGwYPIuJ02+8pWYDt4a28Fyr1pbhQ6YNkj1qtA/KOsd9L+r3SaOC6kr5u++yIeHuhMrqyGWLgMqUbhRUlyfbGEfHrgtd/maQnKX1w/lrSwyLiT05bni8rWEenarH9bknPV9qg8q+SDhl8BlSwu6RXSnqEpI8NPX+npHdWqGcQuP5Oqf3HQrvsduqI2GVxMamtxC4z/foHyth2aXVkmkJKyfpBmlgN/yBJG0bEItul504X2d50cEecF4/V2iK/bkQ8e3geN6+PuH5pv3EM7nbq6/KLvFbkBqU7jxpuctoe/xWlO419lLbvl7Rg5PGlSiODC1SpdYDtA5Tu3m9S2il2cETcO/h7k1Qq8LxZaTPEX5QaDp4pqWggHbD9ZqVQ/v+V/h1b6e+nZJ+kuyPiHkn32P7vwVbfvOX5noJ1dK2WwyX9UtL8/PX+/JnuVFK5heV5gfSXbL+41gjtiIttn6U0TX+IU6PVse2MWgbFRmjHuS29+jRF9mFJl9n+Yb7+Tkov/gdJ+n7hWg6W9APbv8y1bKL051LDcbZfERFXSJLtl0n6v5K+U6GWg5TuCg9Q+vDaVenDtYa9lV6331T6h3hefq6YLrUMGLK+pD3yCNxiEXG/7ecXrON5EXGoUuiRlLpja2LKuqQDJT2hxo7PIevkEUFLWmtodNCS1p7DtVQ7gmU6EXGq7edJerImDwIcVbiUV0vaUtIv8wjcg1Xvc6iosXZajoin2b5isGvB9o8j4m+W9nvHUMuGSutlrlUa4fltRJxXuo5cyyqaWKV/bcUV+o+R9HWlaawdJe0r6fkx5sPb+sL2GhHxx8o1PEJpG+sOSuHrJ0otFX5bsIZO7V5z7rS8tOcK1fIDSbtVnCqR7eNn+nnJ8NylWqbi1KLk5kqL7QfrA1dXGgD4nNIJABdGRI3jJV6gic0zP4qIoje6to/RxMjOy5SmHReLMfVJGmfgOV+pA+nXJZ2rNE3xwYgoepio7dco3Yk9QmkeeTul1fIle0LsGhHnjqzNWKzWmgynXjzfkvQbSS+MkR44hWp4hfKdcn7qGkmfiIgvl64l17O90pvRGhGxsVMn3ddHxP4VajlbadpmsKtkH0kvj4jdCtbwK03sXnu4Jvc0GVu/jCnqeK7SmoOXKG0xHlhL0mYRsU2JOnItg7OYnqz0uv03TW4k+rGpfh/KyTvEPijpFqVR4xOURinnSdo3Is6oUNPg2JrBf9eQ9I2I+NvCdXxQqXHoifmpvZV2LR8y/e96wGuYcQQ/xtQnaZxTWgdp8jTFLkqjCKUdqPSXe0FE7GL7iZJKtx1/plLoG12bIRVek+HUO2Q45a4naQWl7bUqObdte1+l18lbJF2i9EG6lVJbelUKPf+stMhw0EF4oe2dZv4tY7NBRAzfNX/R9kElCxjsXpMWLy6sdWbVjUrHjrxAk3d03Kk0FVvSmvm/v85fK+evaka3Giv9Oyq61XikntpTN/+itCB4baX33udGxAX5/f9ryodaFza4ofxTnnW4WXWm3v5O0pYRcb8k2f6S0jrBYoFH6aZlzYj4n+EnbT9E0tgO4B1n4AmlVL2JJs66OU5lF/RJaSHd3bZle5WIuNZ20VGmiBjsNjoqIiZ1W3b5HiIl11sszf6SXhSTO3yea/vFSkOcVUZ5ojtnr91kex+lN2gp3YnVXC9SrUtppG60C21/Vel9q2an5Q8rvVn/YfhJ2w9VnaMCpG5sNZY0/dRN4TJWHIQ920dFxAWSlN//C5ey2HdtryPpI0o3eKF6R8WsozT6JZVfXyWl8xLP0JI3+7spLbF44zguOs4zgk5U+gf3YqUP2edr6hGOcfttfpF9S9LZtk9TulusYaoV+l8vWUBEXD/4UnrRL8hf64wuSC1grZiinXl+bq3CtQxMOnvN9ttU7+y1VylN4fxeaevznvm5uew5SlPTZ0iS7S2dz/Mq6ONKb8qjnq00QljDEluNh54rbfuI2FfSrZEO8XyG0mGzJQ3vOhqdqq/Vt+k9EXFb3qm1iVJ/uA9WKOUDki61/cU8unOxpPcXrmHHqZZyRMSJmlhb9IAb5wjP/0RE6TeiJUTEi/K3R+ZFhmur8HBmHkZ9sqS1R9bxrKWhId/CNY0efPgVlz/4cKY1Q8XXE2VvUPpA20jSb5XukN9UugjbK0h6f1Q4MmGkjsF6FUt6yNBjSVXWqxypJTstP6pwDTtGxOtGn4yIE23X6KsidWurcRembubbvkPpdbta/l75cfH3XKduzw+XdHneur+20nT+KyVtWLKWiPha3rW8tdKfxzsi4vcla9DMYXxsAzHjDDxH2P6cpHM0eUFflQW6+dq1miI+QWmEax1NHuW6Uyl01PBqSdvGxMGHH1I6mbxk4HmS0wGMoyyp+IGqkhQRN6leA8bhOhbZ3sD2yvkNspY1h74/buRxjTvlqTotl1blzXopurTVuPrUTUSsUPJ6M8nr7g6V9F+SVrH9caUGhF9W6jJcup4XSTp3MCBhex3bL4yyx3/8wfY2ETFpqtP21pL+Z5rf81cbZ+DZT2nIbiVN3GlUaZpWW6Qzu06z/YyI+GntejJr8tqUQeO0kp5U+HrT8tRnri02rm2SS3GdpPPzlM1dQ7UUG1XJUxKyvUNEnD/8M9s7lKpjSBc6LVd5s16Kk5WWEFwmLT5Lqsp6r4gYNII81fZ3Ja06x9tdvE6pX9MttjdWCj47DdYVVXBERAzOf1NE3Gb7CKVlH6UcLOlk21/UxCaEpyttbHrZuC46zsAzPzg1eNSltt+kJXcv1FiXUf3gw4i4Pk/dnBkRVc6rGvIGSVcqfXAMb72u6cb8NU8TIyu1Fg4foyUPOZzquXEb7rT8NdXptFzlzXopPq10k/kJ26dI+mJEXFuygJnab+Rdl3PuZje7e9CvKiJ+bfs/K4YdaepRyHFmgSVEOrh6G6XlAq/MT1+lNOvwh2l/419pnP8nL7C9WURcPcZr9M0JSs0Pd5d0lNLUSZUFsZEOPvyh0uLLKgcf5joW2f6T7bUr3wU+XNJekl6qdE7USZJOjYhbK9Z0dURM6iDs1FW4GNvPkLS9pA1G1u+spdTOoKhIxxUcmqdgIyLurFBDlTfrpdT0fUnft7220m6+s23/Rmka8isRcW+BMnbSRPuN4WA+OHJjrgaeR4yMID9k+HGF0eOLbH9M0ieV/l7erAqHqiqNQG4aES8udcFxNh68RtKmSodS/kUVzjDpmkEfk6HGUyspjW4Ua4I4Us+6SrsnFgffiLikQh0nKzWEPFuTp25qTCMNFhjurdQf6B0RccJSfsu46qjeVdj2MyXtrDQC9umhH90p6TsR8YtSteR6tpb0BU2MeN2utCW7xinYB0bEx5f2XMF6HqzUnPIflEYGT1S6oXlKROxc4Ppv1USTysF/lb+fsw0ZXanJ3nScjlU6XGlXoZU2Zrx3sJ6zcC1nSlpQap3iOEd4njPG/+2+Gtxl3WZ7c6Xtxo+qUYjTCeCvlPTfmrgbC6VzrEr7t/xVne2tlMLObpJOV4U7H090Fd5o5M5wLaXRp2LyQv8f2f5ihbYFU/m8pP0j4seSZHtHpenZGjdSr1Da0TfslVM8N3a2v6G0ZvIEpQ+Q3+UfnWT7okJlrJH/+wSlHUCnKX2gLlA6k25OGgQa23vVHrHN9dwl6Z9KX3ca16ngOsWxBZ6OvDl2zWfzqMphSp1815D0rkq1vERpOLHmDiBJ6Q3B9mqq2EzO9ruVdtJdo9T08JCod0ZSl7oKD6xi+7NKAX14RLB0QL5zEHby9X9iu+i0lu29Jf29pEeP9ABaU/UaQ/6rpDMi4g7bh+Xg/t6IuCQinl6igKEF7mdJ2mow3Wj7SNU53LVrDtGSfw5TPTdWuT3LElM7lWYaplqnODZjm9JCt9k+VdIba605GKllgaSPSlo5Ih5te0ulrtTFetDYvl/SLzXRQ2TwD6PaVKztlSLi3jz1ubmkG2r9fdleqDSldbGGdveVmkrKH+BSmq5ZXWnBciitubo10gnqRdjeRKmvzAc0+U75TqU+K8WD8tA0+Y65ro9KemdEbFuhlmuVNq38JT9eRdLCiHhi6Vq6wB06By7XM7wVflWl5sD3RcTbS9ZRQ9GV2XOd7fdL+nBE3JYfryvprRFxWIVyBt02r9TkPkk1Gt0dqSWbyZVuVFbjTJspObXmPyYirsqLUH+qFDLWs/22iPjazP8LY3FfRBxb4boDR488PmLo+6J3bXn0+npJz3A6TmLr/KNrKo4KDkLo8yQdGxGn5ZGVGk6QdGHeARqSXiSp6DqVjunUiO0UNynn267So872BpLeriV3Lo9ltIkRnoI8xeGLpRehDl33KkmfkXSFhjqyRoXmjLZ/FhHbDv/5DO5YC9fxQkmPlXRFRJxZ8tojdVwVEU/O3x8kaeeIeKHth0k6ffQ1VKimIyX9QdI3NTkg3zLd72ldXn/xUaWgbkl/I+ngiCh6XEyu5buSblBaiPo0pZHKCyNifulacj1bKf15SNJ5NXaAdkluv/HliKje1NT2ekMP5ym9Xj4REUXPmMy1nKU06vU2pY0Rr1A6peEd47geIzxlreB0gOlgqHc1SatUquWmiJix2V5B1ZvJ2f6U0l3Gv0t6T24sV7q/y8DwuqrdlOf4I+L3rtdheLDT5OCh50IVOmK7/kncA4dJ2nowzZjvVr+vwufjZS9R2ijy0UiN5B6uyX9XReXdnsV3fHZVbr/xYNfvnC6lUabBLrr7lHZSv7pSLQ+OiM/n3Y2DDRJju+km8JT1FUnn2D5e6QX3KtUb6r3Y9geUFk8P37HXeJMabib3VaVmcu8tXMNOSusOFtleXdKPVb6h3cBttp+vNBS+g/Kbke0VJa1Wo6CI6MSUn7txEvfAvJE1VTer0tESuT/RN4Ye/07pwFl0x/Wq3Dk9X68T/5azwc7l3+UbmRslPWJcFyPwFBQRH7Z9haRnKaXr91ScOhlMi2w39FzRbem2V1Uaxnys0tTaMyqugbgnIhZJ6cPDFYdSJL1e0ickPUzSQTFxsN+zVGn7fg6Bb1HaSfe6PBL3hIj4buFSts+Lcy+PiHfbPlr1GtqdnvuIDNZUvVTS9yrVgu4ruiNplKfogD0s6nTCfm9ep/hWpc7ta2mM65pYw4NqbJ+klPB/LOm5kq6LiIMq1fInpTNupBRGN82Pa+7S2jEifjLy3BJnWhWq5SSlofB9I2LzPB3704jYsnAdg/VeF0jaQ2lU5cqIeFzJOnItH5L0M010Kz9P0nbjWn8A/DXyzMKoxY0io84RR0UReArK/UIGf+ArKx2seldErFWwhgVKW2evz4/fpbQt8XpJB0TEdQVruSLyeWt5uubCGgu48/U3mennNfpKTbWgveIi94si4ukjC8sXll4Ua/twpTvBZ2miNf7nIuLwknXkWqb6+ym+2B79UHpH0gx1DHfEVv7+dkkXR8RlhWt5vKRjJT0030htIekFETGWJQ1MaRUUEZOGMfOuoKI9GCS9T3kaK68T2Ueps/BTlXZt7V6wlsXn+0TEfTVnkaYKNLbXl3RzFL4rcMfOr8ruyaM6IUm2N9XQ2q9SogMncdt+o6T9JT3G9uVDP1pTUvHRN/TGiUo7kp6voR1JFep4mtJht99WCj3Pk/Qfkt5g+5SI+HDBWo5TWlz/GUmKiMttf1VjWsNJ4KkoIr5lu3SL78gLHKU0JfD53JfhYtv7F65lvu078veWtFp+PBhiLTnytZ2kD0q6RWmx8gmS1pc0z/a+EXFGqVqURv/WUPr3ORyS71BapFvDEZLOkPRI2ycqLaZ+ZamLu1sncX9V6diRJRoPzuVt+liqojuSZqpDqRP2HyXJ9hFKOwt3Upq2Lhl4Vo90GO/wc2Nbx0ngKWjkzXqeUsouPado22tI+pPStMCnhn626tS/ZTwiotZoxVT+RdI7Ja2tdOLzcyPiAttPVFqUWizwRPfOr1JEnG37EqXRQUs6MCJuKlhCZ07iziNKtyuNjALLquiOpBlsrMmtL+6VtElE/Nl26VHbm/Jo8WDkeE+NcXchgaesBUPf36d0cNr/KVzD/5N0mdJowTURcZEk2X6qKm1jzS/430bEX2zvrHQQ5Jcjd6QuZMWIOCvXc1REXCBJEXFtxam26udXeeJIh4HBa2Rj2xsXbGNwZ57eu1JTnMQN9EDRHUkz+KqkC2yflh8vkPQ1p1PUry5RgO23K3VPf5Okz0p6ou0blHoC7TO267Joee6xvZGkhyidb3N/fu7hSh/6v6lQz2VKo12PUurB822lLc9/V7CGxQtQRxejVlwoXPX8qlzDD2b4cZQKX3nYXZrmJO6IeE2JOoAWOJ2nNdhd+JPBjW/B639SaVr8TRFxfg5b8yIfODu26xJ4xs/2MZrhTjQiDihYjiTJ9jkR8aylPVeolksiYivbB0u6OyKO8RTHcIy5hkVKzcCs1NxvsM7JSgtjVypVy1BNF0fE05b+K+cOp1b0L46Jk7jXlHRKRDynbmXAzErvSOq6PHp8jKRrlf5cho84GsvIMVNaZQzS8w6SNtPEibl7afJhcmOXm/2tLml9p8NLB1MDa0nasGQtQ+61vbfSroXBtF/RgNGx9UQD38kLyTtxfpXtzZVev8Nbar9cuIzR9Qf3KI0MAl1XdEdS10XEJbYPlXSqUt+zwaDA2BrgEngKiIgvSZLtV0raJSLuzY8/LemswuW8XtJBSuHmYk0EnjuU+prUsJ/SNs33RcSvnE5K/0qlWrqkS+dXHSFpZ6XA8z2lRpE/kVQ68HASN/qq6I6kLrP9EKU1PI+RtGtELCxyXaa0yrH9c6XjE27Jj9eVdEHUOaX2zRFxTOnrop+cjkSZL+nSiJhv+6FKDf8WLOW3jqMWTuJG79g+XdI/Kk3BbpV3JL06Ip5bubTibP9SqQ3IcSX7nDHCU9YHJV1i+4f58TMlHVmjkLxOZnstuQOo9B374MN09EV/u9JU4Hsj4ubSNXWBu3N+lST9OSLut32f7bUk/UEVRpokTuJGb021I+nldUuqZtuIKN50kcBT1heVdtscpBR03qV0QGRxtk9Qmje9TBM7gELlpyik1MRtkdJ2SUl6mdJU2+1Kf2bFRxE64nilacft8+PfSjpFUo3Ac5HtdZTWIVws6Y+qd0o50Ec3KP2b/oGk9ZSWEbxC0lE1i6phEHZs76D0WbiJUh4ZNJ0dy80UU1oF2R6sRN81Ip6Up7TOioitK9RyjaTNSh+bME0t50fEDlM956HztuaarpxfNUVdj5K0VkRcvrRfCyCxfYak25RGJ4fbTBxdq6babF+r1ItotPXGWEb1GeEpa9s8d3upJEXErbZXrlTLlUqjS1WaDY5Yw/a2EfEzSbK9jdLRCtIcXdSXdeL8qnzt05R2F54WBQ+YBRryCNonLOH2iDi91MUIPGXda3sFTXyAbaCh3gOFrS/patsXavKW5xdUqOU1kr6Qj7yw0lDva3Izqg9UqKcrqp5fNeJjkl4q6QP5NXOSpO9GxN2V6gH65t9tPyUirqhdSIf8wPZHlI6GGf4cGssaPaa0CrL9cqUPja2UttLuKemwiDilQi3PnOr5fI5TFbntugsfKdFpth+sifOrLih8ftVU9ayg1CPjtZKeU/KAV6CPhjZlrCjpcZJ+qfThPlivskXF8qqappP72Dq4E3gKy4dRPkvpxX5ORFxTsZZNJD0uIr6fdwStMO7W3iPX3ycivmL7rZqiE3VEfKxULV0yxflVkxQ8v2qSPL22QBOh/bsR8eYatQB9kd9np9WVA4LnAqa0CouIa5VaaVdl+7WSXqe0W2BTSRspndtU8miJB+X/rjHFz+ZyEp9pEePYupDOxPZJkrZVmmL7pKQfDs5hAzA9As30cj+v90vaMCKea3szpV51nx/L9RjhmZvygZ3bSPrZ0A6gKjuibH9J0oGDqay8e+3oiHhV6VowNdvPkXR2RCxa6i8GgGWQmzEeL+nQ3NB0RaXmpmP5HGKEZ+76S0TcM2hznl9otdLvFsPrdvLutWIHh3ZZR86vkqTzJB1iuwtNEAH0mO0VI+I+SetHxMm2D5GkiLgvH+Q8FvPG9T+MzvuR7XdKWs32bkoN7b5TqZZ5eVRHkmR7PRHGB+dXHZO/dpH0YUk1dtFJ6S7sHk1ugjgnDz0E8FcbNC29K2/MGOxc3k6p4exYzPkPlTnsHUrbwa9QOlD0e5I+V6mWo5W2bH5d6YX/Eknvq1RLl+ypifOr9hucX1Wplk0j4qX5VHtFxJ89cgoiACyjwXvHWyR9W9Kmts+XtIHS+95YEHjmINvzJF0eEZsrHRVQVUR82fZFSotxLWmPiLi6clld0Jnzq9ShJogAem8D22/J339T6YbbSu8pz5Y0li7uBJ45KH+ILszrMX5dux5JygGHkDNZl86v6lITRAD9toLS7tzRUeLVx3lRdmnNUbbPlbS10gfoXYPnK3VaxlLUPL8qjwjuKekcdagJIoB+sn1JRMzYb2ws1yXwzE1d7LSMyUbOr7prab9+zLWcFxE71awBQBuGD0Quel0Cz9wzsoYHHZVD6UslPU9pJK7a+VW2D5f051zD8IjgLaVrAdBvtter8d5B4Jmj8jqMQ7qyhgfT68L5VbZ/pamP/6i1iBoAZoVFy3PXwyVdlU++Zg1PR01xftWXKpWymaT9Je2oFHx+rHQUCQD0AiM8cxRreLpv5Pyqk1Xx/CrbJ0u6Q9KJ+am9Ja0TES+pUQ8AzBaBB+ioLp1fZXthRMxf2nMA0FUcLTFH2b7T9h35627bi2zfUbsuTDI4v+qzkmT7cbafX6mWS3Pbd+VatpV0fqVaAGDWWMMzR0XEmsOPbb9Q6fR0dMfxSg0Hh8+vOkVSjQM7t5W0r+3BIveNJV1j+wpJERFbVKgJAJYZU1pYzPYFEbHd0n8lSrB9UUQ8fbhnRa1pJNubzPTziLi+VC0AsDwY4ZmjbO8x9HCepKdrim3HqKoz51cRaAD0HYFn7low9P19kq6TxJb0buH8KgB4gBB45q55kg6MiNskyfa6ko6W9KqaRSHJ3bDXlbSHJs6vOpDzqwBg+bCGZ46a6iyTWuebYGqcXwUADxy2pc9d8/KojqR0tokY8euas22/zfYjba83+KpdFAD0ESM8c5TtfSUdIunrSotiXyLpfRFxQtXCsBjnVwHAA4fAM4fZ3kzpUEpLOicirq5cEobkHVpLnF8VEX+uWhgA9BCBB+gozq8CgAcOgQfoKM6vAoAHDouWge7i/CoAeIAwwgN0lO1rJD1B0qTzqyTdL86vAoBZIfAAHcX5VQDwwCHwAACA5rGGBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8/4XE6AmpY7xenIAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 720x432 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "sns.heatmap(df.isna(), cbar=False, cmap='viridis', yticklabels=False)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "ename": "AttributeError", "evalue": "'numpy.ndarray' object has no attribute 'isna'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m<ipython-input-19-d6d134095446>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfigure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfigsize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0msns\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mheatmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_transform\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misna\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcbar\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcmap\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'viridis'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0myticklabels\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m: 'numpy.ndarray' object has no attribute 'isna'" ] }, { "data": { "text/plain": [ "<Figure size 720x432 with 0 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "sns.heatmap(X_transform.isna(), cbar=False, cmap='viridis', yticklabels=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What's the issue here?\n", "#### Hint: Heatmap needs a DataFrame and not a Numpy Array" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>0</th>\n", " <th>1</th>\n", " <th>2</th>\n", " <th>3</th>\n", " <th>4</th>\n", " <th>5</th>\n", " <th>6</th>\n", " <th>7</th>\n", " <th>8</th>\n", " <th>9</th>\n", " <th>10</th>\n", " <th>11</th>\n", " <th>12</th>\n", " <th>13</th>\n", " <th>14</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1.0</td>\n", " <td>39.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>195.0</td>\n", " <td>106.0</td>\n", " <td>70.0</td>\n", " <td>26.97</td>\n", " <td>80.0</td>\n", " <td>77.00000</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0.0</td>\n", " <td>46.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>250.0</td>\n", " <td>121.0</td>\n", " <td>81.0</td>\n", " <td>28.73</td>\n", " <td>95.0</td>\n", " <td>76.00000</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1.0</td>\n", " <td>48.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>20.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>245.0</td>\n", " <td>127.5</td>\n", " <td>80.0</td>\n", " <td>25.34</td>\n", " <td>75.0</td>\n", " <td>70.00000</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>0.0</td>\n", " <td>61.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>30.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>225.0</td>\n", " <td>150.0</td>\n", " <td>95.0</td>\n", " <td>28.58</td>\n", " <td>65.0</td>\n", " <td>103.00000</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>0.0</td>\n", " <td>46.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>23.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>285.0</td>\n", " <td>130.0</td>\n", " <td>84.0</td>\n", " <td>23.10</td>\n", " <td>85.0</td>\n", " <td>85.00000</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>4233</th>\n", " <td>1.0</td>\n", " <td>50.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>313.0</td>\n", " <td>179.0</td>\n", " <td>92.0</td>\n", " <td>25.97</td>\n", " <td>66.0</td>\n", " <td>86.00000</td>\n", " </tr>\n", " <tr>\n", " <th>4234</th>\n", " <td>1.0</td>\n", " <td>51.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>43.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>207.0</td>\n", " <td>126.5</td>\n", " <td>80.0</td>\n", " <td>19.71</td>\n", " <td>65.0</td>\n", " <td>68.00000</td>\n", " </tr>\n", " <tr>\n", " <th>4235</th>\n", " <td>0.0</td>\n", " <td>48.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>20.0</td>\n", " <td>0.01547</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>248.0</td>\n", " <td>131.0</td>\n", " <td>72.0</td>\n", " <td>22.00</td>\n", " <td>84.0</td>\n", " <td>86.00000</td>\n", " </tr>\n", " <tr>\n", " <th>4236</th>\n", " <td>0.0</td>\n", " <td>44.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>15.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>210.0</td>\n", " <td>126.5</td>\n", " <td>87.0</td>\n", " <td>19.16</td>\n", " <td>86.0</td>\n", " <td>77.74894</td>\n", " </tr>\n", " <tr>\n", " <th>4237</th>\n", " <td>0.0</td>\n", " <td>52.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.00000</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>269.0</td>\n", " <td>133.5</td>\n", " <td>83.0</td>\n", " <td>21.47</td>\n", " <td>80.0</td>\n", " <td>107.00000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>4238 rows × 15 columns</p>\n", "</div>" ], "text/plain": [ " 0 1 2 3 4 5 6 7 8 9 10 11 \\\n", "0 1.0 39.0 4.0 0.0 0.0 0.00000 0.0 0.0 0.0 195.0 106.0 70.0 \n", "1 0.0 46.0 2.0 0.0 0.0 0.00000 0.0 0.0 0.0 250.0 121.0 81.0 \n", "2 1.0 48.0 1.0 1.0 20.0 0.00000 0.0 0.0 0.0 245.0 127.5 80.0 \n", "3 0.0 61.0 3.0 1.0 30.0 0.00000 0.0 1.0 0.0 225.0 150.0 95.0 \n", "4 0.0 46.0 3.0 1.0 23.0 0.00000 0.0 0.0 0.0 285.0 130.0 84.0 \n", "... ... ... ... ... ... ... ... ... ... ... ... ... \n", "4233 1.0 50.0 1.0 1.0 1.0 0.00000 0.0 1.0 0.0 313.0 179.0 92.0 \n", "4234 1.0 51.0 3.0 1.0 43.0 0.00000 0.0 0.0 0.0 207.0 126.5 80.0 \n", "4235 0.0 48.0 2.0 1.0 20.0 0.01547 0.0 0.0 0.0 248.0 131.0 72.0 \n", "4236 0.0 44.0 1.0 1.0 15.0 0.00000 0.0 0.0 0.0 210.0 126.5 87.0 \n", "4237 0.0 52.0 2.0 0.0 0.0 0.00000 0.0 0.0 0.0 269.0 133.5 83.0 \n", "\n", " 12 13 14 \n", "0 26.97 80.0 77.00000 \n", "1 28.73 95.0 76.00000 \n", "2 25.34 75.0 70.00000 \n", "3 28.58 65.0 103.00000 \n", "4 23.10 85.0 85.00000 \n", "... ... ... ... \n", "4233 25.97 66.0 86.00000 \n", "4234 19.71 65.0 68.00000 \n", "4235 22.00 84.0 86.00000 \n", "4236 19.16 86.0 77.74894 \n", "4237 21.47 80.0 107.00000 \n", "\n", "[4238 rows x 15 columns]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_transform = pd.DataFrame(data=X_transform)\n", "df_transform" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<AxesSubplot:>" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAFlCAYAAADvZjI4AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAANFklEQVR4nO3cW4ytB1nG8eelRaFVoYBV6a4pmAYhTSmVNFVMo1RMwWYjGpISTJpo5AZjMRqFNCEhxguD8XBhNAa0RLEmnDw0EdrgAW9QoXSXXTelRSst1G6iUdQmHOzrxfq2GXY3dV2w5lv7ze+XTOZ08T1ZM/PNf61vzVR3BwBgsietPQAAYNcEDwAwnuABAMYTPADAeIIHABhP8AAA4537RJ982ZNe7W/WAYCzwh2Pvau+2uc8wgMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAwnuABAMYTPADAeIIHABhP8AAA4wkeAGA8wQMAjCd4AIDxBA8AMJ7gAQDGEzwAwHiCBwAYT/AAAOMJHgBgPMEDAIwneACA8QQPADCe4AEAxhM8AMB41d2Hc6Cq13X37xzKwbZk03b2cVOyn7ts2o5N29vHXTZtx6btHcauw3yE53WHeKxt2bSdfdyU7Ocum7Zj0/b2cZdN27Fpezvf5ZIWADCe4AEAxjvM4Nm7a4axaVv7uCnZz102bcem7e3jLpu2Y9P2dr7r0J60DACwFpe0AIDxdh48VXVdVd1bVfdX1Rt3fbxtVNXvVtXJqjq+9pZTquriqvrLqjpRVfdU1U17sOkpVfV3VXVs2fSWtTedUlXnVNXHquq2tbckSVU9UFUfr6q7quoja+85paqeXlXvrqpPLN9b373ynuctt9Gpl89X1RvW3LTs+pnle/x4Vd1aVU/Zg003LXvuWfM2OtP5sqqeUVV3VNV9y+sL9mDTq5fb6rGqevFh7nmCTW9dfvburqr3VdXT92DTLy577qqq26vq2WtvOvC5n6uqrqpn7eLYOw2eqjonyW8meXmSFyR5TVW9YJfH3NItSa5be8RpvpzkZ7v7+UmuTvL6PbitvpDkpd39wiRXJLmuqq5ed9L/uSnJibVHnOb7u/uK7j70k+0T+I0k7+/u70zywqx8m3X3vcttdEWS70ryaJL3rbmpqi5K8tNJXtzdlyU5J8kNK2+6LMlPJrkqm6/b9VV16Upzbsnjz5dvTPLB7r40yQeX99fedDzJjyT50CFvOeWWPH7THUku6+7Lk3wyyZv2YNNbu/vy5WfwtiRv3oNNqaqLk7wsyad3deBdP8JzVZL7u/sfu/uLSf4oySt3fMz/V3d/KMm/rb3joO5+uLvvXN7+z2x+MV208qbu7v9a3n3y8rL6k76q6kiSH0rytrW37LOq+qYk1yR5e5J09xe7+99XHfWVrk3yqe7+57WHJDk3yVOr6twk5yX57Mp7np/kw939aHd/OclfJ3nVGkO+yvnylUnesbz9jiQ/vPam7j7R3fce5o7Tjn+mTbcvX78k+XCSI3uw6fMH3j0/h3xOf4Lfv7+W5Od3uWfXwXNRkgcPvP9QVv4lfjaoqkuSvCjJ36485dSlo7uSnExyR3evvinJr2fzg/HYyjsO6iS3V9VHq2pf/rHXc5N8LsnvLZf/3lZV56896oAbkty69oju/kySX8nmnuXDSf6ju29fd1WOJ7mmqp5ZVecleUWSi1fedNC3dPfDyebOWpILV95zNvjxJH++9ogkqapfqqoHk7w2h/8Iz5n2HE3yme4+tsvj7Dp46gwfW/0Rgn1WVd+Q5D1J3nBaia+iu/9neejzSJKrlofaV1NV1yc52d0fXXPHGbyku6/M5vLt66vqmrUHZfOoxZVJfqu7X5Tkv3P4lx7OqKq+LsnRJO/agy0XZPOIxXOSPDvJ+VX1Y2tu6u4TSX45m0si709yLJvL3pyFqurmbL5+71x7S5J0983dfXE2e35qzS1L0N+cQwivXQfPQ/nKeyVHsv5DxXurqp6cTey8s7vfu/aeg5ZLIX+V9Z/79JIkR6vqgWwukb60qv5g3UlJd392eX0ym+ekXLXuoiSbn7+HDjwq9+5sAmgfvDzJnd39yNpDkvxAkn/q7s9195eSvDfJ96y8Kd399u6+sruvyeYSwH1rbzrgkar6tiRZXp9cec/eqqobk1yf5LW9f/8H5g+T/OjKG74jmzsbx5bz+pEkd1bVt36tD7Tr4Pn7JJdW1XOWe3Q3JPnTHR/zrFRVlc1zLU5096+uvSdJquqbT/1VQVU9NZtfDJ9Yc1N3v6m7j3T3Jdl8P/1Fd696b7yqzq+qbzz1dpIfzOaSxKq6+1+SPFhVz1s+dG2Sf1hx0kGvyR5czlp8OsnVVXXe8nN4bfbgCfFVdeHy+tuzeTLuvtxeyeY8fuPy9o1J/mTFLXurqq5L8gtJjnb3o2vvSZLTnvx+NOuf0z/e3Rd29yXLef2hJFcu56+v+cF2+pLNtedPJvlUkpt3fbwtN92azbX6Ly037k/swabvzeZy391J7lpeXrHypsuTfGzZdDzJm9e+nU7b931JbtuDHc/N5pLDsST37Mv3+bLtiiQfWb6Gf5zkgj3YdF6Sf03ytLW3HNj0lmxO/MeT/H6Sr9+DTX+TTaAeS3Ltijsed75M8sxs/jrrvuX1M/Zg06uWt7+Q5JEkH9iDTfdn8zzWU+f0396DTe9Zvs/vTvJnSS5ae9Npn38gybN2cWz/aRkAGM9/WgYAxhM8AMB4ggcAGE/wAADjCR4AYDzBAwCMJ3gAgPEEDwAw3v8CT9kCKMg1u+YAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 720x432 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "sns.heatmap(df_transform.isna(), cbar=False, cmap='viridis', yticklabels=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Check if these datasets contain missing data\n", "### Load the datasets" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "X_train = pd.read_csv(\"X_train.csv\")\n", "Y_train = pd.read_csv(\"Y_train.csv\")\n", "Y_test = pd.read_csv(\"Y_test.csv\")\n", "X_test = pd.read_csv(\"X_test.csv\")" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((384, 12), (384, 1), (96, 12), (96, 1))" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train.shape, Y_train.shape, X_test.shape, Y_test.shape" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<AxesSubplot:>" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAG8CAYAAADaV3/fAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAuiElEQVR4nO3de7ztU73/8fd7u1+yI5coIoUcoVDEqSg6/U6ccpd0c1JH6cg5/fqhU0qlju6U0kWSKCdEJbfYIpEt11AuHdXRhRLHnT6/P8aYe8+1rDXXXMvec4zv2K/n47Eee83v3OuxPns/5uU9x+UzHBECAABo2azSBQAAACxsBB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM1bfNCd28/ajT3rAACgE8792yme7D5GeAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0j8ADAACaR+ABAADNI/AAAIDmEXgAAEDzCDwAAKB5BB4AANA8Ag8AAGgegQcAADSPwAMAAJpH4AEAAM0j8AAAgOYReAAAQPMIPAAAoHkEHgAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzSPwAACA5hF4AABA8wg8AACgeQQeAADQPAIPAABoHoEHAAA0zxFR5hfb+0XEsUV++RPU1dq7WrfU3dq7WrfU3dq7WrfU3dq7WrfU3dq7WrdUrvaSIzz7FfzdT1RXa+9q3VJ3a+9q3VJ3a+9q3VJ3a+9q3VJ3a+9q3VKh2pnSAgAAzSPwAACA5pUMPJ2ce8y6WntX65a6W3tX65a6W3tX65a6W3tX65a6W3tX65YK1V5s0TIAAMCoMKUFAACaR+ABAADNI/AAAIDmEXimYHsx2+eVrmNRZ3uW7RVK1wEsDLaXK13DdNk+YZhrtbK9ou2NbT+/91W6pqnYXql0DU+U7VVtr9X7GuXvXnyUv8z2tpIOkLR+vnSDpKMj4sJR1jEdEfGY7fttz46Iv5auZ1i2dx50f0ScOqpaZsr2NyW9TdJjkuZKmm37kxFxZNnKBrO9taTDJD1D6TlmSRERzyxZ1yBdf7zYXk/SMZJWi4iNbG8saaeI+FDh0gay/SJJX5a0vKS1bG8i6a0RsX/Zyobyd/03bC8mabNCtUyL7cMlvVHSLZJ6O3dC0nalahrSZbavknScpLOiQ7uObO8k6ROS1pD0R6XXxxs07nG0UGsY1f+X7X+UdLSkD0q6UulN4PmS3ivpHRHxg5EUMgO2vy1pS0nnSrqvdz0i3lmsqCnYPi5/u6qkF0n6Ub69raQLI2LgG1wNbF8VEZva3lvphfQ9kuZGxMaFSxvI9o2S3qUU0h7rXY+Iu4oVNYWuP15sz5H0bklfjIjn5WvXRcRGZSsbzPZlknaVdEZX6rZ9sKRDJC0j6f7eZUkPSzo2Ig4uVduwbN8k6bkR8XDpWqbDtiW9XNKbJb1A0rckfS0iflm0sCHYvlopUJ4XEc/LAyB7RcTIui6PcoTn3ZJeHRFX9127yvYVko6SVG3gkfT9/NUZEfEmSbL9PUkbRsQd+fbqkj5XsrZpWML2EpJerTQS+Eh6vlfvrxFxVukipqOBx8uyEXH5uMfHo6WKmY6I+M24uh+b7O/WICKOkHSE7SO6EG4mcZ2kJyuNNHRGHtE5V9K5OTB8Q9L+OUz8v4i4tGiBgz0SEXfl5QmzIuIC2x8bZQGjDDxPHRd2JEkRcY3t1UZYx7RFxPG2l5G0VkTcVLqeaVq79+aV/UHSeqWKmaYvSvq1pKslXWT7GZK6MK14ge0jJZ0q6aHexYi4slxJQ+vq4+VO2+sqT0/Y3lXSHYN/pAq/ydNaYXtJSe9UGuavXkQcbPtpmj9127t+UbmqhnaEpJ/bvk5jn6M7lStparafIul1kvZRem4eIOkMSZtKOkXSOsWKm9rdtpeX9GNJJ9r+o0b8oWSUgee+Gd5XnO0dJX1c0pKS1rG9qaQP1v7kyC60fbakk5TeDPaUdEHZkoZ2ZkR8tnfD9u1KQ7m1e2H+c/O+a11YHyB19/HydqXurRvY/p2k25TeGGr3NkmfkfQ0Sb+VdI7Sv6V6tj+q9Pj4heaPSoWkLgSe4yV9TNK1kv5WuJbpuFTSCUqzJb/tu36F7S8UqmlY/yTpAUkHStpb0mylJS4jM8o1PHdr4ieCJW0TESuOpJAZsD1X6c3qwr559msj4rllKxuO7ddIenG+eVFEnFaynmHZvjIinj/u2tyI6MTCyK7q6uNFmrfbaVZE3Fu6ltbldTAbR8RDU/7lytieExEvKV3HdORF4UdGxEGla5mpPEr/7Ig4z/aykhYb5XN1lCM8/zTgvo+PrIqZeTQi/jpunr0zq+OVFonf23uQ2X5SzW8ItjdQWrk/e9zuoRUkLV2mquHZni3p/ZofGuYojQh2YTpO6tjjRZJsP1nS6yWtLWnx3nO15o0FkmR7HaVpibU1dlqoC6PHt0paQn1TQh0y1/YRStNBnZh2zjuGNyldx0zZfouk/SStJGldpVHNL0h62ahqGFngiYg5o/pdC8F1tl8raTHbz1aaZ/9J4ZqGUsODbAbWl/QqpUWFO/Zdv1fSW0oUNE1fVVoUuXu+vY/SNtKqdzpJnX28SGnTw0/VvSmK0yV9RdKZ6lbdUtqhdZXt8zU2NFQdMrPn5T+37LvWhWnnq2yfobRep3/HcNVtI7K3K+0su0ySIuJXtlcdZQEjCzy2r9WAUZHKtxofIOlQpSf1SZLOlnR40YqGV/xBNl0R8V1J37W9VeW7DiazbkTs0nf7A7l3Rhd07vGSLd3Rof4H+9epdcwZ+atT8tTQGRHxqdK1zMBKku7S2GAWShskavdQRDzcG321vbhGPFMyyimtV43wdy1QEXG/UuA5tHQtM1D8QfYE3Gz7ED1+uL/2hcsP2N4mIi6W5jUifKBwTcPq6uPlhDw69T2NHW34c7mShvIZ2+9XWqzciamVnog4vnQNM5GnhnaS1LnA02sf0VFz8uv5Mra3l7S/0sjmyIxySuu/h/l7ti+NiK0Wdj3DsH2mBo9KdWGevfiD7An4rtIWxvNUeW+Scf5F0vF5LY8l/Vmpq2sXdPXx8rCkI5U+lPR3zq22u3X2XKUpz+00f0qrC1Mrsn2bJnh9rLmjeJ+f2D5aqXFf/9RQ1UHT9tOV+tZtrfR/f7Gkfx23Y6tW75H0z0rTzm9Vmob+8igLGNkurWHZ/nlvJ1Rptnur+HeW9FSlJk+StJekX0fEIUUKmwbbsyTtK2kHpTffsyV9uQstyXudlkvXMVPOZ39FxD2laxlWVx8vtm+R9MKIuLN0LdPh1JV74651/JXm9YTpWVrSbpJWioj3FSppaLYnarUQEVF10LR9rqRvKm1Nl1Lrhb0jYvtyVU0tv65cU7qDeI2B53FbkUuzfVFEvHiqa1iwbH9I0k9qPnakn+3XRcQ3bE+4liQiPjnqmqbL9qsk/SAiOrWANi/k3DNPP3eG7W9JOiAiOtXxdzK2L46IbUrX0aqJPgR25YOh7RMlHRwRt5eqYaSHh3bYKrafGRG3SvO2kq5SuKaBbH87InafbLF45YvEe/5V0iG2H1aasugdwlnrqem9E6+fNMF9dX2ymNyeSutKviPpuIjoRNdfpSnPq/In9y7tGFpN0o22f6YOdfyVJI89XXyWUqPNiR771XHq7v8RSWtExCttbyhpq4j4SuHSpnKn7dcpbZ6R0mxDtWf0jbO6pOttX66x04gje6zXOMJTzZRWj+1/UOriemu+tLbSicZnFytqCrZXj4g7cqOnxxl2TRWmz/bWEXHJVNdqlafi9pL0JqWgdpykk2ruxWP7DRNdr31hbd+0+RhdaOMxblroUaVjYD7eheN3bJ+l9Lg+NCI2yYvzf157M1nbaykdwr2V0nPzJ5LeWXLUZFg1PNZrDDwbRcR1pesYz/ZSkjbIN2/sQnfRvP3y7Ih4eelaZsJpq9DektaJiMNtrylp9Yi4vHBpA03SIbq6qdpBbK+stD7gQKWznZ4l6bMRcVTJugZxOouqd+7XTRHxSMl6hpVHG7bINy9vZXqrZrZ/FhFb9H/A7sLUUNc/TPXLu1dfGxEjO0pl5FNauXPuxyStqjRFMWaaosawk22m+dujN7GtiPh62ZIGy9sv77c9u0Ndfvt9XmnnynZKfY/+V+nk7i0G/VAptreS9CKlKdD+dTwrSFqsTFXT43Ru3JuVmg6eIOkFEfFHpzbwNyjtEKmO7ZcqnY/0a6XXlDVtvyEqP8jS9u5Ku8suVKr7KNvvjoj/KlrYENzBjuK2F4+IRyXdlxdd9w6b3VLdOJj4KEnjPzhNdK1KTudQvlapKettkr4zyt9fYg3Pf0rasUNrA2T7BKU3gKs09pC8qgNP9qCka/Pq/v5509rXNkhp183zbf9ckiLiL/lTfK2WlLS80vOqfy3DPZJ2LVLR9O0m6VPjg0JE3G+75v5Hn5C0Q286xfZ6Suscaj937VBJW/RGdWyvotSGofrAo252FL9cKRz8m1LTxHVtX6K0JrPa52iXP0zl5+Kemr/e6FtKs0vbjrqWEoHnD10KO9nmkjasfWvuJL6fv7rokTwt1/sUtooqbr+f56Ln2P5aV9dIRcTrba+Wd2tJfVMsEXF+wdKmskT/2pGI+KXtJUoWNKRZ46aw7lJaANwFXewobkmKiLl5Tcn6+VrtU6Bd/jB1o1I/tR0j4mZJsv2uEoWUCDxX5K2Yp2vsroSaW2Nfp9SH547ShUxHDgv7dHUNj6TPSjpN0qq2P6z0xH5v2ZKGcr/tI5UOQJ132GntPT4kyfZuSof5XqhuTbFcYfsrmt+fZG9JcwvWM6wf2j5b83fd7CHprIL1TEcXO4qPHyHp2SEvU6iydcREH6Zyb5vlO9DnaxelEZ4LbP9Q0snKwXPURr5o2fZxE1yOqPi4gLwbYVOl4dCubR09Qyn0dGF++nGcTk5/mdIT5PwujA7aPkdp2PbfJb1N0hsk/Ski3lO0sCHYvlrS9uOnWCKi6lOa86aCt0vaRumxcpGkz3dkc8HO6qs7Ik4rXNJQ8nqM4yWN6SgeEVeXrGsQ23dIOkaTvOFGxAdGW9H02P6m0mvKY0qBfrakT0bEkUULG4Lt5SS9Wmlqazulx85pEXHOyGro5izNaNWwnW6mbH9b6UTgzqzhsb3SoPuj8vORbM+NiM1sX9Prd2R7TkRM+Diqie1r+7fm5k+RV3dgu+5ySgdxPpZvLyZpqdobEeaeXndExIP59jKSVouIXxctbBrcoY7iXdstOV5vJ5ntvZXWp71H0tyO9FWbJ7/G7yZpj97It+0VI+IvC/P3ltiltbRS6/rxw/3VjvBExJzcz+bZEXFe3rFS9UKxPl1cwzNXad2OJa0l6S/5+ydLul3SOsUqG05vLcAdtv9R0v9IenrBeqZjoimWLnS6Pl/Sy5V28knSMkoHcr6oWEXDOUVja3wsX6tyJ2I/20+W9Hrl3avOB87W/GFKQ06ljOLNd4aWyGvTXi3p6Ih4xHbnRi3yh9Yv5q+e87WQd5uVWMNzgtIipldI+qDSXHvV0xROpzDvJ2klpd1aT5P0BaWplqrV3nhtIhGxjiTZ/oKkMyIfLWH7lUpvarX7UN6y+29KW0ZXkFRkkd50RcS7be+idDihJR3bkSmWpSOiF3YUEf+bP5jUbvHoO0cr0kn1Ne9E7PcDST9VOgyy2s0E4wz7mr3Q33xn6ItKrReulnRR/iBe/cjakBb6up4Sa3h+HhHP6w3357R6ds0LOvPOgxdIuqyvSdWYof9a2X62pCMkbaixI2rVn2jcmxoad+2KiNi8VE2oU95afEDk065tb6b0CXirspUNlttFHBURZ+Tb/6TUObf6D1Ndnx4axBV2/J9MX2+hThvF46nECE9vuP9u2xtJ+r3SkGjNHsqfvCSlB5i6czbScUrNwT4laVul4wKKrJCfgTttv1fplPpQ6vxb/bkxto+X9K8RcXe+vaKkT9Q8bWv7Xk38mK79/LKeAyWdYvt/8u3Vlabjavc2SSfaPlrp//o3StNEXXBCHv3+nsZu5qh6jd2Qqnp99xQHE0uqcndZbUoEnmPzG8B/KDV+Wj5/X7M5tg+RtIzt7SXtL+nMwjUNa5mION+283bGw2z/WCkE1W4vpTp7UyoX5Wu127gXdqR5DROr/rQYEZ049HEyEfGzvKOv11flxsr7qkiSIuIWSVvaXl5pxL3a88om8LBSl+hDNT8ghKTqR487aNDBxK1ob0qri/JOlX0l7ZAvnR0RXy5Y0tDyUP/fK3Vu/ZGk30n6aESsX7SwhuWt3S/tLXrMOxLmdGEKVFLvFOxtlN68Lo6InxcuaSi2X6T5x79IUvXHv+Tt9Lvo8XV/sFRNw7J9i1I39DtL17KgdWlKqytsf1zScRFx/ST3r7SwRwdL7NKaLekwpTdhKTU4O7zGPjF5Pv3pEfE5SV/Kw7erSNrM9t0daMYmpaH+ZSW9U+k8qu2U+sJUz6kl+b/r8W8G1a73yj4h6Se2e4+P3SR9uGA9Q7P9PqV6e41Av2b7lIj4UMGypuTuHv/yXaUznOaqb1qoI66XVPW2/8nYPiEi9hlwrao1VLY/O+j+ynfG9dyoNMOzuNJSi5P63/dHMRVaYtHyd5Q6F/d2D+0jaZOIqO78lTw6smdE/CbfvkopMCyvlFSrelK0Jo+UfEHpzaD3JqaIqL6Dru0NlR4rvYaJvyhc0lBs3yDpeeP6wlwZEc8pW9lgue7OHf9i+7qI2Kh0HTNh+zSl9iIXaOwanurffMcvkM19m66NiA0LljUp2/0fUj+gcUsSurQb1/b6SmtJ95J0iaQvRcQFo/jdJdbwdOn8lSV7YSe7OKfQP+dGZ9XKHZYnFR3oEi3p0Yg4pnQR02V7LaV+MGf0X4uI28tVNbRfK+3mezDfXkrSLcWqGV4nj39RGgl8bkRcW7qQGTg9f/WrOnDaPlhSbz1mbzu3ldYjHVussCn0BxrbB3Yp4PTLwXKD/HWn0vb6g2y/NSL2XNi/v0Tg6dL5Kyv234iId/TdXGXEtUzXVko7Pk6SdJm6szOr35m291datNylXSDf1/wX/mWUGiXepPRpuHYPSbo+b5cOSdtLurg3pF7xp/eVJf3CdteOf9lG0htt36ZUd29XXPWdc8e/6dpeU+nMpGpFxBGSjrB9REQcXLqeGao6VE7G9icl7ai0lvQjEXF5vutjtm+a/CcXYA0FprQ2UZpXn50v/UXSGyLimpEWMgTbJ0q6MCK+NO76W5UWpVa7Yygn6e2Vhg03VnoTPmmyBWM1ym8C40UXegj1y4uA3xoRby1dy1TGDZ0/Tq2fLN3R419y47jHyTsqq2d7ZaU1X3spNWQ9LSL+vWxVw7H9NEnP0Nj1gReVq2g4Xe1/ZPvNkk6OCY57sT17FOt4i+3Sct/5K3mI7tNFChnA9qqaf6r7lfnyZkrD/K+OiD8UKm1a8k6QvZS2kH4wIo4qXNIip6svUlg43OHz4mw/SdJrJL1W0npKI7B7RERXjk+R7Y8qjUb9Qn2L3GsdERzXJ2tZzV8s3pU+WbJ9/vh1rxNdW6g11LDGz/btEbFW6TomY3s7zZ+OuD4iflSynmHloPOPSmFnbaU1JV+NiN+VrGtYTkcDHCRprYjYL3eNXj8ivle4tIHGNQebpdSi/ikR8YpCJQ3N9quUdvP1PvlW/YLa90ZgjR3qr73u2zS/7vGqHsW0/YCkyyW9V2ldY9i+teaax8tTKBtHRNd2xg3kCs8Aczo/c1mlxe0v1fzH/AqSzhrlhogSa3gmUvX6khxwOhFyenK3340knSXpAxFxXeGSZuI4pR1avcMVf6t0sGLVgUdjm4M9qjSd+J1CtUzXpyXtrLRjpfynoSl0tWFi5PPiOuoQpdGRYyR90/a3CtczE7dKWkLdawUwlRrPAHurUnuUNZRez3vv9/dI+twoC2GEp1G2/ybpvnyzM598+zmfm9XfBMz21RGxSenaWmX7Akkvi4iuHAYpaai+KlWqYZh/pmw/U2n0eE9Jz1buih4Rvyxa2BBye5RNlAJCp7bUD1Jrw8S8pvSQiDi8ZB0jG+Hx4LN6lhlVHYuKiJhVuoYF4OHcByYkyfa6qvgTme0zNWAHRa3rA8b5v5J+YHuOxr4R1H5Wz5gdcLm52WaT/N3i8jD/cpJWdjpqp3+Yf41ihU1DRNyq1FDzw7afq7Sm5yylBpC1O0N9bSMaUn4EYwIR8Zjt/6M0XV7MyAJPV4eeUdT7Jf1Q0pp5x9zWkt5YtKLBPp7/3FmpJ8w38u29lPrbdMGHlXoILS1pycK1TKmrfVVU0TD/gpD7CB2cv6pX627Dxp1jexdJp5aaLq9iSguYjO2nSNpS6Q3hp9GBc3tsXxQRL57qWo1604il65iurvZVsX1AV3dN2t5Z0sckrar0/OzSdHlv0fgYXVp4PZFap7SkebM8yyntintABR4vtSxaBibzEs0/yHIJzT85vWar2H5mHvKX7XVUf6PKnvNs7xAR55QuZDoi4uAu9lWJiKPcwUNPs/+UtGNE3FC6kBnoD/VLK/USGtgqoAZdOwOsXw2zPIzwoFq2Py/pWUrdoiVpD0m3RMTby1U1Ndv/oDSdcmu+tLZS48GzixU1pL5PYQ9JekQd+dTetb4qPZ7k0NMuLJ61fUlEbF26jgXF9sURsU3pOgbp2hlg/Wxb0t6S1omIw3Nn7tX7Oi4v/BoIPKiV7eslbdSb77U9S+nJXf0RDbkH0gb55o2t9fuoTVf7qrijh55Kku3PKK1VO11jF7ifWqqmYeXu5z2zlEZ8/qXWHaD9a9U0tungw5KO7cJ0ru1jJP1N0nYR8Zy8WP+ciNhiVDW0sJMH7bpJUn+7gjUlVXcESY/t/9t3c6eIuDp/PWT7I8UKG4Lt1/V9v/W4+97x+J+oTq+vStf0Dj3tohWU3nx3UDojaUdJrypa0fA+0fd1hNKOvt2LVjRARByRp4SOjIgV8teTIuIpXQg72Qvz6PyDkpQbJI50YwQjPKhW3hq9hVJXV+XvL1X+hFPbdEX/cPMEQ89VHy3R5dql7vZVyX2PNlV6jHfp0FOMkO0NIuLGcSNT80TElRNdr4nty5SayP4sIp5vexWlEZ6RLbJm0TJq9r7SBUyTJ/l+otu16XLtUnf7qhxWuoCZyr2E9lXqgbR073pEvLlYUUOyPVup7UVv5+QcpXMGF/oBljP0b5LeojQiNV5I2m605czIZ5U2naxm+8OSdlU6nmRkCDyoVkTMyadJPzsizstNCBePiHtL1zaJmOT7iW7Xpsu1d7avSlR+mvsUTpB0o6RXSPqg0oLUruzY+qrSdGJvGmsfpaNsdi5W0QAR8Zb857ala5mpiDjR9lzN30n26lHv8GNKC9Wy/RZJ+0laKSLWzYeHfqHWtvu2H1M6zqPXPbx/ceHSEVHtGhPb90u6WanWdfP3yrefGRHLlaptGPmxcYSkDTV2tKHqviq2t5R0lKTnKK1nWEzSfbXvipPm93yxfU1EbGx7CUlnR0T1ow22r4qITae6Vovc82hSXVgoLs1bLN5rM3LJqKfiGOFBzd4u6QWSLpOkiPiV7VXLljS5iFisdA1PwMhOLF5IjlOaoviUpG0lvUndmIo7Wmk7/SlKO4Ver3QuVRc8kv+82/ZGkn6v1IKhCx6wvU1EXCzNW6j/QOGaBtkx/7mq0jqY3mHW20q6UFL1gcf2+5T6HX1H6bl5nO1TIuJDo6qBwIOaPRQRD6f2DfPOR2JIciGIiP8e5u/ZvjQitlrY9czAMhFxvm3nf8thtn+sFIKqFhE3214sIh5TehP4SemahnRs3lr8H0rrp5bP33fBv0g6Pq/lsaQ/q+JjayLiTZJk+3tKbQzuyLdXV3eOItlL0vMi4kFpXu+sKyUReABJc2z3zknaXtL+ks4sXNOibump/0oRD+Y+Tb/K2+h/p/RpuHb3215S0lW2/1PSHUqNH6sXEV/O386RVPXU4XgRcZWkTWyvkG/fM/gnqrF2L+xkf5C0XqlipunXSq8fD+bbS0m6ZZQFsIYH1cpvYPsq9fmwpLMlfbmLTdpaUesWddtbKC2YfbLSicwrKPUs+WnJuqaSF+X/QWn9zrskzZb0+Yi4eeAPViCPjhwm6e/zpQslHV7xTqd5bD9ZafpwbY090qP2NgZHK015nqQ02r2npJsj4oCihQ3B9ulKrUXOVap9e0kXS/qjNJr/ewIPqpZ7NSgi/lS6FtQbeLrK9nKSHoiIv+Xbi0laKiLuH/yT5eXeR9dJ6u2Q20fSJhFR5U6nfnna8KeSrlXq/iupG7v9bL9G87fTXxQRXThfULbfMOj+UfzfE3hQnXzmyvslvUPzT2F+TNJREfHBkrW1yvZSwxzLUOtpzLbPlbRbRNydb68o6eSIeEXRwqZg+6eSXh4R/5tvL6/UjO1FZSubWtd2OvXrcnAf16pjWUmLVdyqY4w8fdubgrspIh4Z9PcXNI6WQI0OlLS1pC1y6/SVJL1Q0ta231W0snZdKs07zHKQfaa4v5SVe2FHmte2vgtreJbuhR1Jyt8vW7Ce6XjA9rzDNjuw06nfCbbfYnt12yv1vkoXNZXcquO/JH0xX3qa0llm1bP9Ukm/Ulpk/XlJv7T94kE/s6CxaBk1er2k7SPizt6FiLg1n/d0jtLWYyxYS+Yh5xdN1POj1+cjIq4beWXD+ZvttSLidmnep+AuDF/fZ/v5vX4ktjdTd0LD2yR9Pa/lkaS/SBo4bVGRhyUdKelQzX+chOpffN2pVh3jfELSDhFxkyTZXk9pLdJmoyqAwIMaLdEfdnoi4k+5uRkWvLcpdcp9sub3/OgJ1d/n41BJF+fz16S0xmG/gvUM60BJp9j+n3x7dUl7lCtneBFxtcbtdLJ9oCo+4LfPQZKeNdHrTOW63KpjiV7YkaSI+OWoX89Zw4PqDJpf7/LcexfY3jcivlK6jpmwvbKkLZXWfF3alTez/KK/vlLdN456XcOCZPv2iFirdB1TsX2GpD27sDi8X25dcLfSKPgBSq06fhERh5asaxi2j1NaIN6bNt9b6aigN42sBgIPatN3RMPj7lLlRzR0VVdb13f1FGnb20XEjyb7f6/1/3sqtn8TEWuWrmMqtk9TOvT0Ao09pb72bemW9M/qYKsO20spTclto1T7RUotGKbcLLGgMKWF6nT8iIauGj+N1a/mKa2uniL9EqXjASb6f6/5/3sq1b/xZqfr8Yt9q6499yW7JiI2kvSl0vVMR659bq79k8Xq6EAwBABUwva9mjgcWOmIj859kLa9ptIU15GlaxnE9omSDu4tzu+SGmrv3AMTwMJjezVJH5G0RkS80vaGkraqdV1Ph6fiDhp0f0QU+xQ8lYh4UukaFoS85ms3pTOeniapCw38Vpd0ve3L1TftHxE7lStpaMVrJ/AA6Pc1pZPHe4sgfynpW5KqDDzq7lRcE6Gha2w/SdJrJL1WqQHeaZKeGRFPL1rYFGw/S9Jqkj4w7q6XKJ0b1wXjax85prQAzGP7ZxGxRX9H5a50zwWmYvsBSZdLeq+kiyMibN8aEVX338mnpB8SEdeMu765pPdHxKDgX5TtpZXaXjxL6SiPr0TEoyVqodMygH732X6K8hoN21tK6sJhkE+x/VnbV9qea/sz+d9RNdvPtH2m7T/Z/qPt79qu+s234w5ROrH7GEkH2163cD3DWnt82JGkiLhC6QDUmh0vaXOlsPNKTbzBYCQIPAD6HSTpDEnr2r5E0teV+n3U7mRJf5K0i6Rd8/ffKlrRcL4p6dtK6xvWkHSKUvdZLAQR8amIeKGknZQWWZ8uaQ3b78mdf2u19ID7lhlZFTOzYUS8LiK+qPTc/PtShRB4AMj2FrafmvvWvETpk/BDSkd5/LZoccNZKSIOj4jb8teHlLpG184RcUJEPJq/vqHKt0e3ICJujYgPR8RzJW0habakswqXNcjP8jlaY9jeV9LcAvVMx7xGmqWmsnpYwwNAtq9UOrX7z/lAv5OVRnY2lfSciNi1ZH1Tsf1xSVcojZZI6ZPk30XE+8tVNTXbH1XqnHuyUtDZQ9JSSgcsKiL+XKy4RZjtSyNiq9J19OTdk6cpnQHWCzibS1pS0msi4velapvKuEayVhqRuj9/HxGxwshqIfAAsH11RGySv/+cpD9FxGH5dvWLlnNvmOWUWtdLafS69yI70hfV6bB924C7o/bFtK3qX7RfE9vbStoo37w+In5Usp6uYVs6AElazPbiecj5ZRp78Gb1rxNd7Q0TEeuUrgETqnIkICIuUDoOAzNQ/QsZgJE4SdIc23dKekDSj6V5/T+q36UlzWtCuI3Sm9WPI+L0shVNLW/Z3V99dUv6QkQ8WLQwoEFMaQGQNG8L+uqSzomI+/K19SQtX+shnD22P6/U56O3w2kPSbdExNvLVTU129+WdK+kb+RLe0laMSJ2K1cVap3SwhND4AHQebavl7RR79TofFjhtRHxd2UrG6x/7dSgaxgt2xtFxHWl68CCxbZ0AC24SdJafbfXlPS4Rm0V+nkeWZMk2X6hpEsK1rNIsL2z7V/Z/qvte2zfa/ue3v2EnTYxwgOg82zPUeqncnm+tIWkS5W2v1Z7uKLtGyStL6l3gvRakm5Q2m0WEbFxqdpaZvtmSTtGxA2la8HosGgZQAve1/e9lRYB76W0ILhm/1C6gEXUHwg7ix5GeAA0wfamSqdg7y7pNkmnRsRRRYsaku1V1Xd8QETcPuCv4wmy/RlJT1U6WuKh3vWIOLVUTVj4GOEB0Fl5F9meSqM5dymdn+WI2LZoYUOyvZPSYYprSPqjpGcoTWlVvdi6ASsoTXfu0HctJBF4GsYID4DOsv03pd41+0bEzfnarV3pUGz7aknbSTovIp6XO+nuFRH7TfGjAKaJER4AXbaL0gjPBbZ/qHQmlcuWNC2PRMRdtmfZnhURF9j+WOmiWpcbPu6rNJLWP5X45mJFYaFjWzqAzoqI0yJiD0kbSLpQ0rskrWb7GNs7DPzhOtxte3mlUaoT89qSoidKLyJOUFrD8wpJcyQ9XakBJBrGlBaAptheSdJukvaIiO1K1zOI7eUkPag0KrW3pNmSToyIu4oW1rheJ2Xb10TExraXkHR27Y8XPDGM8ABoSkT8OSK+2IU3r3yEx8pK29PvknQyYWckHsl/3m17I6WguXa5cjAKBB4AKMT27krNEndT2k5/me1dy1a1SDjW9oqS/kPSGZJ+IYm1U41jSgsACsm7tLaPiD/m26so7djiLC1gAWOEBwDKmdULO9ld4nV5obM92/anbF+Rvz5ue3bpurBw8cQCgHJ+aPts22+0/UZJ35f0g8I1LQq+KukepWnE3ZV2aB1XtCIsdExpAcCI2X6WpNUi4hLbOyud/WVJf1HapXVL0QIbZ/uqiNh0qmtoCyM8ADB6n1bu+xIRp0bEQRHxLqXRnU8XrGtR8YDtbXo3bG8t6YGC9WAE6LQMAKO3dkRcM/5iRFxhe+0C9Sxq3ibp633rdv4i6Q0F68EIEHgAYPSWHnDfMiOrYhEVEVdL2sT2Cvn2PbYPlPS4EIp2MKUFAKP3M9tvGX/R9r6S5haoZ5EUEfdExD355kFFi8FCx6JlABgx26tJOk3Sw5ofcDaXtKSk10TE70vVtqiy/ZuIWLN0HVh4CDwAUIjtbSVtlG9eHxE/KlnPosz27RGxVuk6sPAQeAAAiwTb90qa6E3PkpaJCNa1NozAAwAAmseiZQAA0DwCDwAAaB6BBwAANI/AAwAAmkfgAQAAzfv/4ITu65lrXQsAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 720x432 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "sns.heatmap(X_train.isna(), cbar=False, cmap='viridis', yticklabels=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Is there missing data in this dataset???" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Build a Logistic Regression model Without imputation" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "df=pd.read_csv(\"data/heart_disease.csv\")\n", "X = df[df.columns[:-1]]\n", "y = df[df.columns[-1]]" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression\n", "from sklearn.metrics import accuracy_score" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "model = LogisticRegression()" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "ename": "ValueError", "evalue": "Input contains NaN, infinity or a value too large for dtype('float64').", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m<ipython-input-28-4c1a2828403e>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/linear_model/_logistic.py\u001b[0m in \u001b[0;36mfit\u001b[0;34m(self, X, y, sample_weight)\u001b[0m\n\u001b[1;32m 1340\u001b[0m \u001b[0m_dtype\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat64\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat32\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1341\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1342\u001b[0;31m X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,\n\u001b[0m\u001b[1;32m 1343\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m\"C\"\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1344\u001b[0m accept_large_sparse=solver != 'liblinear')\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/base.py\u001b[0m in \u001b[0;36m_validate_data\u001b[0;34m(self, X, y, reset, validate_separately, **check_params)\u001b[0m\n\u001b[1;32m 430\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcheck_array\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mcheck_y_params\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 431\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 432\u001b[0;31m \u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcheck_X_y\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mcheck_params\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 433\u001b[0m \u001b[0mout\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 434\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36minner_f\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 70\u001b[0m FutureWarning)\n\u001b[1;32m 71\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m{\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0marg\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mk\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0marg\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mparameters\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 72\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 73\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0minner_f\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 74\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36mcheck_X_y\u001b[0;34m(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)\u001b[0m\n\u001b[1;32m 793\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"y cannot be None\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 794\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 795\u001b[0;31m X = check_array(X, accept_sparse=accept_sparse,\n\u001b[0m\u001b[1;32m 796\u001b[0m \u001b[0maccept_large_sparse\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maccept_large_sparse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 797\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0morder\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcopy\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcopy\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36minner_f\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 70\u001b[0m FutureWarning)\n\u001b[1;32m 71\u001b[0m \u001b[0mkwargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m{\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0marg\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mk\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0marg\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mparameters\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 72\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 73\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0minner_f\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 74\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36mcheck_array\u001b[0;34m(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)\u001b[0m\n\u001b[1;32m 642\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 643\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mforce_all_finite\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 644\u001b[0;31m _assert_all_finite(array,\n\u001b[0m\u001b[1;32m 645\u001b[0m allow_nan=force_all_finite == 'allow-nan')\n\u001b[1;32m 646\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/opt/anaconda3/envs/testing/lib/python3.8/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36m_assert_all_finite\u001b[0;34m(X, allow_nan, msg_dtype)\u001b[0m\n\u001b[1;32m 94\u001b[0m not allow_nan and not np.isfinite(X).all()):\n\u001b[1;32m 95\u001b[0m \u001b[0mtype_err\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'infinity'\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mallow_nan\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0;34m'NaN, infinity'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 96\u001b[0;31m raise ValueError(\n\u001b[0m\u001b[1;32m 97\u001b[0m \u001b[0mmsg_err\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 98\u001b[0m (type_err,\n", "\u001b[0;31mValueError\u001b[0m: Input contains NaN, infinity or a value too large for dtype('float64')." ] } ], "source": [ "model.fit(X,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Drop all rows with missing entries - Build a Logistic Regression model and benchmark the accuracy" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.metrics import accuracy_score\n", "from sklearn.model_selection import RepeatedStratifiedKFold, cross_val_score" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>male</th>\n", " <th>age</th>\n", " <th>education</th>\n", " <th>currentSmoker</th>\n", " <th>cigsPerDay</th>\n", " <th>BPMeds</th>\n", " <th>prevalentStroke</th>\n", " <th>prevalentHyp</th>\n", " <th>diabetes</th>\n", " <th>totChol</th>\n", " <th>sysBP</th>\n", " <th>diaBP</th>\n", " <th>BMI</th>\n", " <th>heartRate</th>\n", " <th>glucose</th>\n", " <th>TenYearCHD</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>39</td>\n", " <td>4.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>195.0</td>\n", " <td>106.0</td>\n", " <td>70.0</td>\n", " <td>26.97</td>\n", " <td>80.0</td>\n", " <td>77.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>250.0</td>\n", " <td>121.0</td>\n", " <td>81.0</td>\n", " <td>28.73</td>\n", " <td>95.0</td>\n", " <td>76.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>48</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>245.0</td>\n", " <td>127.5</td>\n", " <td>80.0</td>\n", " <td>25.34</td>\n", " <td>75.0</td>\n", " <td>70.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>0</td>\n", " <td>61</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>30.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>225.0</td>\n", " <td>150.0</td>\n", " <td>95.0</td>\n", " <td>28.58</td>\n", " <td>65.0</td>\n", " <td>103.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>23.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>285.0</td>\n", " <td>130.0</td>\n", " <td>84.0</td>\n", " <td>23.10</td>\n", " <td>85.0</td>\n", " <td>85.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>4233</th>\n", " <td>1</td>\n", " <td>50</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>313.0</td>\n", " <td>179.0</td>\n", " <td>92.0</td>\n", " <td>25.97</td>\n", " <td>66.0</td>\n", " <td>86.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4234</th>\n", " <td>1</td>\n", " <td>51</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>43.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>207.0</td>\n", " <td>126.5</td>\n", " <td>80.0</td>\n", " <td>19.71</td>\n", " <td>65.0</td>\n", " <td>68.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4235</th>\n", " <td>0</td>\n", " <td>48</td>\n", " <td>2.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>248.0</td>\n", " <td>131.0</td>\n", " <td>72.0</td>\n", " <td>22.00</td>\n", " <td>84.0</td>\n", " <td>86.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4236</th>\n", " <td>0</td>\n", " <td>44</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>15.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>210.0</td>\n", " <td>126.5</td>\n", " <td>87.0</td>\n", " <td>19.16</td>\n", " <td>86.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4237</th>\n", " <td>0</td>\n", " <td>52</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>269.0</td>\n", " <td>133.5</td>\n", " <td>83.0</td>\n", " <td>21.47</td>\n", " <td>80.0</td>\n", " <td>107.0</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>4238 rows × 16 columns</p>\n", "</div>" ], "text/plain": [ " male age education currentSmoker cigsPerDay BPMeds \\\n", "0 1 39 4.0 0 0.0 0.0 \n", "1 0 46 2.0 0 0.0 0.0 \n", "2 1 48 1.0 1 20.0 0.0 \n", "3 0 61 3.0 1 30.0 0.0 \n", "4 0 46 3.0 1 23.0 0.0 \n", "... ... ... ... ... ... ... \n", "4233 1 50 1.0 1 1.0 0.0 \n", "4234 1 51 3.0 1 43.0 0.0 \n", "4235 0 48 2.0 1 20.0 NaN \n", "4236 0 44 1.0 1 15.0 0.0 \n", "4237 0 52 2.0 0 0.0 0.0 \n", "\n", " prevalentStroke prevalentHyp diabetes totChol sysBP diaBP BMI \\\n", "0 0 0 0 195.0 106.0 70.0 26.97 \n", "1 0 0 0 250.0 121.0 81.0 28.73 \n", "2 0 0 0 245.0 127.5 80.0 25.34 \n", "3 0 1 0 225.0 150.0 95.0 28.58 \n", "4 0 0 0 285.0 130.0 84.0 23.10 \n", "... ... ... ... ... ... ... ... \n", "4233 0 1 0 313.0 179.0 92.0 25.97 \n", "4234 0 0 0 207.0 126.5 80.0 19.71 \n", "4235 0 0 0 248.0 131.0 72.0 22.00 \n", "4236 0 0 0 210.0 126.5 87.0 19.16 \n", "4237 0 0 0 269.0 133.5 83.0 21.47 \n", "\n", " heartRate glucose TenYearCHD \n", "0 80.0 77.0 0 \n", "1 95.0 76.0 0 \n", "2 75.0 70.0 0 \n", "3 65.0 103.0 1 \n", "4 85.0 85.0 0 \n", "... ... ... ... \n", "4233 66.0 86.0 1 \n", "4234 65.0 68.0 0 \n", "4235 84.0 86.0 0 \n", "4236 86.0 NaN 0 \n", "4237 80.0 107.0 0 \n", "\n", "[4238 rows x 16 columns]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df=pd.read_csv(\"data/heart_disease.csv\")\n", "df" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4238, 16)" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Drop rows with missing values" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3656, 16)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = df.dropna()\n", "df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Split dataset into X and y" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3656, 15)" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = df[df.columns[:-1]]\n", "X.shape" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3656,)" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = df[df.columns[-1]]\n", "y.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a pipeline with model parameter" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "pipeline = Pipeline([('model', model)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a RepeatedStratifiedKFold with 10 splits and 3 repeats and random_state=1" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Call cross_val_score with pipeline, X, y, accuracy metric and cv" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "scores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=cv, n_jobs=-1)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.85245902, 0.8579235 , 0.85245902, 0.84699454, 0.84699454,\n", " 0.84699454, 0.84109589, 0.85753425, 0.84109589, 0.84109589,\n", " 0.85245902, 0.84972678, 0.85519126, 0.8442623 , 0.85519126,\n", " 0.84153005, 0.84657534, 0.84383562, 0.84931507, 0.84657534,\n", " 0.86065574, 0.84972678, 0.84972678, 0.8442623 , 0.85245902,\n", " 0.8442623 , 0.84931507, 0.85205479, 0.84383562, 0.83835616])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scores" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Print the Mean Accuracy and Standard Deviation from scores" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean Accuracy: 0.848 | Std: 0.006\n" ] } ], "source": [ "print(f\"Mean Accuracy: {round(np.mean(scores), 3)} | Std: {round(np.std(scores), 3)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Build a Logistic Regression model with IterativeImputer" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.metrics import accuracy_score\n", "from sklearn.model_selection import RepeatedStratifiedKFold, cross_val_score" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>male</th>\n", " <th>age</th>\n", " <th>education</th>\n", " <th>currentSmoker</th>\n", " <th>cigsPerDay</th>\n", " <th>BPMeds</th>\n", " <th>prevalentStroke</th>\n", " <th>prevalentHyp</th>\n", " <th>diabetes</th>\n", " <th>totChol</th>\n", " <th>sysBP</th>\n", " <th>diaBP</th>\n", " <th>BMI</th>\n", " <th>heartRate</th>\n", " <th>glucose</th>\n", " <th>TenYearCHD</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>39</td>\n", " <td>4.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>195.0</td>\n", " <td>106.0</td>\n", " <td>70.0</td>\n", " <td>26.97</td>\n", " <td>80.0</td>\n", " <td>77.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>250.0</td>\n", " <td>121.0</td>\n", " <td>81.0</td>\n", " <td>28.73</td>\n", " <td>95.0</td>\n", " <td>76.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>48</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>245.0</td>\n", " <td>127.5</td>\n", " <td>80.0</td>\n", " <td>25.34</td>\n", " <td>75.0</td>\n", " <td>70.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>0</td>\n", " <td>61</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>30.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>225.0</td>\n", " <td>150.0</td>\n", " <td>95.0</td>\n", " <td>28.58</td>\n", " <td>65.0</td>\n", " <td>103.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>23.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>285.0</td>\n", " <td>130.0</td>\n", " <td>84.0</td>\n", " <td>23.10</td>\n", " <td>85.0</td>\n", " <td>85.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>4233</th>\n", " <td>1</td>\n", " <td>50</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>313.0</td>\n", " <td>179.0</td>\n", " <td>92.0</td>\n", " <td>25.97</td>\n", " <td>66.0</td>\n", " <td>86.0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>4234</th>\n", " <td>1</td>\n", " <td>51</td>\n", " <td>3.0</td>\n", " <td>1</td>\n", " <td>43.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>207.0</td>\n", " <td>126.5</td>\n", " <td>80.0</td>\n", " <td>19.71</td>\n", " <td>65.0</td>\n", " <td>68.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4235</th>\n", " <td>0</td>\n", " <td>48</td>\n", " <td>2.0</td>\n", " <td>1</td>\n", " <td>20.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>248.0</td>\n", " <td>131.0</td>\n", " <td>72.0</td>\n", " <td>22.00</td>\n", " <td>84.0</td>\n", " <td>86.0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4236</th>\n", " <td>0</td>\n", " <td>44</td>\n", " <td>1.0</td>\n", " <td>1</td>\n", " <td>15.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>210.0</td>\n", " <td>126.5</td>\n", " <td>87.0</td>\n", " <td>19.16</td>\n", " <td>86.0</td>\n", " <td>NaN</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>4237</th>\n", " <td>0</td>\n", " <td>52</td>\n", " <td>2.0</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>269.0</td>\n", " <td>133.5</td>\n", " <td>83.0</td>\n", " <td>21.47</td>\n", " <td>80.0</td>\n", " <td>107.0</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>4238 rows × 16 columns</p>\n", "</div>" ], "text/plain": [ " male age education currentSmoker cigsPerDay BPMeds \\\n", "0 1 39 4.0 0 0.0 0.0 \n", "1 0 46 2.0 0 0.0 0.0 \n", "2 1 48 1.0 1 20.0 0.0 \n", "3 0 61 3.0 1 30.0 0.0 \n", "4 0 46 3.0 1 23.0 0.0 \n", "... ... ... ... ... ... ... \n", "4233 1 50 1.0 1 1.0 0.0 \n", "4234 1 51 3.0 1 43.0 0.0 \n", "4235 0 48 2.0 1 20.0 NaN \n", "4236 0 44 1.0 1 15.0 0.0 \n", "4237 0 52 2.0 0 0.0 0.0 \n", "\n", " prevalentStroke prevalentHyp diabetes totChol sysBP diaBP BMI \\\n", "0 0 0 0 195.0 106.0 70.0 26.97 \n", "1 0 0 0 250.0 121.0 81.0 28.73 \n", "2 0 0 0 245.0 127.5 80.0 25.34 \n", "3 0 1 0 225.0 150.0 95.0 28.58 \n", "4 0 0 0 285.0 130.0 84.0 23.10 \n", "... ... ... ... ... ... ... ... \n", "4233 0 1 0 313.0 179.0 92.0 25.97 \n", "4234 0 0 0 207.0 126.5 80.0 19.71 \n", "4235 0 0 0 248.0 131.0 72.0 22.00 \n", "4236 0 0 0 210.0 126.5 87.0 19.16 \n", "4237 0 0 0 269.0 133.5 83.0 21.47 \n", "\n", " heartRate glucose TenYearCHD \n", "0 80.0 77.0 0 \n", "1 95.0 76.0 0 \n", "2 75.0 70.0 0 \n", "3 65.0 103.0 1 \n", "4 85.0 85.0 0 \n", "... ... ... ... \n", "4233 66.0 86.0 1 \n", "4234 65.0 68.0 0 \n", "4235 84.0 86.0 0 \n", "4236 86.0 NaN 0 \n", "4237 80.0 107.0 0 \n", "\n", "[4238 rows x 16 columns]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df=pd.read_csv(\"data/heart_disease.csv\")\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Split dataset into X and y" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4238, 16)" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4238, 15)" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = df[df.columns[:-1]]\n", "X.shape" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 0\n", "2 0\n", "3 1\n", "4 0\n", " ..\n", "4233 1\n", "4234 0\n", "4235 0\n", "4236 0\n", "4237 0\n", "Name: TenYearCHD, Length: 4238, dtype: int64" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = df[df.columns[-1]]\n", "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a SimpleImputer with mean strategy" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "imputer = IterativeImputer(max_iter=10, random_state=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Logistic Regression model" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "model = LogisticRegression()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a pipeline with impute and model parameters" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "pipeline = Pipeline([('impute', imputer), ('model', model)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a RepeatedStratifiedKFold with 10 splits and 3 repeats and random_state=1" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Call cross_val_score with pipeline, X, y, accuracy metric and cv" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "scores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=cv, n_jobs=-1)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.85377358, 0.85377358, 0.8490566 , 0.8490566 , 0.84433962,\n", " 0.84669811, 0.8490566 , 0.8490566 , 0.84869976, 0.8534279 ,\n", " 0.8490566 , 0.85141509, 0.8490566 , 0.85377358, 0.84433962,\n", " 0.84669811, 0.84433962, 0.8490566 , 0.84397163, 0.85106383,\n", " 0.8490566 , 0.85141509, 0.84433962, 0.84669811, 0.85377358,\n", " 0.85141509, 0.8490566 , 0.85613208, 0.8534279 , 0.84397163])" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scores" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Print the Mean Accuracy and Standard Deviation" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean Accuracy: 0.849 | Std: 0.003\n" ] } ], "source": [ "print(f\"Mean Accuracy: {round(np.mean(scores), 3)} | Std: {round(np.std(scores), 3)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Which accuracy is better? \n", "- Dropping missing values\n", "- SimpleImputer with Mean Strategy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# IterativeImputer with RandomForest" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.model_selection import RepeatedStratifiedKFold, cross_val_score" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "imputer = IterativeImputer(max_iter=10, random_state=0)" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "model = RandomForestClassifier()" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "pipeline = Pipeline([('impute', imputer), ('model', model)])" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [], "source": [ "scores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=cv, n_jobs=-1)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean Accuracy: 0.849 | Std: 0.006\n" ] } ], "source": [ "print(f\"Mean Accuracy: {round(np.mean(scores), 3)} | Std: {round(np.std(scores), 3)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run experiments with different Imputation methods and different algorithms\n", "\n", "## Imputation Methods\n", "- Mean\n", "- Median\n", "- Most_frequent\n", "- Constant\n", "- IterativeImputer\n", "\n", "## ALGORITHMS\n", "- Logistic Regression\n", "- KNN\n", "- Random Forest\n", "- SVM\n", "- Any other algorithm of your choice" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Strategy: mean >> Accuracy: 0.85 | Max accuracy: 0.868\n", "Strategy: median >> Accuracy: 0.85 | Max accuracy: 0.861\n", "Strategy: most_frequent >> Accuracy: 0.85 | Max accuracy: 0.865\n", "Strategy: constant >> Accuracy: 0.849 | Max accuracy: 0.865\n" ] } ], "source": [ "results =[]\n", "\n", "strategies = ['mean', 'median', 'most_frequent','constant']\n", "\n", "for s in strategies:\n", " pipeline = Pipeline([('impute', SimpleImputer(strategy=s)),('model', model)])\n", " cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\n", " scores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=cv, n_jobs=-1)\n", " \n", " results.append(scores)\n", " \n", "for method, accuracy in zip(strategies, results):\n", " print(f\"Strategy: {method} >> Accuracy: {round(np.mean(accuracy), 3)} | Max accuracy: {round(np.max(accuracy), 3)}\")\n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Q1: Which is the best strategy for this dataset using Random Forest algorithm?\n", "- SimpleImputer(Mean)\n", "- SimpleImputer(Median)\n", "- SimpleImputer(Most_frequent)\n", "- SimpleImputer(Constant)\n", "- IterativeImputer" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Q2: Which is the best algorithm for this dataset using IterativeImputer?\n", "- Logistic Regression\n", "- Random Forest\n", "- KNN\n", "- any other algorithm of your choice (BONUS)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Q3: Which is the best combination of algorithm and best Imputation Strategy overall?\n", "- Mean , Median, Most_frequent, Constant, IterativeImputer\n", "- Logistic Regression, Random Forest, KNN" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }