|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "## Multilayer Perceptron: Fit and evaluate a model\n", |
| 8 | + "\n", |
| 9 | + "Using the Titanic dataset from [this](https://www.kaggle.com/c/titanic/overview) Kaggle competition.\n", |
| 10 | + "\n", |
| 11 | + "In this section, we will fit and evaluate a simple Multilayer Perceptron model." |
| 12 | + ] |
| 13 | + }, |
| 14 | + { |
| 15 | + "cell_type": "markdown", |
| 16 | + "metadata": {}, |
| 17 | + "source": [ |
| 18 | + "### Read in Data" |
| 19 | + ] |
| 20 | + }, |
| 21 | + { |
| 22 | + "cell_type": "code", |
| 23 | + "execution_count": 4, |
| 24 | + "metadata": {}, |
| 25 | + "outputs": [], |
| 26 | + "source": [ |
| 27 | + "import joblib\n", |
| 28 | + "import pandas as pd\n", |
| 29 | + "from sklearn.neural_network import MLPClassifier\n", |
| 30 | + "from sklearn.model_selection import GridSearchCV\n", |
| 31 | + "\n", |
| 32 | + "import warnings\n", |
| 33 | + "warnings.filterwarnings('ignore', category=FutureWarning)\n", |
| 34 | + "warnings.filterwarnings('ignore', category=DeprecationWarning)\n", |
| 35 | + "\n", |
| 36 | + "train_features = pd.read_csv('../Data/train_features.csv')\n", |
| 37 | + "train_labels = pd.read_csv('../Data/train_labels.csv', header=None)" |
| 38 | + ] |
| 39 | + }, |
| 40 | + { |
| 41 | + "cell_type": "markdown", |
| 42 | + "metadata": {}, |
| 43 | + "source": [ |
| 44 | + "### Hyperparameter tuning\n", |
| 45 | + "\n", |
| 46 | + "" |
| 47 | + ] |
| 48 | + }, |
| 49 | + { |
| 50 | + "cell_type": "code", |
| 51 | + "execution_count": 5, |
| 52 | + "metadata": {}, |
| 53 | + "outputs": [], |
| 54 | + "source": [ |
| 55 | + "def print_results(results):\n", |
| 56 | + " print('BEST PARAMS: {}'.format(results.best_params_))\n", |
| 57 | + " \n", |
| 58 | + " means = results.cv_results_['mean_test_score']\n", |
| 59 | + " stds = results.cv_results_['std_test_score']\n", |
| 60 | + " for mean, std, params in zip(means, stds, results.cv_results_['params']):\n", |
| 61 | + " print('{} (+- {}) for {}'.format(round(mean,3), round(std *2, 3), params))" |
| 62 | + ] |
| 63 | + }, |
| 64 | + { |
| 65 | + "cell_type": "markdown", |
| 66 | + "metadata": {}, |
| 67 | + "source": [ |
| 68 | + "#### Hyper parameters tuning Notes\n", |
| 69 | + "- #### hidden_layer_sizes\n", |
| 70 | + " - as the problem is relatively simple, we will use one layer only => passing value in the tuple with one value represents 1 layer\n", |
| 71 | + " - here 1 hidden layer with 10 nodes, 50 nodes and 100 nodes.\n", |
| 72 | + "- #### activation\n", |
| 73 | + " - `relu`, `tanh`, `logistic`\n", |
| 74 | + "- #### learning_rate\n", |
| 75 | + " - `constant`: it will just take the initial learning rate and keep it the same throughout the entire optimization process.\n", |
| 76 | + " - `invscaling`: (inverse scaling) it gradually decreases the learning rate at each step. So this will allow it to take large jump at first. and then it slowly decreases as it gets closer and closer to optimal model.\n", |
| 77 | + " - `adaptive`: this keeps the learning constant as long as training loss keeps decreasing. If the learning rate stops going down, then it will decrease the learning rate, so that it takes smaller steps. " |
| 78 | + ] |
| 79 | + }, |
| 80 | + { |
| 81 | + "cell_type": "code", |
| 82 | + "execution_count": 9, |
| 83 | + "metadata": {}, |
| 84 | + "outputs": [ |
| 85 | + { |
| 86 | + "name": "stderr", |
| 87 | + "output_type": "stream", |
| 88 | + "text": [ |
| 89 | + "C:\\Users\\Phone Thiri Yadana\\.conda\\envs\\venv-datascience\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:582: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (1000) reached and the optimization hasn't converged yet.\n", |
| 90 | + " warnings.warn(\n" |
| 91 | + ] |
| 92 | + }, |
| 93 | + { |
| 94 | + "name": "stdout", |
| 95 | + "output_type": "stream", |
| 96 | + "text": [ |
| 97 | + "BEST PARAMS: {'activation': 'tanh', 'hidden_layer_sizes': (10,), 'learning_rate': 'constant'}\n", |
| 98 | + "0.787 (+- 0.114) for {'activation': 'relu', 'hidden_layer_sizes': (10,), 'learning_rate': 'constant'}\n", |
| 99 | + "0.792 (+- 0.099) for {'activation': 'relu', 'hidden_layer_sizes': (10,), 'learning_rate': 'invscaling'}\n", |
| 100 | + "0.79 (+- 0.1) for {'activation': 'relu', 'hidden_layer_sizes': (10,), 'learning_rate': 'adaptive'}\n", |
| 101 | + "0.774 (+- 0.136) for {'activation': 'relu', 'hidden_layer_sizes': (50,), 'learning_rate': 'constant'}\n", |
| 102 | + "0.787 (+- 0.082) for {'activation': 'relu', 'hidden_layer_sizes': (50,), 'learning_rate': 'invscaling'}\n", |
| 103 | + "0.8 (+- 0.102) for {'activation': 'relu', 'hidden_layer_sizes': (50,), 'learning_rate': 'adaptive'}\n", |
| 104 | + "0.789 (+- 0.123) for {'activation': 'relu', 'hidden_layer_sizes': (100,), 'learning_rate': 'constant'}\n", |
| 105 | + "0.783 (+- 0.105) for {'activation': 'relu', 'hidden_layer_sizes': (100,), 'learning_rate': 'invscaling'}\n", |
| 106 | + "0.805 (+- 0.109) for {'activation': 'relu', 'hidden_layer_sizes': (100,), 'learning_rate': 'adaptive'}\n", |
| 107 | + "0.824 (+- 0.092) for {'activation': 'tanh', 'hidden_layer_sizes': (10,), 'learning_rate': 'constant'}\n", |
| 108 | + "0.792 (+- 0.115) for {'activation': 'tanh', 'hidden_layer_sizes': (10,), 'learning_rate': 'invscaling'}\n", |
| 109 | + "0.779 (+- 0.139) for {'activation': 'tanh', 'hidden_layer_sizes': (10,), 'learning_rate': 'adaptive'}\n", |
| 110 | + "0.805 (+- 0.082) for {'activation': 'tanh', 'hidden_layer_sizes': (50,), 'learning_rate': 'constant'}\n", |
| 111 | + "0.807 (+- 0.083) for {'activation': 'tanh', 'hidden_layer_sizes': (50,), 'learning_rate': 'invscaling'}\n", |
| 112 | + "0.809 (+- 0.108) for {'activation': 'tanh', 'hidden_layer_sizes': (50,), 'learning_rate': 'adaptive'}\n", |
| 113 | + "0.803 (+- 0.086) for {'activation': 'tanh', 'hidden_layer_sizes': (100,), 'learning_rate': 'constant'}\n", |
| 114 | + "0.792 (+- 0.09) for {'activation': 'tanh', 'hidden_layer_sizes': (100,), 'learning_rate': 'invscaling'}\n", |
| 115 | + "0.788 (+- 0.091) for {'activation': 'tanh', 'hidden_layer_sizes': (100,), 'learning_rate': 'adaptive'}\n", |
| 116 | + "0.798 (+- 0.106) for {'activation': 'logistic', 'hidden_layer_sizes': (10,), 'learning_rate': 'constant'}\n", |
| 117 | + "0.79 (+- 0.127) for {'activation': 'logistic', 'hidden_layer_sizes': (10,), 'learning_rate': 'invscaling'}\n", |
| 118 | + "0.787 (+- 0.142) for {'activation': 'logistic', 'hidden_layer_sizes': (10,), 'learning_rate': 'adaptive'}\n", |
| 119 | + "0.805 (+- 0.12) for {'activation': 'logistic', 'hidden_layer_sizes': (50,), 'learning_rate': 'constant'}\n", |
| 120 | + "0.789 (+- 0.124) for {'activation': 'logistic', 'hidden_layer_sizes': (50,), 'learning_rate': 'invscaling'}\n", |
| 121 | + "0.8 (+- 0.111) for {'activation': 'logistic', 'hidden_layer_sizes': (50,), 'learning_rate': 'adaptive'}\n", |
| 122 | + "0.794 (+- 0.108) for {'activation': 'logistic', 'hidden_layer_sizes': (100,), 'learning_rate': 'constant'}\n", |
| 123 | + "0.794 (+- 0.121) for {'activation': 'logistic', 'hidden_layer_sizes': (100,), 'learning_rate': 'invscaling'}\n", |
| 124 | + "0.789 (+- 0.1) for {'activation': 'logistic', 'hidden_layer_sizes': (100,), 'learning_rate': 'adaptive'}\n" |
| 125 | + ] |
| 126 | + } |
| 127 | + ], |
| 128 | + "source": [ |
| 129 | + "mlp = MLPClassifier(max_iter = 1000)\n", |
| 130 | + "\n", |
| 131 | + "parameters = {\n", |
| 132 | + " 'hidden_layer_sizes': [(10,), (50,), (100,)], \n", |
| 133 | + " 'activation': ['relu', 'tanh', 'logistic'],\n", |
| 134 | + " 'learning_rate': ['constant', 'invscaling', 'adaptive'],\n", |
| 135 | + "}\n", |
| 136 | + "\n", |
| 137 | + "cv = GridSearchCV(mlp, parameters, cv=5)\n", |
| 138 | + "cv.fit(train_features, train_labels.values.ravel())\n", |
| 139 | + "\n", |
| 140 | + "print_results(cv)" |
| 141 | + ] |
| 142 | + }, |
| 143 | + { |
| 144 | + "cell_type": "markdown", |
| 145 | + "metadata": {}, |
| 146 | + "source": [ |
| 147 | + "### Write out pickled model" |
| 148 | + ] |
| 149 | + }, |
| 150 | + { |
| 151 | + "cell_type": "code", |
| 152 | + "execution_count": 10, |
| 153 | + "metadata": {}, |
| 154 | + "outputs": [ |
| 155 | + { |
| 156 | + "data": { |
| 157 | + "text/plain": [ |
| 158 | + "MLPClassifier(activation='tanh', hidden_layer_sizes=(10,), max_iter=1000)" |
| 159 | + ] |
| 160 | + }, |
| 161 | + "execution_count": 10, |
| 162 | + "metadata": {}, |
| 163 | + "output_type": "execute_result" |
| 164 | + } |
| 165 | + ], |
| 166 | + "source": [ |
| 167 | + "cv.best_estimator_" |
| 168 | + ] |
| 169 | + }, |
| 170 | + { |
| 171 | + "cell_type": "code", |
| 172 | + "execution_count": 12, |
| 173 | + "metadata": {}, |
| 174 | + "outputs": [ |
| 175 | + { |
| 176 | + "data": { |
| 177 | + "text/plain": [ |
| 178 | + "['../Pickled_Models/MLP_model.pkl']" |
| 179 | + ] |
| 180 | + }, |
| 181 | + "execution_count": 12, |
| 182 | + "metadata": {}, |
| 183 | + "output_type": "execute_result" |
| 184 | + } |
| 185 | + ], |
| 186 | + "source": [ |
| 187 | + "joblib.dump(cv.best_estimator_, '../Pickled_Models/MLP_model.pkl')" |
| 188 | + ] |
| 189 | + }, |
| 190 | + { |
| 191 | + "cell_type": "code", |
| 192 | + "execution_count": null, |
| 193 | + "metadata": {}, |
| 194 | + "outputs": [], |
| 195 | + "source": [] |
| 196 | + } |
| 197 | + ], |
| 198 | + "metadata": { |
| 199 | + "kernelspec": { |
| 200 | + "display_name": "Python 3", |
| 201 | + "language": "python", |
| 202 | + "name": "python3" |
| 203 | + }, |
| 204 | + "language_info": { |
| 205 | + "codemirror_mode": { |
| 206 | + "name": "ipython", |
| 207 | + "version": 3 |
| 208 | + }, |
| 209 | + "file_extension": ".py", |
| 210 | + "mimetype": "text/x-python", |
| 211 | + "name": "python", |
| 212 | + "nbconvert_exporter": "python", |
| 213 | + "pygments_lexer": "ipython3", |
| 214 | + "version": "3.8.3" |
| 215 | + } |
| 216 | + }, |
| 217 | + "nbformat": 4, |
| 218 | + "nbformat_minor": 2 |
| 219 | +} |
0 commit comments