Polish the story in the third notebook

hnarayanan · hnarayanan · commit 87ff424f0be1 · 2017-02-20T13:38:38.000Z
diff --git a/notebooks/2_Neural_Network-based_Image_Classifier-1.ipynb b/notebooks/2_Neural_Network-based_Image_Classifier-1.ipynb
@@ -302,7 +302,7 @@
     "editable": true
    },
    "source": [
-    "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser to improve our parameters for our score function and reduce the loss."
+    "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser (to improve our parameters for our score function and reduce the loss) in one line of code."
    ]
   },
   {
diff --git a/notebooks/3_Neural_Network-based_Image_Classifier-2.ipynb b/notebooks/3_Neural_Network-based_Image_Classifier-2.ipynb
@@ -17,8 +17,6 @@
     "editable": true
    },
    "source": [
-    "## Getting a feel for the data\n",
-    "\n",
     "Let's start by importing some packages we need."
    ]
   },
@@ -52,7 +50,13 @@
     "editable": true
    },
    "source": [
-    "MNIST is a dataset that contains 70,000 labelled images of handwritten digits. We're going to train a linear classifier on a part of this data set, and test it against another portion of the data set to see how well we did.\n",
+    "## Getting a feel for the data\n",
+    "\n",
+    "MNIST is a dataset that contains 70,000 labelled images of handwritten digits that look like the following.\n",
+    "\n",
+    "![MNIST Data Sample](images/mnist-sample.png \"MNIST Data Sample\")\n",
+    "\n",
+    "We're going to train a linear classifier on a part of this data set, and test it against another portion of the data set to see how well we did.\n",
     "\n",
     "The TensorFlow tutorial comes with a handy loader for this dataset."
    ]
@@ -236,7 +240,9 @@
     "editable": true
    },
    "source": [
-    "We define a nonlinear model for the score function (a vanilla neural network) after introducing two sets of parameters, **W1**, **b1** and **W2**, **b2**."
+    "We define a nonlinear model for the score function (a vanilla neural network) after introducing two sets of parameters, **W1**, **b1** and **W2**, **b2**.\n",
+    "\n",
+    "![Neural network 1 hidden layer](images/neural-network-1-hidden.png \"Neural network with 1 hidden layer\")"
    ]
   },
   {
@@ -275,7 +281,7 @@
     "\n",
     "````\n",
     "\n",
-    "We define our loss function to measure how poorly this model performs on images with known labels. We think of the scores we have as unnormalized log probabilities of the classes, and take the cross entropy loss of the softmax of the class scores determined by our score function."
+    "We define our loss function to measure how poorly this model performs on images with known labels. We use the a specific form called the [cross entropy loss](https://jamesmccaffrey.wordpress.com/2013/11/05/why-you-should-use-cross-entropy-error-instead-of-classification-error-or-mean-squared-error-for-neural-network-classifier-training/)."
    ]
   },
   {
@@ -298,7 +304,7 @@
     "editable": true
    },
    "source": [
-    "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser to improve our parameters for our score function and reduce the loss."
+    "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser (to improve our parameters for our score function and reduce the loss) in one line of code."
    ]
   },
   {
@@ -425,6 +431,8 @@
     "1. Play around with the length of the hidden layer and see how the accuracy changes.\n",
     "\n",
     "2. Try extending the model to two hidden layers and see how much the accuracy increases:\n",
+    "\n",
+    "  ![Neural network 2 hidden layer](images/neural-network-2-hidden.png \"Neural network with 2 hidden layer\")\n",
     "    \n",
     "  ````\n",
     "  W1 = tf.Variable(tf.truncated_normal(shape=[784, 400], stddev=0.1))\n",
diff --git a/notebooks/images/neural-network-2-hidden.png b/notebooks/images/neural-network-2-hidden.png

Original file line number	Diff line number	Diff line change
`@@ -302,7 +302,7 @@`
`302`	`302`	`"editable": true`
`303`	`303`	`},`
`304`	`304`	`"source": [`
`305`		`- "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser to improve our parameters for our score function and reduce the loss."`
	`305`	`+ "Using the magic of blackbox optimisation algorithms provided by TensorFlow, we can define a single step of the stochastic gradient descent optimiser (to improve our parameters for our score function and reduce the loss) in one line of code."`
`306`	`306`	`]`
`307`	`307`	`},`
`308`	`308`	`{`