Skip to content

Commit 227e3be

Browse files
hokie45astorfi
authored andcommitted
Fixed broken links for website and Fix image sizes (#23)
* Fixed broken links for website and Fix image sizes * Minor fix on links * Added test Reference list for linear regression * Added references section to all files
1 parent 16ea20f commit 227e3be

File tree

10 files changed

+150
-26
lines changed

10 files changed

+150
-26
lines changed

docs/source/content/overview/crossvalidation.rst

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ and is useful if you have a large amount of data or need to implement
2323
validation quickly and easily.
2424

2525
.. figure:: _img/holdout.png
26-
:scale: 50 %
26+
:scale: 75 %
2727
:alt: holdout method
2828

2929

@@ -35,8 +35,7 @@ data may give your model an unwanted bias towards the training data.
3535
This lack of training or bias can lead to
3636
`Underfitting/Overfitting`_ of our model.
3737

38-
.. _Underfitting/Overfitting: overfitting.rst
39-
38+
.. _Underfitting/Overfitting: https://machine-learning-course.readthedocs.io/en/latest/content/overview/overfitting.html
4039

4140
K-Fold Cross Validation
4241
-----------------------
@@ -49,7 +48,7 @@ combination of data, and the results are averaged to find a total error
4948
estimation.
5049

5150
.. figure:: _img/kfold.png
52-
:scale: 50 %
51+
:scale: 75 %
5352
:alt: kfold method
5453

5554
A "fold" here is a unique section of test data. For instance, if you
@@ -82,7 +81,7 @@ Where "T" is a test point, and "-" is a training point. Below is another
8281
visualization of LPOCV:
8382

8483
.. figure:: _img/LPOCV.png
85-
:scale: 50 %
84+
:scale: 75 %
8685
:alt: kfold method
8786

8887
Ref: http://www.ebc.cat/2017/01/31/cross-validation-strategies/
@@ -102,7 +101,7 @@ Validation, where the number of folds is equal to the number of data
102101
points.
103102

104103
.. figure:: _img/LOOCV.png
105-
:scale: 50 %
104+
:scale: 75 %
106105
:alt: kfold method
107106

108107
Ref: http://www.ebc.cat/2017/01/31/cross-validation-strategies/
@@ -222,6 +221,18 @@ train-test data split is created with the `split()` method:
222221
Note that you can change the P value at the top of the script to see
223222
how different values operate.
224223

225-
.. _holdout.py: /code/overview/cross-validation/holdout.py
226-
.. _k-fold.py: /code/overview/cross-validation/k-fold.py
227-
.. _leave-p-out.py: /code/overview/cross-validation/leave-p-out.py
224+
.. _holdout.py: /https://github.com/machinelearningmindset/machine-learning-course/tree/mastercode/overview/cross-validation/holdout.py
225+
.. _k-fold.py: /https://github.com/machinelearningmindset/machine-learning-course/tree/mastercode/overview/cross-validation/k-fold.py
226+
.. _leave-p-out.py: /https://github.com/machinelearningmindset/machine-learning-course/tree/mastercode/overview/cross-validation/leave-p-out.py
227+
228+
229+
************
230+
References
231+
************
232+
233+
1. https://towardsdatascience.com/cross-validation-in-machine-learning-72924a69872f
234+
2. https://machinelearningmastery.com/k-fold-cross-validation/
235+
3. https://www.quora.com/What-is-cross-validation-in-machine-learning
236+
#. http://www.ebc.cat/2017/01/31/cross-validation-strategies/
237+
238+

docs/source/content/overview/linear-regression.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,3 +228,18 @@ but still show up in a lot of data sets so this is a good technique to know.
228228
Learning about linear regression is a good first step towards learning more
229229
complicated analysis techniques. We will build on a lot of the concepts
230230
covered here in later modules.
231+
232+
233+
************
234+
References
235+
************
236+
237+
1. https://towardsdatascience.com/introduction-to-machine-learning-algorithms-linear-regression-14c4e325882a
238+
2. https://machinelearningmastery.com/linear-regression-for-machine-learning/
239+
3. https://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html
240+
#. https://machinelearningmastery.com/implement-simple-linear-regression-scratch-python/
241+
#. https://medium.com/analytics-vidhya/linear-regression-in-python-from-scratch-24db98184276
242+
#. https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
243+
#. https://scikit-learn.org/stable/modules/generated/sklearn.compose.TransformedTargetRegressor.html
244+
245+

docs/source/content/overview/overfitting.rst

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ In practice, this error isn't always at edge cases and can pop up anywhere.
2929
The noise in training can cause error as seen in the graph below.
3030

3131
.. figure:: _img/Overfit_small.png
32-
:scale: 10 %
32+
:scale: 100 %
3333
:alt: Overfit
3434
(Created using https://www.desmos.com/calculator/dffnj2jbow)
3535

@@ -54,7 +54,7 @@ In machine learning, this could be a result of underfitting, the model has not
5454
had enough exposure to training data to adapt to it, and is currently in a simple state.
5555

5656
.. figure:: _img/Underfit.PNG
57-
:scale: 50 %
57+
:scale: 100 %
5858
:alt: Underfit
5959
(Created using Wolfram Alpha)
6060

@@ -97,7 +97,7 @@ how to avoid overfitting in machine learning models.
9797
Ideally, a good fit looks something like this:
9898

9999
.. figure:: _img/GoodFit.PNG
100-
:scale: 50 %
100+
:scale: 100 %
101101
:alt: Underfit
102102
(Created using Wolfram Alpha)
103103

@@ -106,3 +106,13 @@ When using machine learning in any capacity, issues such as overfitting
106106
frequently come up, and having a grasp of the concept is very important.
107107
The modules in this section are among the most important in the whole repository,
108108
since regardless of the implementation, machine learning always includes these fundamentals.
109+
110+
111+
************
112+
References
113+
************
114+
115+
1. https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/
116+
2. https://medium.com/greyatom/what-is-underfitting-and-overfitting-in-machine-learning-and-how-to-deal-with-it-6803a989c76
117+
3. https://towardsdatascience.com/overfitting-vs-underfitting-a-conceptual-explanation-d94ee20ca7f9
118+

docs/source/content/overview/regularization.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,3 +184,16 @@ problem in modeling so it's good to know how to mediate it. We have also
184184
explored some methods of regularization that we can use in different
185185
situations. With this, we have learned enough about the core concepts of
186186
machine learning to move onto our next major topic, supervised learning.
187+
188+
189+
************
190+
References
191+
************
192+
193+
1. https://towardsdatascience.com/regularization-in-machine-learning-76441ddcf99a
194+
2. https://www.analyticsvidhya.com/blog/2018/04/fundamentals-deep-learning-regularization-techniques
195+
3. https://www.quora.com/What-is-regularization-in-machine-learning
196+
#. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html
197+
#. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
198+
199+

docs/source/content/supervised/bayes.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,3 +208,14 @@ come in handy for real-time predictions. We make a lot of assumptions to use
208208
Naive Bayes so results should be taken with a grain of salt. But if you don’t
209209
have much data and need fast results, Naive Bayes is a good choice for
210210
classification problems.
211+
212+
213+
************
214+
References
215+
************
216+
217+
1. https://machinelearningmastery.com/naive-bayes-classifier-scratch-python/
218+
2. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/
219+
3. https://towardsdatascience.com/naive-bayes-in-machine-learning-f49cc8f831b4
220+
#. https://medium.com/machine-learning-101/chapter-1-supervised-learning-and-naive-bayes-classification-part-1-theory-8b9e361897d5
221+

docs/source/content/supervised/decisiontrees.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,3 +316,17 @@ whether Mike will go shopping:
316316
317317
# Use our tree to predict the outcome of the random values
318318
prediction_results = tree.predict(encoder.transform(prediction_data))
319+
320+
321+
************
322+
References
323+
************
324+
325+
1. https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052
326+
2. https://heartbeat.fritz.ai/introduction-to-decision-tree-learning-cd604f85e23
327+
3. https://machinelearningmastery.com/implement-decision-tree-algorithm-scratch-python/
328+
#. https://sebastianraschka.com/faq/docs/decision-tree-binary.html
329+
#. https://www.cs.cmu.edu/~bhiksha/courses/10-601/decisiontrees/
330+
331+
332+

docs/source/content/supervised/knn.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ k = 1 then the class the object would be in is the class of the closest
2020
neighbor. Let's look at an example.
2121

2222
.. figure:: _img/knn.png
23-
:scale: 50 %
23+
:scale: 100 %
2424
:alt: KNN
2525

2626
Ref: https://coxdocs.org
@@ -60,7 +60,7 @@ to one of them and then we know that distance is roughly close to the other poin
6060
Here is an example of how the K-D tree looks like.
6161

6262
.. figure:: _img/KNN_KDTree.jpg
63-
:scale: 50 %
63+
:scale: 100 %
6464
:alt: KNN K-d tree
6565

6666
Ref: https://slideplayer.com/slide/3273367/
@@ -120,7 +120,7 @@ The program will take the data and plot them on a graph, then use the KNN algori
120120
The output should look like this:
121121

122122
.. figure:: _img/knn_output_k9.png
123-
:scale: 50%
123+
:scale: 100%
124124
:alt: KNN k = 9 output
125125

126126
The green points are classified as benign.
@@ -154,7 +154,7 @@ Try changing the value of n_neighbors to 1 in the code below.
154154
If you changed the value of n_neighbors to 1 this will classify by the point that is closest to the point. The output should look like this:
155155

156156
.. figure:: _img/knn_output_k1.png
157-
:scale: 50%
157+
:scale: 100%
158158
:alt: KNN k = 1 output
159159

160160
Comparing this output to k = 9 you can see a large difference on how it classifies the data. So if you want to ignore outliers you
@@ -165,6 +165,15 @@ Eventually the algorithm will classify all the data into 1 class, and there will
165165

166166
.. _knn.py: https://github.com/machinelearningmindset/machine-learning-course/blob/master/code/supervised/KNN/knn.py
167167

168-
.. _Support Vector Machines: linear_SVM.html
168+
.. _Support Vector Machines: https://machine-learning-course.readthedocs.io/en/latest/content/supervised/linear_SVM.html
169169

170170

171+
************
172+
References
173+
************
174+
175+
1. https://medium.com/machine-learning-101/k-nearest-neighbors-classifier-1c1ff404d265
176+
2. https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/
177+
3. https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html
178+
#. https://turi.com/learn/userguide/supervised-learning/knn_classifier.html
179+

docs/source/content/supervised/linear_SVM.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ amount of lines that can divide two classes. As you can see in the graph below,
3030
the circles, so which one do we choose?
3131

3232
.. figure:: _img/Possible_hyperplane.png
33-
:scale: 50%
33+
:scale: 100%
3434
:alt: Possible_Hyperplane
3535

3636
Ref: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47
@@ -40,7 +40,7 @@ the line/hyperplane with the **maximum margin**. Maximizing the margin will give
4040
This is shown in the figure below.
4141

4242
.. figure:: _img/optimal_hyperplane.png
43-
:scale: 50%
43+
:scale: 100%
4444
:alt: Optimal_Hyperplane
4545

4646
Ref: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47
@@ -62,7 +62,7 @@ Support Vector Machines will ignore these outliers. This is shown in the figure
6262

6363

6464
.. figure:: _img/SVM_Outliers.png
65-
:scale: 50%
65+
:scale: 100%
6666
:alt: Outliers
6767

6868
Ref: https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
@@ -78,7 +78,7 @@ There will be data classes that can't be separated with a simple line or hyperpl
7878
separable data**. Here is an example of that kind of data.
7979

8080
.. figure:: _img/SVM_Kernal.png
81-
:scale: 50%
81+
:scale: 100%
8282
:alt: Kernel
8383

8484
Ref: https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
@@ -92,7 +92,7 @@ classified with a circle that separates the data.
9292
Here is an example of the kernel trick.
9393

9494
.. figure:: _img/SVM_Kernel2.png
95-
:scale: 50%
95+
:scale: 100%
9696
:alt: Kernel X Y graph
9797

9898
Ref: https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
@@ -145,7 +145,7 @@ The program will take the data and plot them on a graph, then use the SVM to cre
145145
It also circles the support vectors that determine the hyperplane. The output should look like this:
146146

147147
.. figure:: _img/linear_svm_output.png
148-
:scale: 50%
148+
:scale: 100%
149149
:alt: Linear SVM output
150150

151151
The green points are classified as benign.
@@ -173,3 +173,15 @@ the data. You can change it here in the code:
173173

174174
.. _linear_svm.py: https://github.com/machinelearningmindset/machine-learning-course/blob/master/code/supervised/Linear_SVM/linear_svm.py
175175

176+
177+
************
178+
References
179+
************
180+
181+
1. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
182+
2. https://stackabuse.com/implementing-svm-and-kernel-svm-with-pythons-scikit-learn/
183+
3. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47
184+
#. https://towardsdatascience.com/https-medium-com-pupalerushikesh-svm-f4b42800e989
185+
#. https://towardsdatascience.com/support-vector-machines-svm-c9ef22815589
186+
187+

docs/source/content/supervised/logistic_regression.rst

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Here is the standard logistic function, note that the output is always between
2525
0 and 1, but never reaches either of those values.
2626

2727
.. figure:: _img/WikiLogistic.svg.png
28-
:scale: 20%
28+
:scale: 100%
2929
:alt: Logistic
3030
Ref: https://en.wikipedia.org/wiki/Logistic_regression
3131

@@ -142,3 +142,20 @@ The basic idea is to supply the training data as pairs of input and
142142
classification, and the model will be built automatically.
143143
As always, keep in mind the basics mentioned in the overview section of this
144144
repository, as there is no fool-proof method for machine learning.
145+
146+
147+
************
148+
References
149+
************
150+
151+
1. https://towardsdatascience.com/logistic-regression-b0af09cdb8ad
152+
2. https://medium.com/datadriveninvestor/machine-learning-model-logistic-regression-5fa4ffde5773
153+
3. https://github.com/bfortuner/ml-cheatsheet/blob/master/docs/logistic_regression.rst
154+
#. https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/
155+
#. https://towardsdatascience.com/logistic-regression-a-simplified-approach-using-python-c4bc81a87c31
156+
#. https://hackernoon.com/introduction-to-machine-learning-algorithms-logistic-regression-cbdd82d81a36
157+
#. https://en.wikipedia.org/wiki/Logistic_regression
158+
#. https://en.wikipedia.org/wiki/Multinomial_logistic_regression
159+
#. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
160+
#. https://towardsdatascience.com/5-reasons-logistic-regression-should-be-the-first-thing-you-learn-when-become-a-data-scientist-fcaae46605c4
161+

docs/source/content/unsupervised/clustering.rst

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ K-Means is a good choice.
128128

129129
The relevant code is available in the clustering_kmeans.py_ file.
130130

131-
.. _clustering_kmeans.py: /code/unsupervised/Clustering/clustering_kmeans.py
131+
.. _clustering_kmeans.py: https://github.com/machinelearningmindset/machine-learning-course/code/unsupervised/Clustering/clustering_kmeans.py
132132

133133
In the code, we create the simple data set to use for analysis. Setting up the
134134
clustering is very simple and requires one line of code:
@@ -187,7 +187,7 @@ the number of expected clusters.
187187

188188
The relevant code is available in the clustering_hierarchical.py_ file.
189189

190-
.. _clustering_hierarchical.py: /code/unsupervised/Clustering/clustering_hierarchical.py
190+
.. _clustering_hierarchical.py: https://github.com/machinelearningmindset/machine-learning-course/code/unsupervised/Clustering/clustering_hierarchical.py
191191

192192
In the code, we create the simple data set to use for analysis. Setting up the
193193
clustering is very simple and requires one line of code:
@@ -221,3 +221,15 @@ the toy manufacturer example that could be used for targeted advertising. This
221221
is a very useful result for businesses and it only took us a few lines of
222222
code. By developing a good understanding of clustering, you are setting
223223
yourself up for success in the machine learning world.
224+
225+
226+
************
227+
References
228+
************
229+
230+
1. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/
231+
2. https://medium.com/datadriveninvestor/an-introduction-to-clustering-61f6930e3e0b
232+
3. https://medium.com/predict/three-popular-clustering-methods-and-when-to-use-each-4227c80ba2b6
233+
#. https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
234+
#. https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
235+

0 commit comments

Comments
 (0)