Skip to content

Commit 696b80e

Browse files
committed
Pushing the docs to dev/ for branch: main, commit 8388a1d55ea2ac93b4ecf7c58f695aacb00a2171
1 parent bb5a37b commit 696b80e

File tree

1,595 files changed

+7362
-8640
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,595 files changed

+7362
-8640
lines changed

dev/.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: bf0cbbb0e1707c9dfb2e280aa1a85df2
3+
config: 2478929b8f68b6599631942f204e49da
44
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/4c1663175b07cf9608b07331aa180eb7/plot_logistic_multinomial.ipynb

Lines changed: 92 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"\n# Plot multinomial and One-vs-Rest Logistic Regression\n\nPlot decision surface of multinomial and One-vs-Rest Logistic Regression.\nThe hyperplanes corresponding to the three One-vs-Rest (OVR) classifiers\nare represented by the dashed lines.\n"
7+
"\n# Decision Boundaries of Multinomial and One-vs-Rest Logistic Regression\n\nThis example compares decision boundaries of multinomial and one-vs-rest\nlogistic regression on a 2D dataset with three classes.\n\nWe make a comparison of the decision boundaries of both methods that is equivalent\nto call the method `predict`. In addition, we plot the hyperplanes that correspond to\nthe line when the probability estimate for a class is of 0.5.\n"
88
]
99
},
1010
{
@@ -15,7 +15,97 @@
1515
},
1616
"outputs": [],
1717
"source": [
18-
"# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom sklearn.datasets import make_blobs\nfrom sklearn.inspection import DecisionBoundaryDisplay\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.multiclass import OneVsRestClassifier\n\n# make 3-class dataset for classification\ncenters = [[-5, 0], [0, 1.5], [5, -1]]\nX, y = make_blobs(n_samples=1000, centers=centers, random_state=40)\ntransformation = [[0.4, 0.2], [-0.4, 1.2]]\nX = np.dot(X, transformation)\n\nfor multi_class in (\"multinomial\", \"ovr\"):\n clf = LogisticRegression(solver=\"sag\", max_iter=100, random_state=42)\n if multi_class == \"ovr\":\n clf = OneVsRestClassifier(clf)\n clf.fit(X, y)\n\n # print the training scores\n print(\"training score : %.3f (%s)\" % (clf.score(X, y), multi_class))\n\n _, ax = plt.subplots()\n DecisionBoundaryDisplay.from_estimator(\n clf, X, response_method=\"predict\", cmap=plt.cm.Paired, ax=ax\n )\n plt.title(\"Decision surface of LogisticRegression (%s)\" % multi_class)\n plt.axis(\"tight\")\n\n # Plot also the training points\n colors = \"bry\"\n for i, color in zip(clf.classes_, colors):\n idx = np.where(y == i)\n plt.scatter(X[idx, 0], X[idx, 1], c=color, edgecolor=\"black\", s=20)\n\n # Plot the three one-against-all classifiers\n xmin, xmax = plt.xlim()\n ymin, ymax = plt.ylim()\n if multi_class == \"ovr\":\n coef = np.concatenate([est.coef_ for est in clf.estimators_])\n intercept = np.concatenate([est.intercept_ for est in clf.estimators_])\n else:\n coef = clf.coef_\n intercept = clf.intercept_\n\n def plot_hyperplane(c, color):\n def line(x0):\n return (-(x0 * coef[c, 0]) - intercept[c]) / coef[c, 1]\n\n plt.plot([xmin, xmax], [line(xmin), line(xmax)], ls=\"--\", color=color)\n\n for i, color in zip(clf.classes_, colors):\n plot_hyperplane(i, color)\n\nplt.show()"
18+
"# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause"
19+
]
20+
},
21+
{
22+
"cell_type": "markdown",
23+
"metadata": {},
24+
"source": [
25+
"## Dataset Generation\n\nWe generate a synthetic dataset using :func:`~sklearn.datasets.make_blobs` function.\nThe dataset consists of 1,000 samples from three different classes,\ncentered around [-5, 0], [0, 1.5], and [5, -1]. After generation, we apply a linear\ntransformation to introduce some correlation between features and make the problem\nmore challenging. This results in a 2D dataset with three overlapping classes,\nsuitable for demonstrating the differences between multinomial and one-vs-rest\nlogistic regression.\n\n"
26+
]
27+
},
28+
{
29+
"cell_type": "code",
30+
"execution_count": null,
31+
"metadata": {
32+
"collapsed": false
33+
},
34+
"outputs": [],
35+
"source": [
36+
"import matplotlib.pyplot as plt\nimport numpy as np\n\nfrom sklearn.datasets import make_blobs\n\ncenters = [[-5, 0], [0, 1.5], [5, -1]]\nX, y = make_blobs(n_samples=1_000, centers=centers, random_state=40)\ntransformation = [[0.4, 0.2], [-0.4, 1.2]]\nX = np.dot(X, transformation)\n\nfig, ax = plt.subplots(figsize=(6, 4))\n\nscatter = ax.scatter(X[:, 0], X[:, 1], c=y, edgecolor=\"black\")\nax.set(title=\"Synthetic Dataset\", xlabel=\"Feature 1\", ylabel=\"Feature 2\")\n_ = ax.legend(*scatter.legend_elements(), title=\"Classes\")"
37+
]
38+
},
39+
{
40+
"cell_type": "markdown",
41+
"metadata": {},
42+
"source": [
43+
"## Classifier Training\n\nWe train two different logistic regression classifiers: multinomial and one-vs-rest.\nThe multinomial classifier handles all classes simultaneously, while the one-vs-rest\napproach trains a binary classifier for each class against all others.\n\n"
44+
]
45+
},
46+
{
47+
"cell_type": "code",
48+
"execution_count": null,
49+
"metadata": {
50+
"collapsed": false
51+
},
52+
"outputs": [],
53+
"source": [
54+
"from sklearn.linear_model import LogisticRegression\nfrom sklearn.multiclass import OneVsRestClassifier\n\nlogistic_regression_multinomial = LogisticRegression().fit(X, y)\nlogistic_regression_ovr = OneVsRestClassifier(LogisticRegression()).fit(X, y)\n\naccuracy_multinomial = logistic_regression_multinomial.score(X, y)\naccuracy_ovr = logistic_regression_ovr.score(X, y)"
55+
]
56+
},
57+
{
58+
"cell_type": "markdown",
59+
"metadata": {},
60+
"source": [
61+
"## Decision Boundaries Visualization\n\nLet's visualize the decision boundaries of both models that is provided by the\nmethod `predict` of the classifiers.\n\n"
62+
]
63+
},
64+
{
65+
"cell_type": "code",
66+
"execution_count": null,
67+
"metadata": {
68+
"collapsed": false
69+
},
70+
"outputs": [],
71+
"source": [
72+
"from sklearn.inspection import DecisionBoundaryDisplay\n\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5), sharex=True, sharey=True)\n\nfor model, title, ax in [\n (\n logistic_regression_multinomial,\n f\"Multinomial Logistic Regression\\n(Accuracy: {accuracy_multinomial:.3f})\",\n ax1,\n ),\n (\n logistic_regression_ovr,\n f\"One-vs-Rest Logistic Regression\\n(Accuracy: {accuracy_ovr:.3f})\",\n ax2,\n ),\n]:\n DecisionBoundaryDisplay.from_estimator(\n model,\n X,\n ax=ax,\n response_method=\"predict\",\n alpha=0.8,\n )\n scatter = ax.scatter(X[:, 0], X[:, 1], c=y, edgecolor=\"k\")\n legend = ax.legend(*scatter.legend_elements(), title=\"Classes\")\n ax.add_artist(legend)\n ax.set_title(title)"
73+
]
74+
},
75+
{
76+
"cell_type": "markdown",
77+
"metadata": {},
78+
"source": [
79+
"We see that the decision boundaries are different. This difference stems from their\napproaches:\n\n- Multinomial logistic regression considers all classes simultaneously during\n optimization.\n- One-vs-rest logistic regression fits each class independently against all others.\n\nThese distinct strategies can lead to varying decision boundaries, especially in\ncomplex multi-class problems.\n\n## Hyperplanes Visualization\n\nWe also visualize the hyperplanes that correspond to the line when the probability\nestimate for a class is of 0.5.\n\n"
80+
]
81+
},
82+
{
83+
"cell_type": "code",
84+
"execution_count": null,
85+
"metadata": {
86+
"collapsed": false
87+
},
88+
"outputs": [],
89+
"source": [
90+
"def plot_hyperplanes(classifier, X, ax):\n xmin, xmax = X[:, 0].min(), X[:, 0].max()\n ymin, ymax = X[:, 1].min(), X[:, 1].max()\n ax.set(xlim=(xmin, xmax), ylim=(ymin, ymax))\n\n if isinstance(classifier, OneVsRestClassifier):\n coef = np.concatenate([est.coef_ for est in classifier.estimators_])\n intercept = np.concatenate([est.intercept_ for est in classifier.estimators_])\n else:\n coef = classifier.coef_\n intercept = classifier.intercept_\n\n for i in range(coef.shape[0]):\n w = coef[i]\n a = -w[0] / w[1]\n xx = np.linspace(xmin, xmax)\n yy = a * xx - (intercept[i]) / w[1]\n ax.plot(xx, yy, \"--\", linewidth=3, label=f\"Class {i}\")\n\n return ax.get_legend_handles_labels()"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": null,
96+
"metadata": {
97+
"collapsed": false
98+
},
99+
"outputs": [],
100+
"source": [
101+
"fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5), sharex=True, sharey=True)\n\nfor model, title, ax in [\n (\n logistic_regression_multinomial,\n \"Multinomial Logistic Regression Hyperplanes\",\n ax1,\n ),\n (logistic_regression_ovr, \"One-vs-Rest Logistic Regression Hyperplanes\", ax2),\n]:\n hyperplane_handles, hyperplane_labels = plot_hyperplanes(model, X, ax)\n scatter = ax.scatter(X[:, 0], X[:, 1], c=y, edgecolor=\"k\")\n scatter_handles, scatter_labels = scatter.legend_elements()\n\n all_handles = hyperplane_handles + scatter_handles\n all_labels = hyperplane_labels + scatter_labels\n\n ax.legend(all_handles, all_labels, title=\"Classes\")\n ax.set_title(title)\n\nplt.show()"
102+
]
103+
},
104+
{
105+
"cell_type": "markdown",
106+
"metadata": {},
107+
"source": [
108+
"While the hyperplanes for classes 0 and 2 are quite similar between the two methods,\nwe observe that the hyperplane for class 1 is notably different. This difference stems\nfrom the fundamental approaches of one-vs-rest and multinomial logistic regression:\n\nFor one-vs-rest logistic regression:\n\n- Each hyperplane is determined independently by considering one class against all\n others.\n- For class 1, the hyperplane represents the decision boundary that best separates\n class 1 from the combined classes 0 and 2.\n- This binary approach can lead to simpler decision boundaries but may not capture\n complex relationships between all classes simultaneously.\n- There is no possible interpretation of the conditional class probabilities.\n\nFor multinomial logistic regression:\n\n- All hyperplanes are determined simultaneously, considering the relationships between\n all classes at once.\n- The loss minimized by the model is a proper scoring rule, which means that the model\n is optimized to estimate the conditional class probabilities that are, therefore,\n meaningful.\n- Each hyperplane represents the decision boundary where the probability of one class\n becomes higher than the others, based on the overall probability distribution.\n- This approach can capture more nuanced relationships between classes, potentially\n leading to more accurate classification in multi-class problems.\n\nThe difference in hyperplanes, especially for class 1, highlights how these methods\ncan produce different decision boundaries despite similar overall accuracy.\n\nIn practice, using multinomial logistic regression is recommended since it minimizes a\nwell-formulated loss function, leading to better-calibrated class probabilities and\nthus more interpretable results. When it comes to decision boundaries, one should\nformulate a utility function to transform the class probabilities into a meaningful\nquantity for the problem at hand. One-vs-rest allows for different decision boundaries\nbut does not allow for fine-grained control over the trade-off between the classes as\na utility function would.\n\n"
19109
]
20110
}
21111
],
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/557dc086a33038b608833f8490e00a43/plot_iris_logistic.ipynb

Lines changed: 0 additions & 43 deletions
This file was deleted.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/8597aee4ffb052082e2e71a7496b7ee0/plot_iris_logistic.py

Lines changed: 0 additions & 52 deletions
This file was deleted.

0 commit comments

Comments
 (0)