Update all notebooks assuming we are all in the future now: sklearn 0.20+, python 3.5+, TF 2.0 preview

2026-01-26 17:50:27 +01:00 · 2019-01-18 23:08:37 +08:00
parent 10c432a997
commit 6b8dff91d0
12 changed files with 1186 additions and 2625 deletions
--- a/05_support_vector_machines.ipynb
+++ b/05_support_vector_machines.ipynb
@@ -1,5 +1,12 @@
 {
 "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -60,6 +67,23 @@
    "    plt.savefig(path, format='png', dpi=300)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook assumes you have installed Scikit-Learn ≥0.20."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sklearn\n",
+    "assert sklearn.__version__ >= \"0.20\""
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -76,7 +100,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -98,7 +122,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -160,7 +184,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -204,7 +228,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -269,7 +293,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -293,7 +317,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -309,7 +333,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -332,7 +356,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -356,7 +380,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -393,7 +417,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -432,7 +456,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -453,7 +477,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -472,7 +496,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -495,7 +519,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -510,7 +534,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -523,7 +547,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -545,7 +569,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 19,
   "metadata": {
    "scrolled": true
   },
@@ -613,7 +637,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -625,7 +649,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -638,7 +662,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 22,
   "metadata": {
    "scrolled": true
   },
@@ -681,7 +705,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -693,7 +717,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -705,7 +729,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -728,7 +752,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -765,7 +789,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -779,19 +803,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "**Warning**: the default value of `gamma` will change from `'auto'` to `'scale'` in version 0.22 to better account for unscaled features. To preserve the same results as in the book, we explicitly set it to `'auto'`, but you should probably just use the default in your own code."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 27,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from sklearn.svm import SVR\n",
-    "\n",
-    "svm_poly_reg = SVR(kernel=\"poly\", degree=2, C=100, epsilon=0.1, gamma=\"auto\")\n",
-    "svm_poly_reg.fit(X, y)"
+    "**Note**: to be future-proof, we set `gamma=\"scale\"`, as this will be the default value in Scikit-Learn 0.22."
   ]
  },
  {
@@ -802,15 +814,27 @@
   "source": [
    "from sklearn.svm import SVR\n",
    "\n",
-    "svm_poly_reg1 = SVR(kernel=\"poly\", degree=2, C=100, epsilon=0.1, gamma=\"auto\")\n",
-    "svm_poly_reg2 = SVR(kernel=\"poly\", degree=2, C=0.01, epsilon=0.1, gamma=\"auto\")\n",
+    "svm_poly_reg = SVR(kernel=\"poly\", degree=2, C=100, epsilon=0.1, gamma=\"scale\")\n",
+    "svm_poly_reg.fit(X, y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.svm import SVR\n",
+    "\n",
+    "svm_poly_reg1 = SVR(kernel=\"poly\", degree=2, C=100, epsilon=0.1, gamma=\"scale\")\n",
+    "svm_poly_reg2 = SVR(kernel=\"poly\", degree=2, C=0.01, epsilon=0.1, gamma=\"scale\")\n",
    "svm_poly_reg1.fit(X, y)\n",
    "svm_poly_reg2.fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -835,7 +859,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 30,
+   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -846,7 +870,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -874,17 +898,17 @@
    "    ax.plot_wireframe(x1, x2, df, alpha=0.3, color=\"k\")\n",
    "    ax.plot(X_crop[:, 0][y_crop==0], X_crop[:, 1][y_crop==0], 0, \"bs\")\n",
    "    ax.axis(x1_lim + x2_lim)\n",
-    "    ax.text(4.5, 2.5, 3.8, \"Decision function $h$\", fontsize=15)\n",
-    "    ax.set_xlabel(r\"Petal length\", fontsize=15)\n",
-    "    ax.set_ylabel(r\"Petal width\", fontsize=15)\n",
-    "    ax.set_zlabel(r\"$h = \\mathbf{w}^T \\mathbf{x} + b$\", fontsize=18)\n",
+    "    ax.text(4.5, 2.5, 3.8, \"Decision function $h$\", fontsize=16)\n",
+    "    ax.set_xlabel(r\"Petal length\", fontsize=16, labelpad=10)\n",
+    "    ax.set_ylabel(r\"Petal width\", fontsize=16, labelpad=10)\n",
+    "    ax.set_zlabel(r\"$h = \\mathbf{w}^T \\mathbf{x} + b$\", fontsize=18, labelpad=5)\n",
    "    ax.legend(loc=\"upper left\", fontsize=16)\n",
    "\n",
    "fig = plt.figure(figsize=(11, 6))\n",
    "ax1 = fig.add_subplot(111, projection='3d')\n",
    "plot_3D_decision_function(ax1, w=svm_clf2.coef_[0], b=svm_clf2.intercept_[0])\n",
    "\n",
-    "#save_fig(\"iris_3D_plot\")\n",
+    "save_fig(\"iris_3D_plot\")\n",
    "plt.show()"
   ]
  },
@@ -897,7 +921,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -931,7 +955,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -956,7 +980,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 34,
+   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -992,7 +1016,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 35,
+   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1003,7 +1027,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 36,
+   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1021,7 +1045,11 @@
    "    tols.append(tol)\n",
    "    print(i, tol, t2-t1)\n",
    "    tol /= 10\n",
-    "plt.semilogx(tols, times)"
+    "plt.semilogx(tols, times, \"bo-\")\n",
+    "plt.xlabel(\"Tolerance\", fontsize=16)\n",
+    "plt.ylabel(\"Time (seconds)\", fontsize=16)\n",
+    "plt.grid(True)\n",
+    "plt.show()"
   ]
  },
  {
@@ -1033,7 +1061,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 37,
+   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1044,7 +1072,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 38,
+   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1109,7 +1137,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 39,
+   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1119,7 +1147,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 40,
+   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1128,7 +1156,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 41,
+   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1139,7 +1167,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 42,
+   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1165,7 +1193,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 43,
+   "execution_count": 44,
   "metadata": {
    "scrolled": true
   },
@@ -1173,7 +1201,7 @@
   "source": [
    "from sklearn.linear_model import SGDClassifier\n",
    "\n",
-    "sgd_clf = SGDClassifier(loss=\"hinge\", alpha = 0.017, max_iter = 50, tol=-np.infty, random_state=42)\n",
+    "sgd_clf = SGDClassifier(loss=\"hinge\", alpha=0.017, max_iter=1000, tol=1e-3, random_state=42)\n",
    "sgd_clf.fit(X, y.ravel())\n",
    "\n",
    "m = len(X)\n",
@@ -1242,7 +1270,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 44,
+   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1259,7 +1287,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 45,
+   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1273,7 +1301,7 @@
    "lin_clf = LinearSVC(loss=\"hinge\", C=C, random_state=42)\n",
    "svm_clf = SVC(kernel=\"linear\", C=C)\n",
    "sgd_clf = SGDClassifier(loss=\"hinge\", learning_rate=\"constant\", eta0=0.001, alpha=alpha,\n",
-    "                        max_iter=100000, tol=-np.infty, random_state=42)\n",
+    "                        max_iter=1000, tol=1e-3, random_state=42)\n",
    "\n",
    "scaler = StandardScaler()\n",
    "X_scaled = scaler.fit_transform(X)\n",
@@ -1296,7 +1324,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 46,
+   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1358,19 +1386,15 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 47,
+   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
-    "try:\n",
-    "    from sklearn.datasets import fetch_openml\n",
-    "    mnist = fetch_openml('mnist_784', version=1, cache=True)\n",
-    "except ImportError:\n",
-    "    from sklearn.datasets import fetch_mldata\n",
-    "    mnist = fetch_mldata('MNIST original')\n",
+    "from sklearn.datasets import fetch_openml\n",
+    "mnist = fetch_openml('mnist_784', version=1, cache=True)\n",
    "\n",
    "X = mnist[\"data\"]\n",
-    "y = mnist[\"target\"]\n",
+    "y = mnist[\"target\"].astype(np.uint8)\n",
    "\n",
    "X_train = X[:60000]\n",
    "y_train = y[:60000]\n",
@@ -1382,31 +1406,21 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Many training algorithms are sensitive to the order of the training instances, so it's generally good practice to shuffle them first:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 48,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "np.random.seed(42)\n",
-    "rnd_idx = np.random.permutation(60000)\n",
-    "X_train = X_train[rnd_idx]\n",
-    "y_train = y_train[rnd_idx]"
+    "Many training algorithms are sensitive to the order of the training instances, so it's generally good practice to shuffle them first. However, the dataset is already shuffled, so we do not need to do it."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Let's start simple, with a linear SVM classifier. It will automatically use the One-vs-All (also called One-vs-the-Rest, OvR) strategy, so there's nothing special we need to do. Easy!"
+    "Let's start simple, with a linear SVM classifier. It will automatically use the One-vs-All (also called One-vs-the-Rest, OvR) strategy, so there's nothing special we need to do. Easy!\n",
+    "\n",
+    "**Warning**: this may take a few minutes depending on your hardware."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 49,
+   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1423,7 +1437,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 50,
+   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1437,12 +1451,12 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Wow, 86% accuracy on MNIST is a really bad performance. This linear model is certainly too simple for MNIST, but perhaps we just needed to scale the data first:"
+    "Okay, 89.5% accuracy on MNIST is pretty bad. This linear model is certainly too simple for MNIST, but perhaps we just needed to scale the data first:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 51,
+   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1451,9 +1465,16 @@
    "X_test_scaled = scaler.transform(X_test.astype(np.float32))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Warning**: this may take a few minutes depending on your hardware."
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 52,
+   "execution_count": 63,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1463,7 +1484,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 53,
+   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1475,24 +1496,29 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "That's much better (we cut the error rate in two), but still not great at all for MNIST. If we want to use an SVM, we will have to use a kernel. Let's try an `SVC` with an RBF kernel (the default).\n",
-    "\n",
-    "**Warning**: if you are using Scikit-Learn ≤ 0.19, the `SVC` class will use the One-vs-One (OvO) strategy by default, so you must explicitly set `decision_function_shape=\"ovr\"` if you want to use the OvR strategy instead (OvR is the default since 0.19)."
+    "That's much better (we cut the error rate by about 25%), but still not great at all for MNIST. If we want to use an SVM, we will have to use a kernel. Let's try an `SVC` with an RBF kernel (the default)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Note**: to be future-proof we set `gamma=\"scale\"` since it will be the default value in Scikit-Learn 0.22."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 54,
+   "execution_count": 77,
   "metadata": {},
   "outputs": [],
   "source": [
-    "svm_clf = SVC(decision_function_shape=\"ovr\", gamma=\"auto\")\n",
+    "svm_clf = SVC(gamma=\"scale\")\n",
    "svm_clf.fit(X_train_scaled[:10000], y_train[:10000])"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 55,
+   "execution_count": 78,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1793,7 +1819,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 - tf2",
+   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },