Improve alignment between notebook and book section headers

2026-01-14 12:14:36 +01:00 · 2021-10-03 23:05:49 +13:00
parent 6b821335c0
commit 3f89676892
6 changed files with 560 additions and 151 deletions
--- a/08_dimensionality_reduction.ipynb
+++ b/08_dimensionality_reduction.ipynb
@@ -84,8 +84,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Projection methods\n",
-    "Build 3D dataset:"
+    "# PCA\n",
+    "Let's build a simple 3D dataset:"
   ]
  },
  {
@@ -110,7 +110,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## PCA using SVD decomposition"
+    "## Principal Components"
   ]
  },
  {
@@ -146,6 +146,13 @@
    "np.allclose(X_centered, U.dot(S).dot(Vt))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Projecting Down to d Dimensions"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 6,
@@ -169,7 +176,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## PCA using Scikit-Learn"
+    "## Using Scikit-Learn"
   ]
  },
  {
@@ -344,6 +351,13 @@
    "Notice how the axes are flipped."
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Explained Variance Ratio"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -406,6 +420,13 @@
    "Next, let's generate some nice figures! :)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–2. A 3D dataset lying close to a 2D subspace:**"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -515,6 +536,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–3. The new 2D dataset after projection:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 25,
@@ -540,8 +568,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Manifold learning\n",
-    "Swiss roll:"
+    "**Code to generate Figure 8–4. Swiss roll dataset:**"
   ]
  },
  {
@@ -551,6 +578,7 @@
   "outputs": [],
   "source": [
    "from sklearn.datasets import make_swiss_roll\n",
+    "\n",
    "X, t = make_swiss_roll(n_samples=1000, noise=0.2, random_state=42)"
   ]
  },
@@ -578,6 +606,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–5. Squashing by projecting onto a plane (left) versus unrolling the Swiss roll (right):**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 28,
@@ -603,6 +638,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–6. The decision boundary may not always be simpler with lower dimensions:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 29,
@@ -688,7 +730,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# PCA"
+    "**Code to generate Figure 8–7. Selecting the subspace to project on:**"
   ]
  },
  {
@@ -761,7 +803,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# MNIST compression"
+    "## Choosing the Right Number of Dimensions"
   ]
  },
  {
@@ -818,6 +860,13 @@
    "d"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–8. Explained variance as a function of the number of dimensions:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 35,
@@ -867,6 +916,13 @@
    "np.sum(pca.explained_variance_ratio_)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## PCA for Compression"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 39,
@@ -878,6 +934,13 @@
    "X_recovered = pca.inverse_transform(X_reduced)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–9. MNIST compression that preserves 95% of the variance:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 40,
@@ -930,7 +993,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Incremental PCA"
+    "## Randomized PCA"
   ]
  },
  {
@@ -938,6 +1001,23 @@
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
+   "source": [
+    "rnd_pca = PCA(n_components=154, svd_solver=\"randomized\", random_state=42)\n",
+    "X_reduced = rnd_pca.fit_transform(X_train)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Incremental PCA"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 44,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "from sklearn.decomposition import IncrementalPCA\n",
    "\n",
@@ -952,16 +1032,23 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 44,
+   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_recovered_inc_pca = inc_pca.inverse_transform(X_reduced)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's check that compression still works well:"
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 45,
+   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -975,7 +1062,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 46,
+   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -991,7 +1078,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 47,
+   "execution_count": 48,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1007,7 +1094,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 48,
+   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1018,7 +1105,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Using `memmap()`"
+    "**Using `memmap()`:**"
   ]
  },
  {
@@ -1030,7 +1117,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 49,
+   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1050,7 +1137,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 50,
+   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1066,7 +1153,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 51,
+   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1077,21 +1164,11 @@
    "inc_pca.fit(X_mm)"
   ]
  },
-  {
-   "cell_type": "code",
-   "execution_count": 52,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "rnd_pca = PCA(n_components=154, svd_solver=\"randomized\", random_state=42)\n",
-    "X_reduced = rnd_pca.fit_transform(X_train)"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Time complexity"
+    "**Time complexity:**"
   ]
  },
  {
@@ -1226,6 +1303,13 @@
    "X_reduced = rbf_pca.fit_transform(X)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–10. Swiss roll reduced to 2D using kPCA with various kernels:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 58,
@@ -1260,6 +1344,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–11. Kernel PCA and the reconstruction pre-image error:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 59,
@@ -1300,6 +1391,13 @@
    "plt.grid(True)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Selecting a Kernel and Tuning Hyperparameters"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 61,
@@ -1384,6 +1482,13 @@
    "X_reduced = lle.fit_transform(X)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–12. Unrolled Swiss roll using LLE:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 67,
@@ -1405,7 +1510,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# MDS, Isomap and t-SNE"
+    "## Other Dimensionality Reduction Techniques"
   ]
  },
  {
@@ -1459,6 +1564,13 @@
    "X_reduced_lda = lda.transform(X_mnist)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Code to generate Figure 8–13. Using various techniques to reduce the Swill roll to 2D:**"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 72,