Add some section headers

2026-01-14 12:14:36 +01:00 · 2021-10-03 00:14:44 +13:00
parent 2bd68d6348
commit 6b821335c0
3 changed files with 239 additions and 26 deletions
--- a/02_end_to_end_machine_learning_project.ipynb
+++ b/02_end_to_end_machine_learning_project.ipynb
@@ -83,7 +83,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Get the data"
+    "# Get the Data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download the Data"
   ]
  },
  {
@@ -132,6 +139,13 @@
    "    return pd.read_csv(csv_path)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Take a Quick Look at the Data Structure"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 5,
@@ -182,6 +196,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a Test Set"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 10,
@@ -443,7 +464,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Discover and visualize the data to gain insights"
+    "# Discover and Visualize the Data to Gain Insights"
   ]
  },
  {
@@ -455,6 +476,13 @@
    "housing = strat_train_set.copy()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Visualizing Geographical Data"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 33,
@@ -540,6 +568,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Looking for Correlations"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 38,
@@ -585,6 +620,13 @@
    "save_fig(\"income_vs_house_value_scatterplot\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Experimenting with Attribute Combinations"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 42,
@@ -631,7 +673,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Prepare the data for Machine Learning algorithms"
+    "# Prepare the Data for Machine Learning Algorithms"
   ]
  },
  {
@@ -644,6 +686,29 @@
    "housing_labels = strat_train_set[\"median_house_value\"].copy()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Data Cleaning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the book 3 options are listed:\n",
+    "\n",
+    "```python\n",
+    "housing.dropna(subset=[\"total_bedrooms\"])    # option 1\n",
+    "housing.drop(\"total_bedrooms\", axis=1)       # option 2\n",
+    "median = housing[\"total_bedrooms\"].median()  # option 3\n",
+    "housing[\"total_bedrooms\"].fillna(median, inplace=True)\n",
+    "```\n",
+    "\n",
+    "To demonstrate each of them, let's create a copy of the housing dataset, but keeping only the rows that contain at least one null. Then it will be easier to visualize exactly what each option does:"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 47,
@@ -815,6 +880,13 @@
    "housing_tr.head()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Handling Text and Categorical Attributes"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -910,6 +982,13 @@
    "cat_encoder.categories_"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Custom Transformers"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -985,6 +1064,13 @@
    "housing_extra_attribs.head()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Transformation Pipelines"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -1154,7 +1240,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Select and train a model "
+    "# Select and Train a Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Training and Evaluating on the Training Set"
   ]
  },
  {
@@ -1269,7 +1362,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Fine-tune your model"
+    "## Better Evaluation Using Cross-Validation"
   ]
  },
  {
@@ -1382,6 +1475,20 @@
    "svm_rmse"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Fine-Tune Your Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Grid Search"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 99,
@@ -1457,6 +1564,13 @@
    "pd.DataFrame(grid_search.cv_results_)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Randomized Search"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 104,
@@ -1488,6 +1602,13 @@
    "    print(np.sqrt(-mean_score), params)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Analyze the Best Models and Their Errors"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 106,
@@ -1512,6 +1633,13 @@
    "sorted(zip(feature_importances, attributes), reverse=True)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Evaluate Your System on the Test Set"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 108,