Adding missing figure in chapter 02

This commit is contained in:
Aurélien Geron
2017-06-08 14:23:33 +02:00
parent 74794da1de
commit 8935c61570

View File

@@ -14,6 +14,13 @@
"*This notebook contains all the sample code and solutions to the exercices in chapter 2.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: You may find little differences between the code outputs in the book and in these Jupyter notebooks: these slight differences are mostly due to the random nature of many training algorithms: although I have tried to make these notebooks' outputs as constant as possible, it is impossible to guarantee that they will produce the exact same output on every platform. Also, some data structures (such as dictionaries) do not preserve the item order. Finally, I fixed a few minor bugs (I added notes next to the concerned cells) which lead to slightly different results, without changing the ideas presented in the book."
]
},
{
"cell_type": "markdown",
"metadata": {
@@ -408,6 +415,17 @@
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"housing[\"income_cat\"].hist()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -425,7 +443,7 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 25,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -438,7 +456,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 26,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -462,7 +480,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 27,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -475,7 +493,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 28,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -499,7 +517,7 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 29,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -512,7 +530,7 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 30,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -526,7 +544,7 @@
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 31,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -538,9 +556,16 @@
"save_fig(\"better_visualization_plot\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The argument `sharex=False` fixes a display bug (the x-axis values and legend were not displayed). This is a temporary fix (see: https://github.com/pandas-dev/pandas/issues/10611). Thanks to Wilmer Arellano for pointing it out."
]
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 32,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -551,16 +576,14 @@
"housing.plot(kind=\"scatter\", x=\"longitude\", y=\"latitude\", alpha=0.4,\n",
" s=housing[\"population\"]/100, label=\"population\", figsize=(10,7),\n",
" c=\"median_house_value\", cmap=plt.get_cmap(\"jet\"), colorbar=True,\n",
" sharex=False) # sharex=False fixes a bug (temporary solution)\n",
" # See: https://github.com/pandas-dev/pandas/issues/10611\n",
" # Thanks to Wilmer Arellano for pointing it out.\n",
" sharex=False)\n",
"plt.legend()\n",
"save_fig(\"housing_prices_scatterplot\")"
]
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 33,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -592,7 +615,7 @@
},
{
"cell_type": "code",
"execution_count": 33,
"execution_count": 34,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -605,7 +628,7 @@
},
{
"cell_type": "code",
"execution_count": 34,
"execution_count": 35,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -618,7 +641,7 @@
},
{
"cell_type": "code",
"execution_count": 35,
"execution_count": 36,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -634,7 +657,7 @@
},
{
"cell_type": "code",
"execution_count": 36,
"execution_count": 37,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -652,7 +675,7 @@
},
{
"cell_type": "code",
"execution_count": 37,
"execution_count": 38,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -665,9 +688,16 @@
"housing[\"population_per_household\"]=housing[\"population\"]/housing[\"households\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: there was a bug in the previous cell, in the definition of the `rooms_per_household` attribute. This explains why the correlation value below differs slightly from the value in the book (unless you are reading the latest version)."
]
},
{
"cell_type": "code",
"execution_count": 38,
"execution_count": 39,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -681,7 +711,7 @@
},
{
"cell_type": "code",
"execution_count": 39,
"execution_count": 40,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -697,7 +727,7 @@
},
{
"cell_type": "code",
"execution_count": 40,
"execution_count": 41,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -720,7 +750,7 @@
},
{
"cell_type": "code",
"execution_count": 41,
"execution_count": 42,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -732,19 +762,6 @@
"housing_labels = strat_train_set[\"median_house_value\"].copy()"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"housing.iloc[21:24]"
]
},
{
"cell_type": "code",
"execution_count": 43,
@@ -755,8 +772,7 @@
},
"outputs": [],
"source": [
"housing_copy = housing.copy().iloc[21:24]\n",
"housing_copy.dropna(subset=[\"total_bedrooms\"]) # option 1"
"housing.iloc[21:24]"
]
},
{
@@ -770,7 +786,7 @@
"outputs": [],
"source": [
"housing_copy = housing.copy().iloc[21:24]\n",
"housing_copy.drop(\"total_bedrooms\", axis=1) # option 2"
"housing_copy.dropna(subset=[\"total_bedrooms\"]) # option 1"
]
},
{
@@ -782,6 +798,20 @@
"editable": true
},
"outputs": [],
"source": [
"housing_copy = housing.copy().iloc[21:24]\n",
"housing_copy.drop(\"total_bedrooms\", axis=1) # option 2"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"housing_copy = housing.copy().iloc[21:24]\n",
"median = housing_copy[\"total_bedrooms\"].median()\n",
@@ -791,7 +821,7 @@
},
{
"cell_type": "code",
"execution_count": 46,
"execution_count": 47,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -804,7 +834,7 @@
},
{
"cell_type": "code",
"execution_count": 47,
"execution_count": 48,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -819,7 +849,7 @@
},
{
"cell_type": "code",
"execution_count": 48,
"execution_count": 49,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -841,7 +871,7 @@
},
{
"cell_type": "code",
"execution_count": 49,
"execution_count": 50,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -856,7 +886,7 @@
},
{
"cell_type": "code",
"execution_count": 50,
"execution_count": 51,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -869,7 +899,7 @@
},
{
"cell_type": "code",
"execution_count": 51,
"execution_count": 52,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -882,7 +912,7 @@
},
{
"cell_type": "code",
"execution_count": 52,
"execution_count": 53,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -895,7 +925,7 @@
},
{
"cell_type": "code",
"execution_count": 53,
"execution_count": 54,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -908,7 +938,7 @@
},
{
"cell_type": "code",
"execution_count": 54,
"execution_count": 55,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -921,7 +951,7 @@
},
{
"cell_type": "code",
"execution_count": 55,
"execution_count": 56,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -934,7 +964,7 @@
},
{
"cell_type": "code",
"execution_count": 56,
"execution_count": 57,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -947,7 +977,7 @@
},
{
"cell_type": "code",
"execution_count": 57,
"execution_count": 58,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -960,7 +990,7 @@
},
{
"cell_type": "code",
"execution_count": 58,
"execution_count": 59,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -974,7 +1004,7 @@
},
{
"cell_type": "code",
"execution_count": 59,
"execution_count": 60,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -992,7 +1022,7 @@
},
{
"cell_type": "code",
"execution_count": 60,
"execution_count": 61,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1005,7 +1035,7 @@
},
{
"cell_type": "code",
"execution_count": 61,
"execution_count": 62,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1022,7 +1052,7 @@
},
{
"cell_type": "code",
"execution_count": 62,
"execution_count": 63,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1035,7 +1065,7 @@
},
{
"cell_type": "code",
"execution_count": 63,
"execution_count": 64,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1052,7 +1082,7 @@
},
{
"cell_type": "code",
"execution_count": 64,
"execution_count": 65,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1085,7 +1115,7 @@
},
{
"cell_type": "code",
"execution_count": 65,
"execution_count": 66,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1099,7 +1129,7 @@
},
{
"cell_type": "code",
"execution_count": 66,
"execution_count": 67,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1121,7 +1151,7 @@
},
{
"cell_type": "code",
"execution_count": 67,
"execution_count": 68,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1134,7 +1164,7 @@
},
{
"cell_type": "code",
"execution_count": 68,
"execution_count": 69,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -1155,7 +1185,7 @@
},
{
"cell_type": "code",
"execution_count": 69,
"execution_count": 70,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -1181,7 +1211,7 @@
},
{
"cell_type": "code",
"execution_count": 70,
"execution_count": 71,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1199,7 +1229,7 @@
},
{
"cell_type": "code",
"execution_count": 71,
"execution_count": 72,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1213,7 +1243,7 @@
},
{
"cell_type": "code",
"execution_count": 72,
"execution_count": 73,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1236,7 +1266,7 @@
},
{
"cell_type": "code",
"execution_count": 73,
"execution_count": 74,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1252,7 +1282,7 @@
},
{
"cell_type": "code",
"execution_count": 74,
"execution_count": 75,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1270,7 +1300,7 @@
},
{
"cell_type": "code",
"execution_count": 75,
"execution_count": 76,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1283,7 +1313,7 @@
},
{
"cell_type": "code",
"execution_count": 76,
"execution_count": 77,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1296,7 +1326,7 @@
},
{
"cell_type": "code",
"execution_count": 77,
"execution_count": 78,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1314,7 +1344,7 @@
},
{
"cell_type": "code",
"execution_count": 78,
"execution_count": 79,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1330,7 +1360,7 @@
},
{
"cell_type": "code",
"execution_count": 79,
"execution_count": 80,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1346,7 +1376,7 @@
},
{
"cell_type": "code",
"execution_count": 80,
"execution_count": 81,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1372,7 +1402,7 @@
},
{
"cell_type": "code",
"execution_count": 81,
"execution_count": 82,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1389,7 +1419,7 @@
},
{
"cell_type": "code",
"execution_count": 82,
"execution_count": 83,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1407,7 +1437,7 @@
},
{
"cell_type": "code",
"execution_count": 83,
"execution_count": 84,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1423,7 +1453,7 @@
},
{
"cell_type": "code",
"execution_count": 84,
"execution_count": 85,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1439,7 +1469,7 @@
},
{
"cell_type": "code",
"execution_count": 85,
"execution_count": 86,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1455,7 +1485,7 @@
},
{
"cell_type": "code",
"execution_count": 86,
"execution_count": 87,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1473,7 +1503,7 @@
},
{
"cell_type": "code",
"execution_count": 87,
"execution_count": 88,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1487,7 +1517,7 @@
},
{
"cell_type": "code",
"execution_count": 88,
"execution_count": 89,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1507,7 +1537,7 @@
},
{
"cell_type": "code",
"execution_count": 89,
"execution_count": 90,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1530,7 +1560,7 @@
},
{
"cell_type": "code",
"execution_count": 90,
"execution_count": 91,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1543,7 +1573,7 @@
},
{
"cell_type": "code",
"execution_count": 91,
"execution_count": 92,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1556,7 +1586,7 @@
},
{
"cell_type": "code",
"execution_count": 92,
"execution_count": 93,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1571,7 +1601,7 @@
},
{
"cell_type": "code",
"execution_count": 93,
"execution_count": 94,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1584,7 +1614,7 @@
},
{
"cell_type": "code",
"execution_count": 94,
"execution_count": 95,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1608,7 +1638,7 @@
},
{
"cell_type": "code",
"execution_count": 95,
"execution_count": 96,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1623,7 +1653,7 @@
},
{
"cell_type": "code",
"execution_count": 96,
"execution_count": 97,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1637,7 +1667,7 @@
},
{
"cell_type": "code",
"execution_count": 97,
"execution_count": 98,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1653,7 +1683,7 @@
},
{
"cell_type": "code",
"execution_count": 98,
"execution_count": 99,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -1675,7 +1705,7 @@
},
{
"cell_type": "code",
"execution_count": 99,
"execution_count": 100,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1711,7 +1741,7 @@
},
{
"cell_type": "code",
"execution_count": 100,
"execution_count": 101,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1748,7 +1778,7 @@
},
{
"cell_type": "code",
"execution_count": 101,
"execution_count": 102,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -1761,7 +1791,7 @@
},
{
"cell_type": "code",
"execution_count": 102,
"execution_count": 103,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -1787,7 +1817,7 @@
},
{
"cell_type": "code",
"execution_count": 103,
"execution_count": 104,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1837,7 +1867,7 @@
},
{
"cell_type": "code",
"execution_count": 104,
"execution_count": 105,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1870,7 +1900,7 @@
},
{
"cell_type": "code",
"execution_count": 105,
"execution_count": 106,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1895,7 +1925,7 @@
},
{
"cell_type": "code",
"execution_count": 106,
"execution_count": 107,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1938,7 +1968,7 @@
},
{
"cell_type": "code",
"execution_count": 107,
"execution_count": 108,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -1978,7 +2008,7 @@
},
{
"cell_type": "code",
"execution_count": 108,
"execution_count": 109,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2003,7 +2033,7 @@
},
{
"cell_type": "code",
"execution_count": 109,
"execution_count": 110,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2036,7 +2066,7 @@
},
{
"cell_type": "code",
"execution_count": 110,
"execution_count": 111,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2068,7 +2098,7 @@
},
{
"cell_type": "code",
"execution_count": 111,
"execution_count": 112,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2120,7 +2150,7 @@
},
{
"cell_type": "code",
"execution_count": 112,
"execution_count": 113,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -2166,7 +2196,7 @@
},
{
"cell_type": "code",
"execution_count": 113,
"execution_count": 114,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -2189,7 +2219,7 @@
},
{
"cell_type": "code",
"execution_count": 114,
"execution_count": 115,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2203,7 +2233,7 @@
},
{
"cell_type": "code",
"execution_count": 115,
"execution_count": 116,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2226,7 +2256,7 @@
},
{
"cell_type": "code",
"execution_count": 116,
"execution_count": 117,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2249,7 +2279,7 @@
},
{
"cell_type": "code",
"execution_count": 117,
"execution_count": 118,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2265,7 +2295,7 @@
},
{
"cell_type": "code",
"execution_count": 118,
"execution_count": 119,
"metadata": {
"collapsed": true,
"deletable": true,
@@ -2288,7 +2318,7 @@
},
{
"cell_type": "code",
"execution_count": 119,
"execution_count": 120,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2311,7 +2341,7 @@
},
{
"cell_type": "code",
"execution_count": 120,
"execution_count": 121,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2354,7 +2384,7 @@
},
{
"cell_type": "code",
"execution_count": 121,
"execution_count": 122,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2371,7 +2401,7 @@
},
{
"cell_type": "code",
"execution_count": 122,
"execution_count": 123,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2394,7 +2424,7 @@
},
{
"cell_type": "code",
"execution_count": 123,
"execution_count": 124,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2441,7 +2471,7 @@
},
{
"cell_type": "code",
"execution_count": 124,
"execution_count": 125,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2461,7 +2491,7 @@
},
{
"cell_type": "code",
"execution_count": 125,
"execution_count": 126,
"metadata": {
"collapsed": false,
"deletable": true,
@@ -2484,7 +2514,7 @@
},
{
"cell_type": "code",
"execution_count": 126,
"execution_count": 127,
"metadata": {
"collapsed": false,
"deletable": true,