diff --git a/README.md b/README.md index 6e0de04..a535cef 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Machine Learning Exercises -This repository contains the exercises accompanying the theory from my [machine learning book](https://franziskahorn.de/mlbook/). +This repository contains the Python programming exercises accompanying the theory from my [machine learning book](https://franziskahorn.de/mlbook/). They are part of the curriculum of the [ML for Data Scientists Workshop](https://franziskahorn.de/mlws_scientist.html). If you have any questions, please send me an [email](mailto:hey@franziskahorn.de). @@ -8,7 +8,7 @@ Have fun! ### Using Python -The programming exercises are written in Python. If you're unfamiliar with Python, you should work through [this tutorial](https://github.com/cod3licious/python_tutorial) at the beginning of the course. +The programming exercises are written in Python. If you're unfamiliar with Python, you should work through [this tutorial](https://github.com/cod3licious/python_tutorial). ##### Using Python on your own computer The [Python tutorial](https://github.com/cod3licious/python_tutorial) includes some notes on how to install Python and Jupyter Notebook on your own computer.
@@ -18,96 +18,3 @@ Please make sure you're using Python 3 and all libraries listed in the [`require If you have a Google account, you can also run the code in the cloud using Google Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cod3licious/ml_exercises)
While Google Colab already includes most packages that we need, should you require an additional library (e.g., `skorch` for training PyTorch neural networks in notebook 5), you can install a package by executing `!pip install package` in a notebook cell. With Colab, it is also possible to run code on a GPU, but this has to be manually selected. - - -## Course Overview - -For an optimal learning experience, the chapters from the [machine learning book](https://franziskahorn.de/mlbook/) should be interleaved with quizzes and programming exercises as shown below. Additionally, you should take notes in the [workbook](/other/ml_course_workbook.pdf) while working through the materials. - -**Important:** Please make a note of all questions that arise while working through the materials. At the beginning of each group session, we'll collect everyone's questions and discuss them. - -You can also find the course syllabus on the last page of the [course description](/ml_course_description.pdf), which explicitly lists the sections of the book for each block. - ---- - -### Part 1: Getting started: What is ML? - -##### Block 1.1: -- [ ] Read the whole chapter: ["Introduction"](https://franziskahorn.de/mlbook/_introduction.html) -- [ ] Don't forget to take notes in the [workbook](/other/ml_course_workbook.pdf) (throughout the whole course) -- [ ] Answer [Quiz 1](https://forms.gle/uzdzytpsYf9sFG946) (quizzes are also available in PDF form in the folder "other" in case you can't access Google Forms) - -##### Block 1.2: -- [ ] Read the whole chapter: ["ML with Python"](https://franziskahorn.de/mlbook/_ml_with_python.html) -- [ ] Install Python on your computer and complete the [Python tutorial](https://github.com/cod3licious/python_tutorial) - -##### Block 1.3: -- [ ] Read the whole chapter: ["Data & Preprocessing"](https://franziskahorn.de/mlbook/_data_preprocessing.html) -- [ ] Answer [Quiz 2](https://forms.gle/Pqr6EKHNxzrWb7MF9) - -##### Block 1.4: -- [ ] Read the whole chapter ["ML Solutions: Overview"](https://franziskahorn.de/mlbook/_ml_solutions_overview.html) -- [ ] Answer [Quiz 3](https://forms.gle/fr7PYmP9Exx4Vvrc8) -- [ ] Prepare a [90-second Spotlight presentation](/other/exercise_ml_use_cases_spotlight.pdf) for one of the given ML use cases - ---- - -### Part 2: Your first algorithms - -##### Block 2.1: -- [ ] Read the whole chapter: ["Unsupervised Learning"](https://franziskahorn.de/mlbook/_unsupervised_learning.html) -- [ ] Work through [Notebook 1: visualize text](/notebooks/1_visualize_text.ipynb) (after the section on dimensionality reduction) -- [ ] Work through [Notebook 2: image quantization](/notebooks/2_image_quantization.ipynb) (after the section on clustering) - -##### Block 2.2: -- [ ] Read the chapter ["Supervised Learning Basics"](https://franziskahorn.de/mlbook/_supervised_learning_basics.html) -- [ ] Answer [Quiz 4](https://forms.gle/M2dDevwzicjcHLtc9) - ---- - -### Part 3: Advanced models - -##### Block 3.1: -- [ ] Read the chapter ["Supervised Learning Models"](https://franziskahorn.de/mlbook/_supervised_learning_models.html) -- [ ] **In parallel**, work through the respective sections of [Notebook 3: supervised comparison](/notebooks/3_supervised_comparison.ipynb) - -##### Block 3.2: -- [ ] Start reading the chapter ["Deep Learning & more"](https://franziskahorn.de/mlbook/_deep_learning_more.html) up to and including the section: ["Information Retrieval (Similarity Search)"](https://franziskahorn.de/mlbook/_information_retrieval_similarity_search.html) and refresh your memory about [TF-IDF feature vectors](https://franziskahorn.de/mlbook/_feature_extraction.html) and the [cosine similarity](https://franziskahorn.de/mlbook/_computing_similarities.html) -- [ ] Work through [Notebook 4: information retrieval](/notebooks/4_information_retrieval.ipynb) - -##### Block 3.3: -- [ ] Read the section: ["Deep Learning (Neural Networks)"](https://franziskahorn.de/mlbook/_deep_learning_neural_networks.html) -- [ ] Work through [Notebook 5: MNIST with torch](/notebooks/5_mnist_torch.ipynb) (recommended) **_or_** [MNIST with keras](/notebooks/5_mnist_keras.ipynb) (in case others in your organization are already working with TensorFlow) - -##### Block 3.4: -- [ ] Read the last sections of the chapter "Deep Learning & more": ["Time Series Forecasting"](https://franziskahorn.de/mlbook/_time_series_forecasting.html) and ["Recommender Systems (Pairwise Data)"](https://franziskahorn.de/mlbook/_recommender_systems_pairwise_data.html) - ---- - -### Part 4: Avoiding common pitfalls - -##### Block 4.1: -- [ ] Read the whole chapter: ["Avoiding Common Pitfalls"](https://franziskahorn.de/mlbook/_avoiding_common_pitfalls.html) - -##### Block 4.2: -- [ ] Work through [Notebook 6: analyze toy dataset](/notebooks/6_analyze_toydata.ipynb) -- [ ] Have a look at the [cheat sheet](/other/cheatsheet.pdf), which includes a summary of the most important steps when developing a machine learning solution, incl. code snippets - -##### Block 4.3: -- [ ] _Case Study!_ Work through [Notebook 7: predicting hard drive failures](/notebooks/7_hard_drive_failures.ipynb) (plan at least 5 hours for this!) - ---- - -### Part 5: RL & Conclusion - -##### Block 5.1: -- [ ] Read the whole chapter: ["Reinforcement Learning"](https://franziskahorn.de/mlbook/_reinforcement_learning.html) -- [ ] Work through [Notebook 8: RL gridmove](/notebooks/8_rl_gridmove.ipynb) - -##### Block 5.2: -- [ ] Answer [Quiz 5](https://forms.gle/uZGj54YQHKwckmL46) -- [ ] Read the whole chapter: ["Conclusion"](https://franziskahorn.de/mlbook/_conclusion.html) -- [ ] Complete the exercise: ["Your next ML Project"](/other/exercise_your_ml_project.pdf) (in case you need some inspiration for a project idea, have a look at [how ML could be used to fight climate change](https://www.climatechange.ai/summaries)). Feel free to prepare a few slides or use the [Word template](/other/exercise_your_ml_project_template.docx) and aim for a 5 minute presentation. -- [ ] Please fill out the [Feedback Survey](https://forms.gle/Ccv5h5zQxwPjWtCS7) to help me further improve this course! :-) - ---- diff --git a/ml_course_description.pdf b/ml_course_description.pdf deleted file mode 100644 index 4d53536..0000000 Binary files a/ml_course_description.pdf and /dev/null differ diff --git a/notebooks/2_image_quantization.ipynb b/notebooks/2_image_quantization.ipynb index 5a46b5d..bc17d00 100644 --- a/notebooks/2_image_quantization.ipynb +++ b/notebooks/2_image_quantization.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "# Color Quantization using K-Means\n", - "In this notebook, we want to transform a regular RGB image (where each pixel is represented as a Red-Green-Blue triplet) into a [compressed representation](https://en.wikipedia.org/wiki/Color_quantization), where each pixel is represented as a single number (color index) together with a limited color palette (RGB triplets corresponding to the color indices). \n", + "In this notebook, we want to transform a regular [RGB image](https://en.wikipedia.org/wiki/RGB_color_model#Numeric_representations) (where each pixel is represented as a Red-Green-Blue triplet) into a [compressed representation](https://en.wikipedia.org/wiki/Color_quantization), where each pixel is represented as a single number (color index) together with a limited color palette (RGB triplets corresponding to the color indices). \n", "\n", "Example from Wikipedia (original image and after quantization):\n", "\"\" \"\"" diff --git a/other/cheatsheet.pdf b/other/cheatsheet.pdf deleted file mode 100644 index 37270af..0000000 Binary files a/other/cheatsheet.pdf and /dev/null differ diff --git a/other/exercise_ml_use_cases_spotlight.docx b/other/exercise_ml_use_cases_spotlight.docx deleted file mode 100644 index 6482259..0000000 Binary files a/other/exercise_ml_use_cases_spotlight.docx and /dev/null differ diff --git a/other/exercise_ml_use_cases_spotlight.pdf b/other/exercise_ml_use_cases_spotlight.pdf deleted file mode 100644 index 520cd58..0000000 Binary files a/other/exercise_ml_use_cases_spotlight.pdf and /dev/null differ diff --git a/other/exercise_your_ml_project.pdf b/other/exercise_your_ml_project.pdf deleted file mode 100644 index 8e39d9f..0000000 Binary files a/other/exercise_your_ml_project.pdf and /dev/null differ diff --git a/other/exercise_your_ml_project_template.docx b/other/exercise_your_ml_project_template.docx deleted file mode 100644 index ffaae41..0000000 Binary files a/other/exercise_your_ml_project_template.docx and /dev/null differ diff --git a/other/ml_course_workbook.docx b/other/ml_course_workbook.docx deleted file mode 100644 index 92c7cd2..0000000 Binary files a/other/ml_course_workbook.docx and /dev/null differ diff --git a/other/ml_course_workbook.pdf b/other/ml_course_workbook.pdf deleted file mode 100644 index 73e6e67..0000000 Binary files a/other/ml_course_workbook.pdf and /dev/null differ diff --git a/other/quizzes/quiz1_introduction.pdf b/other/quizzes/quiz1_introduction.pdf deleted file mode 100644 index 8d36c76..0000000 Binary files a/other/quizzes/quiz1_introduction.pdf and /dev/null differ diff --git a/other/quizzes/quiz2_data.pdf b/other/quizzes/quiz2_data.pdf deleted file mode 100644 index 600c44e..0000000 Binary files a/other/quizzes/quiz2_data.pdf and /dev/null differ diff --git a/other/quizzes/quiz3_ml_solutions.pdf b/other/quizzes/quiz3_ml_solutions.pdf deleted file mode 100644 index aa2e52c..0000000 Binary files a/other/quizzes/quiz3_ml_solutions.pdf and /dev/null differ diff --git a/other/quizzes/quiz4_model_selection.pdf b/other/quizzes/quiz4_model_selection.pdf deleted file mode 100644 index 7db9356..0000000 Binary files a/other/quizzes/quiz4_model_selection.pdf and /dev/null differ diff --git a/other/quizzes/quiz5_big_recap.pdf b/other/quizzes/quiz5_big_recap.pdf deleted file mode 100644 index 9789324..0000000 Binary files a/other/quizzes/quiz5_big_recap.pdf and /dev/null differ