{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Color Quantization using K-Means\n", "In this notebook, we want to transform a regular [RGB image](https://en.wikipedia.org/wiki/RGB_color_model#Numeric_representations) (where each pixel is represented as a Red-Green-Blue triplet) into a [compressed representation](https://en.wikipedia.org/wiki/Color_quantization), where each pixel is represented as a single number (color index) together with a limited color palette (RGB triplets corresponding to the color indices). \n", "\n", "Example from Wikipedia (original image and after quantization):\n", "\"\" \"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from PIL import Image # library for loading image files\n", "from sklearn.cluster import KMeans\n", "from sklearn.utils import shuffle" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# load the original image -> change the path to an image of your choice\n", "img_org = Image.open(\"../data/cat.jpg\")\n", "img_org" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# transform the image into a numpy array\n", "img_array = np.asarray(img_org)\n", "print(img_array.shape) # height x width x 3 (RGB channels)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# reshape image into a matrix with RGB values for each pixel\n", "h, w, d = img_array.shape\n", "X = new_X.reshape(h*w, d)\n", "# 1 pixel = 1 data point; RGB values = features\n", "print(X.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# to speed things up a little, we only take a random subsample of the original pixels\n", "X_sample = shuffle(X, random_state=0)[:1000]\n", "# initialize k-means and set n_clusters to the number of colors you want in your image (e.g. 10)\n", "kmeans = ...\n", "# fit the model on the data (i.e., find the cluster indices)\n", "kmeans.fit(X_sample)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the cluster centers now contain the RGB triplets for each cluster, i.e., our new color palette\n", "kmeans.cluster_centers_" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use the predict function of kmeans to compute the cluster index for each data point (i.e., pixel) in X\n", "# (cluster indices together with the color palette would be the compressed representation of the image)\n", "cluster_idx = ...\n", "print(cluster_idx.shape) # same first dimension as X" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# to visualize what the compressed image looks like, map each pixel to the corresponding new color\n", "new_X = kmeans.cluster_centers_[cluster_idx]\n", "print(new_X.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# cast as integers to get proper RGB values\n", "new_X = np.array(new_X, dtype=np.uint8)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# reshape back into image format\n", "img_new = ... # TODO: reshape new_X such that img_new is a matrix of shape height x width x 3 RGB channels\n", "print(img_new.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# transform into PIL image (and possibly save)\n", "img_new = Image.fromarray(img_new)\n", "# img_new.save(\"cat_new.png\") # -> save & share your image with the other participants\n", "img_new" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heuristic to determine the number of clusters *k*\n", "\n", "The objective that k-means internally optimizes is the average distance of the samples to their assigned cluster centers, i.e., it tries to find clusters such that all the points in the cluster are very close to the respective cluster center.\n", "\n", "After fitting k-means, the final value of this objective function can be computed with the `score` function on the dataset (this actually gives you the negative value, since this is more convenient for some optimization algorithms).\n", "\n", "We can now simply fit k-means with different settings for *k* and observe how the value of the score function changes as we increase the number of clusters.\n", "\n", "#### Questions: \n", "* What would happen (i.e., what would the score be) if you set *k* to a very large value, e.g., the number of data points? \n", "* Based on the plot that we compute below, what do you think might be a good value for *k*? (Of course, this will be different for every dataset, i.e., in this example, a different image might need more or less colors to look ok.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# how many clusters (i.e. distinct colors) are needed?\n", "scores = []\n", "for n in range(1, 16):\n", " # compute the value of the k-means objective function for the current k\n", " kmeans = KMeans(n_clusters=n, random_state=0).fit(X_sample)\n", " scores.append(kmeans.score(X_sample))\n", "# check out how much the score improves as we use more clusters\n", "plt.figure()\n", "plt.plot(range(1, 16), scores)\n", "plt.xlabel(\"number of clusters\")\n", "plt.ylabel(\"score\");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" } }, "nbformat": 4, "nbformat_minor": 2 }