Refactor and enhance code in Reinforcement Learning notebook; add new R script for EM algorithm in Unsupervised Learning; update README to include new section for Unsupervised Learning.

This commit is contained in:
2025-11-26 13:20:18 +01:00
parent 5d968fa5e5
commit 08cf8fbeda
8 changed files with 1480 additions and 212 deletions

View File

@@ -0,0 +1,313 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Séance 4 - Bonus : Réseau récurrent avec Embedding\n",
"\n",
"Dans cette séance nous avons entraîné un modèle à copier le style de poésie de Beaudelaire, spécifiquement l'oeuvre *Les fleurs du mal*. On souhaite voir ici comment utiliser la couche [`Embedding`](https://keras.io/api/layers/core_layers/embedding/) et ce que l'on peut faire avec.\n",
"\n",
"Commençons par importer les données."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"import keras\n",
"import numpy as np\n",
"import seaborn as sns\n",
"\n",
"sns.set(style=\"whitegrid\")\n",
"\n",
"\n",
"start = False\n",
"book = open(\"Beaudelaire.txt\", encoding=\"utf8\") # noqa: SIM115\n",
"lines = book.readlines()\n",
"verses = []\n",
"\n",
"for line in lines:\n",
" line_striped = line.strip().lower()\n",
" if \"AU LECTEUR\".lower() in line_striped and not start:\n",
" start = True\n",
" if (\n",
" \"End of the Project Gutenberg EBook of Les Fleurs du Mal, by Charles Baudelaire\".lower()\n",
" in line_striped\n",
" ):\n",
" break\n",
" if not start or len(line_striped) == 0:\n",
" continue\n",
" verses.append(line_striped)\n",
"\n",
"book.close()\n",
"text = \" \".join(verses)\n",
"characters = sorted(set(text))\n",
"n_characters = len(characters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dans le TP principal nous avons one-hot encodé le texte. La couche [`Embedding`](https://keras.io/api/layers/core_layers/embedding/) prend en entrée une séquence d'entier. Ainsi, nous devons changer la manière de construire $X$ et $y$.\n",
"\n",
"**Consigne** : En s'inspirant du travail précédent, construire la matrice d'informations $X$ et le vecteur réponse $y$. Puis on scindera le dataset en un dataset d'entraînement et de validation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"X_train shape: (108720, 40)\n",
"y_train shape: (108720,)\n",
"X_val shape: (27181, 40)\n",
"y_val shape: (27181,)\n"
]
}
],
"source": [
"# Create character to index and index to character mappings\n",
"char_to_idx = {char: idx for idx, char in enumerate(characters)}\n",
"idx_to_char = dict(enumerate(characters))\n",
"\n",
"# Parameters\n",
"sequence_length = 40\n",
"\n",
"# Create sequences\n",
"X = []\n",
"y = []\n",
"\n",
"for i in range(len(text) - sequence_length):\n",
" # Input sequence: convert characters to indices\n",
" sequence = text[i : i + sequence_length]\n",
" X.append([char_to_idx[char] for char in sequence])\n",
"\n",
" # Target: next character as index\n",
" target = text[i + sequence_length]\n",
" y.append(char_to_idx[target])\n",
"\n",
"X = np.array(X)\n",
"y = np.array(y)\n",
"\n",
"# Split into training and validation sets\n",
"split_ratio = 0.8\n",
"split_index = int(len(X) * split_ratio)\n",
"\n",
"X_train, X_val = X[:split_index], X[split_index:]\n",
"y_train, y_val = y[:split_index], y[split_index:]\n",
"\n",
"print(f\"X_train shape: {X_train.shape}\")\n",
"print(f\"y_train shape: {y_train.shape}\")\n",
"print(f\"X_val shape: {X_val.shape}\")\n",
"print(f\"y_val shape: {y_val.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"La couche [`Embedding`](https://keras.io/api/layers/core_layers/embedding/) a comme paramètre :\n",
"* *input_dim* : la taille du vocabulaire que l'on considère, ici *n_characters*\n",
"* *output_dim* : la dimension de l'embedding, autrement dit chaque caractère sera représenté comme un vecteur de *output_dim* dimension\n",
"\n",
"On souhaite mesurer l'impact du paramètre *output_dim*. \n",
"\n",
"**Consigne** : Définir une fonction `get_model` qui prend en paramètre:\n",
"* *dimension* : un entier qui correspond à la dimension de sortie de l'embedding\n",
"* *vocabulary_size* : la taille du vocabulaire\n",
"\n",
"La fonction renvoie un réseau de neurones récurrents avec une couche d'embedding paramétré en accord avec les paramètres de la fonction. On essayera de faire un modèle de taille raisonnable.\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def get_model(dimension: int, vocabulary_size: int) -> keras.Model:\n",
" \"\"\"Create and return a SimpleRNN Keras model.\n",
"\n",
" Args:\n",
" dimension (int): The embedding dimension.\n",
" vocabulary_size (int): The size of the vocabulary.\n",
"\n",
" Returns:\n",
" keras.Model: The constructed Keras model.\n",
"\n",
" \"\"\"\n",
" model = keras.Sequential()\n",
" model.add(\n",
" keras.layers.Embedding(\n",
" input_dim=vocabulary_size,\n",
" output_dim=dimension,\n",
" )\n",
" )\n",
" model.add(keras.layers.SimpleRNN(128, return_sequences=False))\n",
" model.add(keras.layers.Dense(vocabulary_size, activation=\"softmax\"))\n",
" return model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Consigne** : Écrire une boucle d'entraînement qui va stocker dans une liste le maximum atteint lors de l'entraînement jusqu'à 10 époques. Chaque élément de la liste correspondra à un dictionnaire avec pour clé:\n",
"* *dimension*: la dimension de l'embedding\n",
"* *val_loss*: la valeur de loss minimale atteinte sur le dataset de validation au cours de l'entraînement"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential_1\"</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1mModel: \"sequential_1\"\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
"┃<span style=\"font-weight: bold\"> Layer (type) </span>┃<span style=\"font-weight: bold\"> Output Shape </span>┃<span style=\"font-weight: bold\"> Param # </span>┃\n",
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
"│ embedding_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Embedding</span>) │ ? │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (unbuilt) │\n",
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
"│ simple_rnn_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">SimpleRNN</span>) │ ? │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (unbuilt) │\n",
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
"│ dense_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ ? │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (unbuilt) │\n",
"└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
"</pre>\n"
],
"text/plain": [
"┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
"┃\u001b[1m \u001b[0m\u001b[1mLayer (type) \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m Param #\u001b[0m\u001b[1m \u001b[0m┃\n",
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
"│ embedding_1 (\u001b[38;5;33mEmbedding\u001b[0m) │ ? │ \u001b[38;5;34m0\u001b[0m (unbuilt) │\n",
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
"│ simple_rnn_1 (\u001b[38;5;33mSimpleRNN\u001b[0m) │ ? │ \u001b[38;5;34m0\u001b[0m (unbuilt) │\n",
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
"│ dense_1 (\u001b[38;5;33mDense\u001b[0m) │ ? │ \u001b[38;5;34m0\u001b[0m (unbuilt) │\n",
"└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1m Total params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dimension = 64\n",
"vocabulary_size = n_characters\n",
"\n",
"model = get_model(dimension, vocabulary_size)\n",
"model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Consigne** : Modifier la structure de results pour correspondre à une liste de tuple où on a la moyenne et l'écart-type pour chaque entraînement pour une dimension précise."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Consigne** : Visualiser puis commenter les résultats."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "studies",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

File diff suppressed because one or more lines are too long

View File

@@ -97,7 +97,8 @@
"metadata": {},
"outputs": [],
"source": [
"MIN_ARMS = 2 # Minimum number of arms\n",
"MIN_ARMS = 2 # Minimum number of arms\n",
"\n",
"\n",
"class BernoulliBanditK:\n",
" \"\"\"K-armed Bernoulli bandit environment.\n",
@@ -297,10 +298,16 @@
" return np.random.randint(len(Q))\n",
"\n",
" # With probability 1ε: exploit (choose an arm with the highest estimated value).\n",
" max_val = np.max(Q) # Compute the maximum value of the array Q and store it in the variable max_val\n",
" candidates = np.isclose(Q, max_val) # (see Hint) Find all positions in Q where the value equals max_val.\n",
" max_val = np.max(\n",
" Q\n",
" ) # Compute the maximum value of the array Q and store it in the variable max_val\n",
" candidates = np.isclose(\n",
" Q, max_val\n",
" ) # (see Hint) Find all positions in Q where the value equals max_val.\n",
"\n",
" return np.random.choice(candidates) # pick one of those best arms uniformly at random."
" return np.random.choice(\n",
" candidates\n",
" ) # pick one of those best arms uniformly at random."
]
},
{
@@ -593,7 +600,9 @@
" env = BernoulliBanditK(k=k) # Create a new bandit environment with k arms.\n",
"\n",
" # For evaluation only (not used by the agent),\n",
" opt_arm = env.optimal_arm() # the index of the truly best arm which has the largest p_i\n",
" opt_arm = (\n",
" env.optimal_arm()\n",
" ) # the index of the truly best arm which has the largest p_i\n",
" opt_mean = env.optimal_mean() # the best true success probability p_i\n",
"\n",
" if hasattr(opt_mean, \"item\"):\n",
@@ -604,9 +613,15 @@
" Q = np.zeros(k, dtype=float)\n",
"\n",
" # record what happens at each time step\n",
" rewards = np.zeros(T, dtype=float) # rewards[t] = observed reward at step t (0 or 1),\n",
" chose_opt = np.zeros(T, dtype=float) # chose_opt[t] = 1 if the chosen arm equals opt_arm, else 0,\n",
" regret = np.zeros(T, dtype=float) # regret[t]=p^*-R_t, which means how much we “missed” compared to the best arm\n",
" rewards = np.zeros(\n",
" T, dtype=float\n",
" ) # rewards[t] = observed reward at step t (0 or 1),\n",
" chose_opt = np.zeros(\n",
" T, dtype=float\n",
" ) # chose_opt[t] = 1 if the chosen arm equals opt_arm, else 0,\n",
" regret = np.zeros(\n",
" T, dtype=float\n",
" ) # regret[t]=p^*-R_t, which means how much we “missed” compared to the best arm\n",
"\n",
" # -------------------------------------------------\n",
" # For the naive method,\n",
@@ -720,7 +735,7 @@
" avg_optimal = np.mean([res[\"optimal_selected\"] for res in results], axis=0)\n",
" avg_instant_regret = np.mean([res[\"regret\"] for res in results], axis=0)\n",
"\n",
" return avg_rewards, avg_optimal, np.cumsum(avg_instant_regret)\n"
" return avg_rewards, avg_optimal, np.cumsum(avg_instant_regret)"
]
},
{
@@ -828,7 +843,9 @@
"\n",
"plt.figure(figsize=(10, 6))\n",
"for eps in eps_list:\n",
" plt.plot(results[eps][\"avg_reward\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)])\n",
" plt.plot(\n",
" results[eps][\"avg_reward\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)]\n",
" )\n",
"plt.xlabel(\"Step\")\n",
"plt.ylabel(\"Average reward\")\n",
"plt.title(\"Average reward vs step (Bernoulli bandit, ε-greedy)\")\n",
@@ -838,7 +855,9 @@
"# Plot: Probability of optimal action\n",
"plt.figure(figsize=(10, 6))\n",
"for eps in eps_list:\n",
" plt.plot(results[eps][\"avg_opt\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)])\n",
" plt.plot(\n",
" results[eps][\"avg_opt\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)]\n",
" )\n",
"plt.xlabel(\"Step\")\n",
"plt.ylabel(\"P(select optimal arm)\")\n",
"plt.title(\"Optimal-action probability vs step (ε-greedy)\")\n",
@@ -861,12 +880,14 @@
"# Plot: Cumulative regret\n",
"plt.figure(figsize=(10, 6))\n",
"for eps in eps_list:\n",
" plt.plot(results[eps][\"avg_cumreg\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)])\n",
" plt.plot(\n",
" results[eps][\"avg_cumreg\"], label=f\"ε={eps}\", color=colors[eps_list.index(eps)]\n",
" )\n",
"plt.xlabel(\"Step\")\n",
"plt.ylabel(\"Average cumulative regret\")\n",
"plt.title(\"Cumulative regret vs step (ε-greedy)\")\n",
"plt.legend()\n",
"plt.show()\n"
"plt.show()"
]
},
{
@@ -907,7 +928,9 @@
"source": [
"# Calculate final performance metrics for each epsilon\n",
"print(\"### Performance Summary for Different ε Values\\n\")\n",
"print(f\"{'ε':<6} {'Final Avg Reward':<18} {'Final Opt %':<15} {'Final Cum Reward':<18} {'Final Cum Regret':<18}\")\n",
"print(\n",
" f\"{'ε':<6} {'Final Avg Reward':<18} {'Final Opt %':<15} {'Final Cum Reward':<18} {'Final Cum Regret':<18}\"\n",
")\n",
"print(\"-\" * 80)\n",
"\n",
"for eps in eps_list:\n",
@@ -916,7 +939,9 @@
" final_cum_reward = np.cumsum(results[eps][\"avg_reward\"])[-1]\n",
" final_cum_regret = results[eps][\"avg_cumreg\"][-1]\n",
"\n",
" print(f\"{eps:<6.2f} {final_avg_reward:<18.4f} {final_opt_prob:<15.2f} {final_cum_reward:<18.2f} {final_cum_regret:<18.2f}\")\n",
" print(\n",
" f\"{eps:<6.2f} {final_avg_reward:<18.4f} {final_opt_prob:<15.2f} {final_cum_reward:<18.2f} {final_cum_regret:<18.2f}\"\n",
" )\n",
"\n",
"# Find the best epsilon based on multiple criteria\n",
"best_eps_reward = max(eps_list, key=lambda e: results[e][\"avg_reward\"][-1])\n",

View File

@@ -0,0 +1,130 @@
# Chargement de la librairie mclust
library(mclust)
#
# n est le nombre de lignes (éruptions) dans les données Old Faithful
n <- length(faithful[, 1])
#
# partition initiale aléatoire
set.seed(0)
z.init <- rep(1, n)
n2 <- rbinom(n = 1, size = n, prob = 0.5)
s2 <- sample(1:n, size = n2)
z.init[s2] <- 2
#
# Déroulement de lalgorithme EM étape par étape
# Initialisation de EM (50 itérations et 2 classes)
itmax <- 50
G <- 2
zmat <- matrix(rep(0, n * G), ncol = G)
for (g in (1:G)) {
zmat[, g] <- z.init == g
}
#
# Quelle information contient zmat ?
#
# Un vecteur est simplement une liste déléments du même type
mstep.out <- vector("list", itmax)
estep.out <- vector("list", itmax)
# Itérations de lalgorithme EM
for (iter in 1:itmax) {
#étape M (maximisation)
mstep.tmp <- mstep(modelName = "VVV", data = faithful, z = zmat)
mstep.out[[iter]] <- mstep.tmp
#
# Quelle information est donnée par mstep.tmp ?
# (Indiquer et décrire ses composants)
#
#étape (Estimation)
estep.tmp <- estep(
modelName = "VVV",
data = faithful,
parameters = mstep.tmp$parameters
)
estep.out[[iter]] <- estep.tmp
zmat <- estep.tmp$z
#
# Quelle information est donnée par step.tmp ?
# (Indiquer et décrire ses composants)
#
}
# Extraction de la partition
#
EMclass <- rep(NA, n)
for (i in 1:n) {
zmati <- zmat[i, ]
#
# Que contient zmati ?
#
zmat.max <- zmati == max(zmati)
#
# Quelle information est donnée par zmat.max ?
#
zmat.argmax <- (1:G)[zmat.max]
EMclass[i] <- min(zmat.argmax)
#
# Quelle information est donnée par zmat.argmax ?
# A quoi sert linstruction EMclass[i] <- min(zmat.argmax) ?
#
}
# Graphe indiquant la position des centres mu1 et mu2 des classes
# On commence par extraire les estimations de mu_{g,d} à chaque itération
d <- length(faithful[1, ]) # dimension des données
mu1 <- matrix(rep(NA, itmax * d), ncol = 2)
mu2 <- matrix(rep(NA, itmax * d), ncol = 2)
for (iter in (1:itmax)) {
mu1[iter, ] <- estep.out[[iter]]$parameters$mean[, 1]
mu2[iter, ] <- estep.out[[iter]]$parameters$mean[, 2]
}
# Afficher les coordonnées des 2 centres à litération 1,
# puis leurs coordonnées à litération 50. Commenter.
#
# Représentation du déplacement des centres mu1 et mu2
#
plot(
faithful,
type = "n",
xlab = "Eruptions",
ylab = "Waiting",
main = "Déplacement des moyennes"
)
y1 <- faithful[EMclass == 2, ]
y2 <- faithful[EMclass == 1, ]
# Que contiennent y1 et y2 ?
# y1 contient les observations classées dans la classe 2
# y2 contient les observations classées dans la classe 1
#
lines(mu1[, 1], mu1[, 2], type = "l", lwd = 3, col = "red")
lines(mu2[, 1], mu2[, 2], type = "l", lwd = 3, col = "blue")
points(y1, col = "red", pch = 1)
points(y2, col = "blue", pch = 1)
points(mu1[itmax, 1], mu1[itmax, 2], col = "red", pch = 19)
points(mu2[itmax, 1], mu2[itmax, 2], col = "blue", pch = 19)
legend(
x = min(faithful[, 1]),
y = max(faithful[, 2]),
legend = c("Moyenne 1", "Moyenne 2"),
col = c("red", "blue"),
lty = c(1, 1),
lwd = c(3, 3)
)
# Loading the mclust library
library(mclust)
data(faithful)
# Clustering avec Mclust
faithful.Mclust2 <- Mclust(faithful, model = "VVV", G = 2)
#
# Affichage de lincertitude sur les affectations
#
plot(faithful.Mclust2, what = "uncertainty")
surfacePlot(
faithful,
faithful.Mclust2$parameters,
type = "contour",
what = "uncertainty"
)
points(faithful)