{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Z7cDc2dVX2D1" }, "source": [ "# Séance 4 - Réseau récurrent\n", "\n", "Dans cette séance nous allons entraîner un modèle à copier le style de poésie de Beaudelaire, spécifiquement l'oeuvre *Les fleurs du mal*. Ce TP est largement inspiré du cours du [CNAM](https://cedric.cnam.fr/~thomen/cours/US330X/tpRNNs.html) que l'on a adapté ici.\n", "\n", "Pour cela, nous utiliserons le projet [Gutenberg](https://www.gutenberg.org) qui permet l'accès l'ensemble des oeuvres littéraires classique gratuitement. C'est sur ce dataset, entre autres, que les LLM s'entraînent.\n", "\n", "Commençons par importer les packages dont nous aurons besoin." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Pe1mrRDtYaBM" }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "sns.set(style=\"whitegrid\")\n", "\n", "from tensorflow import keras" ] }, { "cell_type": "markdown", "metadata": { "id": "InTQRzhkY6My" }, "source": [ "Après avoir chargé dans l'environnement le fichier .txt de poésie, nous devons le travailler un peu pour l'exploiter. Quand on regarde le détail du fichier, on voit qu'il y a du texte qui n'est pas de la poésie. Nous décidons de n'exploiter que les poèmes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hkSTixun__Xw" }, "outputs": [], "source": [ "start = False\n", "book = open(\"Beaudelaire.txt\", encoding=\"utf8\") # noqa: SIM115\n", "lines = book.readlines()\n", "verses = []\n", "\n", "for line in lines:\n", " line_stripped = line.strip().lower()\n", " if \"AU LECTEUR\".lower() in line_stripped and not start:\n", " start = True\n", " if (\n", " \"End of the Project Gutenberg EBook of Les Fleurs du Mal, by Charles Baudelaire\".lower()\n", " in line_stripped\n", " ):\n", " break\n", " if not start or len(line_stripped) == 0:\n", " continue\n", " verses.append(line_stripped)\n", "\n", "book.close()\n", "text = \" \".join(verses)\n", "characters = sorted(set(text))\n", "n_characters = len(characters)" ] }, { "cell_type": "markdown", "metadata": { "id": "7626TUDbZHVf" }, "source": [ "On décide ici de le découper en séquence de 32 caractères et de se décaler d'un caractère à chaque fois. Nous allons donc prédire le caractère suivant à partir des 32 caractères précédents.\n", "Construisons deux listes qui, une fois transformée, deviendront $X$ et $y$.\n", "\n", "**Consigne** : Compléter la cellule suivante avec les informations précédentes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uO403MyQAzim" }, "outputs": [], "source": [ "sequence_length = 32\n", "stride = 1\n", "sequences = []\n", "y_character = []\n", "for index in range(0, len(text) - sequence_length, stride):\n", " sequences.append(text[index : index + sequence_length])\n", " y_character.append(text[index + sequence_length])" ] }, { "cell_type": "markdown", "metadata": { "id": "12vayeAWZaAo" }, "source": [ "Un réseau de neurone ne comprend pas le texte, donc nous devrons jongler entre nombre et caractères. Pour cela, nous créons deux dictionnaires pour traduire ces deux visions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rGkZx3JQB3gn" }, "outputs": [], "source": [ "character_to_index = {character: index for index, character in enumerate(characters)}\n", "index_to_character = dict(enumerate(characters))" ] }, { "cell_type": "markdown", "metadata": { "id": "4er_XSvHZljl" }, "source": [ "Nous sommes maintenant prêt pour renseigner $X$ et $y$. La matrice $X$ sera de taille $n \\times N \\times C$ avec:\n", "* $n$ : le nombre de séquence exemples\n", "* $N$ : la longueur de la séquence que l'on considère, ici 32\n", "* $C$ : le nombre de caractères différents, ici stocké dans la variable *n_characters*\n", "\n", "La matrice $y$ sera de taille $n\\times C$. Les deux matrices seront de types booléens avec la valeur *True* à l'index du caractères représenté.\n", "\n", "**Consigne** : Remplir la cellule suivante avec les informations précédentes. On utilisera le dictionnaire *character_to_index*." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3b5vdYJ_DO8y" }, "outputs": [], "source": [ "X = np.zeros((len(sequences), sequence_length, n_characters), dtype=bool)\n", "y = np.zeros((len(sequences), n_characters), dtype=bool)\n", "\n", "for row, sequence in enumerate(sequences):\n", " for position, character in enumerate(sequence):\n", " X[row, position, character_to_index[character]] = 1\n", " y[row, character_to_index[y_character[row]]] = 1" ] }, { "cell_type": "markdown", "metadata": { "id": "GhrZsGulZs6U" }, "source": [ "Découpons à présent $X$ et $y$ en un jeu de test et un jeu d'entraînement. Aussi, nous allons sauvegarder ces matrices au cas où nous souhaiterions ne pas avoir à relancer ce preprocessing." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "b37hlbPmFUd6" }, "outputs": [], "source": [ "import _pickle as pickle\n", "\n", "train_size = 0.8\n", "train_index = round(len(sequences) * train_size)\n", "X_train = X[:train_index, :, :]\n", "y_train = y[:train_index, :]\n", "\n", "X_test = X[train_index:, :, :]\n", "y_test = y[train_index:, :]\n", "\n", "\n", "outfile = f\"Baudelaire_len_{sequence_length}.p\"\n", "\n", "with open(outfile, \"wb\") as pickle_f:\n", " pickle.dump([index_to_character, X_train, y_train, X_test, y_test], pickle_f)" ] }, { "cell_type": "markdown", "metadata": { "id": "K-zzNAhSZ7pA" }, "source": [ "## Modélisation\n", "\n", "Dans cet exemple, nous allons définir un réseau récurrent avec les neurones de bases : pas de LSTM ou GRU.\n", "\n", "Un neurone [`SimpleRNN`](https://keras.io/api/layers/recurrent_layers/simple_rnn/) possède les mêmes attributs qu'un neurones classique en plus de deux paramètres majeurs:\n", "* **return_sequences**: si l'on doit renvoyer la totalité de la séquence ou seulement la dernière valeur\n", "* **unroll**: permet d'accélérer l'entraînement du réseau de neurone au prix de plus de mémoire impliquée\n", "\n", "**Consigne** : Compléter la cellule suivante pour définir le réseau de neurones avec les informations précédentes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "muGo3GElW-6l" }, "outputs": [ { "data": { "text/html": [ "
Model: \"sequential_1\"\n",
       "
\n" ], "text/plain": [ "\u001b[1mModel: \"sequential_1\"\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
       "┃ Layer (type)                     Output Shape                  Param # ┃\n",
       "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
       "│ simple_rnn_2 (SimpleRNN)        │ (None, 32, 128)        │        23,424 │\n",
       "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
       "│ layer_normalization_1           │ (None, 32, 128)        │           256 │\n",
       "│ (LayerNormalization)            │                        │               │\n",
       "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
       "│ simple_rnn_3 (SimpleRNN)        │ (None, 128)            │        32,896 │\n",
       "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
       "│ dense_1 (Dense)                 │ (None, 54)             │         6,966 │\n",
       "└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
       "
\n" ], "text/plain": [ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n", "┃\u001b[1m \u001b[0m\u001b[1mLayer (type) \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m Param #\u001b[0m\u001b[1m \u001b[0m┃\n", "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n", "│ simple_rnn_2 (\u001b[38;5;33mSimpleRNN\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m32\u001b[0m, \u001b[38;5;34m128\u001b[0m) │ \u001b[38;5;34m23,424\u001b[0m │\n", "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", "│ layer_normalization_1 │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m32\u001b[0m, \u001b[38;5;34m128\u001b[0m) │ \u001b[38;5;34m256\u001b[0m │\n", "│ (\u001b[38;5;33mLayerNormalization\u001b[0m) │ │ │\n", "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", "│ simple_rnn_3 (\u001b[38;5;33mSimpleRNN\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m128\u001b[0m) │ \u001b[38;5;34m32,896\u001b[0m │\n", "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", "│ dense_1 (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m54\u001b[0m) │ \u001b[38;5;34m6,966\u001b[0m │\n", "└─────────────────────────────────┴────────────────────────┴───────────────┘\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
 Total params: 63,542 (248.21 KB)\n",
       "
\n" ], "text/plain": [ "\u001b[1m Total params: \u001b[0m\u001b[38;5;34m63,542\u001b[0m (248.21 KB)\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
 Trainable params: 63,542 (248.21 KB)\n",
       "
\n" ], "text/plain": [ "\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m63,542\u001b[0m (248.21 KB)\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
 Non-trainable params: 0 (0.00 B)\n",
       "
\n" ], "text/plain": [ "\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "model = keras.models.Sequential(\n", " [\n", " keras.layers.InputLayer(shape=(sequence_length, n_characters)),\n", " # Ajouter une couche SimpleRNN\n", " keras.layers.SimpleRNN(128, return_sequences=True),\n", " # Ajouter une couche de LayerNormalization\n", " keras.layers.LayerNormalization(),\n", " # Ajouter une couche SimpleRNN\n", " keras.layers.SimpleRNN(128, return_sequences=False),\n", " # Ajouter une couche Dense\n", " keras.layers.Dense(n_characters, activation=\"softmax\"),\n", " ],\n", ")\n", "\n", "model.summary()" ] }, { "cell_type": "markdown", "metadata": { "id": "bLfT59Q3adf9" }, "source": [ "Pour éviter l'overfitting, on se propose d'exploiter la mécanique d'[EarlyStopping](https://keras.io/api/callbacks/early_stopping/).\n", "\n", "**Consigne** : Compléter la cellule suivante pour compiler le réseau de neurones et l'entraîner avec la mécanique d'EarlyStopping à paramétrer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Jk1g8higXtED" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m28s\u001b[0m 17ms/step - accuracy: 0.5332 - loss: 1.4744 - val_accuracy: 0.4982 - val_loss: 1.6537\n", "Epoch 2/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m26s\u001b[0m 17ms/step - accuracy: 0.5345 - loss: 1.4680 - val_accuracy: 0.4965 - val_loss: 1.6555\n", "Epoch 3/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m26s\u001b[0m 17ms/step - accuracy: 0.5357 - loss: 1.4609 - val_accuracy: 0.4932 - val_loss: 1.6608\n", "Epoch 4/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m26s\u001b[0m 17ms/step - accuracy: 0.5365 - loss: 1.4585 - val_accuracy: 0.4951 - val_loss: 1.6449\n", "Epoch 5/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m26s\u001b[0m 17ms/step - accuracy: 0.5386 - loss: 1.4508 - val_accuracy: 0.4953 - val_loss: 1.6524\n", "Epoch 6/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m27s\u001b[0m 17ms/step - accuracy: 0.5399 - loss: 1.4484 - val_accuracy: 0.4934 - val_loss: 1.6657\n", "Epoch 7/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m31s\u001b[0m 20ms/step - accuracy: 0.5397 - loss: 1.4452 - val_accuracy: 0.4943 - val_loss: 1.6644\n", "Epoch 8/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m30s\u001b[0m 20ms/step - accuracy: 0.5412 - loss: 1.4389 - val_accuracy: 0.4921 - val_loss: 1.6641\n", "Epoch 9/50\n", "\u001b[1m1529/1529\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m26s\u001b[0m 17ms/step - accuracy: 0.5420 - loss: 1.4336 - val_accuracy: 0.4946 - val_loss: 1.6630\n" ] } ], "source": [ "n_epochs = 50\n", "batch_size = 64\n", "validation_split = 0.1\n", "\n", "callback = keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)\n", "model.compile(\n", " loss=\"categorical_crossentropy\",\n", " optimizer=keras.optimizers.Adam(),\n", " metrics=[\"accuracy\"],\n", ")\n", "history = model.fit(\n", " X_train,\n", " y_train,\n", " epochs=n_epochs,\n", " batch_size=batch_size,\n", " validation_split=validation_split,\n", " callbacks=[callback],\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "wnYgg07XaxtI" }, "source": [ "L'entraînement étant terminé, visualisons sa courbe d'entraînement." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qsPvn4t6XuuK" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "50 9\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "historic = pd.DataFrame(history.history)\n", "figure, (axis_1, axis_2) = plt.subplots(1, 2, figsize=(15, 6))\n", "epochs = range(1, n_epochs + 1)\n", "\n", "print(len(epochs), len(historic[\"loss\"]))\n", "\n", "for index, (metric_name, axis) in enumerate(\n", " zip([\"loss\", \"accuracy\"], [axis_1, axis_2], strict=False),\n", "):\n", " color = sns.color_palette()[index]\n", " axis.plot(\n", " epochs[: len(historic[metric_name])],\n", " historic[metric_name],\n", " lw=2,\n", " color=color,\n", " )\n", " axis.plot(\n", " epochs[: len(historic[\"val_\" + metric_name])],\n", " historic[\"val_\" + metric_name],\n", " ls=\"--\",\n", " color=color,\n", " )\n", "\n", " if metric_name == \"accuracy\":\n", " axis.set_ylim(0, 1)\n", " axis.set_ylabel(metric_name.capitalize())\n", " axis.set_xlabel(\"Epochs\")\n", " axis.set_title(f\"{metric_name.capitalize()} through training\")\n", "\n", "\n", "plt.suptitle(\"RNN Training\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "56NMPwxOa8I9" }, "source": [ "Sauvegardons le modèle pour pouvoir l'utiliser plus tard, ou sur un autre notebook par exemple." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sS2kjywjGx8_" }, "outputs": [], "source": [ "def save_model(model, name) -> None:\n", " \"\"\"Save a Keras model to JSON and H5 files.\"\"\"\n", " model_json = model.to_json()\n", " with open(name + \".json\", \"w\") as json_file:\n", " json_file.write(model_json)\n", " model.save_weights(name + \".h5\")\n", "\n", "\n", "save_model(model, \"SimpleRNN.weights\")" ] }, { "cell_type": "markdown", "metadata": { "id": "UqMSdzSbX06B" }, "source": [ "Importons le modèle que l'on vient de sauvegarder sous un autre alias." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "FRVCXsVOHQx-" }, "outputs": [], "source": [ "from keras.models import model_from_json\n", "\n", "\n", "def load_model(name):\n", " \"\"\"Load a Keras model from JSON and H5 files.\"\"\"\n", " with open(name + \".json\") as json_file:\n", " model = model_from_json(json_file.read())\n", " model.load_weights(name + \".h5\")\n", " return model\n", "\n", "\n", "model_SimpleRNN = load_model(\"SimpleRNN.weights\")" ] }, { "cell_type": "markdown", "metadata": { "id": "1UKNXx4lbD4g" }, "source": [ "**Consignes** : Compléter la cellule suivante pour vérifier que les performances sont bien celles que nous connaissons." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "AhFyJkrbX-bn" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m850/850\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m3s\u001b[0m 3ms/step - accuracy: 0.4930 - loss: 1.6966\n", "Test accuracy: 49.30%\n" ] } ], "source": [ "model_SimpleRNN.compile(\n", " loss=\"categorical_crossentropy\",\n", " optimizer=keras.optimizers.Adam(),\n", " metrics=[\"accuracy\"],\n", ")\n", "score = model_SimpleRNN.evaluate(X_test, y_test)\n", "print(\"Test accuracy: %.02f%%\" % (score[1] * 100))" ] }, { "cell_type": "markdown", "metadata": { "id": "DzMrZFmKbVfL" }, "source": [ "## Génération de texte\n", "\n", "On souhaite exploiter le modèle pour générer de la poésie dans le style de Beaudelaire.\n", "On se propose de commencer par un bout d'un poème au hasard." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Start sequence: mère épouvantée et pleine de bla\n" ] } ], "source": [ "seed = 2025\n", "sequence = \"\"\n", "for index in range(sequence_length):\n", " character = index_to_character[np.argmax(X_train[seed, index, :])]\n", " sequence += character\n", "\n", "print(\"Start sequence: \" + sequence)" ] }, { "cell_type": "markdown", "metadata": { "id": "NeGrigxYa9lj" }, "source": [ "Pour choisir le prochain caractère, nous pouvons simplement sélectionner le caractère le plus probable prédit par le modèle. Cette approche peut amener le modèle à dégénérer.\n", "\n", "Nous allons essayer de sélectionner aléatoirement l'index du prochain caractère en s'appuyant sur le vecteur de probabilité produit par le réseau de neurone définit plus tôt.\n", "\n", "**Consigne** : A l'aide de la fonction [np.random.multinomial](https://numpy.org/doc/stable/reference/random/generated/numpy.random.multinomial.html), sélectionner un index aléatoirement selon un vecteur de probabilité à construire aléatoirement." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "OkE9Dd9Ua9W3" }, "outputs": [], "source": [ "random_index = np.random.multinomial(\n", " 1,\n", " y_test[np.random.randint(0, len(X_test) - 1)].ravel(),\n", ").argmax()" ] }, { "cell_type": "markdown", "metadata": { "id": "eu6NDVX6bpLn" }, "source": [ "Nous allons en pratique exploiter le *temperature sampling* pour sélectionner le prochain caractère.\n", "\n", "### Température\n", "\n", "On considère un vecteur $u = (u_1, u_2, \\ldots, u_d)$ et un paramètre $\\tau > 0$ que l'on appelle la température. On peut construire le vecteur $v = (v_1, v_2, \\ldots, v_d)$ à partir de $u$ et de $\\tau$ comme:\n", "\n", "$$\\forall i \\leqslant d, \\quad v_i = \\frac{\\displaystyle \\exp\\left(\\frac{u_i}{\\tau}\\right)}{\\displaystyle \\sum_{j=1}^d \\exp\\left(\\frac{u_j}{\\tau}\\right)}$$\n", "\n", "Cela ressemble à la fonction softmax mais paramétrer par la température $\\tau$.\n", "\n", "**Consigne** : Ecrire une fonction nommé `sampling` qui prend en paramètre un vecteur de probabilité et la température. Cette fonction doit renvoyer un index sélectionné selon le vecteur de probabilité définit par la température. On s'appuiera sur le travail de la cellule précédente." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "id": "Jeglw8biH6E8" }, "outputs": [], "source": [ "def sampling(probabilities: np.ndarray, temperature: float = 1.0) -> int:\n", " \"\"\"Sample an index from a probability array reweighted by temperature.\n", "\n", " Args:\n", " probabilities (np.ndarray): Array of probabilities.\n", " temperature (float, optional): Temperature parameter. Defaults to 1.0.\n", "\n", " Returns:\n", " int: Sampled index.\n", "\n", " \"\"\"\n", " probabilities = np.asarray(probabilities).astype(\"float64\")\n", " log_probabilities = np.log(probabilities + 1e-10) / temperature\n", " exp_probabilities = np.exp(log_probabilities)\n", " probabilities = exp_probabilities / np.sum(exp_probabilities)\n", " return np.random.multinomial(1, probabilities, 1).argmax()" ] }, { "cell_type": "markdown", "metadata": { "id": "9i1JrHL-cCAq" }, "source": [ "Maintenant que nous sommes capables de sélectionner le prochain caractère avec plus de justesse, il ne nous restes plus qu'à générer la suite de la phrase !" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "faT74GT7PkJt" }, "outputs": [], "source": [ "def generate_sequence(start: str, length: int, model, temperature: float = 1) -> str:\n", " \"\"\"Generate a sequence of characters given a starting string.\n", "\n", " Args:\n", " start (str): The starting string to seed the generation.\n", " length (int): The length of the sequence to generate.\n", " model: The trained Keras model for character prediction.\n", " temperature (float): The temperature parameter for sampling.\n", "\n", " Returns:\n", " str: The generated sequence of characters.\n", "\n", " \"\"\"\n", " sequence = np.zeros((1, sequence_length, n_characters), dtype=bool)\n", " for position, character in enumerate(start):\n", " sequence[0][position][character_to_index[character]] = True\n", "\n", " generated_sequence = start\n", "\n", " for _ in range(length):\n", " probabilities = model.predict(sequence, verbose=0)[0]\n", " next_index = sampling(probabilities, temperature=temperature)\n", " character = index_to_character[next_index]\n", " generated_sequence += character\n", "\n", " for index in range(sequence_length - 1):\n", " sequence[0, index, :] = sequence[0, index + 1, :]\n", "\n", " sequence[0, sequence_length - 1, :] = 0\n", " sequence[0, sequence_length - 1, next_index] = 1\n", "\n", " return generated_sequence" ] }, { "cell_type": "markdown", "metadata": { "id": "v4Rm2tKib-md" }, "source": [ "Avec l'ensemble du travail, on a :" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "qvHtwn9ZcC-t" }, "outputs": [ { "data": { "text/plain": [ "'mère épouvantée et pleine de blasphères, où les profonds des flamois ses monstres '" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generate_sequence(start=sequence, length=50, model=model_SimpleRNN, temperature=0.5)" ] }, { "cell_type": "markdown", "metadata": { "id": "BYX-wJDHcazG" }, "source": [ "**Consignes** : Définir et comparer d'autres architectures de réseau de neurones pour répondre à ce problème. On conseille d'observer les performances avec les courbes d'apprentissage mais aussi avec plusieurs génération de texte.\n", "\n", "## Pour continuer\n", "\n", "Choisir une ou plusieurs pistes de recherche parmi les suivantes. Il est possible de choisir une autre direction, mais elle doit être validé auparavant.\n", "\n", "1. Nous avons défini une seule architecture. On peut en essayer d'autres et les comparer à la fois avec les courbes d'apprentissages mais également avec la génération de texte.\n", "2. Il existe une couche [`Embedding`](https://keras.io/api/layers/core_layers/embedding/). On se propose de l'exploiter et de mesurer ses performances.\n" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "studies", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.9" } }, "nbformat": 4, "nbformat_minor": 1 }