diff --git a/M2/Generative AI/TP1/TP1 Benchmark.ipynb b/M2/Generative AI/TP1/TP1 Benchmark.ipynb new file mode 100644 index 0000000..da9501e --- /dev/null +++ b/M2/Generative AI/TP1/TP1 Benchmark.ipynb @@ -0,0 +1,851 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "172a7a9f", + "metadata": {}, + "source": [ + "# TP2 - Benchmark automatique\n", + "\n", + "Dans ce TP nous allons définir une fonction pour mesurer les performances d'un modèle de langage via l'exécution de plusieurs benchmarks. Nous avons vu en cours trois manières de mesurer la performance d'un modèle de langage qu'on peut résumer à:\n", + "1. **Évaluation automatique**: via un ensemble de questions dont on connait la réponse\n", + "2. **Évaluation humaine**: qualification humaine de la réponse d'un modèle à une question\n", + "3. **Évaluation par modèle de langage**: notation ou comparaison de réponse d'un ou plusieurs modèles par un autre modèle\n", + "\n", + "Nous nous intéressons ici au premier point, en particulier avec les benchmarks [GSM8K](https://huggingface.co/datasets/openai/gsm8k) et [HellaSwag](https://huggingface.co/datasets/Rowan/hellaswag).\n", + "Dans l'ensemble du notebook nous utiliserons la librairie LangChain.\n", + "\n", + "Il est à garder en tête que ce notebook n'a qu'une portée pédagogique et n'est pas forcément à jour puisque le domaine évolue rapidement, ni que les pratiques sont celles validées par l'industrie.\n", + "\n", + "## Uniformisation des benchmarks\n", + "\n", + "Pour chaque benchmark que l'on considère, nous avons besoin de plusieurs informations :\n", + "* **Dataset** : une fonction pour charger les questions du benchmark\n", + "* **Référence** : une fonction capable d'identifier la réponse attentue\n", + "* **Prompt** : un prompt qui permet de demander correctement au modèle de répondre à la question\n", + "* **Chaîne** : une fonction qui renvoie la chaîne de traitement de LangChain\n", + "* **Score** : une fonction qui score la performance d'un modèle sur une question\n", + "\n", + "Nous allons commencer par créer une classe qui regroupe ces desiderata :\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd75374d", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import PromptTemplate\n", + "from langchain_core.runnables import Runnable\n", + "\n", + "\n", + "class Benchmark:\n", + " \"\"\"Base class for benchmarks.\"\"\"\n", + "\n", + " name: str\n", + "\n", + " def __init__(self, prompt: PromptTemplate) -> None:\n", + " \"\"\"Initialize the benchmark with a prompt template.\"\"\"\n", + " self.prompt = prompt\n", + "\n", + " def load_data(self) -> list:\n", + " \"\"\"Load and return the benchmark data samples.\"\"\"\n", + " raise NotImplementedError\n", + "\n", + " def build_chain(self, model) -> Runnable:\n", + " \"\"\"Build and return the evaluation chain using the provided model.\"\"\"\n", + " raise NotImplementedError\n", + "\n", + " def get_reference(self, sample) -> str:\n", + " \"\"\"Extract and return the reference answer from a data sample.\"\"\"\n", + " raise NotImplementedError\n", + "\n", + " def score(self, prediction, reference) -> float:\n", + " \"\"\"Score the prediction against the reference answer.\"\"\"\n", + " raise NotImplementedError" + ] + }, + { + "cell_type": "markdown", + "id": "e2ab41df", + "metadata": {}, + "source": [ + "Pour rendre cette classe plus concrète, commençons par travailler avec le benchmark [GSM8K](https://huggingface.co/datasets/openai/gsm8k).\n", + "\n", + "### Benchmark GSM8K\n", + "\n", + "On commence par charger le dataset et observer une question.\n", + "\n", + "**Consigne** : Résoudre la question *à la main* et vérifier votre réponse. On recommande d'explorer plusieurs questions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "93979ba0", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Number of questions: 1319\n", + "Example of question:\n", + " Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?\n", + "And its answer:\n", + " Janet sells 16 - 3 - 4 = <<16-3-4=9>>9 duck eggs a day.\n", + "She makes 9 * 2 = $<<9*2=18>>18 every day at the farmer’s market.\n", + "#### 18\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "\n", + "from datasets import load_dataset\n", + "\n", + "np.random.seed(42)\n", + "\n", + "dataset = load_dataset(\"gsm8k\", \"main\")\n", + "dataset = dataset[\"test\"]\n", + "\n", + "print(f\"Number of questions: {len(dataset)}\")\n", + "index = 0\n", + "print(\"Example of question:\\n\", dataset[index][\"question\"])\n", + "print(\"And its answer:\\n\", dataset[index][\"answer\"])" + ] + }, + { + "cell_type": "markdown", + "id": "82d797f0", + "metadata": {}, + "source": [ + "Après avoir inspecté plusieurs éléments du dataset, on remarque que la réponse finale est placée après la chaîne de caractères \"####\".\n", + "\n", + "**Consigne**: Construire une fonction `get_reference` qui prend en argument un élément de GMS8K (dictionnaire avec question et réponse) et renvoie la réponse attendue (string). On pourra utiliser la fonction [`search`](https://docs.python.org/3/library/re.html#re.search) de la librairie [`re`](https://docs.python.org/3/library/re.html#).\n", + "Puis tester cette fonction sur l'exemple précédent." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b336056a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Reference: 18\n" + ] + } + ], + "source": [ + "from re import search\n", + "\n", + "\n", + "def get_reference(sample: dict) -> str:\n", + " \"\"\"Extract the reference answer from a data sample.\"\"\"\n", + " match = search(r\"#### (\\d+)\", sample[\"answer\"])\n", + " return match.group(1) if match else None\n", + "\n", + "\n", + "index = 0\n", + "reference = get_reference(sample=dataset[index])\n", + "print(f\"Reference: {reference}\")" + ] + }, + { + "cell_type": "markdown", + "id": "4c137e6a", + "metadata": {}, + "source": [ + "Il nous reste maintenant à définir un prompt tel que l'on puisse appeler un modèle et tester notre mécanique." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b899872", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import PromptTemplate\n", + "\n", + "prompt = PromptTemplate(\n", + " input_variables=[\"question\"],\n", + " template=(\n", + " \"\"\"You are a careful mathematician. Solve the problem step by step, then display your answer in the end.\n", + " Question: {question}\n", + " Answer:\"\"\"\n", + " ),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "36433b53", + "metadata": {}, + "source": [ + "En intégrant l'appel à un modèle via Ollama sur notre ordinateur, on peut définir avec LangChain :" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2f0676b6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Model answer : Here's how we can solve this problem step by step:\n", + "\n", + "1. **Calculate the total number of eggs laid:** Janet's ducks lay 16 eggs per day.\n", + "\n", + "2. **Calculate the number of eggs eaten:** She eats 3 eggs per day.\n", + "\n", + "3. **Calculate the number of eggs remaining after breakfast:** 16 eggs (laid) - 3 eggs (eaten) = 13 eggs\n", + "\n", + "4. **Calculate the number of eggs used for baking:** She uses 4 eggs for baking.\n", + "\n", + "5. **Calculate the number of eggs remaining after baking:** 13 eggs - 4 eggs (baking) = 9 eggs\n", + "\n", + "6. **Calculate the earnings from selling the remaining eggs:** She sells 9 eggs at $2 per egg. So she makes 9 * $2 = $18.\n", + "\n", + "**Answer:** $18\n", + "The answer was : 18\n" + ] + } + ], + "source": [ + "from langchain_core.output_parsers import StrOutputParser\n", + "from langchain_core.runnables import RunnablePassthrough\n", + "from langchain_ollama import OllamaLLM\n", + "\n", + "model = OllamaLLM(model=\"gemma3:4b\")\n", + "\n", + "chain = {\"question\": RunnablePassthrough()} | prompt | model | StrOutputParser()\n", + "\n", + "index = 0\n", + "\n", + "question = dataset[index][\"question\"]\n", + "answer = get_reference(dataset[index])\n", + "response = chain.invoke(question)\n", + "print(f\"Model answer : {response}\")\n", + "print(f\"The answer was : {answer}\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "97dd7db7", + "metadata": {}, + "source": [ + "Il nous faut extraire la dernière valeur numérique pour obtenir automatiquement la réponse du modèle.\n", + "\n", + "**Consigne** : Définir une fonction `score` qui prend en paramètre la réponse du modèle et la réponse attendue puis renvoie si les deux réponses sont identiques (1 / 0). On pourra utiliser la fonction [`findall`](https://docs.python.org/3/library/re.html#re.findall) de la librairie `re`.\n", + "Puis l'appliquer sur l'exemple précédent." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ad43cf84", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The model scored 1.0\n" + ] + } + ], + "source": [ + "from re import findall\n", + "\n", + "\n", + "def score(prediction, reference):\n", + " if reference is None:\n", + " return 0.0\n", + "\n", + " numbers = findall(r\"\\d+\", prediction)\n", + " return 1.0 if numbers and numbers[-1] == reference else 0.0\n", + "\n", + "\n", + "value = score(response, answer)\n", + "print(f\"The model scored {value}\")" + ] + }, + { + "cell_type": "markdown", + "id": "a2ec5088", + "metadata": {}, + "source": [ + "Nous avons l'ensemble des éléments nécessaire pour définir la classe `GSM8KBenchmark` depuis la classe `Benchmark` que nous avons défini précédemment.\n", + "\n", + "**Consigne** : Définir cette classe comme sous-classe de `Benchmark`." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "id": "d83f4394", + "metadata": {}, + "outputs": [], + "source": [ + "class GSM8KBenchmark(Benchmark):\n", + " name = \"GSM8K\"\n", + "\n", + " def load_data(self):\n", + " return load_dataset(\"gsm8k\", \"main\", split=\"test\")\n", + "\n", + " def build_chain(self, model):\n", + " return (\n", + " {\"question\": RunnablePassthrough()}\n", + " | self.prompt\n", + " | model\n", + " | StrOutputParser()\n", + " )\n", + "\n", + " def get_reference(self, sample):\n", + " match = search(r\"#### (\\d+)\", sample[\"answer\"])\n", + " return match.group(1) if match else None\n", + "\n", + " def score(self, prediction, reference):\n", + " if reference is None:\n", + " return 0.0\n", + " numbers = findall(r\"\\d+\", prediction)\n", + " return 1.0 if numbers and numbers[-1] == reference else 0.0" + ] + }, + { + "cell_type": "markdown", + "id": "dfc3cb78", + "metadata": {}, + "source": [ + "Il est maintenant temps de définir une fonction qui *fait* le benchmark.\n", + "\n", + "**Consigne** : Définir une fonction `run_benchmark` qui prend en paramètre :\n", + "* `model_name` : le nom du modèle Ollama que l'on veut tester\n", + "* `benchmark` : la classe benchmark que l'on souhaite tester\n", + "* `max_samples` : le nombre maximum de questions que l'on souhaite utiliser\n", + "\n", + "Puisque l'object avec lequel nous travaillons est un dataset HuggingFace, pour sélectionner $n$ lignes, on utilisera \n", + "```python\n", + "dataset = dataset.select(range(max_samples))\n", + "```\n", + "De cette manière on préserve la structure." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2d7125af", + "metadata": {}, + "outputs": [], + "source": [ + "from tqdm import tqdm\n", + "import numpy as np\n", + "\n", + "\n", + "def run_benchmark(\n", + " model_name: str, benchmark: Benchmark, max_samples: int | None = None\n", + ") -> dict:\n", + " model = OllamaLLM(model=model_name)\n", + "\n", + " data = benchmark.load_data()\n", + " if max_samples:\n", + " data = data.select(range(max_samples))\n", + " chain = benchmark.build_chain(model)\n", + "\n", + " scores = []\n", + "\n", + " for sample in tqdm(data, desc=f\"Running {benchmark.name}\"):\n", + " prediction = chain.invoke(sample)\n", + " reference = benchmark.get_reference(sample)\n", + " scores.append(benchmark.score(prediction, reference))\n", + "\n", + " results = {\n", + " \"benchmark\": benchmark.name,\n", + " \"model\": model_name,\n", + " \"num_samples\": len(scores),\n", + " \"accuracy\": np.mean(scores),\n", + " }\n", + " return results\n" + ] + }, + { + "cell_type": "markdown", + "id": "81de8940", + "metadata": {}, + "source": [ + "**Consigne** : Utiliser la fonction `run_benchmark` en définissant un prompt pour GSM8K." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f6bbeb53", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running GSM8K: 100%|██████████| 5/5 [00:50<00:00, 10.18s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'benchmark': 'GSM8K', 'model': 'gemma3:4b', 'num_samples': 5, 'accuracy': 0.8}\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "prompt_GMS8K = PromptTemplate(\n", + " input_variables=[\"question\"],\n", + " template=(\n", + " \"\"\"You are a careful mathematician. Solve the problem step by step, then display your answer in the end.\n", + " Question: {question}\n", + " Answer:\"\"\"\n", + " ),\n", + ")\n", + "\n", + "benchmark_GSM8K = GSM8KBenchmark(prompt=prompt_GMS8K)\n", + "results = run_benchmark(\n", + " model_name=\"gemma3:4b\", benchmark=benchmark_GSM8K, max_samples=5\n", + ")\n", + "print(results)" + ] + }, + { + "cell_type": "markdown", + "id": "0c943124", + "metadata": {}, + "source": [ + "### HellaSwag\n", + "\n", + "Maintenant que nous avons réussi à le faire pour le dataset GMS8K, attaquons-nous à [HellaSwag](https://huggingface.co/datasets/Rowan/hellaswag).\n", + "\n", + "**Consigne** : En suivant la même approche que précédemment, implémenter une sous classe `HellaSwagBenchmark` à partir de la classe `Benchmark`. Puis utiliser la fonction `run_benchmark` pour valider votre travail." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "id": "32886901", + "metadata": {}, + "outputs": [], + "source": [ + "class HellaSwagBenchmark(Benchmark):\n", + " name = \"HellaSwag\"\n", + "\n", + " def load_data(self):\n", + " return load_dataset(\"hellaswag\", split=\"validation\")\n", + "\n", + " def build_chain(self, model):\n", + " return (\n", + " {\n", + " \"context\": lambda x: x[\"ctx\"],\n", + " \"choices\": lambda x: \"\\n\".join(\n", + " f\"{index}: {choice}\" for index, choice in enumerate(x[\"endings\"])\n", + " ),\n", + " }\n", + " | self.prompt\n", + " | model\n", + " | StrOutputParser()\n", + " )\n", + "\n", + " def get_reference(self, sample):\n", + " return str(sample[\"label\"])\n", + "\n", + " def score(self, prediction, reference):\n", + " match = search(r\"\\d\", prediction)\n", + " return 1.0 if match and match.group(0) == reference else 0.0\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "96a3031a", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running HellaSwag: 100%|██████████| 5/5 [00:02<00:00, 2.08it/s]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'benchmark': 'HellaSwag', 'model': 'gemma3:4b', 'num_samples': 5, 'accuracy': 1.0}\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "prompt_HellaSwag = PromptTemplate(\n", + " input_variables=[\"context\", \"choices\"],\n", + " template=(\n", + " \"\"\"You will be given a context and then different choices. You need to find the most likely continuation to the context. Answer with the number of the most likely choice only.\n", + " Context: {context}\n", + " Choices: {choices}\n", + " Answer:\"\"\"\n", + " ),\n", + ")\n", + "\n", + "benchmark_HellaSwag = HellaSwagBenchmark(prompt=prompt_HellaSwag)\n", + "\n", + "results = run_benchmark(\n", + " model_name=\"gemma3:4b\", benchmark=benchmark_HellaSwag, max_samples=5\n", + ")\n", + "print(results)" + ] + }, + { + "cell_type": "markdown", + "id": "c542783c", + "metadata": {}, + "source": [ + "## Réponses structurées\n", + "\n", + "Sur quelques exemples tout semble fonctionner ! Mais il y a au moins une fragilité dans notre travail : la récupération de la réponse est peu fiable et largement dépendante des prompts.\n", + "\n", + "\n", + "Par exemple pour GMS8K, on aimerait avoir une réponse sous la forme d'un JSON :\n", + "```json\n", + "{\n", + " \"reasoning\": \"étapes de raisonnement\",\n", + " \"final_answer\": 18\n", + "}\n", + "```\n", + "\n", + "De cette manière ce serait particulièrement simple d'extraire la réponse, sans pour autant ne pas avoir de *réflexion* du modèle. En revanche pour HellaSwag, un JSON extrêment simple suffit :\n", + "```json\n", + "{\n", + " \"choice\": 2\n", + "}\n", + "```\n", + "\n", + "Pour forcer le modèle à suivre ces formats, nous allons utiliser l'option [Pydantic](https://docs.langchain.com/oss/python/langchain/structured-output). Elle s'utilise comme suit, pour GSM8K :" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "988dbca3", + "metadata": {}, + "outputs": [], + "source": [ + "from pydantic import BaseModel, Field\n", + "\n", + "\n", + "class GSM8KOutput(BaseModel):\n", + " reasoning: str = Field(description=\"Step-by-step reasoning\")\n", + " final_answer: float = Field(description=\"Final numeric answer\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "d855adfe", + "metadata": {}, + "source": [ + "Concernant l'intégration dans le prompt :" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f25afddc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The output should be formatted as a JSON instance that conforms to the JSON schema below.\n", + "\n", + "As an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}\n", + "the object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.\n", + "\n", + "Here is the output schema:\n", + "```\n", + "{\"properties\": {\"reasoning\": {\"description\": \"Step-by-step reasoning\", \"title\": \"Reasoning\", \"type\": \"string\"}, \"final_answer\": {\"description\": \"Final numeric answer\", \"title\": \"Final Answer\", \"type\": \"number\"}}, \"required\": [\"reasoning\", \"final_answer\"]}\n", + "```\n" + ] + } + ], + "source": [ + "from langchain.output_parsers import PydanticOutputParser\n", + "\n", + "parser_gsm8k = PydanticOutputParser(pydantic_object=GSM8KOutput)\n", + "\n", + "prompt_gsm8k = PromptTemplate(\n", + " input_variables=[\"question\"],\n", + " partial_variables={\"format_instructions\": parser_gsm8k.get_format_instructions()},\n", + " template=(\n", + " \"\"\"You are a careful mathematician. Solve the problem step by step.\n", + " Question: {question}\n", + " {format_instructions}\"\"\"\n", + " ),\n", + ")\n", + "\n", + "print(parser_gsm8k.get_format_instructions())" + ] + }, + { + "cell_type": "markdown", + "id": "d1dcc480", + "metadata": {}, + "source": [ + "**Consigne** : Modifier la classe `Benchmark` et la sous-classe `GMS8KBenchmark` pour intégrer ces évolutions." + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "id": "542a31d6", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.runnables import Runnable\n", + "from langchain_core.prompts import PromptTemplate\n", + "\n", + "\n", + "class Benchmark:\n", + " name: str\n", + "\n", + " def __init__(self, prompt: PromptTemplate, parser: PydanticOutputParser):\n", + " self.prompt = prompt\n", + " self.parser = parser\n", + "\n", + " def load_data(self):\n", + " raise NotImplementedError\n", + "\n", + " def build_chain(self, model) -> Runnable:\n", + " raise NotImplementedError\n", + "\n", + " def get_reference(self, sample):\n", + " raise NotImplementedError\n", + "\n", + " def score(self, prediction, reference):\n", + " raise NotImplementedError" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c94f1dd1", + "metadata": {}, + "outputs": [], + "source": [ + "class GSM8KBenchmark(Benchmark):\n", + " name = \"GSM8K\"\n", + "\n", + " def load_data(self):\n", + " return load_dataset(\"gsm8k\", \"main\", split=\"test\")\n", + "\n", + " def build_chain(self, model):\n", + " return {\"question\": RunnablePassthrough()} | self.prompt | model | self.parser\n", + "\n", + " def get_reference(self, sample):\n", + " match = search(r\"#### (\\d+)\", sample[\"answer\"])\n", + " return float(match.group(1)) if match else None\n", + "\n", + " def score(self, prediction: GSM8KOutput, reference: float | None):\n", + " if reference is None:\n", + " return 0.0\n", + " return 1.0 if prediction.final_answer == reference else 0.0" + ] + }, + { + "cell_type": "markdown", + "id": "b2076f24", + "metadata": {}, + "source": [ + "**Consigne** : Utiliser la fonction `run_benchmark` et vérifier que tout fonctionne." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "id": "31e433b0", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running GSM8K: 100%|██████████| 5/5 [01:01<00:00, 12.25s/it]\n" + ] + }, + { + "data": { + "text/plain": [ + "{'benchmark': 'GSM8K', 'model': 'gemma3:4b', 'num_samples': 5, 'accuracy': 0.8}" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "gsm8k = GSM8KBenchmark(\n", + " prompt=prompt_gsm8k,\n", + " parser=parser_gsm8k,\n", + ")\n", + "\n", + "run_benchmark(\"gemma3:4b\", gsm8k, max_samples=5)" + ] + }, + { + "cell_type": "markdown", + "id": "b7ed90cd", + "metadata": {}, + "source": [ + "**Consigne** : Réaliser la même modification pour HellaSwag, et vérifier que cela fonctionne." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e678bed2", + "metadata": {}, + "outputs": [], + "source": [ + "class HellaSwagOutput(BaseModel):\n", + " choice: int = Field(description=\"Index of the chosen continuation\")\n", + "\n", + "\n", + "class HellaSwagBenchmark(Benchmark):\n", + " name = \"HellaSwag\"\n", + "\n", + " def load_data(self):\n", + " return load_dataset(\"hellaswag\", split=\"validation\")\n", + "\n", + " def build_chain(self, model):\n", + " return (\n", + " {\n", + " \"context\": lambda x: x[\"ctx\"],\n", + " \"choices\": lambda x: \"\\n\".join(\n", + " f\"{index}: {choice}\" for index, choice in enumerate(x[\"endings\"])\n", + " ),\n", + " }\n", + " | self.prompt\n", + " | model\n", + " | self.parser\n", + " )\n", + "\n", + " def get_reference(self, sample):\n", + " return str(sample[\"label\"])\n", + "\n", + " def score(self, prediction: HellaSwagOutput, reference: str) -> float:\n", + " return 1.0 if str(prediction.choice) == reference else 0.0\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2455f816", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running HellaSwag: 100%|██████████| 5/5 [00:15<00:00, 3.12s/it]\n" + ] + }, + { + "data": { + "text/plain": [ + "{'benchmark': 'HellaSwag',\n", + " 'model': 'gemma3:4b',\n", + " 'num_samples': 5,\n", + " 'accuracy': 1.0}" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "parser_hellaswag = PydanticOutputParser(pydantic_object=HellaSwagOutput)\n", + "\n", + "prompt_HellaSwag = PromptTemplate(\n", + " input_variables=[\"context\", \"choices\"],\n", + " partial_variables={\n", + " \"format_instructions\": parser_hellaswag.get_format_instructions()\n", + " },\n", + " template=(\n", + " \"\"\"You will be given a context and then different choices. You need to find the most likely continuation to the context.\n", + " Context: {context}\n", + " Choices: {choices}\n", + " {format_instructions}\"\"\"\n", + " ),\n", + ")\n", + "\n", + "hella_swag = HellaSwagBenchmark(\n", + " prompt=prompt_HellaSwag,\n", + " parser=parser_hellaswag,\n", + ")\n", + "\n", + "run_benchmark(\"gemma3:4b\", hella_swag, max_samples=5)" + ] + }, + { + "cell_type": "markdown", + "id": "ba9acd54", + "metadata": {}, + "source": [ + "## Pour aller plus loin\n", + "\n", + "On pourrait implémenter d'autres benchmark, comparer vraiment des modèles entre eux, comparer des prompts entre eux..." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/M2/Generative AI/TP1/TP1 RAG.ipynb b/M2/Generative AI/TP1/TP1 RAG.ipynb new file mode 100644 index 0000000..a3cf44d --- /dev/null +++ b/M2/Generative AI/TP1/TP1 RAG.ipynb @@ -0,0 +1,1395 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "8514812a", + "metadata": {}, + "source": [ + "# TP2 - Retrieval Augmented Generation\n", + "\n", + "Dans ce TP nous allons construire un système RAG complet : base de connaissance, vectorisation et appel avec un modèle de langage.\n", + "\n", + "Certaines fonctions seront réutilisées dans les prochaines séances, nous encourageons donc la définition de fonction générale, optimisée et robuste. Il est à garder en tête que ce notebook n'a qu'une portée pédagogique et n'est pas forcément à jour puisque le domaine évolue rapidement.\n", + "\n", + "Dans ce TP nous cherchons à apporter des connaissances Machine Learning, bien que le modèle en ait largement, en utilisant des cours au format PDF à notre disposition. \n", + "\n", + "\n", + "## Constitution de la base de connaissance\n", + "\n", + "Pour construire un RAG, il faut commencer par une base de connaissance. Elle sera composée dans notre cas de document PDF. Nous allons commencer par extraire les informations texte contenue dans les documents.\n", + "\n", + "**Consigne** : À partir des fichiers disponible, construire une fonction `pdf_parser` qui prend en paramètre le nom du fichier et qui renvoie le texte associé. On utilisera la classe [`PyPDFLoader`](https://python.langchain.com/docs/how_to/document_loader_pdf/#simple-and-fast-text-extraction) et sa méthode `load` pour charger le document.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "6a4a00a2", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_community.document_loaders import PyPDFLoader\n", + "\n", + "def pdf_parser(file_path: str):\n", + " loader = PyPDFLoader(file_path=file_path)\n", + " return loader.load()" + ] + }, + { + "cell_type": "markdown", + "id": "77905595", + "metadata": {}, + "source": [ + "**Consigne** : Utiliser la fonction `pdf_parser` pour charger le fichier 'ML.pdf' puis inspecter son contenu." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "8ec332e6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "page_content='Chapitre 1\n", + "Introduction au Machine Learning\n", + "Les termes d’intelligence artificielle (IA) et Machine Learning (ML) sont fréquemment confondu et\n", + "leur hiérarchie n’est pas toujours clair. Unalgorithme est une séquence d’instructions logique ordonnée\n", + "pour répondre explicitement à un problème. Par exemple, une recette de cuisine est un algorithme, mais\n", + "tous les algorithmes ne sont pas des recettes de cuisine. Un algorithme d’intelligence d’artificielle est un\n", + "algorithme, mais il n’est pas explicitement construit pour répondre à un problème : il va s’adapter. S’il\n", + "s’appuie sur des données, alors on parle d’algorithme de Machine Learning1.\n", + "Le terme d’intelligence artificielle vient de la conférence de Dartmouth en 1957 où l’objectif était de\n", + "copier le fonctionnement des neurones. Mais les concepts d’intelligence artificielle était déjà proposé par\n", + "Alan Turing, et la méthode des moindres carrés de Legendre (la fameuse tendance linéaire dans Excel)\n", + "date de bien avant 1957. Depuis, le domaine s’est structuré autour d’une philosophie d’ouverture. Ainsi,\n", + "nous avons des datasets commun, des algorithmes identiques et des compétitions commune pour pouvoir\n", + "progresser ensemble.\n", + "Nous proposons dans ce chapitre d’introduire les différentes approches du Machine Learning et les\n", + "grands principes. Pour le rendre aussi général que possible, nous ne discuterons pas d’algorithmes en\n", + "particulier, mais supposerons que nous en avons un. La description de ces objets sera le coeur des prochains\n", + "chapitre.\n", + "1.1 Les différentes approches du Machine Learning\n", + "Quand on parle de Machine Learning, on parle d’un grand ensemble contenant plusieurs approches\n", + "différentes. Leur point commun est que la donnée est la source de l’apprentissage de paramètres optimaux\n", + "selon une procédure donnée. Pour saisir les différences entre ces approches, regardons ce dont chacune a\n", + "besoin pour être suivie.\n", + "• Apprentissage supervisé: je dispose d’une base de données qui contient une colonne que je\n", + "souhaite prédire\n", + "• Apprentissage non-supervisé: je dispose seulement d’une base de données composée d’indicateurs\n", + "Ces deux approches représentent l’écrasante majorité des utilisations en entreprise. Se développe\n", + "également une troisième approche : l’apprentissage par renforcement, qui nécessiterai un cours dédié2.\n", + "Au sein de ces deux grandes approches se trouvent des sous catégories :\n", + "• Apprentissage supervisé: je dispose d’une base de données qui contient une colonne que je\n", + "souhaite prédire qui est ...\n", + "– Régression: ... une valeur continue\n", + "1. Et si la classe d’algorithme est un réseau de neurone, alors on parle de Deep Learning. Ce n’est pas au programme du\n", + "cours.\n", + "2. Elle est au coeur de l’alignement des modèles de langage avec la préférence humaine par exemple.\n", + "6' metadata={'producer': 'pdfTeX-1.40.26', 'creator': 'TeX', 'creationdate': '2025-07-20T15:41:06+02:00', 'moddate': '2025-07-20T15:41:06+02:00', 'trapped': '/False', 'ptex.fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.26 (TeX Live 2024) kpathsea version 6.4.0', 'source': 'ML.pdf', 'total_pages': 140, 'page': 5, 'page_label': '6'}\n" + ] + } + ], + "source": [ + "ml_doc = pdf_parser(\"ML.pdf\")\n", + "print(ml_doc[5])" + ] + }, + { + "cell_type": "markdown", + "id": "0473470e", + "metadata": {}, + "source": [ + "Nous avons du texte et des métadonnées. Nous commençerons par nous concentrer sur le texte. Pour qu'il puisse être digérer par le RAG, nous devons le découper en plusieurs *chunk*. La classe [`CharacterTextSplitter`](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.CharacterTextSplitter.html) permet de réaliser cette opération." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "bea1f928", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Il y a 1471 chunks.\n" + ] + } + ], + "source": [ + "from langchain.text_splitter import CharacterTextSplitter\n", + "\n", + "text_splitter = CharacterTextSplitter(\n", + " separator=\"\\n\",\n", + " chunk_size=256,\n", + " chunk_overlap=0,\n", + " length_function=len,\n", + " is_separator_regex=False,\n", + ")\n", + "\n", + "texts = text_splitter.split_documents(documents=ml_doc)\n", + "print(f\"Il y a {len(texts)} chunks.\")" + ] + }, + { + "cell_type": "markdown", + "id": "96d05d6a", + "metadata": {}, + "source": [ + "**Consigne** : Après avoir inspecté le contenu de la variable *texts*, afficher la distribution de la longueur des chunks." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "b30cc5de", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAqYAAAImCAYAAACBy0hHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8ekN5oAAAACXBIWXMAAA9hAAAPYQGoP6dpAABIuUlEQVR4nO3dB5wU9f3/8c/BUQ4Bg4QWS0AQkEgvASOIqIAKUTRNhQhKUVR+goCKoFSDikDAICIQREBExUI0ihCjooAUOx0B0VCkSO+3/8f762P2v3vcwd1x5Tt3r+fjsY+9m5mdnZnvze17vmU2IRKJRAwAAADIZQVyewMAAAAAIZgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFMhDfPi+DB+2AQAQTgRTIId06NDBqlWrFn1Ur17d6tatazfeeKNNnTrVjh8/Hrd8ixYt7MEHH0z3+ufPn28PPPDAaZfTOrXuzL5PWvbu3Wt9+/a1pUuXxu2zHrll9uzZ7lh///33Z7SerNiPxYsXu23RM8JzrLQdY8eOzXf7DeSWxFx7ZyAfqlGjhj366KPu5xMnTtiePXvsww8/tL/97W8u0I0ePdoKFPj5evHpp5+24sWLp3vdU6ZMSddy3bt3t7/+9a+W1VauXGlvvPGG3XTTTdFpwb4CAJAeBFMgBylo1qlTJ26aaiwvvPBCGzZsmP3rX/+y3//+99EQmx0uuOACyylVqlTJsfcCAIQfTfmAB9q3b2/lypWzmTNnptnEHoTWWrVqWePGja137962bds2N0/NzJ9++ql7BE2BQbOg1nnFFVdYvXr17OOPPz6pKV+OHTtmQ4cOtYYNG1qDBg1cl4Bdu3adsik7ttlRj6AWVs/Bsilfd+TIEfvHP/5hrVu3tpo1a1rLli1twoQJlpycHPdeDz/8sJvevHlzt9xf/vIX+/LLL095DLWOcePGudfUrl3b1QyrRjqlNWvWWLdu3dzx0OPuu++2zZs3W0bo2AwaNMgd10suucQaNWrk1pPRLgNfffWV3XHHHfbb3/7Wbcudd95pa9euPekYL1y40G6//Xa3X7/73e/sySefdDXugf3799sjjzxiTZo0cd1Devbs6WrQ9dpTddlIravD6Y5PWt0jUq5fy6jWX11V9Dern9Oiv9FWrVq55XQu/O9//ztpGU3r1auXO9Y6DrfddputWLEibplTnSNp2b59u/t7D46d3v+zzz6LW0bHV3+Tem8t06NHD9uxY0eGjq26A1x99dX23//+19q2bev+brTPr7/+eprbdvToUVfu+vtQi4R8/fXXbt/r16/vtqVjx472+eefn3IfgTAhmAIeUPO9PhgVvlL2NZVly5a5/psKcs8995w99NBDtmjRIrv//vujTeaqYdXjpZdest/85jfR1yoQ6INXwUUfZKn597//bd98840NHz7cLasPzy5dusSFn1PR+2n9oufUmvA1KErBa+LEifbHP/7Rxo8f7wKqui+kXP7dd991fWb79+9vI0eOdCHg3nvvPeX2KKwp9P7hD39w+/yLX/zCnnrqqbhlNmzY4ELuzp077fHHH3e11ApdN998s5uWHtoPBTeFfAWfSZMm2T333OPCY0a6Lqj89L7y2GOPuQuDLVu2uO1bv3593LJ6HwURHbM2bdq4Y/jyyy9H5yuEqwx1jEaNGmUHDhw4ad/TIyuOTyxtr0LYmDFjXAhLzbRp09xxu/zyy92FhULngAEDTroQ0Hbpb1TztG+6ELn11lujx+p050hqdJy0b7oA6NOnj/u7KVKkiAuDGzdujC6nPuC6ePv73//u1vef//zHBg8enOHj8eOPP7rX6eJNF17nnXeeO99Slrfo/4AuMBREJ0+ebBdffLELyJ07d7ZSpUq5oKuyPnTokLu42bdvX4a3B/ARTfmAJ375y1+6D7+ffvrJ/RxLH7pFixa1rl27WuHChd00BS/VuCkoqck86I+asqvALbfc4gLgqeiDTgGrWLFi0d9VU6b+r6oVPB29d9Bsr+fUmvC1rk8++cQFzeuuu85NU+2f9ksf+Pqwvuiii6IfytqeYJ8UIPQBrloj1TSlNvDqhRdesE6dOrmQKE2bNnW1YR999FF0OQWPpKQkV5sYrFsXBFdddZULe+kZPKZ1ah1aVrXLohqt7777zl0UpJfC1a9//WsXUAoWLOimXXbZZa5WTUFOxySgIK/yCLZ33rx57uJBYU2BWMFKQUWhTJo1a+YCbGqB51Sy4vjE0vFRmaRFf7sKo9dee63169cvegwUwGJbD55//nl3Xrz44ot27rnnRvdRr9Nx0vE63TmSkJBw0vu/9tpr9sMPP7hnBT9RLfENN9xgS5YssYoVK7ppqrV/4oknosfjiy++sA8++MAySiFSYV/rEK1f55fWVbly5ehyCt2qgVW5/vOf/4xeaK5bt852797tzhVtp6gbkP7udI6UKFEiw9sE+IYaU8ATwW2WUvsAVRO7PtQUNhRoNFBKH+AKYaktHyv4wD0V1VYFoTRomkxMTHQfzllF3Qy0zpQhOehTq/mB2KAt6uYgOgapUVOmQn3KEH3NNdfE/a4aNDXHKsAo/Oqh91GAUmhOD22LatBUg6lmWtWcKhQvX77cNb2mx8GDB11g0vYFoVRKlizp9iH2WEjKmu7y5cu7dQT7VKhQIRceY2vgFdoyKiuOT0b+9r799ltXE3u6clP41rp07IPt0j4qnAbblZlzRGFWtZax26lgrhp7XQwEVNax9BpdDGVG7IWjylGCsgyMGDHC5syZ4wKoQnFAF27nnHOOa3lQy8R7773nLmJV2xusCwg7akwBT6gvnAKBanlSUjBRzZpqslSDop/1gaQPqNPdxig2cKalTJkycb/rQ1+1ppn98E2N+ntqnbFBLPa9Y5siFQ5Sbo/E9kVNuW7R+lNbd0C1bm+//bZ7pKQP/PR68803Xc2vmt5VXgo2Krv00r7qQiRlzbhoWspm2ZTr1vEILmRUg6ZtCI5RoHTp0pZRWXV80vu3l5Fy27RpU1wXlVgKpJk5R7Te9BynlPsRe/wzKvZvOyizlOtSlwoFbdUU//nPf45emJ111lk2ffp0e+aZZ1zXDdWU6m/j+uuvd91egppiIMwIpoAHVAOkZjs1z6UMbgE1TeuhD2HVbKnWTv0S1SdPgz3OhD6gY6kvpwJP7Id2yv6dKWt5Tufss89269R6YvdRTeOphZOMCF6r2jc1baa1X2rqvPTSS1NtXlZtbnqoJk5N2go76tsXhAY19aoGLj20HarFix1AE9sPMbWLk7To/XVcFdpjw2lqfUJPV4bpOT5B7WPKiwQ1JZ9JucVKrdxUk6s+pKkJAllGzxGtN7UBa6r91t9rbPP66Zzp+RFryJAhbn9Vc6xBduruENDfdzD4TX3SdYs2dXHQ3TbU/xQIO5ryAQ+o5kOBJBgMk5IGouj+oKpZUY2Lmj6D/n7BCOaUNWYZoebo2EFXasrU7+o7KWrO3bp1a9xrUoawtAJ1QB+0Wuc777xzUu1jas2lGaHaMtUcpVz3+++/f9I2qJ+eajjVRKqH+qyqlk3NoumhEdsKZRpoFIRShYSgSTmtWt2UNXB6X9V6xQYa1ZSq72hGjkVwXDUgJ6C/E/VDjZWeMkzP8Qm6WMSuS31ZU4bJ9FAfywoVKqSr3FSLWKlSpeh26aFQ9sorr7i/vfScIympi4IGd8XeCUF3jlDZar3plZ5jmxGq6VWtse5CoEGA+jsRHSfdbUD/K7TP+rsfOHCg6wKS1j4CYUONKZCDNKgjuLWLAoxquhYsWOCCqfpaBoNXUtKHkZonNSBCy6k/pQajqGZN80QfTgpN6o+X0Xug6oNOH8aqBdRoZDVTa2BSMEhDH/IKPvoiAPU/Va1hytvcBAMvFKxU26Rvtoql/oAKumpyVLcFzVdfSo2gbteu3Rnd81RNnBqZrhH+CiU6JhpQkjLgaBkNGNKoel0EaAS2jr1CnAbQpEdQ86bR1QpCao5W8+qqVauiNWXp+WIEje5WjasG62iAmspUzc/qpxoMdEoPNfmqrHQ7I9XA/upXv3KhavXq1XF9K1WGzz77rHuoBlHlqVrFjB4flaEuAnQHh//7v/9zNaWal5Fa3oC2T3cc0LHQ34X6H+v8UA1gLN0SSSFUzxoxr5pWdTeYNWuWG32f3nMkJd3KSv2D77rrLncLKK03GIGvMkmv9BzbzFBZ6DzTgCnVZKtFRf839Pehvxv93Su06oImrf8dQNhQYwrkIN13UX3G9NAHn5omFWhU6xGM+k1rcJIGRKhmR4M5VJOiAKYP0SAQ6NY5GgSj2zxpBHxGaFvUbK8PPI1y1i1+NEI7CDYKYFqv7hOpD0QF4JRBTgMzNPBEIU1hIyWtSx/c+rBVDZzWoxog7Ytul3SmFKY0slvrVNBQMEs5ilxhWNunbdGxVxhRKNdtptL7wa5gpoEnOgY6JgpoCoPBfTrTW1Om0K8gdfjwYXcMdBsk1cAqbFWtWjVD+67bBumCQYN+FBbVtK1gGds3UsdHA3p0twMdH+23Ak9Gj48ugHQHANX0Bn8vek7tbgnpob8Zbb8CqbZLFxMpb8UU3ONXI/J1rqjfqJqxtf0Kq+k9R1LSBYRuV6Uwqebz++67zwU/veb8889P9z6k59hmhlpBdCx0Aasa4bJly7qwrYtAXYjofXULLZVHWuEbCJuESGZ7cAMAcp1ud6RQd+WVV8YNklKoVDO1boUEAGFBUz4AhJhq1dR8rWCqLxdQ30Pdu3Xu3Lmu6wUAhAk1pgAQcurPqOZ2fQGBBkJpNLlG1quZHADChGAKAAAALzD4CQAAAF4gmAIAAMALBFMAAAB4IfSj8nUvQXWT1f0bAQAA4B99cYXukaxvLMvTNaYKpdk1fkvr1bewMD4sfCi78KLswouyCy/KLrwiISm79Oa10NeYBjWl+t7krKavFtTtV/RVibHfoAL/UXbhRdmFF2UXXpRdeB0MSdl99dVX6Vou9DWmAAAAyBsIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAkGHJyRHLy/L6/vkqMbc3AAAAhE+BAgk2Yvoy+37bPstrzitXwnrfWj+3NyNfIpgCAIBMUShd/8Oe3N4M5CE05QMAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAMjnEhISLCkpyT0DuSkxV98dAIA8KDk5YgUKhCfkKZTWqFEjtzcDIJgCAJDVFEpHTF9m32/bZ3lRvepl7a/XEmSR9QimAABkA4XS9T/ssbzovLLFc3sTkEfRxxQAAADhDqYbNmywunXr2uzZs6PTVq5cae3bt7c6depYixYtbOrUqXGvSU5OtjFjxljTpk3dMl26dLHNmzef2R4AAAAg/wbTY8eOWe/eve3gwYPRabt377ZOnTrZBRdcYK+++qrdfffdNmLECPdzYNy4cTZjxgwbMmSIzZw50wXVzp0729GjR7NmbwAAAJC/gunYsWOtePH4/iWzZs2yQoUK2eDBg61y5cp20003WceOHW3ChAluvsLn5MmTrUePHta8eXOrXr26jRo1yrZu3Wpz587Nmr0BAABA/gmmS5YssZdeesmGDx8eN33p0qXWqFEjS0z8/+OpGjdubBs3brQdO3bYqlWr7MCBA9akSZPo/JIlS7rbU2idAAAAyN8yNCp/79691rdvX+vfv79VqFAhbp5qPqtWrRo3rWzZsu55y5Ytbr6kfJ2WCeZlViQSietWkFUOHToU94zwoOzCi7ILL8ou/mb1CD/9LStj+OxQSM47Hcf0fIFDhoLpwIED3YCntm3bnjTv8OHDVrhw4bhpRYoUcc9HjhyJHrDUltmz58xup6E+rxp4lV1U64twouzCi7ILr/xedtysPu/QQG/fA1+YzruUGfCMgunrr7/umuvnzJmT6vyiRYueNIhJgVSKFSvm5ouWCX4OljnTK0v1ba1SpYplNf0xqqArVqzI1W/IUHbhRdmFF2X3M77WM++oVKlSKGpMN4bgvFu3bl26lkt3MNXo+p07d7qBS7EeffRRe/vtt618+fK2ffv2uHnB7+XKlbPjx49Hp2nkfuwy1apVszP9J6Dwm11U0Nm5fmQfyi68KLvwouyQV/gc9MJ23qX3gi3dwVS3flJzfayWLVu6Ufa///3v7Y033nC3gDpx4oQVLFjQzV+0aJG72ihdurSVKFHCjeRfvHhxNJiqz+qKFSvcvU8BAACQv6U7mKrWMzUKnZqn20NNnDjRHn74YXdv0i+//NKmTJligwYNivYrUABVwD3nnHPs3HPPtSeffNLVtCrgAgAAIHOD7fJKF5IMDX46FQVUBdNhw4ZZu3btrEyZMm4Ev34OqHZVTfoa1a/a14YNG9qkSZNcH1EAAAAf/KJEEUtOjliBAgl5erBdsof7eEbBdPXq1XG/16pVy93jNC1q4u/Tp497AAAA+Kh4UiEX2EZMX2bfb9tnedF55UpY71vrW56tMQUAAMhLFErX/3Bmt7REDnwlKQAAAJDVCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAABDOYLpz507r06ePNW7c2OrWrWtdu3a19evXR+f379/fqlWrFvdo0aJFdH5ycrKNGTPGmjZtanXq1LEuXbrY5s2bs26PAAAAkD+C6d13322bNm2yCRMm2CuvvGJFixa1jh072qFDh9z81atX25133mkLFiyIPrRcYNy4cTZjxgwbMmSIzZw50wXVzp0729GjR7N2zwAAAJB3g+mePXvs3HPPtaFDh1qtWrWscuXK1r17d9u+fbutXbvWIpGIrVu3zi655BIrU6ZM9HHOOee41yt8Tp482Xr06GHNmze36tWr26hRo2zr1q02d+7c7NpHAAAA5LVgevbZZ9tTTz1lVatWdb/v2rXLpkyZYuXLl7cqVarYd999ZwcPHrQLL7ww1devWrXKDhw4YE2aNIlOK1mypNWoUcOWLFlypvsCAACAEEvM7AsHDBhgs2bNssKFC9szzzxjxYoVszVr1rh5L7zwgn344YdWoEABa9asmfXs2dNKlCjhakalQoUKcesqW7ZsdF5mqKZWgTirBd0TgmeEB2UXXpRdeFF2P0tISLCkpKTc3gwgXXS+KkdlN72Hzo1sC6a33Xab/fnPf7bp06e7fqfqN6pgqjCqoDl+/HhXg/rEE0+4Zv7nn38++s9KYTZWkSJFXDeBzDp27JitXLnSssvGjRuzbd3IXpRdeFF24ZXfy06hVC2BQBhs2LAhxy4mU+a/LA2marqXYcOG2RdffGHTpk1zP99yyy1WqlQpN09N/upj+qc//cm++uorN1Aq6Gsa/CxHjhw5o6vLQoUKRbcnK6mg9A+2YsWKXP2GDGUXXpRdeFF2P0tPrRDgi0qVKuVIjanGIKVHhoKp+pQuXLjQWrVqZYmJP79UNaQKhRoApZ+DUBq46KKL3LOa6oMmfC17wQUXRJfR77qt1Jn8E1BXguyif7DZuX5kH8ouvCi78KLsgPBIyqGLyPResGVo8NOOHTusV69eLpzGNqOvWLHCjdDv27evu3VULNWUisKrRuEXL17cFi9eHJ2/d+9e9/qGDRtmZFMAAACQx2QomKppXoOZdLsojaJXn9IHH3zQhUsFUtWkKrQ+/fTTrn/pBx98YP369bM2bdq44Kq+Be3bt7cRI0bY/Pnz3Sh9DYzSqP6WLVtm314CAADAexnuYzpy5Eh3yygFyn379lmDBg3cAKhf/epX7jF69Gh38/3nnnvOjcRv27at3XfffdHX6x6mx48fd98QdfjwYVdTOmnSJNdPFAAAAPlXhoOpwubAgQPdIzXXXHONe6SlYMGC7itN9QAAAAAy/ZWkAAAAQHYgmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAAhnMN25c6f16dPHGjdubHXr1rWuXbva+vXro/NXrlxp7du3tzp16liLFi1s6tSpca9PTk62MWPGWNOmTd0yXbp0sc2bN2fN3gAAACD/BNO7777bNm3aZBMmTLBXXnnFihYtah07drRDhw7Z7t27rVOnTnbBBRfYq6++6pYdMWKE+zkwbtw4mzFjhg0ZMsRmzpzpgmrnzp3t6NGjWb1vAAAACJHEjCy8Z88eO/fcc61bt25WtWpVN6179+52/fXX29q1a23hwoVWqFAhGzx4sCUmJlrlypWjIfamm25y4XPy5MnWu3dva968uXv9qFGjXO3p3LlzrU2bNtmzlwAAAMhbNaZnn322PfXUU9FQumvXLpsyZYqVL1/eqlSpYkuXLrVGjRq5UBpQk//GjRttx44dtmrVKjtw4IA1adIkOr9kyZJWo0YNW7JkSVbuFwAAAPJyjWmsAQMG2KxZs6xw4cL2zDPPWLFixWzr1q3R0BooW7ase96yZYubLxUqVDhpmWBeZkQiETt48KBlNXVPiH1GeFB24UXZhRdl97OEhARLSkrK7c0A0kXnq3JUdtN76NzItmB622232Z///GebPn2660uqfqOHDx92QTVWkSJF3PORI0ei/6xSW0bdBDLr2LFjbtBVdlGNL8KJsgsvyi688nvZKZSqJRAIgw0bNuTYxWTK/JelwVRN9zJs2DD74osvbNq0aW4gVMpBTAqkohpVzRctE/wcLHMmV5fq1xpsT1ZSQekfbMWKFbn6DRnKLrwou/Ci7H6WnlohwBeVKlXKkRrTdevWpWu5DAVT9SnVAKdWrVpF+5EWKFDAhcLt27e7vqZ6jhX8Xq5cOTt+/Hh0mkbuxy5TrVo1O5N/Agq+2UX/YLNz/cg+lF14UXbhRdkB4ZGUQxeR6b1gy9DgJw1g6tWrlwunsc3oK1ascCPwGzZsaMuWLbMTJ05E5y9atMil8dKlS1v16tWtePHitnjx4uj8vXv3utfrtQAAAMi/MhRMNbCpWbNmNnToUDeKfs2aNfbggw+6cKl7meqWUPv377eHH37YVdnOnj3bjdrX7aWCvgW6+b7ubTp//nw3Sr9nz56uprVly5bZtY8AAAAIgQz3MR05cqS7ZZQC5b59+6xBgwZuANSvfvUrN3/ixImu32m7du2sTJky1rdvX/dzoEePHq5Jv3///m6wlGpKJ02a5PqJAgAAIP/KcDAtUaKEDRw40D1SU6tWLXvppZfSfH3BggXdV5rqAQAAAGT6K0kBAACA7EAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAAEM5g+tNPP9kjjzxizZo1s3r16tnNN99sS5cujc7v1KmTVatWLe7RoUOH6PwjR47YoEGDrEmTJla3bl27//77bdeuXVm3RwAAAAilxIy+oFevXvbjjz/ayJEjrXTp0vbCCy/YHXfcYa+99ppdeOGFtnr1ahs4cKBdddVV0dcUKlQo+rPmKciOHTvWChcubI8++qj16NHDpk2blnV7BQAAgLwdTDdt2mQff/yxzZgxw+rXr++mDRgwwD766CObM2eOtW/f3nbu3Gm1a9e2MmXKnPT6bdu22euvv27jx4+3Bg0auGkKuK1bt7bPPvvM1aACAAAgf8pQU36pUqVswoQJVrNmzei0hIQE99i7d6+rLdXPlSpVSvX1y5Ytc8+NGzeOTtOy5cqVsyVLlmR+LwAAAJC/akxLlixpl19+edy0d99919Wk9uvXz9asWWMlSpSwwYMHu5rVYsWKudrQ7t27u2Z71Zgq3BYpUiRuHWXLlrWtW7dmeicikYgdPHjQstqhQ4finhEelF14UXbhRdn9TBU0SUlJub0ZQLrofFWOym56D50bWd7HNNby5cvtoYcespYtW1rz5s1dONXgplq1arlBUCtXrrQnnnjC/ve//7ln7bwCakoKqnpdZh07dsy9V3bZuHFjtq0b2YuyCy/KLrzye9kplNaoUSO3NwNIlw0bNuTYxWRqGTDLgum8efOsd+/ebmT+iBEj3DTVlD7wwAN29tlnu9+rVq3qBj717NnT+vbta0WLFrWjR4+etC6F0jO5utR7VKlSxbKaCkr/YCtWrMjVb8hQduFF2YUXZfez9NQKAb6oVKlSjtSYrlu3Ll3LZSqYagT9sGHDXDP9448/Hk3AiYmJ0VAauOiii9yzmurLly/vbjelcBqbmrdv3+76mZ7JPwF1G8gu+gebnetH9qHswouyCy/KDgiPpBy6iEzvBVuG72OqEflDhgyxW2+91Y2ojw2Yul+pmvZjffXVV65GU1fQGsmfnJwcHQQVVCGr72nDhg0zuikAAADIQzJUY6oQ+dhjj9nVV19t3bp1sx07dkTnqZm+VatWbr76mF522WUulKpvqe5zWrx4cfe47rrrrH///m45pXTdx7RRo0ZWp06d7Ng/AAAA5MVgqhH4Gmj03nvvuUesdu3a2fDhw11VrW66r+Cpe5l27NjRunbtGl1Ota2ad88997jf9Q1SCqoAAADI3zIUTO+88073OBU18euRFvU7Gjp0qHsAAAAAme5jCgAAAGQHgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAIBwBtOffvrJHnnkEWvWrJnVq1fPbr75Zlu6dGl0/sKFC+3GG2+02rVrW+vWre2tt96Ke/2RI0ds0KBB1qRJE6tbt67df//9tmvXrqzZGwAAAOSfYNqrVy/77LPPbOTIkfbqq6/axRdfbHfccYd9++23tn79euvWrZs1bdrUZs+ebX/84x+tb9++LqwGBg4caAsWLLCxY8fa888/717Xo0ePrN4vAAAAhExiRhbetGmTffzxxzZjxgyrX7++mzZgwAD76KOPbM6cObZz506rVq2a9ezZ082rXLmyrVixwiZOnOhqSLdt22avv/66jR8/3ho0aOCWUcBVzarCrmpQAQAAkD9lqMa0VKlSNmHCBKtZs2Z0WkJCgnvs3bvXNekrgMZq3LixLVu2zCKRiHsOpgUqVapk5cqVsyVLlpz53gAAACB/1JiWLFnSLr/88rhp7777rqtJ7devn7322mtWvnz5uPlly5a1Q4cO2e7du12NqcJtkSJFTlpm69atmd4Jhd6DBw9aVtN2xz4jPCi78KLswouy+5kqa5KSknJ7M4B00fmqHJXd9B46N7I0mKa0fPlye+ihh6xly5bWvHlzO3z4sBUuXDhumeD3o0ePup1POV8UVDUoKrOOHTtmK1eutOyycePGbFs3shdlF16UXXjl97JTKK1Ro0ZubwaQLhs2bMixi8nUMmCWBdN58+ZZ79693cj8ESNGRAOmAmis4HedqEWLFj1pviiUnsnVZaFChaxKlSqW1VRQ+gdbsWJFrn5DhrILL8ouvCi7n6WnVgjwRaVKlXKkxnTdunXpWi5TwXTatGk2bNgwN2jp8ccfjybgChUq2Pbt2+OW1e/FihWzEiVKuGZ+3W5K4TQ2NWsZ9TM9k38Ceo/son+w2bl+ZB/KLrwou/Ci7IDwSMqhi8j0XrBl+HZRGpE/ZMgQu/XWW92I+tiAqZH2n376adzyixYtcrWqBQoUcCP5k5OTo4Oggipk9T1t2LBhRjcFAAAAeUiGgqlC5GOPPWZXX321u1/pjh077Mcff3SPffv2WYcOHezLL790Tfu6p+nkyZPtnXfesc6dO7vXq1b0uuuus/79+9vixYvdsrovaqNGjaxOnTrZtY8AAAAIgQw15WsEvgYavffee+4Rq127djZ8+HAbN26cPfnkk+7m+eedd577OfYWUqptVbi955573O/6BikFVQAAAORvGQqmd955p3ucioKmHmlRv6OhQ4e6BwAAAJDpPqYAAABAdiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBADkuOTmS25sAwEOJub0BAID8p0CBBBsxfZl9v22f5TX1qpe1v15bI7c3AwglgikAIFcolK7/YY/lNeeVLZ7bmwCEFk35AAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAAAIfzB99tlnrUOHDnHT+vfvb9WqVYt7tGjRIjo/OTnZxowZY02bNrU6depYly5dbPPmzWeyGQAAAMjPwXT69Ok2evTok6avXr3a7rzzTluwYEH08corr0Tnjxs3zmbMmGFDhgyxmTNnuqDauXNnO3r0aOb3AgAAAPkvmG7bts0FzxEjRljFihXj5kUiEVu3bp1dcsklVqZMmejjnHPOcfMVPidPnmw9evSw5s2bW/Xq1W3UqFG2detWmzt3btbtFQAAAPJ+MP3mm2+sUKFC9uabb1rt2rXj5n333Xd28OBBu/DCC1N97apVq+zAgQPWpEmT6LSSJUtajRo1bMmSJZnZfgAAAOQRiRl9gfqLxvYZjbVmzRr3/MILL9iHH35oBQoUsGbNmlnPnj2tRIkSrmZUKlSoEPe6smXLRudlhmpqFYiz2qFDh+KeER6UXXhRdnm/7BISEiwpKSmHtgrAqeh8VY7KbnoPnftZHkxPRcFUYVRBc/z48a4G9YknnrC1a9fa888/H/1nVbhw4bjXFSlSxPbs2ZPp9z127JitXLnSssvGjRuzbd3IXpRdeFF2ebfsFErVUgYg923YsCHHKgJS5r9sD6Z33XWX3XLLLVaqVCn3e9WqVV0f0z/96U/21VdfWdGiRaN9TYOf5ciRI2d09ayuBVWqVLGspoLSP1j1peXqPlwou/Ci7PJ+2aWn1gRAzqhUqVKO1JhqDFJ6ZGkwVW1pEEoDF110kXtWU33QhL99+3a74IILosvod91WKrP0T65YsWKZfv3p6B9sdq4f2YeyCy/KLrwoOyA8knKoAiC9F6RZeoP9vn37WseOHeOmqaZUVKOpUfjFixe3xYsXR+fv3bvXVqxYYQ0bNszKTQEAAEDIZGkwbdWqlS1cuNCefvpp17/0gw8+sH79+lmbNm2scuXKrm9B+/bt3a2m5s+f70bpa2BU+fLlrWXLllm5KQAAAAiZLG3Kv/LKK91N9ydMmGDPPfecG4nftm1bu++++6LL6B6mx48fd98QdfjwYVdTOmnSJNdPFAAAAPnXGQXT4cOHnzTtmmuucY+0FCxY0Pr06eMeAAAAQLY05QMAAACZRTAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAAAQ/mD67LPPWocOHeKmrVy50tq3b2916tSxFi1a2NSpU+PmJycn25gxY6xp06ZumS5dutjmzZvPZDMAAACQn4Pp9OnTbfTo0XHTdu/ebZ06dbILLrjAXn31Vbv77rttxIgR7ufAuHHjbMaMGTZkyBCbOXOmC6qdO3e2o0ePntmeAAAAINQSM/qCbdu22aOPPmqLFy+2ihUrxs2bNWuWFSpUyAYPHmyJiYlWuXJl27Rpk02YMMFuuukmFz4nT55svXv3tubNm7vXjBo1ytWezp0719q0aZN1ewYAAIC8XWP6zTffuPD55ptvWu3atePmLV261Bo1auRCaaBx48a2ceNG27Fjh61atcoOHDhgTZo0ic4vWbKk1ahRw5YsWXKm+wIAAID8VGOqfqN6pGbr1q1WtWrVuGlly5Z1z1u2bHHzpUKFCictE8zLjEgkYgcPHrSsdujQobhnhAdlF16UXd4vu4SEBEtKSsqhrQJwKjpflaOym95D536WB9NTOXz4sBUuXDhuWpEiRdzzkSNHov+sUltmz549mX7fY8eOuUFX2UU1vggnyi68KLu8W3YKpWopA5D7NmzYkGMVASnzX7YH06JFi540iEmBVIoVK+bmi5YJfg6WOZOrZ3UtqFKlimU1FZT+waovLVf34ULZhRdll/fLLj21JgByRqVKlXKkxnTdunXpWi5Lg2n58uVt+/btcdOC38uVK2fHjx+PTtPI/dhlqlWrlun31T85Bd/son+w2bl+ZB/KLrwou/Ci7IDwSMqhCoD0XpBm6Q32GzZsaMuWLbMTJ05Epy1atMil8dKlS1v16tWtePHibkR/YO/evbZixQr3WgAAAORfWRpMdUuo/fv328MPP+yqbGfPnm1Tpkyxbt26RfsW6Ob7urfp/Pnz3Sj9nj17uprWli1bZuWmAAAAIGSytClftaITJ060YcOGWbt27axMmTLWt29f93OgR48erkm/f//+brCUakonTZrk+okCAAAg/zqjYDp8+PCTptWqVcteeumlNF9TsGBB69Onj3sAAAAA2dKUDwAAAGQWwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAEDeDKbbtm2zatWqnfSYPXu2m79y5Upr37691alTx1q0aGFTp07N6k0AAABACCVm9QpXrVplRYoUsXnz5llCQkJ0eokSJWz37t3WqVMnF0gHDRpkn3/+uXs+66yz7KabbsrqTQEAAEB+DqZr1qyxihUrWtmyZU+a9/zzz1uhQoVs8ODBlpiYaJUrV7ZNmzbZhAkTCKYAAAD5XJY35a9evdoFztQsXbrUGjVq5EJpoHHjxrZx40bbsWNHVm8KAAAA8nuNaalSpezWW2+1DRs22K9//Wu76667rFmzZrZ161arWrVq3PJBzeqWLVvsl7/8ZabeMxKJ2MGDBy2rHTp0KO4Z4UHZhRdll/fLTt28kpKScmirAJyKzlflqOym94jt4pkjwfT48eP27bffWpUqVezBBx+04sWL21tvvWVdu3a1f/7zn3b48GErXLhw3GvUH1WOHDmS6fc9duyYG1SVXVSji3Ci7MKLssu7ZadQWqNGjRzbHgBpUyViTlUEpMyA2R5M1US/ePFiK1iwoBUtWtRNu+SSS2zt2rU2adIkN+3o0aNxrwkCabFixTL9vuq3qjCc1VRQ+gerPrNc3YcLZRdelF3eL7v01JoAyBmVKlXKkRrTdevW5U5TvkbYp3TRRRfZggULrHz58rZ9+/a4ecHv5cqVy/R76p/cmQTb09E/2OxcP7IPZRdelF14UXZAeCTlUAVAei9Is3Twk2pG69Wr52pNY3399deuRrNhw4a2bNkyO3HiRHTeokWLXFovXbp0Vm4KAAAAQiZLg6lG41944YXudlAagb9+/Xr729/+5u5XqgFQuiXU/v377eGHH3ZVurrp/pQpU6xbt25ZuRkAAAAIoSxtyi9QoICNHz/ennrqKbvvvvts7969roO7Bj4Fo/EnTpxow4YNs3bt2lmZMmWsb9++7mcAAADkb1nex1S3fFItaVpq1aplL730Ula/LQAAAEIuy2+wDwAAAGQGwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAAMALBFMAAAB4gWAKAAAALxBMAQAA4AWCKQAAALxAMAUAAIAXCKYAAADwAsEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgBCJzk5YnldfthHAEgp8aQpAOC5AgUSbMT0Zfb9tn2WF51XroT1vrV+bm8GAOQ4gimAUFIoXf/DntzeDABAFqIpH4BXEhISLCkpyT0DAPIXakyBPNo/Uc3dYaRQWqNGjdzeDABALiCYAnlQXu6DWa96WfvrtQRXH1HbDeBMEUyBPCqv9sE8r2xxy+t+UaJIKGu9qe0GcKYIpgDgmeJJhaj1BpAvEUwBwFPUegPIbxiVj3yJm5cDAOAfakyRL9FMCgCAfwimyLdoJgUAwC805QMAAMALBFMAAAB4gWCaCXl94Exe2D9u9A0AQPjQxzQT8vLAmYsrnWNdrq9pYceNvgEACJ9cCabJycn29NNP28svv2z79u2zhg0b2iOPPGLnn3++hUVeHjiTl4O3MGodAAA/5UowHTdunM2YMcOGDx9u5cuXtyeffNI6d+5sc+bMscKFC+fGJiGfBG9h1DoAAH7K8T6mR48etcmTJ1uPHj2sefPmVr16dRs1apRt3brV5s6dm9ObAwAAgPwaTFetWmUHDhywJk2aRKeVLFnS9QdcsmRJTm8OAAAAPJEQiURydAi2akXvvfde++KLL6xo0aLR6f/3f/9nhw8ftmeffTZD61u+fLlpFwoVKpTl26r1Hj9+3BITE+NGd+vnPfuP2vETyZbXFClU0IoXK5Rn9y8/7CP7F355fR/Zv/DL6/uY1/dPEgsWsLOLF3ZZJyccO3bM5ad69er51cf00KFD7jllX9IiRYrYnj0Z79MYBMbsuC2Q1plWn1cVZl6W1/cvP+wj+xd+eX0f2b/wy+v7mNf3T3Lqtop6n/S8V44H06CWVH1NY2tMjxw54m7xk1F169bN0u0DAABAPuljWqFCBfe8ffv2uOn6vVy5cjm9OQAAAMivwVSj8IsXL26LFy+OTtu7d6+tWLHC3c8UAAAA+VOON+Wrz2b79u1txIgRds4559i5557r7mOq+5m2bNkypzcHAAAA+fkG+7qHqUa79+/f343EV03ppEmTsmVkPQAAAMIhx28XBQAAAHjRxxQAAABIDcEUAAAAXiCYAgAAwAsEUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwTUNycrKNGTPGmjZtanXq1LEuXbrY5s2bc3uzkIpt27ZZtWrVTnrMnj3bzV+5cqX7tjGVY4sWLWzq1Km5vcn53rPPPmsdOnSIm3a6cuKc9Lfs9GUpKc8/lWGAsss9P/30kz3yyCPWrFkzq1evnt188822dOnS6PyFCxfajTfeaLVr17bWrVvbW2+9Fff6I0eO2KBBg6xJkyZWt25du//++23Xrl25sCf5z0+nKbtOnTqddN7FnpuhLTvdYB8nGzt2bOS3v/1t5P3334+sXLkycvvtt0datmwZOXLkSG5vGlL473//G6lZs2Zk27Ztke3bt0cfhw4diuzatcuV40MPPRRZt25d5JVXXnHL6hm5Y9q0aZHq1atH2rdvH52WnnLinPSz7OQPf/hDZOTIkXHn386dO6PzKbvc06lTp0ibNm0iS5YsiXz77beRQYMGRWrVqhVZv369O9d0nqns9PPEiRMjNWrUiHzyySfR1z/44IORq666yr3+iy++iNxwww2RW2+9NVf3Kb/odIqykyZNmkRmzJgRd97t3r079GVHME2F/lnWrVs3Mn369Oi0PXv2uD+IOXPm5Oq24WQTJkyItG3bNtV548ePj1x22WWRY8eORac99dRT7kMROWvr1q2Rbt26RerUqRNp3bp1XLg5XTlxTvpbdsnJyW763LlzU30tZZd7Nm7cGKlatWpk6dKlceWlsDJ69OjIgAED3EVFrF69erkLh6DcdSGii/+AApLWuXz58hzck/xn42nKbseOHW7+N998k+rrw1x2NOWnYtWqVXbgwAFX/R0oWbKk1ahRw5YsWZKr24aTrV692ipXrpzqPDV7NGrUyBITE6PTGjdubBs3brQdO3bk4Fbim2++sUKFCtmbb77pmg0zUk6ck/6W3XfffWcHDx60Cy+8MNXXUna5p1SpUjZhwgSrWbNmdFpCQoJ77N271513seUSnHfLli1TpZV7DqYFKlWqZOXKlaPscrnsVq9e7X5WeaQmzGVHME3F1q1b3XOFChXippctWzY6D/5Ys2aN6zdz66232qWXXur64Xz44YdunsqrfPnyJ5WjbNmyJVe2N79Sn8OxY8fa+eeff9K805UT56S/ZafzT1544QW33FVXXWWDBw+2ffv2uemUXe7RBcDll19uhQsXjk579913bdOmTa6/b1rn3aFDh2z37t2u/74CUpEiRU5ahrLL3bJbs2aNlShRwp1r6oOq/sGjR4+2o0ePumXDXHYE01TopJTYPwhRAaszMfxx/Phx+/bbb23Pnj127733uitMDa7o2rWr69R/+PDhVMtRKEt/nK6cOCf9pQ/IAgUKuA+88ePH24MPPmgLFiyw7t27u0FPlJ0/li9fbg899JC1bNnSmjdvnup5F/yugKOySzlfKLvcL7s1a9a4MqhVq5ZNnDjR7rrrLnv55ZfdQEQJc9n9/3YzRBUtWjR6YgY/iwozKSkpF7cMKanpd/HixVawYMFoWV1yySW2du1amzRpkpsWXEEGgpOyWLFiubLNONnpyolz0l/6QLzllltc7YxUrVrVypQpY3/605/sq6++ouw8MW/ePOvdu7cb3T1ixIhoSEl53gW/q2xSOy+Fssv9shs8eLA98MADdvbZZ0fPO3W36dmzp/Xt2zfUZUeNaSqCJqft27fHTdfv6p8Bv5x11llxH3hy0UUXuaYMNVOlVo5CWfrjdOXEOekv1ZYGoTT2/BM1GVJ2uW/atGmuRemKK65wtdpBa4TKJrVy0cWgmol1XuqWRSkDDmWX+2WXmJgYDaWpnXdhLjuCaSqqV69uxYsXdzVxAXU2XrFihTVs2DBXtw3xVDOqq8jYspKvv/7aqlSp4spLncBPnDgRnbdo0SLXCbx06dK5sMVIzenKiXPSX6qd6dixY9w01ZSKzkHKLnfNmDHDhgwZ4vrgjxw5Mq55t0GDBvbpp5/GLa/zTv9TdcFRv3591x0jGEgjGzZscBf9lF3ull2HDh1c037K8061phUrVgx12RFMU6HC142+VWU+f/58N6pU1eO6AlH/DvhDo/E1GljNGhphun79evvb3/5mn3/+uWtivOmmm2z//v328MMP27p169xN96dMmWLdunXL7U1HjNOVE+ekv1q1auX6cz/99NNuhP4HH3xg/fr1szZt2rjzk7LLPQoijz32mF199dXuXNIdLn788Uf30OA0hZsvv/zSlY3+d06ePNneeecd69y5s3u9atauu+46129RFxZatlevXu4OGurLj9wru1atWtkbb7xhL774ovuyirffftueeOIJu+OOO9yFYJjLLkH3jMrtjfCRam50haIPSHUQ1xWGvoHhvPPOy+1NQwo6YZ966in76KOPXE2MbkOj/jiqDRCdkMOGDXM1NOr7dvvtt7sPSuQeDZD54Ycf3EjuwOnKiXPS37L797//7QYeaiCimoDbtm1r9913X7TZkbLLHWr6HTVqVKrz2rVrZ8OHD3d3MHnyySfdrdlUHmo2vvbaa6PL6VZgCkgaES4aAa6wk7L7BnK+7KZPn+4eCqZBv24N/FVtd5jLjmAKAAAAL9CUDwAAAC8QTAEAAOAFgikAAAC8QDAFAACAFwimAAAA8ALBFAAAAF4gmAIAshR3IQSQWQRTAFlK3yajLzkIvpYypRYtWribtOcEvY/eL7fpxvLVqlWz77//3vK6l19+2R5//PGTpuvb2XQM9EUKAJAWgimALKdv+tH3OB89ejS3NwU57JlnnrGffvrppOn6xhldJPz973/Ple0CEA4EUwBZTl9LuXbtWvvHP/6R25sCT+hrEhVKH3744dzeFAAeI5gCyHIXX3yx3XDDDTZx4kT7+uuvT1u7qu971ver16pVy5o3b24jRoywI0eOxDXJ33HHHfbSSy/ZVVdd5Zb7y1/+Yhs2bLD333/fvbZ27dr2xz/+0VauXHnSe+h1Wq9ed9ttt9mKFSvimtnV9UBN0L/73e+sUaNGtm7dOjdv3rx5duONN1rNmjXdvKFDh7rvnz6V5ORkGzdunHs/bVP37t1tz549Jy23Zs0a69atm9WrV8897r77bved16fzwQcfuH2vU6eOXXbZZe475/fu3Rudv2TJEnes9H30l1xyiaulHDt2rNsuUXcCNan/85//tNatW7ttfPXVV6P7e8stt1jdunXdazVfZRNr+/bt9sADD1iTJk3ccu3bt7fPPvvMzdN7/fDDD/baa6/FdV343//+Z7169XLbq+/5TlkGp9qm9Byn559/3r1O5dS0aVMbOHCg7d+//7THEoCHIgCQhdq3b+8eP/30U+R3v/tdpE2bNpEjR45E519xxRWRBx54IPp7v379Ir/5zW8io0ePjixYsCAyYcKESO3atSO33357JDk52S2j5evWrevW9d5770X+9a9/RRo0aBC56qqrIldffXVkzpw5kXnz5rn3u/baa6Pr1usuvvjiyGWXXRZ57bXX3Guvv/76SL169SI//PCDW+bVV1+NVK1aNdK6devI+++/H5k9e7Z73zfffNNNv//++yMffPBBZMaMGZGGDRtGbrvttuh2pWb48OGRGjVqRMaOHRv58MMPIw899JDbP61r8+bNbplvv/3W7c9NN90UmTt3buTtt9+OtG3b1m3/jh070lz3f/7zn0i1atUi3bt3d9uqfWrSpIk7VrJy5Ur33r169Yp89NFH7v379Onj3lvHTLQN+l3v/8orr0TeeeedyJYtW9z6NH3o0KGRTz75xL1X586d3bTPP//cvXb//v2RFi1aRC6//HJ33FReeu86depENmzYEPnmm2/cPnTp0iXy2WefuXLfuXNnpGnTppGWLVu6Y6oy0N+HXrNu3bpTblN6jpPKXsd36tSpkcWLF0defPFFt+6+fftm6u8XQO4imALIlmAq8+fPd4Fj5MiRqQbTtWvXuvnPPvts3Dpef/11N/2///2v+13L6/cgyMgjjzzipilEBSZNmuSm7dmzJ+51X3zxRXSZ7du3R2rVquUCZGww1XsGFDybNWsWueOOO+K2S++lZRXiUqP3VUh68skn46ZrPbHBVMHx0ksvjezbty+6zO7duyP169ePbldq2rVrF7nhhhvigvFbb73lQt+PP/7ogqrC5IkTJ6Lz9bPWO2DAgLgQqAuCWM8991zcBUOwTbHl88ILL7hgvGLFiugyBw8edO8/a9asVC88VPY1a9aMfP/999FpCqxXXnll5N577z3lNqXnOGm/WrVqFbfPb7zxhguqAMInMbdrbAHkXWra/f3vf++a9Fu2bGm/+c1v4uZ/+umn7vm6666Lm67fNXhq8eLFdvnll7tpZ599tlWuXDm6zC9/+Uv3rGbfwC9+8Qv3rKbtkiVLup/PP/9814QfKFOmjGsGV5N3yu4HgW+//da2bt3qmpCPHz8ena7m8eLFi9vHH3/smupT+vzzz+3YsWN2xRVXxE2/5ppr7KOPPor+vmjRItdloGjRotH1a70NGjSwTz75JNVjefjwYdf8fe+991pCQkJ0+rXXXuseou4TeqgbhLo5bNq0yXVtUHcJbVda+yudO3d2zwcOHHCv/e6776J3VggGsS1btszOO++8uNcmJSXZu+++a2lZuHChW75cuXLRfVV/02bNmtmbb755ym1Kz3Fq3Lix66qhLhfq5qG/F3XtiD1GAMKDYAogW2k0tsKJgmbQbzAQ9L1UWIyVmJhopUqVsn379kWnKZCkplixYqd8/yDAxipdurRt2bIlzfUEo8oHDRrkHimpn2Vqgv3RtsdKuX9a/9tvv+0eKZ1zzjlprlutXNr2tCi8DhkyxN544w0X5BQi1Q9UxzPlvUVTHrddu3bZo48+6vqZKtT9+te/dgFQgtdqu0/1/qnRaxSQU16UBA4dOpTmNqXnOCmUq//sjBkzXN9e9ac999xzrXfv3tHADiA8CKYAspVqOjUYRYNWFBxSzpMff/zRhYmAavd27959UsDLjNQGHun90gqAEtS29u3b19XYpRRsd0rB9u7cudMuvPDC6PSUt0/SXQsuvfRS69Sp00nrUIhMjYK5AqMCZCzVjqpmUTXHTz31lKu9HD16tFt/EPQ0UOl0FORUUzxlyhQXZgsXLuxC46xZs+K2O7V7sS5fvvykGu3Y1+gY6limRu+TlvQepzZt2riHLmQWLFhgzz33nPXp08fq16/vamoBhAej8gFkOzWxKjhMmDAhLlgFoe+tt96KW16/q/lZweJMBc3SAdWUahT5b3/72zRfo1CpmkGFMI30Dh4KOQp/sSPKYynQqdn5nXfeiZuuOwfECkb+q+k6WLdGwSsUvvfee6mu+6yzznLLp1zXhx9+aF27dnW1uGpq137peAehVHdF0DEPRuWnRa9Vdwu9PgiLWrcEr1UNqkbE61ZgscFY3QteeeWVaDN9yn1VGVSqVCnuWKpWV68pWLBgmtuUnuN03333uYueIMiq24TuhKAa47RqtgH4ixpTADliwIABrmZvx44d0WlVqlRxtw8aM2aMq51TH071iXz66addQNKtf85UkSJF7K677rKePXu6sKt7aaovqm5ZlBaFJS2vWzHpZ/UZVb9V1fhu27YtzWZphUeFItVYqu+l+j/q9k4pw6SW0S2f1If15ptvdtuofpJqRtexSEuPHj3cvujWS+pLqmM5cuRIF0SrVq3q+tL++9//thdffNHVXq5atcrd8F41rbFN5qnRa+fMmeP2rXz58q4WVBcSsa9VP84XXnjBbYO2RTXEU6dOdTXcus1UUNus4K7+w1pnx44dXQjV8+233+5eo6Z51cSqe8eppOc46RirC4K+bUr9VlVO+vupWLGiVa9e/ZTrB+AfgimAHKEwqCb9e+65J276sGHDXH9G9T9VE2zZsmXtr3/9qwslKWvfMkP3KG3VqpV7bzX1qlm7X79+p2zKF90TVUFTA7cUhlQDqfto6h6rGlCVFoUoLat7a+qhWlTd91PvH1Bg0v1BR40a5Zq41YdTwVJfSHDllVemuW4F5PHjx7vgpVpC7YMG+qjGMrjfq0KigrEGLKmPqUKkah3/85//uGCeluHDh7v+qXqIgp3612qA0tKlS6PdCaZNm2ZPPPGEW041qRpIpnAaHBOFz8cee8zdS1X3JVUt68yZM11Ns46Bali1bpX7H/7wh1OWQXqOk4Kr9lnvoX6mqrFWGaspv1ChQqdcPwD/JGhofm5vBAAAAEAfUwAAAHiBYAoAAAAvEEwBAADgBYIpAAAAvEAwBQAAgBcIpgAAAPACwRQAAABeIJgCAADACwRTAAAAeIFgCgAAAC8QTAEAAOAFgikAAADMB/8PCTteVC5D8lUAAAAASUVORK5CYII=", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns; sns.set_theme(style=\"whitegrid\")\n", + "\n", + "\n", + "length = np.array([len(doc.page_content) for doc in texts])\n", + "\n", + "plt.figure(figsize=(8, 6))\n", + "plt.hist(length)\n", + "plt.title(\"Distribution de la longueur des chunks\")\n", + "plt.xlabel(\"Nombre de caractères\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "43bf41cd", + "metadata": {}, + "source": [ + "Nous observons des chunks avec très peu de caractères. Inspecter les contenus des documents avec moins de 100 caractères et noter les améliorations possibles." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "8d300959", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "INTRODUCTION AU MACHINE LEARNING\n", + "2022-2026\n", + "Théo Lopès-Quintas\n", + "------------------------------\n", + "vue un peu plus complète du domaine, ainsi qu’un aperçu plus récent des développements en cours.\n", + "2\n", + "------------------------------\n", + "3. À condition que l’algorithme soit performant.\n", + "7\n", + "------------------------------\n", + "Pour essayer de comprendre ce passage, faisons un exercice :\n", + "4. Voir l’équation (2.3).\n", + "8\n", + "------------------------------\n", + "11\n", + "------------------------------\n", + "le résultat, on peut vérifier la cohérence de la formule avec un exercice.\n", + "15\n", + "------------------------------\n", + "valeur moyenne. La vision est donc bien complémentaire à celle de laRMSE.\n", + "17\n", + "------------------------------\n", + "• FP : Faux positif - une baisse identifiée comme une hausse\n", + "28\n", + "------------------------------\n", + "L’idée est de partitionner l’espace engendré parD, dont voici la procédure à chaque étape :\n", + "33\n", + "------------------------------\n", + "définir ce que l’on appelle intuitivementla meilleure coupure.\n", + "34\n", + "------------------------------\n", + "Devant cet exemple jouet, on peut imaginer une situation plus proche de la réalité :\n", + "37\n", + "------------------------------\n", + "Pour saisir l’intérêt de la proposition, résolvons l’exercice suivant.\n", + "38\n", + "------------------------------\n", + "40\n", + "------------------------------\n", + "des champions.\n", + "41\n", + "------------------------------\n", + "42\n", + "------------------------------\n", + "fm(x) = fm−1(x) − γ\n", + "nX\n", + "i=1\n", + "∂C\n", + "∂fm−1\n", + "\u0000\n", + "x(i)\u0001\n", + "\u0010\n", + "yi, fm−1\n", + "\u0010\n", + "x(i)\n", + "\u0011\u0011\n", + "= fm−1(x) + γ′hm(x)\n", + "45\n", + "------------------------------\n", + "peut visualiser ce résultat avec la figure (5.3).\n", + "47\n", + "------------------------------\n", + "i (xi − µk)\n", + "2. Conclure sur la convergence deJ.\n", + "53\n", + "------------------------------\n", + "pour amener le clustering vers sa meilleure version.\n", + "62\n", + "------------------------------\n", + "3. Que nous ne démontrerons pas\n", + "68\n", + "------------------------------\n", + "6. Puisqu’on peut normaliser la distance par rapport au voisin le plus éloigné.\n", + "71\n", + "------------------------------\n", + "2. Largement inspiré du schéma de Park ChangUk.\n", + "77\n", + "------------------------------\n", + "8. Avec des valeurs non nulle dans la majorité des coordonnées.\n", + "84\n", + "------------------------------\n", + "10. Pour plus de détails, voir la section (G.1)\n", + "88\n", + "------------------------------\n", + "11. Dépendant donc de la méthode de tokenization et de la taille du vocabulaire.\n", + "89\n", + "------------------------------\n", + "Appendices\n", + "93\n", + "------------------------------\n", + "donner. Il nous faudrait une caractérisation plus simple d’utilisation :\n", + "95\n", + "------------------------------\n", + "existe deux minimaux globaux et on aboutit à une absurdité en exploitant la stricte convexité.\n", + "98\n", + "------------------------------\n", + "∥xi∥. Alors lak-ième erreur de classification du perceptron aura lieu avant :\n", + "k ⩽\n", + "\u0012R\n", + "γ\n", + "\u00132\n", + "∥w∗∥2\n", + "103\n", + "------------------------------\n", + "P({y = k}) × P\n", + "\n", + "\n", + "d\\\n", + "j=1\n", + "xj | {y = k}\n", + "\n", + "\n", + "P\n", + "\n", + "\n", + "d\\\n", + "j=1\n", + "xj\n", + "\n", + "\n", + "(C.1)\n", + "109\n", + "------------------------------\n", + "exploratoire et d’augmentation des données pour répondre à un problème de Machine Learning.\n", + "113\n", + "------------------------------\n", + "aléatoirement entre−1 et 1. Puis on normalise le vecteurx.\n", + "114\n", + "------------------------------\n", + "époque il y avait également Yann Le Cun, à la tête de la recherche chez Meta.\n", + "116\n", + "------------------------------\n", + "118\n", + "------------------------------\n", + "2. Kernel en allemand.\n", + "125\n", + "------------------------------\n", + "s’améliore! Deux phénomènes contre-intuitifs se réalisent :\n", + "132\n", + "------------------------------\n", + "computing. In Proceedings of the AAAI Conference on Artificial Intelligence, 2015.\n", + "139\n", + "------------------------------\n" + ] + } + ], + "source": [ + "for doc in texts:\n", + " if len(doc.page_content) < 100:\n", + " print(doc.page_content)\n", + " print(\"-\" * 30)" + ] + }, + { + "cell_type": "markdown", + "id": "f69b2033", + "metadata": {}, + "source": [ + "Nous avons à présent un ensemble de chunk, il nous reste à construire l'embedding pour stocker toute ces informations. Nous faisons les choix suivants :\n", + "* Nous utiliserons l'embedding [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) pour sa taille et son entraînement spécifique à notre tâche\n", + "* Nous utiliserons le *vector store* [FAISS](https://python.langchain.com/docs/integrations/vectorstores/faiss/) puisque nous l'avons couvert en cours.\n", + "* Nous récupérerons les trois chunks les plus proches, pour commencer" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "40021b12", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_huggingface import HuggingFaceEmbeddings\n", + "from langchain_community.vectorstores import FAISS\n", + "import os\n", + "\n", + "os.environ['USE_TF'] = 'false'\n", + "os.environ['USE_TORCH'] = 'true'\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'\n", + "\n", + "\n", + "embedding_model = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n", + "vectordb = FAISS.from_documents(texts, embedding_model)\n", + "n_doc_to_retrieve = 3\n", + "retriever = vectordb.as_retriever(search_kwargs={\"k\": n_doc_to_retrieve})" + ] + }, + { + "cell_type": "markdown", + "id": "ed148169", + "metadata": {}, + "source": [ + "Notre base de connaissance est réalisée ! Passons maintenant à l'augmentation du modèle de langage.\n", + "\n", + "## Génération\n", + "\n", + "Pour cette étape, il nous reste à définir le modèle de langage et comment nous allons nous adresser à lui.\n", + "\n", + "**Consigne** : Définir la variable *model* à partir de la classe [OllamaLLM](https://python.langchain.com/api_reference/ollama/llms/langchain_ollama.llms.OllamaLLM.html#ollamallm) et du modèle de votre choix." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "4abfbda6", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_ollama import OllamaLLM\n", + "\n", + "model = OllamaLLM(model=\"gemma3:4b\")" + ] + }, + { + "cell_type": "markdown", + "id": "d42c7f56", + "metadata": {}, + "source": [ + "**Consigne** : À l'aide de la classe [PromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html#langchain_core.prompts.prompt.PromptTemplate) et en s'inspirant éventuellement de [cet exemple](https://smith.langchain.com/hub/rlm/rag-prompt), définir un template de prompt qui aura deux *input_variable* : 'context' et 'question'." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "2c3c7729", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import PromptTemplate\n", + "\n", + "prompt_template = PromptTemplate(\n", + " template=\"\"\"\n", + " You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. \n", + " If you don't know the answer, just say that you don't know. Answer in the language of the question asked.\n", + "\n", + " Question: {question}\n", + " Context:\\n{context}\n", + " Answer:\n", + " \"\"\",\n", + " input_variables=[\"context\", \"question\"]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "0da52ea4", + "metadata": {}, + "source": [ + "Pour construire la chaîne de RAG, LangChain utilise le [LangChain Expression Language (LCEL)](https://python.langchain.com/v0.2/docs/concepts/#langchain-expression-language-lcel), voici dans notre cas comment cela se traduit :" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c51afe07", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.runnables import RunnablePassthrough\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "\n", + "def format_docs(docs):\n", + " return \"\\n\\n\".join(doc.page_content for doc in docs)\n", + "\n", + "\n", + "rag_chain = (\n", + " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt_template\n", + " | model\n", + " | StrOutputParser()\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "7db86940", + "metadata": {}, + "source": [ + "Une fois la chaîne définie, nous pouvons lui poser des questions :" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "02444b65", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Answer: Nous ne pouvons qu’avoir un aperçu du futur, mais cela suffit pour comprendre qu’il y a beaucoup à faire.\n", + "— Alan Turing (1950)\n" + ] + } + ], + "source": [ + "query = \"Quelle est la citation d'Alan Turing ?\"\n", + "result = rag_chain.invoke(query)\n", + "print(\"Answer:\", result)" + ] + }, + { + "cell_type": "markdown", + "id": "3ffe0531", + "metadata": {}, + "source": [ + "LangChain ne permet pas nativement d'afficher quels chunks ont été utilisé pour produire la réponse, ni le score de similarité. Pour le faire, nous allons utiliser directement FAISS.\n", + "\n", + "**Consigne** : À l'aide de la méthode [`similarity_search_with_score`](https://python.langchain.com/v0.2/docs/integrations/vectorstores/llm_rails/#similarity-search-with-score) de `FAISS`, afficher les trois documents utilisé dans le RAG." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "95d81fe2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Similarity Score: 0.5376\n", + "Document Content: s’entraîneront, propageant ainsi les biais des premiers. Évidemment les usages malveillants malgré un\n", + "travail sur la sécurité et la toxicité toujours plus important.\n", + "Finalement, la fameuse citation d’Alan Turing est plus que jamais d’actualité.\n", + "--------------------------------------------------\n", + "Similarity Score: 0.6169\n", + "Document Content: Cadre et approche du cours\n", + "Alan Turing publieComputing Machinery and Intelligenceen 1950 [Tur50], qui deviendra un article\n", + "fondamental pour l’intelligence artificielle. Une citation devenue célèbre a motivé l’écriture de ce cours :\n", + "--------------------------------------------------\n", + "Similarity Score: 0.6388\n", + "Document Content: Nous ne pouvons qu’avoir un aperçu du futur, mais cela suffit pour comprendre qu’il y a\n", + "beaucoup à faire.\n", + "— Alan Turing (1950)\n", + "C’est par cette vision des années 1950 que nous nous proposons de remonter le temps et de découvrir\n", + "--------------------------------------------------\n" + ] + } + ], + "source": [ + "results_with_scores = vectordb.similarity_search_with_score(query, k=n_doc_to_retrieve)\n", + "\n", + "for doc, score in results_with_scores:\n", + " print(f\"Similarity Score: {score:.4f}\")\n", + " print(f\"Document Content: {doc.page_content}\")\n", + " print(\"-\" * 50)" + ] + }, + { + "cell_type": "markdown", + "id": "6aeeadf8", + "metadata": {}, + "source": [ + "Nous avons finalement bien défini notre premier RAG !\n", + "\n", + "## Amélioration de notre RAG\n", + "\n", + "Mais nous pouvons faire mieux, notamment afficher la source dans la génération pour que l'utilisateur puisse vérifier et mesurer les performances de notre RAG. Une fois que nous aurons réalisé ces deux améliorations, alors nous pourrons modifier plusieurs points techniques spécifique et mesurer l'apport en performance.\n", + "\n", + "### Exploiter les méta-données\n", + "\n", + "Nous avons utilisé la classe `PyPDFLoader` qui charge chaque page dans un document. Nous avons largement utilisé le contenu *page_content* mais l'attribut *metadata* contient deux informations qui nous intéressent : *source* et *page*. \n", + "\n", + "**Consigne** : Modifier la fonction `format_doc` pour qu'elle prenne en paramètre une liste de document LangChain puis qu'elle affiche la source et la page en plus de seulement le contenu texte." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "cae9a90c", + "metadata": {}, + "outputs": [], + "source": [ + "def format_docs(docs):\n", + " formatted = []\n", + " for doc in docs:\n", + " source = doc.metadata.get(\"source\", \"unknown\")\n", + " page = doc.metadata.get(\"page\", \"unknown\")\n", + " content = doc.page_content.strip()\n", + " formatted.append(f\"[Source: {source}, Page: {page+1}]\\n{content}\")\n", + " return \"\\n\\n\".join(formatted)" + ] + }, + { + "cell_type": "markdown", + "id": "0363d832", + "metadata": {}, + "source": [ + "Maintenant que nous passons des informations sur les métadonnées, il faut s'assurer que le modèle de langage les utilises.\n", + "\n", + "**Consigne** : Modifier le prompt template défini plus tôt pour intégrer cette règle." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "a57e10a6", + "metadata": {}, + "outputs": [], + "source": [ + "prompt_template = PromptTemplate(\n", + " template=\"\"\"\n", + " You are an assistant for question-answering tasks. \n", + " Use the following retrieved pieces of context (with source and page information) to answer the question. \n", + " If you don't know the answer, just say that you don't know. Answer in the same language as the question.\n", + " When possible, cite the source and page in your answer. \n", + "\n", + " Question: {question}\n", + " Context:\\n{context}\n", + " Answer:\n", + " \"\"\",\n", + " input_variables=[\"context\", \"question\"]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "260f39f4", + "metadata": {}, + "source": [ + "Testons à présent avec la même question sur une nouvelle chaîne RAG prenant en compte nos améliorations.\n", + "\n", + "**Consigne** : Définir un nouveau RAG prenant en compte les informations des méta-données, puis poser la même question." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "b3824802", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Answer: Selon ML.pdf, page 92, la citation d'Alan Turing est : « Nous ne pouvons qu’avoir un aperçu du futur, mais cela suffit pour comprendre qu’il y a beaucoup à faire. »\n" + ] + } + ], + "source": [ + "rag_chain = (\n", + " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt_template\n", + " | model\n", + " | StrOutputParser()\n", + ")\n", + "\n", + "query = \"Quelle est la citation d'Alan Turing ?\"\n", + "result = rag_chain.invoke(query)\n", + "print(\"Answer:\", result)" + ] + }, + { + "cell_type": "markdown", + "id": "973dfa8d", + "metadata": {}, + "source": [ + "C'est ce que nous souhaitions obtenir ! Mais nous pourrions avoir un format un peu plus structuré et moins libre. Pour cela, nous allons modifier notre système pour qu'il renvoie des JSON !\n", + "Commençons par modifier le template de prompt pour lui donner les instructions :" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "d4892e8d", + "metadata": {}, + "outputs": [], + "source": [ + "prompt_template = PromptTemplate(\n", + " template=\"\"\"\n", + " You are an assistant for question-answering tasks, use the retrieved context to answer the question. Each piece of context includes metadata (source + page).\n", + " If you don’t know the answer, respond with: {{\"answer\": \"I don't know\", \"sources\": []}}\n", + " Otherwise, return your answer in JSON with this exact structure:\n", + " {{\n", + " \"answer\": \"your answer here\",\n", + " \"sources\": [\"source:page\", \"source:page\"]\n", + " }}\n", + " Rules:\n", + " - Answer in the same language as the question.\n", + " - Always include the sources (source:page).\n", + " - Never add extra fields.\n", + "\n", + " Question: {question}\n", + " Context:\\n{context}\n", + " Answer:\n", + " \"\"\",\n", + " input_variables=[\"context\", \"question\"]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "01e34935", + "metadata": {}, + "source": [ + "Puisque nous demandons ici de répondre par exemple : '['ML.pdf:91\"], nous allons lui faciliter la tâche en modifiant la fonction `format_docs`.\n", + "\n", + "**Consigne** : Modifier la fonction `format_docs` pour prendre en compte le formattage 'source:page'." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "547f6ea2", + "metadata": {}, + "outputs": [], + "source": [ + "def format_docs(docs):\n", + " formatted = []\n", + " for doc in docs:\n", + " source = doc.metadata.get(\"source\", \"unknown\")\n", + " page = doc.metadata.get(\"page\", \"unknown\")\n", + " content = doc.page_content.strip()\n", + " formatted.append(f\"[{source}:{page+1}]\\n{content}\")\n", + " return \"\\n\\n\".join(formatted)" + ] + }, + { + "cell_type": "markdown", + "id": "0238f9f6", + "metadata": {}, + "source": [ + "Si nous souhaitons obtenir un JSON, ou un dictionnaire, en sortie du modèle, nous devons modifier la chaîne RAG définie précédemment.\n", + "\n", + "**Consigne** : Remplacer la fonction [`JsonOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html) à la place de [`StrOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.string.StrOutputParser.html#langchain_core.output_parsers.string.StrOutputParser) puis tester la nouvelle chaîne RAG avec la même question." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "c0f90db7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Answer: {'answer': 'Nous ne pouvons qu’avoir un aperçu du futur, mais cela suffit pour comprendre qu’il y a beaucoup à faire.', 'sources': ['ML.pdf:2']}\n" + ] + } + ], + "source": [ + "from langchain_core.output_parsers import JsonOutputParser\n", + "\n", + "\n", + "rag_chain = (\n", + " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt_template\n", + " | model\n", + " | JsonOutputParser()\n", + ")\n", + "\n", + "query = \"Quelle est la citation d'Alan Turing ?\"\n", + "result = rag_chain.invoke(query)\n", + "print(\"Answer:\", result)" + ] + }, + { + "cell_type": "markdown", + "id": "3db037d1", + "metadata": {}, + "source": [ + "C'est mieux ! Il nous reste à présent à mesurer la performance de notre système.\n", + "\n", + "\n", + "### Mesurer les performances\n", + "\n", + "Nous avons défini manuellement plusieurs questions dont les réponses sont contenus dans le cours dans le fichier JSON *eval_dataset*." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "d4398984", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'question': \"Qu'est-ce qu'un algorithme ?\", 'answer': 'Un algorithme est une séquence d’instructions logique ordonnée pour répondre explicitement à un problème.', 'sources': 'ML.pdf:6'}\n" + ] + } + ], + "source": [ + "import json\n", + "with open(\"eval_dataset.json\", \"r\", encoding=\"utf-8\") as file:\n", + " eval_dataset = json.load(file)\n", + "\n", + "print(eval_dataset[0])" + ] + }, + { + "cell_type": "markdown", + "id": "37b8eb75", + "metadata": {}, + "source": [ + "Il sera probablement difficile de mesurer la performance de manière frontale. Ainsi, nous optons pour une méthodologie *LLM as a Judge*.\n", + "\n", + "**Consigne** : Définir une fonction `evaluate_rag` qui prend en paramètre une chaîne RAG et un dataset pour évaluation. La fonction renverra une liste de dictionnaire avec pour clés :\n", + "* *question* : la question posée\n", + "* *expected_answer* : la réponse attendue\n", + "* *predicted_answer* : la réponse obtenue\n", + "* *expected_sources* : la ou les sources attendues\n", + "* *predicted_sources* : la ou les sources obtenues" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "4a3a70a4", + "metadata": {}, + "outputs": [], + "source": [ + "def evaluate_rag(rag_chain, dataset):\n", + " results = []\n", + " for example in dataset:\n", + " prediction = rag_chain.invoke(example[\"question\"])\n", + "\n", + " results.append({\n", + " \"question\": example[\"question\"],\n", + " \"expected_answer\": example[\"answer\"],\n", + " \"predicted_answer\": prediction[\"answer\"],\n", + " \"expected_sources\": example[\"sources\"],\n", + " \"predicted_sources\": prediction[\"sources\"]\n", + " })\n", + " return results" + ] + }, + { + "cell_type": "markdown", + "id": "da59e623", + "metadata": {}, + "source": [ + "**Consigne** : Tester la fonction précédente avec les trois premières questions puis afficher le résultat sous la forme d'un dataframe pandas." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "a33db551", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[{'question': \"Qu'est-ce qu'un algorithme ?\", 'expected_answer': 'Un algorithme est une séquence d’instructions logique ordonnée pour répondre explicitement à un problème.', 'predicted_answer': \"Un algorithme est un objet dont nous supposerons l'existence, et dont la description sera le cœur des prochains chapitres.\", 'expected_sources': 'ML.pdf:6', 'predicted_sources': ['ML.pdf:6', 'ML.pdf:134']}, {'question': \"Qu'est-ce qu'un hackathon ?\", 'expected_answer': 'Un hackathon en Machine Learning est une compétition entre data-scientists (ou étudiants) dont le but est de trouver la meilleure manière de répondre à une tâche donnée.', 'predicted_answer': \"I don't know\", 'expected_sources': 'ML.pdf:10', 'predicted_sources': []}, {'question': \"Quel est l'inconvénient de la méthode Leave-One-Out Cross-Validation ?\", 'expected_answer': 'L’un des inconvénients majeur est que cela peut devenir très long et très coûteux en opération de calcul puisqu’il faut entraîner n fois l’algorithme sur presque l’ensemble du dataset', 'predicted_answer': \"L'inconvénient de la méthode Leave-One-Out Cross-Validation est que pour chaque point de données, le modèle est entraîné sur tous les autres points de données et testé sur le point de données restant. Cela peut entraîner des estimations de l'erreur beaucoup plus élevées que celles obtenues par la validation croisée standard.\", 'expected_sources': 'ML.pdf:10', 'predicted_sources': ['ML.pdf:10', 'ML.pdf:10', 'ML.pdf:128']}]\n" + ] + } + ], + "source": [ + "results = evaluate_rag(rag_chain, dataset=eval_dataset[:3])\n", + "print(results)" + ] + }, + { + "cell_type": "markdown", + "id": "14393690", + "metadata": {}, + "source": [ + "Nous sommes capable d'obtenir un ensemble de réponse de la part d'un modèle avec un RAG, il nous reste à mettre en place le juge.\n", + "\n", + "**Consigne** : Définir un prompt pour décrire le rôle du juge." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "a9eacd88", + "metadata": {}, + "outputs": [], + "source": [ + "judge_prompt = PromptTemplate(\n", + " template = \"\"\"\n", + " You are an evaluator. Your task is to compare a student's answer with the reference answer. \n", + " The student answer may still be valid even if it is phrased differently.\n", + "\n", + " Question: {question}\n", + " Reference Answer: {expected_answer}\n", + " Expected Sources: {expected_sources}\n", + "\n", + " Student Answer: {predicted_answer}\n", + " Student Sources: {predicted_sources}\n", + "\n", + " Evaluation Instructions:\n", + " - If the student's answer correctly matches the meaning of the reference answer, mark it as CORRECT. \n", + " - If it is wrong or missing important details, mark it as INCORRECT.\n", + " - For sources, check if the student listed at least the expected sources. Extra sources are allowed.\n", + " - Return your judgment strictly as JSON:\n", + " {{\n", + " \"answer_correct\": true/false,\n", + " \"sources_correct\": true/false,\n", + " }}\n", + " \"\"\",\n", + " input_variables=[\n", + " \"question\",\n", + " \"expected_answer\",\n", + " \"predicted_answer\",\n", + " \"expected_sources\",\n", + " \"predicted_sources\",\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "bc714900", + "metadata": {}, + "source": [ + "**Consigne** : Définir une chaîne pour le juge, de la même manière que le RAG : prompt --> model --> JSONParser" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "b3c30cc3", + "metadata": {}, + "outputs": [], + "source": [ + "judge_model = OllamaLLM(model=\"gemma3:4b\")\n", + "json_parser = JsonOutputParser()\n", + "\n", + "judge_chain = judge_prompt | judge_model | JsonOutputParser()" + ] + }, + { + "cell_type": "markdown", + "id": "6069627d", + "metadata": {}, + "source": [ + "**Consigne** : Modifier la fonction `evaluate_rag` pour qu'elle note directement la performance du modèle et renvoie sous forme d'un dataframe pandas les résultats. On implémentera également des mesures temporelles pour le RAG et le juge, ainsi que des blocs *try...except...* pour ne pas bloquer l'exécution de toutes les requêtes si une renvoie une erreur.\n", + "Pour pouvoir suivre l'avancement de l'évaluation, on utilisera la barre de progression tqdm." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "0556cbed", + "metadata": {}, + "outputs": [], + "source": [ + "from tqdm import tqdm\n", + "import time\n", + "import pandas as pd\n", + "\n", + "\n", + "def evaluate_rag(rag_chain, dataset, judge_chain):\n", + " \"\"\"\n", + " Evaluate a RAG chain against a dataset using a judge LLM.\n", + "\n", + " Args:\n", + " rag_chain: LangChain RAG chain.\n", + " dataset: List of dicts with 'question', 'answer', 'sources'.\n", + " judge_chain: LangChain judge chain that outputs JSON with 'answer_correct', 'sources_correct', 'explanation'.\n", + "\n", + " Returns:\n", + " pandas.DataFrame with predictions, judgment, and timings.\n", + " \"\"\"\n", + " results = []\n", + "\n", + " iterator = tqdm(dataset, desc=\"Evaluating RAG\", unit=\"query\")\n", + "\n", + " for example in iterator:\n", + " rag_start = time.time()\n", + " try:\n", + " prediction = rag_chain.invoke(example[\"question\"])\n", + " except Exception as e:\n", + " prediction = {\"answer\": \"\", \"sources\": []}\n", + " print(f\"[RAG ERROR] Question: {example['question']} | {e}\")\n", + " rag_end = time.time()\n", + "\n", + " judge_input = {\n", + " \"question\": example[\"question\"],\n", + " \"expected_answer\": example[\"answer\"],\n", + " \"predicted_answer\": prediction.get(\"answer\", \"\"),\n", + " \"expected_sources\": example[\"sources\"],\n", + " \"predicted_sources\": prediction.get(\"sources\", []),\n", + " }\n", + "\n", + " judge_start = time.time()\n", + " try:\n", + " judgment = judge_chain.invoke(judge_input)\n", + " except Exception as e:\n", + " judgment = {\"answer_correct\": False, \"sources_correct\": False, \"explanation\": f\"Judge error: {e}\"}\n", + " print(f\"[JUDGE ERROR] Question: {example['question']} | {e}\")\n", + " judge_end = time.time()\n", + "\n", + " results.append({\n", + " **judge_input,\n", + " **judgment,\n", + " \"rag_time\": rag_end - rag_start,\n", + " \"judge_time\": judge_end - judge_start,\n", + " \"total_time\": judge_end - rag_start\n", + " })\n", + " \n", + " return pd.DataFrame(results)\n" + ] + }, + { + "cell_type": "markdown", + "id": "73d842ea", + "metadata": {}, + "source": [ + "**Consigne** : Utiliser cette fonction sur les trois premières question du dataset d'évaluation." + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "afad101d", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Evaluating RAG: 100%|██████████| 10/10 [00:46<00:00, 4.64s/query]\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
questionexpected_answerpredicted_answerexpected_sourcespredicted_sourcesanswer_correctsources_correctrag_timejudge_timetotal_time
0Qu'est-ce qu'un algorithme ?Un algorithme est une séquence d’instructions ...Nous ne discuterons pas d’algorithmes en parti...ML.pdf:6[ML.pdf:6]FalseTrue2.7821751.6568884.439065
1Qu'est-ce qu'un hackathon ?Un hackathon en Machine Learning est une compé...I don't knowML.pdf:10[]FalseFalse1.8683081.6570523.525366
2Quel est l'inconvénient de la méthode Leave-On...L’un des inconvénients majeur est que cela peu...L'inconvénient de la méthode Leave-One-Out Cro...ML.pdf:10[ML.pdf:10, ML.pdf:128]TrueTrue4.3393671.8448206.184189
3Qu'est-ce que la régression polynomiale ?Une régression polynomiale est une régression ...Une régression polynomiale est une régression ...ML.pdf:21[ML.pdf:21]TrueTrue3.3427251.7515315.094258
4What is exercise 3.5 about ?Mail classificationI don't knowML.pdf:30[]FalseFalse2.1517261.5533533.705080
5Quel est l'autre nom du bagging ?La solution donne son nom à la section : nous ...Le bagging est également connu sous le nom d’a...ML.pdf:39[ML.pdf:40, ML.pdf:68]TrueTrue2.9523151.6460254.598341
6Qu'est-ce qu'une souche en Machine Learning ?Les weak learners d’AdaBoost sont appelés des ...En Machine Learning, une souche (ou lineage) f...ML.pdf:42[ML.pdf:113]FalseTrue4.6588001.8775336.536340
7Quelle sont les trois propriétés mathématiques...Indiscernabilité, symétrie et sous-additivitéI don't knowML.pdf:51[]FalseFalse2.1284391.5834633.711939
8Pourquoi KMeans a été introduit ?Kmeans++ : un meilleur départ\\nSuivre cette mé...I don't knowML.pdf:54[]FalseFalse1.8780881.7635183.641612
9Dans quel article a été introduit le lemme de ...Cette similitude est expliquée par le titre de...Le lemme de Johnson-Lindenstrauss a été introd...ML.pdf:63[ML.pdf:64]TrueTrue3.1187611.8017414.920507
\n", + "
" + ], + "text/plain": [ + " question \\\n", + "0 Qu'est-ce qu'un algorithme ? \n", + "1 Qu'est-ce qu'un hackathon ? \n", + "2 Quel est l'inconvénient de la méthode Leave-On... \n", + "3 Qu'est-ce que la régression polynomiale ? \n", + "4 What is exercise 3.5 about ? \n", + "5 Quel est l'autre nom du bagging ? \n", + "6 Qu'est-ce qu'une souche en Machine Learning ? \n", + "7 Quelle sont les trois propriétés mathématiques... \n", + "8 Pourquoi KMeans a été introduit ? \n", + "9 Dans quel article a été introduit le lemme de ... \n", + "\n", + " expected_answer \\\n", + "0 Un algorithme est une séquence d’instructions ... \n", + "1 Un hackathon en Machine Learning est une compé... \n", + "2 L’un des inconvénients majeur est que cela peu... \n", + "3 Une régression polynomiale est une régression ... \n", + "4 Mail classification \n", + "5 La solution donne son nom à la section : nous ... \n", + "6 Les weak learners d’AdaBoost sont appelés des ... \n", + "7 Indiscernabilité, symétrie et sous-additivité \n", + "8 Kmeans++ : un meilleur départ\\nSuivre cette mé... \n", + "9 Cette similitude est expliquée par le titre de... \n", + "\n", + " predicted_answer expected_sources \\\n", + "0 Nous ne discuterons pas d’algorithmes en parti... ML.pdf:6 \n", + "1 I don't know ML.pdf:10 \n", + "2 L'inconvénient de la méthode Leave-One-Out Cro... ML.pdf:10 \n", + "3 Une régression polynomiale est une régression ... ML.pdf:21 \n", + "4 I don't know ML.pdf:30 \n", + "5 Le bagging est également connu sous le nom d’a... ML.pdf:39 \n", + "6 En Machine Learning, une souche (ou lineage) f... ML.pdf:42 \n", + "7 I don't know ML.pdf:51 \n", + "8 I don't know ML.pdf:54 \n", + "9 Le lemme de Johnson-Lindenstrauss a été introd... ML.pdf:63 \n", + "\n", + " predicted_sources answer_correct sources_correct rag_time \\\n", + "0 [ML.pdf:6] False True 2.782175 \n", + "1 [] False False 1.868308 \n", + "2 [ML.pdf:10, ML.pdf:128] True True 4.339367 \n", + "3 [ML.pdf:21] True True 3.342725 \n", + "4 [] False False 2.151726 \n", + "5 [ML.pdf:40, ML.pdf:68] True True 2.952315 \n", + "6 [ML.pdf:113] False True 4.658800 \n", + "7 [] False False 2.128439 \n", + "8 [] False False 1.878088 \n", + "9 [ML.pdf:64] True True 3.118761 \n", + "\n", + " judge_time total_time \n", + "0 1.656888 4.439065 \n", + "1 1.657052 3.525366 \n", + "2 1.844820 6.184189 \n", + "3 1.751531 5.094258 \n", + "4 1.553353 3.705080 \n", + "5 1.646025 4.598341 \n", + "6 1.877533 6.536340 \n", + "7 1.583463 3.711939 \n", + "8 1.763518 3.641612 \n", + "9 1.801741 4.920507 " + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "results = evaluate_rag(rag_chain, dataset=eval_dataset[:10], judge_chain=judge_chain)\n", + "results" + ] + }, + { + "cell_type": "markdown", + "id": "91231c6d", + "metadata": {}, + "source": [ + "**Consigne** : A partir des résultats précédents, donner des statistiques de performance du modèle." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "59d821db", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 40.00%\n", + "Accuracy source: 60.00%\n", + "RAG time (avg): 2.92s\n", + "Judge time (avg): 1.71s\n", + "Total time (avg): 4.64s\n" + ] + } + ], + "source": [ + "accuracy = results[\"answer_correct\"].astype(int).mean()\n", + "source_accuracy = results[\"sources_correct\"].astype(int).mean()\n", + "avg_rag_time = results[\"rag_time\"].mean()\n", + "avg_judge_time = results[\"judge_time\"].mean()\n", + "avg_total_time = results[\"total_time\"].mean()\n", + "\n", + "print(f\"Accuracy: {100 * accuracy:.2f}%\")\n", + "print(f\"Accuracy source: {100 * source_accuracy:.2f}%\")\n", + "print(f\"RAG time (avg): {avg_rag_time:.2f}s\")\n", + "print(f\"Judge time (avg): {avg_judge_time:.2f}s\")\n", + "print(f\"Total time (avg): {avg_total_time:.2f}s\")" + ] + }, + { + "cell_type": "markdown", + "id": "289c97f8", + "metadata": {}, + "source": [ + "## Pour aller plus loin\n", + "\n", + "Nous avons plusieurs axes d'améliorations, de manière non exhaustive :\n", + "* Une meilleure récupération du texte dans le PDF : par exemple utiliser [Docling](https://python.langchain.com/docs/integrations/document_loaders/docling/) ?\n", + "* Une meilleure manière de découper en *chunk* le texte : par exemple utiliser [RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html#recursivecharactertextsplitter), ou changer la taille des chunks...\n", + "* Un meilleur modèle d'embedding : voir le [leaderboard](https://huggingface.co/spaces/mteb/leaderboard) des embeddings\n", + "* Un meilleur retrieval : meilleure méthode pour chercher, par exemple [MMR](https://python.langchain.com/v0.2/docs/how_to/example_selectors_mmr/)\n", + "* De meilleurs prompt\n", + "* Une meilleure mesure de performance : plus de questions par exemple\n", + "\n", + "Nous encourageons l'étudiant à tester la ou les améliorations qu'ils souhaitent faire et surtout que les apports soit mesurés séparemment. On encourage également d'utiliser ses propres documents et son propre benchmark.\n", + "Pour accélérer encore un peu l'évaluation, on propose une version asynchrone de la fonction d'évaluation :" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "7ae5fd5d", + "metadata": {}, + "outputs": [], + "source": [ + "import asyncio\n", + "from tqdm.asyncio import tqdm_asyncio\n", + "\n", + "async def evaluate_rag_async(rag_chain, dataset, judge_chain, max_concurrency=5):\n", + " \"\"\"\n", + " Async evaluation of a RAG chain against a dataset using a judge LLM.\n", + " \"\"\"\n", + " results = []\n", + " semaphore = asyncio.Semaphore(max_concurrency)\n", + "\n", + " async def process_example(example):\n", + " async with semaphore:\n", + " rag_start = time.time()\n", + " try:\n", + " prediction = await rag_chain.ainvoke(example[\"question\"])\n", + " except Exception as e:\n", + " prediction = {\"answer\": \"\", \"sources\": []}\n", + " print(f\"[RAG ERROR] Question: {example['question']} | {e}\")\n", + " rag_end = time.time()\n", + "\n", + " judge_input = {\n", + " \"question\": example[\"question\"],\n", + " \"expected_answer\": example[\"answer\"],\n", + " \"predicted_answer\": prediction.get(\"answer\", \"\"),\n", + " \"expected_sources\": example[\"sources\"],\n", + " \"predicted_sources\": prediction.get(\"sources\", []),\n", + " }\n", + "\n", + " judge_start = time.time()\n", + " try:\n", + " judgment = await judge_chain.ainvoke(judge_input)\n", + " except Exception as e:\n", + " judgment = {\"answer_correct\": False, \"sources_correct\": False, \"explanation\": f\"Judge error: {e}\"}\n", + " print(f\"[JUDGE ERROR] Question: {example['question']} | {e}\")\n", + " judge_end = time.time()\n", + "\n", + " results.append({\n", + " **judge_input,\n", + " **judgment,\n", + " \"rag_time\": rag_end - rag_start,\n", + " \"judge_time\": judge_end - judge_start,\n", + " \"total_time\": judge_end - rag_start\n", + " })\n", + "\n", + " tasks = [process_example(example) for example in dataset]\n", + " for f in tqdm_asyncio.as_completed(tasks, desc=\"Evaluating RAG\", total=len(dataset)):\n", + " await f\n", + "\n", + " return pd.DataFrame(results)\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "studies", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/M2/Generative AI/TP1/TP2 Benchmark - Starter.ipynb b/M2/Generative AI/TP1/TP2 Benchmark - Starter.ipynb deleted file mode 100644 index 545f175..0000000 --- a/M2/Generative AI/TP1/TP2 Benchmark - Starter.ipynb +++ /dev/null @@ -1,456 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "172a7a9f", - "metadata": {}, - "source": [ - "# TP2 - Benchmark automatique\n", - "\n", - "Dans ce TP nous allons définir une fonction pour mesurer les performances d'un modèle de langage via l'exécution de plusieurs benchmarks. Nous avons vu en cours trois manières de mesurer la performance d'un modèle de langage qu'on peut résumer à:\n", - "1. **Évaluation automatique**: via un ensemble de questions dont on connait la réponse\n", - "2. **Évaluation humaine**: qualification humaine de la réponse d'un modèle à une question\n", - "3. **Évaluation par modèle de langage**: notation ou comparaison de réponse d'un ou plusieurs modèles par un autre modèle\n", - "\n", - "Nous nous intéressons ici au premier point, en particulier avec les benchmarks [GSM8K](https://huggingface.co/datasets/openai/gsm8k) et [HellaSwag](https://huggingface.co/datasets/Rowan/hellaswag).\n", - "Dans l'ensemble du notebook nous utiliserons la librairie LangChain.\n", - "\n", - "Il est à garder en tête que ce notebook n'a qu'une portée pédagogique et n'est pas forcément à jour puisque le domaine évolue rapidement, ni que les pratiques sont celles validées par l'industrie.\n", - "\n", - "## Uniformisation des benchmarks\n", - "\n", - "Pour chaque benchmark que l'on considère, nous avons besoin de plusieurs informations :\n", - "* **Dataset** : une fonction pour charger les questions du benchmark\n", - "* **Référence** : une fonction capable d'identifier la réponse attentue\n", - "* **Prompt** : un prompt qui permet de demander correctement au modèle de répondre à la question\n", - "* **Chaîne** : une fonction qui renvoie la chaîne de traitement de LangChain\n", - "* **Score** : une fonction qui score la performance d'un modèle sur une question\n", - "\n", - "Nous allons commencer par créer une classe qui regroupe ces desiderata :\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cd75374d", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_core.runnables import Runnable\n", - "from langchain_core.prompts import PromptTemplate\n", - "\n", - "\n", - "class Benchmark:\n", - " name: str\n", - "\n", - " def __init__(self, prompt: PromptTemplate):\n", - " self.prompt = prompt\n", - "\n", - " def load_data(self):\n", - " raise NotImplementedError\n", - "\n", - " def build_chain(self, model) -> Runnable:\n", - " raise NotImplementedError\n", - "\n", - " def get_reference(self, sample):\n", - " raise NotImplementedError\n", - "\n", - " def score(self, prediction, reference):\n", - " raise NotImplementedError" - ] - }, - { - "cell_type": "markdown", - "id": "e2ab41df", - "metadata": {}, - "source": [ - "Pour rendre cette classe plus concrète, commençons par travailler avec le benchmark [GSM8K](https://huggingface.co/datasets/openai/gsm8k).\n", - "\n", - "### Benchmark GSM8K\n", - "\n", - "On commence par charger le dataset et observer une question.\n", - "\n", - "**Consigne** : Résoudre la question *à la main* et vérifier votre réponse. On recommande d'explorer plusieurs questions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "93979ba0", - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np; np.random.seed(42)\n", - "from datasets import load_dataset\n", - "\n", - "dataset = load_dataset(\"gsm8k\", \"main\")\n", - "dataset = dataset[\"test\"]\n", - "\n", - "print(f\"Number of questions: {len(dataset)}\")\n", - "index = 0\n", - "print(\"Example of question:\\n\", dataset[index][\"question\"])\n", - "print(\"And its answer:\\n\", dataset[index][\"answer\"])" - ] - }, - { - "cell_type": "markdown", - "id": "82d797f0", - "metadata": {}, - "source": [ - "Après avoir inspecté plusieurs éléments du dataset, on remarque que la réponse finale est placée après la chaîne de caractères \"####\".\n", - "\n", - "**Consigne**: Construire une fonction `get_reference` qui prend en argument un élément de GMS8K (dictionnaire avec question et réponse) et renvoie la réponse attendue (string). On pourra utiliser la fonction [`search`](https://docs.python.org/3/library/re.html#re.search) de la librairie [`re`](https://docs.python.org/3/library/re.html#).\n", - "Puis tester cette fonction sur l'exemple précédent." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b336056a", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "4c137e6a", - "metadata": {}, - "source": [ - "Il nous reste maintenant à définir un prompt tel que l'on puisse appeler un modèle et tester notre mécanique." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0b899872", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_core.prompts import PromptTemplate\n", - "\n", - "prompt = PromptTemplate(\n", - " input_variables=[\"question\"],\n", - " template=(\n", - " \"\"\"You are a careful mathematician. Solve the problem step by step, then display your answer in the end.\n", - " Question: {question}\n", - " Answer:\"\"\"\n", - " )\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "36433b53", - "metadata": {}, - "source": [ - "En intégrant l'appel à un modèle via Ollama sur notre ordinateur, on peut définir avec LangChain :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2f0676b6", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_core.runnables import RunnablePassthrough\n", - "from langchain_core.output_parsers import StrOutputParser\n", - "from langchain_ollama import OllamaLLM\n", - "\n", - "model = OllamaLLM(model=\"gemma3:4b\")\n", - "\n", - "chain = (\n", - " {\"question\": RunnablePassthrough()}\n", - " | prompt\n", - " | model\n", - " | StrOutputParser()\n", - ")\n", - "\n", - "index = 0\n", - "\n", - "question = dataset[index][\"question\"]\n", - "answer = get_reference(dataset[index])\n", - "response = chain.invoke(question)\n", - "print(f\"Model answer : {response}\")\n", - "print(f\"The answer was : {answer}\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "97dd7db7", - "metadata": {}, - "source": [ - "Il nous faut extraire la dernière valeur numérique pour obtenir automatiquement la réponse du modèle.\n", - "\n", - "**Consigne** : Définir une fonction `score` qui prend en paramètre la réponse du modèle et la réponse attendue puis renvoie si les deux réponses sont identiques (1 / 0). On pourra utiliser la fonction [`findall`](https://docs.python.org/3/library/re.html#re.findall) de la librairie `re`.\n", - "Puis l'appliquer sur l'exemple précédent." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ad43cf84", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "a2ec5088", - "metadata": {}, - "source": [ - "Nous avons l'ensemble des éléments nécessaire pour définir la classe `GSM8KBenchmark` depuis la classe `Benchmark` que nous avons défini précédemment.\n", - "\n", - "**Consigne** : Définir cette classe comme sous-classe de `Benchmark`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d83f4394", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "dfc3cb78", - "metadata": {}, - "source": [ - "Il est maintenant temps de définir une fonction qui *fait* le benchmark.\n", - "\n", - "**Consigne** : Définir une fonction `run_benchmark` qui prend en paramètre :\n", - "* `model_name` : le nom du modèle Ollama que l'on veut tester\n", - "* `benchmark` : la classe benchmark que l'on souhaite tester\n", - "* `max_samples` : le nombre maximum de questions que l'on souhaite utiliser\n", - "\n", - "Puisque l'object avec lequel nous travaillons est un dataset HuggingFace, pour sélectionner $n$ lignes, on utilisera \n", - "```python\n", - "dataset = dataset.select(range(max_samples))\n", - "```\n", - "De cette manière on préserve la structure." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2d7125af", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "81de8940", - "metadata": {}, - "source": [ - "**Consigne** : Utiliser la fonction `run_benchmark` en définissant un prompt pour GSM8K." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f6bbeb53", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "0c943124", - "metadata": {}, - "source": [ - "### HellaSwag\n", - "\n", - "Maintenant que nous avons réussi à le faire pour le dataset GMS8K, attaquons-nous à [HellaSwag](https://huggingface.co/datasets/Rowan/hellaswag).\n", - "\n", - "**Consigne** : En suivant la même approche que précédemment, implémenter une sous classe `HellaSwagBenchmark` à partir de la classe `Benchmark`. Puis utiliser la fonction `run_benchmark` pour valider votre travail." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "32886901", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "96a3031a", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "c542783c", - "metadata": {}, - "source": [ - "## Réponses structurées\n", - "\n", - "Sur quelques exemples tout semble fonctionner ! Mais il y a au moins une fragilité dans notre travail : la récupération de la réponse est peu fiable et largement dépendante des prompts.\n", - "\n", - "\n", - "Par exemple pour GMS8K, on aimerait avoir une réponse sous la forme d'un JSON :\n", - "```json\n", - "{\n", - " \"reasoning\": \"étapes de raisonnement\",\n", - " \"final_answer\": 18\n", - "}\n", - "```\n", - "\n", - "De cette manière ce serait particulièrement simple d'extraire la réponse, sans pour autant ne pas avoir de *réflexion* du modèle. En revanche pour HellaSwag, un JSON extrêment simple suffit :\n", - "```json\n", - "{\n", - " \"choice\": 2\n", - "}\n", - "```\n", - "\n", - "Pour forcer le modèle à suivre ces formats, nous allons utiliser l'option [Pydantic](https://docs.langchain.com/oss/python/langchain/structured-output). Elle s'utilise comme suit, pour GSM8K :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "988dbca3", - "metadata": {}, - "outputs": [], - "source": [ - "from pydantic import BaseModel, Field\n", - "\n", - "class GSM8KOutput(BaseModel):\n", - " reasoning: str = Field(description=\"Step-by-step reasoning\")\n", - " final_answer: float = Field(description=\"Final numeric answer\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "d855adfe", - "metadata": {}, - "source": [ - "Concernant l'intégration dans le prompt :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f25afddc", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.output_parsers import PydanticOutputParser\n", - "\n", - "parser_gsm8k = PydanticOutputParser(pydantic_object=GSM8KOutput)\n", - "\n", - "prompt_gsm8k = PromptTemplate(\n", - " input_variables=[\"question\"],\n", - " partial_variables={\"format_instructions\": parser_gsm8k.get_format_instructions()},\n", - " template=(\n", - " \"\"\"You are a careful mathematician. Solve the problem step by step.\n", - " Question: {question}\n", - " {format_instructions}\"\"\"\n", - " ),\n", - ")\n", - "\n", - "print(parser_gsm8k.get_format_instructions())" - ] - }, - { - "cell_type": "markdown", - "id": "d1dcc480", - "metadata": {}, - "source": [ - "**Consigne** : Modifier la classe `Benchmark` et la sous-classe `GMS8KBenchmark` pour intégrer ces évolutions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "542a31d6", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c94f1dd1", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "b2076f24", - "metadata": {}, - "source": [ - "**Consigne** : Utiliser la fonction `run_benchmark` et vérifier que tout fonctionne." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "31e433b0", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "b7ed90cd", - "metadata": {}, - "source": [ - "**Consigne** : Réaliser la même modification pour HellaSwag, et vérifier que cela fonctionne." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e678bed2", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2455f816", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "ba9acd54", - "metadata": {}, - "source": [ - "## Pour aller plus loin\n", - "\n", - "On pourrait implémenter d'autres benchmark, comparer vraiment des modèles entre eux, comparer des prompts entre eux..." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/M2/Generative AI/TP1/TP2 RAG - Starter.ipynb b/M2/Generative AI/TP1/TP2 RAG - Starter.ipynb deleted file mode 100644 index 3d6a65e..0000000 --- a/M2/Generative AI/TP1/TP2 RAG - Starter.ipynb +++ /dev/null @@ -1,953 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "8514812a", - "metadata": {}, - "source": [ - "# TP2 - Retrieval Augmented Generation\n", - "\n", - "Dans ce TP nous allons construire un système RAG complet : base de connaissance, vectorisation et appel avec un modèle de langage.\n", - "\n", - "Certaines fonctions seront réutilisées dans les prochaines séances, nous encourageons donc la définition de fonction générale, optimisée et robuste. Il est à garder en tête que ce notebook n'a qu'une portée pédagogique et n'est pas forcément à jour puisque le domaine évolue rapidement.\n", - "\n", - "Dans ce TP nous cherchons à apporter des connaissances Machine Learning, bien que le modèle en ait largement, en utilisant des cours au format PDF à notre disposition. \n", - "\n", - "\n", - "## Constitution de la base de connaissance\n", - "\n", - "Pour construire un RAG, il faut commencer par une base de connaissance. Elle sera composée dans notre cas de document PDF. Nous allons commencer par extraire les informations texte contenue dans les documents.\n", - "\n", - "**Consigne** : À partir des fichiers disponible, construire une fonction `pdf_parser` qui prend en paramètre le nom du fichier et qui renvoie le texte associé. On utilisera la classe [`PyPDFLoader`](https://python.langchain.com/docs/how_to/document_loader_pdf/#simple-and-fast-text-extraction) et sa méthode `load` pour charger le document.\n" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "6a4a00a2", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/Users/arthurdanjou/Workspace/studies/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", - " from .autonotebook import tqdm as notebook_tqdm\n" - ] - } - ], - "source": [ - "from langchain_community.document_loaders import PyPDFLoader\n", - "\n", - "\n", - "def pdf_parser(file_name: str) -> list:\n", - " \"\"\"Extract text from a PDF file.\n", - "\n", - " Args:\n", - " file_name (str): The path to the PDF file.\n", - "\n", - " Returns:\n", - " list: A list of documents extracted from the PDF file.\n", - "\n", - " \"\"\"\n", - " loader = PyPDFLoader(file_name)\n", - " return loader.load()" - ] - }, - { - "cell_type": "markdown", - "id": "77905595", - "metadata": {}, - "source": [ - "**Consigne** : Utiliser la fonction `pdf_parser` pour charger le fichier 'ML.pdf' puis inspecter son contenu." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "8ec332e6", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--- Document 0 ---\n", - "INTRODUCTION AU MACHINE LEARNING\n", - "2022-2026\n", - "Théo Lopès-Quintas\n", - "\n", - "--- Document 1 ---\n", - "Cadre et approche du cours\n", - "Alan Turing publieComputing Machinery and Intelligenceen 1950 [Tur50], qui deviendra un article\n", - "fondamental pour l’intelligence artificielle. Une citation devenue célèbre a motivé l’écriture de ce cours :\n", - "Nous ne pouvons qu’avoir un aperçu du futur, mais cela suffit pour comprendre qu’il y a\n", - "beaucoup à faire.\n", - "— Alan Turing (1950)\n", - "C’est par cette vision des années 1950 que nous nous proposons de remonter le temps et de découvrir\n", - "l’ensemble des grandes briques élémentair\n", - "\n" - ] - } - ], - "source": [ - "ml_doc = pdf_parser(\"data/ML.pdf\")\n", - "for i, doc in enumerate(ml_doc[:2]):\n", - " print(f\"--- Document {i} ---\")\n", - " print(doc.page_content[:500])\n", - " print()" - ] - }, - { - "cell_type": "markdown", - "id": "0473470e", - "metadata": {}, - "source": [ - "Nous avons du texte et des métadonnées. Nous commençerons par nous concentrer sur le texte. Pour qu'il puisse être digérer par le RAG, nous devons le découper en plusieurs *chunk*. La classe [`CharacterTextSplitter`](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.CharacterTextSplitter.html) permet de réaliser cette opération." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "bea1f928", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Il y a 1458 chunks.\n" - ] - } - ], - "source": [ - "from langchain_text_splitters import CharacterTextSplitter\n", - "\n", - "text_splitter = CharacterTextSplitter(\n", - " separator=\"\\n\",\n", - " chunk_size=256,\n", - " chunk_overlap=0,\n", - " length_function=len,\n", - " is_separator_regex=False,\n", - ")\n", - "\n", - "texts = text_splitter.split_documents(documents=ml_doc)\n", - "print(f\"Il y a {len(texts)} chunks.\")" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "18664898", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "Document(metadata={'producer': 'pdfTeX-1.40.27', 'creator': 'TeX', 'creationdate': '2026-01-03T13:58:41+01:00', 'moddate': '2026-01-03T13:58:41+01:00', 'trapped': '/False', 'ptex.fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.27 (TeX Live 2025) kpathsea version 6.4.1', 'source': 'data/ML.pdf', 'total_pages': 140, 'page': 0, 'page_label': '1'}, page_content='INTRODUCTION AU MACHINE LEARNING\\n2022-2026\\nThéo Lopès-Quintas')" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "texts[0]" - ] - }, - { - "cell_type": "markdown", - "id": "96d05d6a", - "metadata": {}, - "source": [ - "**Consigne** : Après avoir inspecté le contenu de la variable *texts*, afficher la distribution de la longueur des chunks." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "b30cc5de", - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1IAAAIkCAYAAAAUKhpvAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjEsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvc2/+5QAAAAlwSFlzAAAPYQAAD2EBqD+naQAAe49JREFUeJzt3Qd8FGX6wPFn03tCEkhoCb1XaSJgA8FyNrw7C57lPD099RQ9C579ivXUs59/z471rHjKgYhYaNJb6EiAkIQkpPfs/D/PGzZuQgLZkM2W/L6fzzCzs7Oz78xOlnn2fd/ntVmWZQkAAAAAoNkCmr8pAAAAAEARSAEAAACAiwikAAAAAMBFBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAHwG/fff7/YbLY2ea+TTz7ZTA7ffPONee///Oc/bfL+V1xxhfTo0UO8heP4dd5WXnvtNfOeP/30U5u9J7zv2nPQv8chQ4Z45L3b+u8fgHcgkALglRw3yY4pLCxMunTpItOmTZOnn35aioqKWuV9MjIyTAC2Zs0a8TbeXDYAANo7AikAXu3BBx+UN998U1544QW58cYbzbqbb75Zhg4dKuvWrau37d133y1lZWUuBysPPPCAy8HKvHnzzORORyrb//3f/8mWLVvc+v4AAKBpQUd4DgA87owzzpDRo0fXPZ41a5Z8/fXX8otf/ELOOeccSUtLk/DwcPNcUFCQmdyptLRUIiIiJCQkRDwpODjYo+8P71dSUiKRkZGeLgYA+C1qpAD4nFNPPVXuuece2b17t7z11ltH7CM1f/58mThxosTFxUlUVJT0799f7rrrrrp+DWPGjDHLV155ZV0zQm1W6NznYuXKlXLiiSeaAMrx2oZ9pBxqamrMNsnJyeYmVoO9PXv21NtG+5doP5OGnPd5tLI11k9Fb5xvvfVW6d69u4SGhppjffzxx8WyrHrb6X5uuOEG+eSTT8zx6baDBw+WuXPnNuv87927V8477zxzfJ06dZKZM2dKRUVFo9suW7ZMTj/9dImNjTXn76STTpIffvih3jbaTFNrGfV4tCy6z9NOO01WrVolLfH888+b49F9aXPQ66+/XvLz8+tt4/hsN23aJKeccoopW9euXeXRRx89bH96nenn6Hy8//vf/w7rE9acz9VBz9d9990nffr0MeXUz+z222+vdx6175fzZ+5M1+v13vDa1+O55JJLpEOHDua6V5mZmeYa6tatm3mvzp07y7nnntusvmWOa0Sb1ur8448/bnQ7u90uTz31lDnvum1SUpL8/ve/l4MHD9bbbsWKFaZ5bmJiovkBpGfPnvLb3/5WmuPLL7801090dLTExMSYv4+33377sO2O9pk21beusX5+rlwnDelnqT/46LW/ePFit1zrADyLGikAPuk3v/mNCVi0ed3VV1/d6DYbN240NzLDhg0zTQT1xmX79u11N/IDBw406++991655pprZNKkSWb9CSecULeP3NxcUyt20UUXyaWXXmpuEI/kb3/7m7kZu+OOOyQ7O9vcXE6ZMsU0z3PUnDVHc8rmTIMlvdlfuHChXHXVVTJixAhzs3/bbbfJvn375Mknn6y3/ffffy8fffSR/OEPfzA3ptrv7IILLpD09HRJSEhoslzadHLy5Mlmuz/+8Y8mUNGml1pL2JCu03M3atQoEzQEBATIq6++agLh7777TsaOHWu2u/baa00nfQ3uBg0aZM65lk9rG4877jhxhQYU2hxSz/l1111nmj9qs9Aff/zRfO7ONXl6k69B3vTp0+XXv/61KYN+btpsVMvtCE61vPv375ebbrrJBMh6867nuaU06NDPSo9RP1v9rNevX28+o61bt5rgpaV+9atfSd++feXvf/97XQCtn6v+LWjTWL2B1+tSf2DQz/BISSP0b0tfq5/JQw89ZD4XR0DWkAZNGqDo83pd7Nq1S5599llZvXp13XnX9506dap07NhR7rzzTvPjhgYzeh0eje5bAy4N1LRWWl+r+9bgXwNHVz5TV7Vkn/p3osGqBo5fffVV3Y8irXmtA/ACFgB4oVdffVXvAq0ff/yxyW1iY2OtkSNH1j2+7777zGscnnzySfP4wIEDTe5D96/b6Ps1dNJJJ5nnXnzxxUaf08lh4cKFZtuuXbtahYWFdevff/99s/6f//xn3brU1FTr8ssvP+o+j1Q2fb3ux+GTTz4x2/71r3+tt90vf/lLy2azWdu3b69bp9uFhITUW7d27Vqz/plnnrGO5KmnnjLb6XE5lJSUWH369DHr9Twou91u9e3b15o2bZpZdigtLbV69uxpnXbaafU+x+uvv95q6TWya9cu8zg7O9sc19SpU62ampq67Z599lmz3SuvvHLYZ/vGG2/UrauoqLCSk5OtCy64oG7dP/7xD7Odnl+HsrIya8CAAfWO15XP9c0337QCAgKs7777rt52ep3pPn/44QfzWI+rqc9f1+v13vDav/jii+ttd/DgQbP+scces1w1YsQIq3PnzlZ+fn7dunnz5pn9OV97ehy6bvbs2fVeP3fu3HrrP/7446P+TTdG3z86OtoaN26cOffOnK+t5n6mDa+bhn/Dzp9pc/fpeO0HH3xgFRUVmdclJiZaq1evrvceLb3WAXgnmvYB8FnaVO9I2fv0V2v16aefmlqAltBaLP2Vvbkuu+wyU8Pj8Mtf/tI0pfriiy/EnXT/gYGBpjbAmTb10/tubRblTGtsevfuXfdYa+20udTOnTuP+j56PHpcDtrcSWtWnGkN3LZt20xtgf7qnpOTYyat4dEarW+//bbuM9HPSZsAanKNY6G//FdWVpqmU1r75aA1lnps//3vfw+7frSW0UH7vWktmfM50BoPbcqlNUgO2nStqVrQ5vjggw9MLdSAAQPqzotOWvOljqW2S2s8nGktqB6XNldr2MzuSLQGTj/Dyy+/3DRNc9BmaFqT0vB4dBt9zvl4tCZSz7HjeBx/j59//rlUVVU1uyxae6Z/51qLpefeWcOmvM35TF3lyj4LCgpMrdvmzZvNOdeaYWetda0D8A4EUgB8VnFxcb2gpaELL7xQJkyYIL/73e9Mkzxtnvf++++7FFTpTbQriSW0WVXDGz3tB+PusY60H482s2t4PvSG3fG8s5SUlMP2of1qjnazrfvR42l4A6v9sZxpEKX0RlybcjlPL7/8suk/ojedSvubbNiwwfQT0htUbZ7XkhtfxzE2LIt+fr169TrsHGgTtYbH0fAc6Gs04Gy4nZ6DltJzo03tGp6Xfv36mee1CVxLaZ+jhj8EPPLIIyaQ1r8B7eun51v7TR2J41w1vJ6b+qz1s9T+Pg2PSf9GHcej/Zu0qaA2vdQ+Utr0TZt6NtW/zmHHjh1m3pwxoprzmbrKlX1qEK/NSDWo12aIDbXWtQ7AO9BHCoBP0oQHevN2pBta/TVeaz70F3GtjdDahffee8/88q/9P7QG52hc6dfUXE0NGqyJKppTptbQ1Ps0TEzRUo5g9bHHHjvsV3nnX/qV9jvRPmCayEA/F32N3vxr35mW9mvxxDlo7ueq50b71zzxxBONbq832UfbnyvXq97cn3322abvlfab00Qt2udJ+7CNHDlSjpUejwZRs2fPbvR5DaiUY8DapUuXypw5c0xZtN/TP/7xD7POcT24+zN19by6cp1ocPjuu+/Kww8/LG+88Ua9mlFPXusA3INACoBP0gQHSjOAHYneyGhTMp30xlU74f/5z382wZU2b2vqpqqlHDUxzjdbmuBCm845/5rdMIucoxZAa04cXClbamqq+RVcm0A510ppEyPH861B96O/qOtxOZev4ZhWjmaD2qROz/PRaHNBTXyhk9ZgaMd7Tdzhys2l4xi1LM7nUZv7afKD5pSjsX1qxraGx6ufaUPN/Vz13Kxdu9Zck0f6jHV/quE+G9asNYe+pzbz1EmvUQ1uNYBxznrZ2LlseD039Vnrtae1v8354eH44483k36+mrhjxowZJvjQmuOmyq70ujuWmkB3nNeGNJulNu3T7I36d6iJTtxxrQPwDjTtA+Bz9Jf0v/zlL6YZk96ENSUvL++wdY7aEUdzIsc4O43dALeE/grt3G9Lf4HX/ibON0l6Y6i/wOsNvoP2G2mYJt2Vsp155pnmF3XNlOZMM8HpzXpr3aTp+2j/Dj0u57G1XnrppXrbaf8YPU5Nv67Nuxo6cOCAmWuZHU38HLR2Q5spHq3JV0MaKGkzPs1A6Fxb8O9//9u8x1lnnSWu0kBdsx5+9tlndevKy8vNgMgNNfdz1VoJ3Wdj+9Bsb9qPzBGEahM4rVVtmN69ufSz0fI2LKfe5B/p/OrNvv6tvP766/U+H+2vpIFlw+PRz1H/Jhuqrq6uu361KVzDWpyGf4+N0cBEy6u1aA2PpSW1h47AzPm8avkbXsMtpf0k9Rp88cUXTXY/5/dorWsdgHegRgqAV9O+HVqrojdkWVlZJojSmzn9xVxvbht2Pnem6cP1ZklvoHV7/fVXb0K1z4NjjB29qdIO4HrTozdrGryMGzfusL4mzRUfH2/2rQkqtLya/lx/RXdOTqC/vGsgoimV9SZU+4BozYBz8gdXy6ZNt3ScG61t0/5Yw4cPN02HNNGGNu1quO+W0uPQYE1vFnV8Lb3h1tpBTTjRsCZQ+0JpAKd9RfR8aH8zDSC0NlCDBG3epUGnfh6avELLrM27tHZD+5lojYkrtAmZpsbWPjh6bjVBhNae6Geu6aedEwY0l6b11uO9+OKLTfpzPV5twua47pxrlJr7uWrqfu2rp4kh9FxoTY7eZOt1ruu1yZtjEGrdpzYT07mu0+tZU6Q3l26rNV9aHk0SoQNWa7MyvTa1z+CRaOCifzt6PWsTPP1h4plnnjGfp3NwrH2f9Dzp9pqgQgMfTXeutVmaiOKf//yn+Xw1KNPP4vzzzzfnRD97DSb1WtAAvSn6vP4goOdAP0fHOFlaq6eBou7XFVp+rRHTa0WPSf9mtUZMv2Nai6Y3LywsNH+PmohDh2pozWsdgJfwdNpAAGiMI0WxY9K01ppyWNNmaypx5xTjTaU/X7BggXXuuedaXbp0Ma/XuaaH3rp1a73Xffrpp9agQYOsoKCgeummNYXx4MGDGy1fU+nP33nnHWvWrFlWp06drPDwcOuss86ydu/efdjrNa22pkoPDQ21JkyYYK1YseKwfR6pbA3TnytNuzxz5kxznMHBwSb9uKa9dk4RrXQ/jaVgbip9d0N6POecc44VERFhUjzfdNNNdamunVNHK03/PH36dCshIcEcq77Hr3/9a/PZOFJJ33bbbdbw4cNNiuvIyEiz/Pzzzx+1HE2lsdZ055qeXM9BUlKSdd1115k04M6a+mwbO687d+40n6N+nh07drRuvfVW68MPPzTvvXTp0hZ9rpWVldYjjzxiyqDbdujQwRo1apT1wAMPWAUFBfXSxV911VUmbbaeHz13mua9qfTnDVP95+TkmM9az4eeW92PphF3Tl9/JHqcAwcONGXU6/Cjjz5q9Bypl156yRyDnict69ChQ63bb7/dysjIMM+vWrXK/P2lpKSY/enfyC9+8Qtzjprjs88+s0444QSz/5iYGGvs2LHm760ln+mOHTusKVOmmHLoNXLXXXdZ8+fPbzT9eXP26Zz+3Jkev67Xa/JYrnUA3smm/3g6mAMAwJdoTePMmTNN0hOtaQMAtD8EUgAAHIH2W3JOoqD9dDTbnTbHc6WZHQDAv9BHCgCAI5g+fboZd0sTI2iyAO33pP2Zmkr3DQBoHwikAAA4SuY+TZyhgZPWQmnSBk1OoAM+AwDaL5r2AQAAAICLGEcKAAAAAFxEIAUAAAAALqKPlIjY7XbJyMgwA146D64IAAAAoH2xLMsMot2lSxczwHxTCKRETBDVvXt3TxcDAAAAgJfYs2ePdOvWrcnnCaRETE2U42TFxMR4ujgAAAAAPKSwsNBUsjhihKYQSGnqwkPN+TSIIpACAAAAYDtKlx+STQAAAACAiwikAAAAAMBFBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACAiwikAAAAAMBFBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACALwVSDz30kIwZM0aio6OlU6dOct5558mWLVvqbXPyySeLzWarN1177bX1tklPT5ezzjpLIiIizH5uu+02qa6ubuOjAQAAANBeBHnyzRctWiTXX3+9CaY08Lnrrrtk6tSpsmnTJomMjKzb7uqrr5YHH3yw7rEGTA41NTUmiEpOTpbFixfL/v375bLLLpPg4GD5+9//3ubHBAAA4Av0h+icnBy37T8xMVFSUlLctn/A02yWZVniJQ4cOGBqlDTAOvHEE+tqpEaMGCFPPfVUo6/58ssv5Re/+IVkZGRIUlKSWffiiy/KHXfcYfYXEhJy1PctLCyU2NhYKSgokJiYmFY+KgAAAO8LogYMHChlpaVue4/wiAjZnJZGMAWf09zYwKM1Ug1pYVV8fHy99bNnz5a33nrL1DqdffbZcs8999TVSi1ZskSGDh1aF0SpadOmyXXXXScbN26UkSNHHvY+FRUVZnI+WQAAAO2F1kRpEDXjjsckKaV3q+8/K32HzH7kNvM+BFLwV14TSNntdrn55ptlwoQJMmTIkLr1l1xyiaSmpkqXLl1k3bp1pqZJ+1F99NFH5vnMzMx6QZRyPNbnmuqb9cADD7j1eAAAALydBlHd+g72dDEAn+Q1gZT2ldqwYYN8//339dZfc801dcta89S5c2eZPHmy7NixQ3r3btkvKLNmzZJbbrmlXo1U9+7dj6H0AAAAANoTr0h/fsMNN8jnn38uCxculG7duh1x23Hjxpn59u3bzVyb+2VlZdXbxvFYn2tMaGioae/oPAEAAACATwRSmudCg6iPP/5Yvv76a+nZs+dRX7NmzRoz15opNX78eFm/fr1kZ2fXbTN//nwTHA0aNMiNpQcAAADQXgV5ujnf22+/LZ9++qkZS8rRp0mzZISHh5vme/r8mWeeKQkJCaaP1MyZM01Gv2HDhpltNV26Bky/+c1v5NFHHzX7uPvuu82+teYJAAAAAPyqRuqFF14wmfo0xbnWMDmm9957zzyvqcu/+uorEywNGDBAbr31Vrngggtkzpw5dfsIDAw0zQJ1rrVTl156qRlHynncKQAAAADwmxqpow1hpQkgdEypo9Gsfl988UUrlgwAAAAAvDzZBAAAAAD4EgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHARgRQAAAAAuIhACgAAAABcRCAFAAAAAC4ikAIAAAAAFxFIAQAAAICLCKQAAAAAwEUEUgAAAADgIgIpAAAAAHBRkKsvAAAAAJojLS3NLftNTEyUlJQUt+wbaC4CKQAAALSqwrwDZn7ppZe6Zf/hERGyOS2NYAoeRSAFAACAVlVWXGjmZ/3+z9J/2KhW3XdW+g6Z/chtkpOTQyAFjyKQAgAAgFskdEmVbn0He7oYgFuQbAIAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACAiwikAAAAAMBFBFIAAAAA4EuB1EMPPSRjxoyR6Oho6dSpk5x33nmyZcuWetuUl5fL9ddfLwkJCRIVFSUXXHCBZGVl1dsmPT1dzjrrLImIiDD7ue2226S6urqNjwYAAABAe+HRQGrRokUmSFq6dKnMnz9fqqqqZOrUqVJSUlK3zcyZM2XOnDnywQcfmO0zMjJk+vTpdc/X1NSYIKqyslIWL14sr7/+urz22mty7733euioAAAAAPg7jw7IO3fu3HqPNQDSGqWVK1fKiSeeKAUFBfLvf/9b3n77bTn11FPNNq+++qoMHDjQBF/HH3+8zJs3TzZt2iRfffWVJCUlyYgRI+Qvf/mL3HHHHXL//fdLSEiIh44OAAAAgL/yqj5SGjip+Ph4M9eASmuppkyZUrfNgAEDJCUlRZYsWWIe63zo0KEmiHKYNm2aFBYWysaNG9v8GAAAAAD4P4/WSDmz2+1y8803y4QJE2TIkCFmXWZmpqlRiouLq7etBk36nGMb5yDK8bzjucZUVFSYyUGDLgAAAADwuRop7Su1YcMGeffdd9skyUVsbGzd1L17d7e/JwAAAAD/4RWB1A033CCff/65LFy4ULp161a3Pjk52SSRyM/Pr7e9Zu3T5xzbNMzi53js2KahWbNmmWaEjmnPnj1uOCoAAAAA/sqjgZRlWSaI+vjjj+Xrr7+Wnj171nt+1KhREhwcLAsWLKhbp+nRNd35+PHjzWOdr1+/XrKzs+u20QyAMTExMmjQoEbfNzQ01DzvPAEAAACAT/SR0uZ8mpHv008/NWNJOfo0aXO78PBwM7/qqqvklltuMQkoNOC58cYbTfCkGfuUpkvXgOk3v/mNPProo2Yfd999t9m3BkwAAAAA4FeB1AsvvGDmJ598cr31muL8iiuuMMtPPvmkBAQEmIF4NUGEZuR7/vnn67YNDAw0zQKvu+46E2BFRkbK5ZdfLg8++GAbHw0AAACA9iLI0037jiYsLEyee+45MzUlNTVVvvjii1YuHQAAAAB4cbIJAAAAAPAlBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACAiwikAAAAAMBFBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACAiwikAAAAAMBFQa6+AAAAAGhMfmmlrErPl2xJlujR50heTaiUVlZLRAi3nPA/XNUAAAA4JhXVNbJ8V56s2ZMvdkvXdJD4ydfI+gqRDd/vkol9EmVk9zix2WyeLirQamjaBwAAgBbbeaBYXl+829REaRCVGh8h3SVHSrb8IOG2arEske+25ci323LErg8AP0EgBQAAgBY5UFQhX2zIlLKqGukQESznDO8i547oIqlyQHI+eUjGhGXLpD6JZlutrfpyfaZU19g9XWygVdC0DwAAAC1qzvff9fulxm5JakKEnD2siwQG1G+6py35RqZ2kKiwIJm3MUu2HyiWinU1cv6IrjTzg8+jRgoAAAAusSxL5m/KkoKyKokOC5LTBycfFkQ565cULeeP7CrBgTbZk1cm6/YWtGl5AXcgkAIAAIBLVu/Jlx0HSiTQZpMzh3aWsODAo76ma4dwmXComd8PO3KksKyqDUoKuA+BFAAAAFzqF/X99hyzfGK/REmOCWv2a4d1jZUusWFSVWPJ11uyTc0W4KsIpAAAANBsS3bmmkx8vTtGytCusS69VvtFTRmYZJoB7s4tlc2ZRW4rJ+BuBFIAAABolqzCctmVUyLaG0qb6bUkYUSHyBAZ1zPeLH+79YCUVFS7oaSA+xFIAQAAoFmW7coz8/7J0dIhIqTF+zkupYN0jAqV8mq7/PhT7T4BX0MgBQAAgKPKdKqNGnuoRqmltGnfxL61iSc2ZBRKaSW1UvA9BFIAAAA4qmU7c818wDHWRjl07xAuSTGhZhwqHawX8DUEUgAAADhqbdRPuaVmgN0xx1gb5aD9q0an1u5r7d4CM8Av4EsIpAAAAHBEyw/1jWqt2igHzfzXISJYKqvtsn4fg/TCtxBIAQAAoElF5VWmb5Qa06N1aqMaq5VanZ4v1TX2Vt0/4E4EUgAAAGhS2v7asZ66xYW3am2Ug2YAjAoNktLKmrr3AnwBgRQAAAAaZVmWbNpfaJYHdYlxy3toBr/jUuLM8sr0g2LX0X4BH0AgBQAAgEbtyy+TgrIqCQkMkD6dotz2PkO6xkpoUIB5rz15pW57H6A1EUgBAACgUZsyamuj+iVFSXCg+24bdd/9k6LrvSfg7QikAAAAcBhNR74tu9itzfqcDT70HjsOlEh5FanQ4f0IpAAAAHCYbVnFUm23JD4iRJJjwtz+fh2jQyUxKkRqLEs2Z5J0At6PQAoAAACHcU4yoWnK3U3fY3CX2Nr3pnkffACBFAAAAOrJK6mU/QXlovGTDsLbVjQVeqDNJgeKKyS7sLzN3hdoCQIpAAAA1LMlq7ZpXY+ESIkMDWqz9w0PDpTeHSPN8sZDNWKAtyKQAgAAQD07DtQmmejrxpTnTXEkttiSWSTVNfY2f3+guQikAAAAUCe/tFJyiyslwCbSM7G2dqgtpcRHSHRYkFRU22X7oYAO8EYEUgAAAKij6cdV1w7hEhYc2Obvr0knBnb+uVYK8FYEUgAAADisWV/vjm3frM/BMThvel4pY0rBaxFIAQAAwCipqDbZ+lTvRM8FUvGRIZIQFSJ26+fADvA2BFIAAAAwHEGLDsAbFdZ22foa069TdN3AwIA3IpACAABAvf5RvTu1fZKJhvom1daIpR8slbJKmvfB+xBIAQAAwPRF2nuw1OP9oxw6RIRIx6hQsWjeB38JpObOnSvff/993ePnnntORowYIZdccokcPHiwtcsHAACANvBTTonpk5QQGWKCGG/gqJXaemiAYMCnA6nbbrtNCgtrR5pev3693HrrrXLmmWfKrl275JZbbnFHGQEAAOBmjjGbvKE2yqHfoex9ew+WSWlltaeLA9Tjci9CDZgGDRpklj/88EP5xS9+IX//+99l1apVJqACAACAb6m222V3rqNZn+f7RznEhgdLp+hQyS6qkO3ZxTKsW5yniwS0vEYqJCRESktr/9C++uormTp1qlmOj4+vq6kCAACA78jIL5dquyURIYHSMTpUvImjVmor2fvg6zVSEydONE34JkyYIMuXL5f33nvPrN+6dat069bNHWUEAACAG+3Orc3Wl5oQITabTbxJ305R8v32HNmXX2bGuQJ8tkbq2WeflaCgIPnPf/4jL7zwgnTt2tWs//LLL+X00093RxkBAADgRo5mfT0SvKdZn0NMeLAkxdTWku3MqQ34AJ+skUpJSZHPP//8sPVPPvmklJWVtVa5AAAA0AaKy6slt6TSLHePjxBv1KtjlGQVVpg06KO9L9ZDO+VyjdQf//jHRteXlJSQbAIAAMDH7M6rreVJjgmT8OBA8Ua9E2ujp715ZVJl93RpgBYGUv/973/lvvvuOyyI0mZ91dW0WwUAAPDFZn3aP8pbxUeGSFxEsNRYlmSVuXz7CnhH07558+bJpEmTpEOHDnLzzTdLUVGRTJs2zfSb0n5SAAAA8A06AG96nvcHUpoAQ8e3Wrn7oGSUeVcyjPYkPT1dcnJy3LLvxMRE04XIrwOp3r17y9y5c+WUU06RgIAAeeeddyQ0NNTUVEVG0mgVAADAVxystElFtV1CgwIkKTpMvJmOb6WB1H6tkQpw+RYWrRBEDRg4UMoODYPU2sIjImRzWppPBVMtugqHDRtmEk6cdtppMm7cOLMcHh7e+qUDAACA2ziayaXER0hAgHfX9GgfLh3nqrSyRsJSh3m6OO2O1kRpEDXjjsckKaV3q+47K32HzH7kNvMefhdIjRw5stExBbQmKiMjw4wp5bBq1arWLSEAAADcIrPc5vXN+hz0XrRXx0jZsK9QIvoe7+nitFsaRHXrO9jTxfCdQOq8885zf0kAAADQZgLCok3TPpUa7xvdM7SflAZS4X3Gid2yPF0ctHPNCqQaZukDAACAbwvrOVLreSQhKkSiwnyjz1H3DhESZLNEohNkW16VjPZ0gdCutfivprKyUrKzs8Vur5/M35faNQIAALRXYanDzTzVSwfhbUxggE2Sw+2ytzRQlu8rl4s9XSC0ay4HUlu3bpWrrrpKFi9eXG+9ZVmm7WpNTU1rlg8AAABuDKS0lseXdAm3ZG+pyI8ZFZ4uCto5lwOpK6+80owZpZn6Onfu3GgSCgAAAHivrOJqCY5LFptY0iXOtzIva42UVVMtewtFduWUSM9E3+jfBf/jciC1Zs0aWblypQwYMMA9JQIAAIBbbciuNPP4EEtCgmpToPuK4ACR8j3rJbzHSFmQliW/m9TL00VCO+XyX86gQYPcNqIxAAAA3G99dm2zuI5hvpn5rmzbMjOfvynL00VBO+ZyIPXII4/I7bffLt98843k5uZKYWFhvQkAAADeS/u1rz9UI9UxrH7SMF9Run25ma/YfVDyS2uPBfD6pn1Tpkwx88mTJ9dbT7IJAAAA77fjQIkcLLeLVV0pCaHik2oKsyU1Nkh2F1TLN1sOyHkju3q6SGiHXA6kFi5c6J6SAAAAwO2W7KjtolG+N00Cew0UXzWmS5jsLiiW+WlZBFLwjUDqpJNOck9JAAAA4HaLd+SaeXn6WhHx5UAqVP6TViyLthyQymq7zyXNQDsMpL799tsjPn/iiSceS3kAAADgJna7JUt2Hgqkdq8TkYvEV/WOD5aO0aFyoKhClu3KlUl9O3q6SGhnXA6kTj755MPWOY8lRR8pAAAA75SWWSj5pVUSFmSTysxt4ssCbDaZPKCTvPvjHvlqUxaBFNqcy3WgBw8erDdlZ2fL3LlzZcyYMTJv3jz3lBIAAADHbMmhZn2DOoaI2H3/x+8pA5PM/Ku0bJP4DPDqQCo2NrbelJiYKKeddlpdWnRXmwmeffbZ0qVLF1Or9cknn9R7/oorrjDrnafTTz+93jZ5eXkyY8YMiYmJkbi4OLnqqqukuLjY1cMCAABoN/2jhnYKEX8woU+ihAUHyL78MknbX+Tp4qCdabVeeUlJSbJlyxaXXlNSUiLDhw+X5557rsltNHDav39/3fTOO+/Ue16DqI0bN8r8+fPl888/N8HZNddc0+LjAAAA8EfVNXZZvivPLA/r5KN5zxsIDwmUiX1qm/QtSGNwXnh5H6l167Rj4s+0GlUDnIcfflhGjBjh0r7OOOMMMx1JaGioJCcnN/pcWlqaaVb4448/yujRo826Z555Rs4880x5/PHHTU0XAAAARDZmFEpxRbXEhgdLapzLt4Be67RBneSrtCwz3Ti5r6eLg3bE5b8iDZa0iV3DdqjHH3+8vPLKK9LavvnmG+nUqZN06NBBTj31VPnrX/8qCQkJ5rklS5aY5nyOIMoxYHBAQIAsW7ZMzj///Eb3WVFRYSaHwsLCVi83AACAN9HMdmpMj3iTqMFfnDKgk5mv3VsgWYXlkhQT5ukioZ1wOZDatWtXvccatHTs2FHCwlr/otVmfdOnT5eePXvKjh075K677jI1WBpABQYGSmZmpgmynAUFBUl8fLx5rikPPfSQPPDAA61eXgAAAG+1bGdts75xPeNFJF/8RafoMBnRPU7W7MmXBWnZcsm4FE8XCe2Ey4FUamqqtJWLLvp5bIOhQ4fKsGHDpHfv3qaWavLkyS3e76xZs+SWW26pVyPVvXv3Yy4vAACAN6qxW7L8p0OBVK94qc72n0BKnTYoyQRS2ryPQAptpUUNZBcsWGAmTX1ut9vrPeeO5n0OvXr1MlkCt2/fbgIp7TulZXBWXV1tMvk11a/K0e9KJwAAgPZgc2ahFJVXS1RokAzqHCPr6t8++UUa9Mf+t0W+354jpZXVEhHiP33A4EdZ+7RJ3NSpU00glZOTc9i4Uu60d+9eyc3Nlc6dO5vH48ePl/z8fFm5cmXdNl9//bUJ7saNG+fWsgAAAPhas77RPTpIUGCrJW32Gv2SoqR7fLhUVtvlu205ni4O2gmXw/UXX3xRXnvtNfnNb35zzG+u4z1p7ZJz/6s1a9aYPk46adB2wQUXmNol7SOl41T16dNHpk2bZrYfOHCg6Ud19dVXm3JVVVXJDTfcYJoEkrEPAACgfqKJsaZ/lP/RRGiTByTJa4t/MmnQpw1uumUS0Fpc/kmisrJSTjjhhFZ58xUrVsjIkSPNpLTfki7fe++9JpmEplo/55xzpF+/fmag3VGjRsl3331Xr1ne7NmzZcCAAaapn6Y9nzhxorz00kutUj4AAABfZ9f+UYfGjxrXszbzsT/SflJKE05onzDA62qkfve738nbb78t99xzzzG/+cknn3xYGnVn//vf/466D6250vIAAADgcNsPFMvB0ioJDw6UYd1ixV9pbVt0WJDkllSaxBOjUjt4ukjwc80KpJwz3Gn/I63x+eqrr0wWveDg4HrbPvHEE61fSgAAALTIsp21zfo0sAj2w/5RDnpsJ/fvJHPWZsj8TVkEUvCOQGr16tWHDcqrNmzYcFj7VAAAAHiPpYea9flr/yhnUwY6AqlMufOMAZ4uDvxcswKphQsXur8kAAAAaFXahaL+QLz+7ZQBnSQ40CY7DpTI9uxi6dMpytNFgh9zuX63oKDAjNPUkK7TgW0BAADgHXbllEhOcYWEBAXI8O5x4u9iwoJlfO9Es/y/jZmeLg78nMuBlKYWf/fddw9b//7775vnAAAA4B2WHWrWN7J7nIQFB0p7MG1wbfa+eQRS8LZAatmyZXLKKac0moFPnwMAAIB3JZpoD836nNOga7f9tXsLZH9BmaeLAz/mciBVUVEh1dXVh63XwXDLyrhYAQAAvKZ/lGP8qF7+O35UQ52iw+S4lNqMfZq9D/CaQGrs2LGNDnj74osvmgFzAQAA4Hl7D5bJ/oJyk3zBEVi0F47mffSTglcNyPvXv/5VpkyZImvXrpXJkyebdQsWLJAff/xR5s2b544yAgAAwEVLDzXrG9YtTsJD2kf/KIdpg5Pl719slqU78yS/tFLiIkI8XST4IZdrpCZMmCBLliyR7t27mwQTc+bMkT59+si6detk0qRJ7iklAAAAXLKsHY0f1VBqQqQMSI6WGrslC9KyPV0c+CmXa6QcA/LOnj279UsDAACAVrFsV/tLNOFs6uBk2ZxZZJr3XTCqm6eLAz/kco0UAAAAvFtGfpnsySuTwACbjO7RPgMpRz+pb7cdkLLKGk8XB36IQAoAAMDPLD/UrG9IlxiJCm1RAySfN6hzjHTrEC7lVXb5ZgvN+9D6CKQAAAD8tFlfe+wf5WCz2eSsoZ3N8ufr93u6OPBDBFIAAAB+ZtnOQ+NH9Ww/40c15qxhtYHU12nZUlp5+DiogEcCqe3bt8v//ve/ukF4ddA3AAAAeFZ2UbnszCkRm01kTDuukVJDu8ZKSnyElFXVyNebad4HDwdSubm5Zhypfv36yZlnnin799dWlV511VVy6623tnLxAAAA0JL+UQOTYyQ2PFjaM23e94tDtVKfr6V5HzwcSM2cOVOCgoIkPT1dIiIi6tZfeOGFMnfu3FYuHgAAAFrSrK89949qrHnfwi3ZUlxB8z54MJCaN2+ePPLII9KtW/18/H379pXdu3e3YtEAAADQ0kQTx/cikHJk7+uVGCkV1XZZkJbl6eKgPQdSJSUl9WqiHPLy8iQ0NLS1ygUAAAAX5ZVUytasYrM8tp0nmmised8cmvfBk4HUpEmT5I033qh3cdrtdnn00UfllFNOac2yAQAAoAX9o/olRUl8ZIini+M1zhrWxcy/3XpACsurPF0c+AmXR2jTgGny5MmyYsUKqayslNtvv102btxoaqR++OEH95QSAAAAR8X4UY3rnxwtfTtFybbsYpm/MUsuGFW/iwrQJjVSQ4YMka1bt8rEiRPl3HPPNU39pk+fLqtXr5bevXu3qBAAAAA4dowfdfSkE3PWZXi6KGivNVIqNjZW/vznP7d+aQAAANAiBWVVkpZZaJbHkWjiMOcM7yJPfbVNvtuWY8ba6hQd5ukioT0EUuvWrWv2DocNG3Ys5QEAAEALrPgpTyxLTIY6goTD9eoYJcelxMmq9Hz5dHWGXH1iL08XCe0hkBoxYoRJKmFZlpk76GPlvK6mpsYd5QQAAMARLDuUaIL+UU3TvlEaSP1n5V753aSe9e5hAbf0kdq1a5fs3LnTzD/88EPp2bOnPP/887JmzRoz6bL2j9LnAAAA4LlAimZ9TfvFsC4SEhQgW7KKZGNGbTNIwK01UqmpqXXLv/rVr+Tpp5+WM888s15zvu7du8s999wj5513XosLAwAAANcVV1TLhn0FZplEE02LDQ+WqYOS5PN1+02t1JCusZ4uEtpT1r7169ebGqmGdN2mTZtaq1wAAABoppW7D0qN3ZLu8eHSJS7c08Xxao7U55+u2SeV1XZPFwftKZAaOHCgPPTQQ2YMKQdd1nX6HAAAANrWsp2Hxo/qQW3U0UzqkyidokPlYGmVfL0529PFQXtKf/7iiy/K2WefLd26davL0KdZ/bSz3pw5c9xRRgAAABzBcvpHNVtQYICcP7Kr/OvbnfLhqr1y+pBkTxcJ7SWQGjt2rEk8MXv2bNm8ebNZd+GFF8oll1wikZGR7igjAAAAmlBWWSNr9+ab5ePpH9Xs5n0aSC3cnC25xRWSEBXq6SKhvQzIqwHTNddc0/qlAQAAgEtWpx+UqhpLOseGmT5SOLp+SdEyvFusrN1bIB+s3CvXntTb00VCe+gjBQAAAO+x1Gn8KMZFar4Z42qzUr+1dLdJ1AG4ikAKAADAhy09lGiCtOeuOWdEF4mLCJa9B8tkQVqWp4sDH0QgBQAA4KPKq2pkTXpt/6jxvQmkXBEWHCgXjulull9f8pOniwMfRCAFAADgw+NHVdbYTf+oHgkRni6Oz/nN8akSYBP5YXuubM8u8nRx0B4Cqfz8fHn55Zdl1qxZkpdX2y531apVsm/fvtYuHwAAAJqweEeOmY/vlUD/qBbo1iFCpgxMMsuvL97t6eLA3wMpHTOqX79+8sgjj8jjjz9ugir10UcfmcAKAAAAbWPJjtr+UcfTrK/FLj+hh5nrmFKF5VWeLg78OZC65ZZb5IorrpBt27ZJWFhY3fozzzxTvv3229YuHwAAABpRXFEt6/YW1NVIoWVO6J0gfTtFSWlljfxnxV5PFwf+HEj9+OOP8vvf//6w9V27dpXMzMzWKhcAAACO4Mef8qTabpmxo7rH0z+qpbRJ5GWHaqVeW/yTVNfYPV0k+GsgFRoaKoWFhYet37p1q3Ts2LG1ygUAAIAjWHqoWR+1Ucdu+siuEh8ZIul5pfLpmgxPFwf+Gkidc8458uCDD0pVVVVdFJ+eni533HGHXHDBBe4oIwAAABpYcmj8KNKeH7vI0CC5elIvs/zM19uolYJ7Aql//OMfUlxcLJ06dZKysjI56aSTpE+fPhIdHS1/+9vfXN0dAAAAXFRQViUb9jn6RyV6ujh+4bLxqdIhIlh+yqVWCs0TJC6KjY2V+fPnyw8//CBr1641QdVxxx0nU6ZMcXVXAAAAaIHlu/LEbon0SoyU5Nifk3/h2Gqlrjmxtzwyd7M8u3C7nDuiiwQFMuQqWimQ0uZ84eHhsmbNGpkwYYKZAAAA0LZIe+6+WqmXvt0hu3JK5LO1GTL9uG6eLhK8mEthdnBwsKSkpEhNTY37SgQAAIDm9Y8i0UTr95U6sbav1LNfb6evFI7I5frKP//5z3LXXXdJXl6eqy8FAADAMTpYUilp+2szKB9PINXqLhvfw/SV2plTYgbpBVqtj9Szzz4r27dvly5dukhqaqpERkbWe37VqlWu7hIAAADNtPhQs75+SVHSMTrU08XxO1GhQXL9KX3kr/9Nk0fmbpFpg5MlLiLE08WCPwRS5513nntKAgAAgKP6btsBM5/Yh/E73eXyE3rI+yv2yNasYnn0f1vk7+cP9XSR4A+B1H333eeekgAAAOCILMuS77blmOVJfUl77i7BgQHy1/OGyq//tUTeWZ4uvxrVTUamdPB0seBlWpzTccWKFfLmm2+aaeXKla1bKgAAABxGxzjal18mwYE2Gdcr3tPF8Wtje8bLBcd1E8sSufuTDVKj+eaBY6mR2rt3r1x88cVmHKm4uDizLj8/X0444QR59913pVs30kQCAAC4w/eHmvWNSu0gESEu38bBRbPOHCDzN2XKxoxCeWvpbtPkD2hxjdTvfvc7M55UWlqaydynky7b7XbzHAAAANzj52Z99I9qC4lRoXLb6QPM8qNzN8v27GJPFwlexOWfMhYtWiSLFy+W/v37163T5WeeeUYmTZrU2uUDAACAiBnTyDEQ78Q+9I9qK5eMTZH/rsuQpTvz5Nq3Vsqn108w4015o/T0dMnJqQ22W5tWnKA+l6+C7t27mxqphnSQXk2JDgAAgNa3dm+BFFVUS2x4sAzpGuvp4rQbgQE2eebi4+Ssp78zNVJ3frRenr5ohNhsNvG2IGrAwIFSVlrq1vcpLqZWrsWB1GOPPSY33nijPPfcczJ69Oi6xBM33XSTPP74467uDgAAAM3w/aFmfRP6JJibe7QdHa/r+RnHyUUvLZU5azNkVEqcXDGhp3gTrYnSIGrGHY9JUkpvs67KLlJWbZMKu0hFjU0qLZHQAEsiAkXCgywJDRBpbjyYtnyRfPn6P6W8vNy9B+JvgVSHDh3qRd0lJSUybtw4CQqqfXl1dbVZ/u1vf8s4UwAAAG4cP4r+UZ4xuke83HXmQHnw801msN5+SdFyQguaWLqr+Z02vQsIi5bKhL6yzZ4g+/PLJae4Qo6UazAiJFD6dYqW/snRkhQTesRatqz0Ha1e5nYRSD311FPuLwkAAAAaVVReJav35Jtl+kd5zpUTesiq9IPy+br9cuVrP8q/fjNKTu7fyaPN7zR4iuh3vET0nyjdbnhTlufq7X1B3fOhQQEmYNIsjyFBAVJaWS3FFdVSUlEjpZU1smZvvpm0yei4nvEyIDna65ot+nQgdfnll7u/JAAAAGiUJjrQcYx6JERI9/gITxen3dIA4/FfDZeyyhpZsDlbrn5jhTx7yXEybXByi5vftYSObZVVbpOfigMlo8wmlvwc+ETaqqRvt47SJTZMOseFS1QTiTH0ekrPK5UtmUWy40CxFJRVybxNWebxqQM6SUx4cIvL1160OOVIdna2mTTtubNhw4a1RrkAAADQYPyoiX2pjfK0sOBAeeHSUTLzvTXy3/X75Q+zV8njvxom549s/liqGkR16zvY5ffOL62UTfsLJW1/kalVckiMCpHI4r2y+KW75ZKb75cR/QYddV/az65nYqSZKqvtsnZvvizblSe780rlrWW7ZULvRBnWLZbaqdYMpFauXGlqqLQdpqXhsBM90Zq9DwAAAK1n0Vb6R3kTbSL3z4tGmKDqw1V7ZeZ7a+WbLQfkvrMHS3xkSKu+V1WNXbZlF8umjELZl19Wtz4sKEAGdI6RQZ1jTDKMlQvWSvXBjBYfz5ge8dKnY5R8tTlLMvLL5ZutB0yWyAm9EwimWiuQ0oQS/fr1k3//+9+SlJTEiQUAAHCjnQeK5afcUgkOtMkE+kd5jaDAAHnsl8NMkoYXF+2QT9dkmMyK958zWH4xrPMx3SNrZcX+gnJT+7Q1q0iqan6uvEhNiDDBU6+OkRIUECCtqUNkiPzyuG6yMv2g/LA9V1buPmiaAJ5ITWjrBFI7d+6UDz/8UPr06ePqSwEAAOCirzdnm/m4nglN9neBZwQE2OT20weYPlK3/2edbMkqkhvfWS3PLdwuvx7dXc4b2bXZNVQ64PLeg2WyK6dEduWWSFH5z033NBGEBk8DO0dLdJh7+y5pADg6NV5CAgNk4ZYDsmZPvtjtlkS79V19k8t/jZMnT5a1a9cSSAEAALSBhVtqA6lTBjQ/Oxza1vDucTLnxokmgHph0Q7ZnFlk0qQ//OVmGd87QfolRUmvjlFSlVchod2HSGaZTcqzi00/J01RfqCoQnJLKk3tj0NQgE36JkXJ4M6x0iUurM1bgQ3rFmf6UX2Vli3r9hVIV+H6O+ZA6uWXXzZ9pDZs2CBDhgyR4OD6UfE555zj6i4BAADQCL3RXr4rzyyf0p/+Ud5M+xnNPK2f/HZCT/ls7T55b8Ue2bCv0PRvc/RxU8mXPCw/6MMD+w/bh9Y4avKHHokR0r1DhAQHtm7TPVcN7hIrATabyea3TxIkvO/xHi2PzwdSS5YskR9++EG+/PLLw54j2QQAAEDr0T432j9G055rjQa8X2xEsPxmfA8zpe0vlBW7D5p+bjsPlMjWjDxJ3/2TdEzuKuERERIeHCiJUaGSGB0iHaNCTRM+b8s/MLBzjBworpDV6fmScObNUm4v8XSRfDeQuvHGG+XSSy+Ve+65xySbAAAAgHssPNQ/imZ9vkmDEJ0cVq1aJaNGnSEXPfeRdOvbXXyFpkLfnr5fisKiZFNliIyxW6bZX3vncn1hbm6uzJw5kyAKAADAjTRzm6N/lA6QCniKBk39ZZ/UlBdLkT1EFu/I8XSRfDOQmj59uixcuNA9pQEAAICxMaNQsosqJCIkUMb2jPd0cdDOhUmV5H7xlFlelZ4vew+WSnvnctM+HUNq1qxZ8v3338vQoUMPSzbxxz/+sTXLBwAA0K7TnuvYUaFBgZ4uDiBl25ZK56AS2V8daRJoXDw2xSSjaK9alLUvKipKFi1aZCZn2jmOQAoAAODY0awP3qhncJHkSbTkFFfKxn2FMrRbrLRXLgdSu3btck9JAAAAYOQWV5iBUNUp/Qmk4D2CbXY5vleCqZFavDPHjHUVFtw+a0wDjrUTpE4AAABo3WZ9eoulGd+SY8M8XRygnqFdYyU+MkTKq+yybGftOGftUYsCqTfeeMP0jwoPDzfTsGHD5M0332z90gEAALRDczdkmvnpg5M9XRSg0Sx+J/WrHSB67b58U4PaHrkcSD3xxBNy3XXXyZlnninvv/++mU4//XS59tpr5cknn3RPKQEAANqJ4opq+W5bbXrp04cQSME7pcRHSO+Okabm9Pvt7TMdust9pJ555hl54YUX5LLLLqtbd84558jgwYPl/vvvN2NMAQAAoOWD8FbW2KVXYqT0S4rydHGAJk3skyg7D5TIT7mlkl1ULp2i21czVJdrpPbv3y8nnHDCYet1nT7nim+//VbOPvts6dKli8n498knn9R7Xvtf3XvvvdK5c2fThHDKlCmybdu2etvk5eXJjBkzJCYmRuLi4uSqq66S4uJiVw8LAADAq5r1TRuSbO6PAG8VFxFikk2oFT8dlPbG5UCqT58+pjlfQ++995707dvXpX2VlJTI8OHD5bnnnmv0+UcffVSefvppefHFF2XZsmUSGRkp06ZNk/Ly8rptNIjauHGjzJ8/Xz7//HMTnF1zzTWuHhYAAIDHlVfV1KU9p38UfMHo1NrBordlF8vB0kppT1xu2vfAAw/IhRdeaAKWCRMmmHU//PCDLFiwoNEA60jOOOMMMzVGa6Oeeuopufvuu+Xcc8+tS3KRlJRkaq4uuugiSUtLk7lz58qPP/4oo0ePrmt6qP23Hn/8cVPTBQAA4Cu0b1RpZY10iQ2TYe14fB74jo7RodIzMVJ25ZTIyt0HZcrAJGkvXA6kLrjgAlM7pIklHE3xBg4cKMuXL5eRI0e2WsF0vKrMzEzTnM8hNjZWxo0bJ0uWLDGBlM61OZ8jiFK6fUBAgCnj+eef3+i+KyoqzORQWFjYauUGAABoKZr1NZ/+oO4OiYmJkpKS4pZ9+6sxPTqYQCptf6GM6xkv0WHB0h64HEipUaNGyVtvvSXupEGU0hooZ/rY8ZzOO3WqP0hdUFCQxMfH123TmIceesjUrAEAAHiLqhq7fJWWZZZp1te0wrwDZn7ppZe6Zf/hERGyOS2NYMoFnWPDpWtcuOzLL5NV6fl1qdH9XYsCKV83a9YsueWWW+rVSHXv3t2jZQIAAO3b0p25UlBWJYlRITK6R22/ExyurLi2JdFZv/+z9B82qlX3nZW+Q2Y/cpt89913psWVr9SieUut1L41ZbJhX4GM7REv4SGB4u+aHUhpc7mjVTHr89XV1a1RLklOrv0lJisry2Ttc9DHI0aMqNsmO7u2Q6aDvr9m8nO8vjGhoaFmAgAA8LZmfacNSjYDnuLIErqkSre+g32qtsvBHzNMp8RHSKfoUMkuqpCNGQXt4seAZgdSH3/8cZPPaV8lza5nt9tbq1zSs2dPEwxpEgtH4KQ1R9r3SQcEVuPHj5f8/HxZuXKlaW6ovv76a1MO7UsFAADgK836HIEUg/D6Z22XSlu+SL58/Z/1MlD7C5vNZhKkfJWWLRsyCmVUage/7+fX7EDKkTnP2ZYtW+TOO++UOXPmmDTkDz74oMvR+Pbt2+slmFizZo3p46TtUm+++Wb561//atKqa2B1zz33mEx85513ntleq1xPP/10ufrqq02K9KqqKrnhhhtMIgoy9gEAAF/x3bYDkltSaZr1TeidULc+PT1dcnJyWv39/LmJmbfWdjmaDvqzfknR8u3WHNNENT2vVFITIsWftaiPVEZGhtx3333y+uuvm3GdNPgZMmSIy/tZsWKFnHLKKXWPHf2WLr/8cnnttdfk9ttvN2NN6bhQWvM0ceJEk+48LOznUZNnz55tgqfJkyeb5oeaVVBrxwAAAHzFx6szzPzs4V0kKDCgLogaMHCglJWWuu19/bGJGTwnODBABnSOlnV7C2TDvkICKWcFBQXy97//3YzVpM3ttNndpEmTWvzmJ598shkvqilaHai1XEeq6dLaq7fffrvFZQAAAPCkovIqmbextlnf+SO71q3XmigNombc8ZgkpfRu1ff05yZm8KyhXWNNILUzp1hKKqolMtR/c9s1+8geffRReeSRR0y/pXfeeafRpn4AAABwzf82ZklFtV16d4w0N6ENaRDV2s3M/L2JGTwnMSpUOseGyf6Cctm4v9Bk8JP2HkhpX6jw8HDp06ePadKnU2M++uij1iwfAACAX/t49d662ih/75yP9mFI19jaQGpfgYzx46QTzQ6kLrvsMr89CQAAAJ6QWVAui3fkmuVzR/zcrA/wZf06Rcm3Ww9IYXm17M4rlR5+2leq2YGUJn8AAABA6/ls7T7R7uI6mGn3+AhPFwdoFUGBATIwOUbW7M03A/T6ayBVmxYGAAAAHsvWd55TkgnAHwzpGmPmu3JKpKyqRvwRgRQAAIAHbM4slLT9hRISGCBnDe3s6eIArSohKlQ6RoWK3RLZnuWfafYJpAAAADzg3eV7zPzUAZ0kLiLE08UBWl3/5Ggz35xVKP6IQAoAAKCNlVZWy4cra7P1XTIuxdPFAdyiX1KUmWfkl0theZX4GwIpAACANvbZmgwpqqiW1IQImdgn0dPFAdwiOixYusaFm+WtmUXibwikAAAA2pBlWfLWst1m+ZKxKRIQwPAy8F8DDjXv25JFIAUAAIBjsHZvgWzYVyghQQHyq9HdPV0cwK36dIoS/a0gp7hScoorxJ8QSAEAALSh2Utra6M0U198JEkm4N/CggPrxpHa4mfN+wikAAAA2khBaZXMWVc7dtSlx5NkAu0re9+WrCLTtNVfEEgBAAC0kf+s2ivlVXbTb+S4lA6eLg7QJnolRprx0orKqyWjoFz8BYEUAABAG6ixW/LWoWZ9M45PFZuNJBNoH4ICA6RXx9rmfduz/WdwXgIpAACANjBvY6bsyimRmLAgOX9kV08XB2jzpBNqx4Fiv2neRyAFAADgZnrj+Pw3O8zy5Sf0kKjQIE8XCWhTqfEREhRgM837sov8I3sfgRQAAICb/bA9V9bvK5Cw4AC54oQeni4O4JHmfT0SI+tqpfwBgRQAAICbPf/NdjO/aEyKJESFero4gEf06RjlV/2kCKQAAADcaM2efFm8I9c0a/rdpJ6eLg7gMT0SIyTQZpODpVWSV1Ipvo5ACgAAwI1eOFQbdc6ILtKtQ4SniwN4TGhQoHSPD/ebWikCKQAAADfZnl0k8zZlmeXrTurt6eIAHtfbKXufryOQAgAAcJMn5m8VzfR82qAk6ZsU7eniAF4xOK9NxGTuKyyrEl9GIAUAAOAGq9IPyhfrMyXAJvKnqf09XRzAK0SEBEnXuEPN+3y8VopACgAAwA3jRj38xWazfMFx3aR/MrVRwGHN+3y8nxSBFAAAQCv7Ki1blv+UJ6FBAXLL1H6eLg7gVXp3rB1PKqOgXMoqa8RXEUgBAAC0ouoauzz8ZZpZ/u3EntI5trYZE4Ba0WHB0vHQeGo/5ZaIryKQAgAAaEUfrNwrOw6USIeIYLnuZDL1AY3pmVhbK7Urh0AKAACg3SsoqzKZ+tSNp/aVmLBgTxcJ8OpAanduqdgt8UkEUgAAAK3kkbmb5UBRhUnxPOP4FE8XB/BaSTGhEh4cKJU1dsmp0ITovodACgAAoBUs35Unby9LN8t/nz5UQoMCPV0kwGvZbLa6Wqn9Zb4ZkvhmqQEAALxIRXWNzPponVm+aEx3Ob5XgqeLBHi9ngRSAAAA7dvzC3eYBBOJUaEy64yBni4O4BNS4iPMgNUl1TYJiu8qvoZACgAA4BhsyyqS57/ZbpbvP2eQxEaQYAJojpCgAOnWIcIsh/ceK76GQAoAAKCFyqtq5I/vrpGqGktOHdBJzhra2dNFAnyyeV9EHwIpAACAduOhL9IkbX+hJESGyEPTh5oO9ABcD6RCuw2S4kq7+BICKQAAgBaYuyFTXl+y2yw//uvhkhQT5ukiAT4nNjxYooPtYgsIlDWZFeJLCKQAAABctPdgqdz+n7Vm+ZoTe8kp/Tt5ukiAz+ocVjsi746DVeJLCKQAAABcUFltl5veXSOF5dUyvHuc/Glqf08XCfBpvaNrZN+LV8nlw2PElxBIAQAANJNlWXLXx+tl5e6DEh0aJM9cNNJkHgPQchFBItUFWeJr+MsHAABopn99u1P+s3KvGfvmmUtGSkpCbepmAO0PgRQAAEAzk0s8MnezWb7v7MFyMv2igHaNQAoAAOAo1u8tkJnvrRHLErlsfKpcfkIPTxcJgIcRSAEAABzB9uwiufzV5VJWVSOT+ibKvb8Y5OkiAfACBFIAAABN2JNXKjNeXiZ5JZUytGusPD/jOAkK5PYJAIEUAABAozILyuWSl5dKVmGF9EuKkjd+O1aiw4I9XSwAXoJACgAAoIHswnKZ8fJS2ZNXJqkJEfLWVeOkQ2SIp4sFwIsEeboAAAAA3iQjv8w059uVUyKdY8NMENUpJszTxQLgZQikAACAX0tPT5ecnJxmbZtVXC33LcqT7JIa6RQZKPdMiJYDP22WAz81vn1iYqKkpKS0boEB+AQCKQAA4NdB1ICBA6WstPSo2wbFd5Wki/4mQdGJUpWXIauev0vOuvfIAVh4RIRsTksjmALaIQIpAADgt7QmSoOoGXc8JkkpvZvcLq/CJj8cCJJKu02ig+0yaWiihI946Yj7zkrfIbMfuU2+++47GThwYKuXPS0trdX3CaD1EEgBAAC/p0FUt76DG33up5wS+W79fqm2W9IpOlTOHdFFIkKOfotUmHfAzC+99FJxp+LiYrfuH0DLEEgBAIB2K21/oXyVliV2SyQlPkLOGtpZQoKal9S4rLjQzM/6/Z+l/7BRrV+25Yvky9f/KeXl5a2+bwDHjkAKAAC0O5ZlycrdB+WHHbnmcf/kaDltYJIEBthc3ldCl9Qma7uOhTYdBOC9CKQAAEC7C6K+3ZYja/bkm8fHpcTJxD6JYrO5HkQBaL8IpAAAQLtRbbfLvI1Zsi27tt/RpL6JclxKB08XC4APIpACAADtQkV1jXy+br/sPVgm2oLvtEFJMiA5xtPFAuCjCKQAAIDfK68R+WjVPskuqpDgQJtJKpGaEOnpYgHwYQRSAADArwXFJsk3WcFSUl0h4cGBJr15UkyYp4sFwMcRSAEAAL+1K79Kki59TEqqbRITFiTnjewqHSJCPF0sAH6AQAoAAPilpTtz5Z6FuRIUFS+xwXb55ejuEhXKrQ+A1tG8EecAAAB8yNwN++WyV5ZLaZUl5enr5cSkaoIoAK2KQAoAAPiVd5enyx9mr5LKaruM7Roq2R/cJyHc8QBoZXytAAAAv/HqD7vkzo/Wi90SuWhMd7ltfAexqis9XSwAfohACgAA+IUXF+2QB+ZsMsvXnNhLHpo+VAJ1wCgAcAMaCwMAAJ9mWZY8vWC7PPnVVvP4j5P7yswpfcVmI4gC4D4EUgAAwKPS09MlJyenxa9/d0ORvL+p2CzPGBotJycUy+rVq83jtLS0VisnADgjkAIAAB4NogYMHChlpaUten3M+F9LhxMvM8t5X78sf3/kE/l7I9sVF9cGWgDQWgikAACAx2hNlAZRM+54TJJServ02i2FAbIhv/ZWZkhctfS/4jIRnZykLV8kX77+TykvL2/VcgMAgRQAAPA4DaK69R3c7O3X7MmXDekHzPL4Xgkytmd8o9tlpe9otTICgDOy9gEAAJ+Str9QFm2tDaLG9ohvMogCAHcikAIAAD5j54FimZ+WZZZHdo+T43sRRAHwDAIpAADgE/YdLJMvNmSKZYkMTI6WSX0TSXEOwGMIpAAAgNfLKa6Qz9ZlSI3dkp6JkTJ5YBJBFACPIpACAABerbi8Wj5dkyGV1XbpEhcmZw5JlsAAgigAnkUgBQAAvJYGT5+tzZDiimqJjwiRs4d1kaBAbl8AeJ5XfxPdf//9ptreeRowYEDd8zomxPXXXy8JCQkSFRUlF1xwgWRl1XZABQAAvs1ut+SLDfvlQHGFhAcHyrkjukhYcKCniwUA3h9IqcGDB8v+/fvrpu+//77uuZkzZ8qcOXPkgw8+kEWLFklGRoZMnz7do+UFAADHzrIs+WbrAdmdWypBATY5Z0QXiQkP9nSxAMB3BuQNCgqS5OTkw9YXFBTIv//9b3n77bfl1FNPNeteffVVGThwoCxdulSOP/54D5QWAAC0hnV7C2T9vgKzfPqQZEmOCfN0kQDAt2qktm3bJl26dJFevXrJjBkzJD093axfuXKlVFVVyZQpU+q21WZ/KSkpsmTJEg+WGAAAHIs9eaWyaFvtgLsT+iRI745Rni4SAPhWjdS4cePktddek/79+5tmfQ888IBMmjRJNmzYIJmZmRISEiJxcXH1XpOUlGSeO5KKigozORQWFrrtGAAAQPPll1bKF+v3m7GiBiRHy6iUDp4uEgD4XiB1xhln1C0PGzbMBFapqany/vvvS3h4eIv3+9BDD5mgDAAAeFeGvs/X7ZfyarskxYTK5AGdGCsKgNfy+qZ9zrT2qV+/frJ9+3bTb6qyslLy8/PrbaNZ+xrrU+Vs1qxZpo+VY9qzZ4+bSw4AAI5Ea6DmbcqU3JJKiQwJlF+Q5hyAl/Opb6ji4mLZsWOHdO7cWUaNGiXBwcGyYMGCuue3bNli+lCNHz/+iPsJDQ2VmJiYehMAAPCcrUUBsuNAiQTabCaIigr16kYzAODdTfv+9Kc/ydlnn22a82lq8/vuu08CAwPl4osvltjYWLnqqqvklltukfj4eBMM3XjjjSaIImMfAAC+IyxlmGzIrx0f6qR+HSU5lgx9ALyfVwdSe/fuNUFTbm6udOzYUSZOnGhSm+uyevLJJyUgIMAMxKvJI6ZNmybPP/+8p4sNAACaKae0RhLPuV1EbDKwc7QM6UorEQC+wasDqXffffeIz4eFhclzzz1nJgAA4HvJJR5fclACI+MkNtgup/YnuQQA3+FTfaQAAID/eGTuZtmaWyU15cVyfMdqkksA8Cl8YwEAgDY3b2Om/Pv7XWY59/MnJMqr28gAwOEIpAAAQJvak1cqf/pgrVk+u1+klO1Y7ukiAYDLCKQAAECb9ou68Z3VUlheLSO6x8mlQ6M9XSQAaBECKQAA0GYen7dF1uzJl5iwIHn2kpESHEhyCQC+iUAKAAC0iR+258hL3+40y4//arh06xDh6SIBQIsRSAEAALc7WFIpt75f2y9qxrgUmTo42dNFAoBjQiAFAADcyrIsuevj9ZJZWC69OkbK3WcN8nSRAOCYEUgBAAC3+mDFXvlyQ6bpD/X0RSMlPCTQ00UCgGNGIAUAANxmV06J3D9no1m+dWp/GdI11tNFAoBWwfB3AADgqNLT0yUnJ8el11TbLbnr61wprayRIZ1CZHRUvqxatareNmlpaa1cUgBoGwRSAADgqEHUgIEDpay01KXXxU66VOJOuEhqyopk7v03yn+Lmg7EiouLW6GkANB2CKQAAMARaU2UBlEz7nhMklJ6N+815TZZlF17m3FC9zDp9vBLjW6XtnyRfPn6P6W8vLxVywwA7kYgBQAAmkWDqG59Bx91u4qqGpm3PF0b98nAztFy/KCmU51npe9o5VICQNsg2QQAAGhVC7cckKLyaokND5aT+3XydHEAwC0IpAAAQKvZnFkoW7KKxGYTmTY4SUKCuNUA4J/4dgMAAK2isKxKFm4+YJbH9YyXzrHhni4SALgNgRQAADhmdrslczdmSmWNXTrHhsmY1HhPFwkA3IpACgAAHLMVuw/K/oJyCQkMkGmDkyUgwObpIgGAWxFIAQCAY5JZUC5Ld+Wa5VP6dzRJJgDA3xFIAQCAFqustpsmfZYl0i8pSvonR3u6SADQJgikAABAiy3aekAKyqokOixITu3fSWyarg8A2gECKQAA0CLbsopk0/5CszxtULKEBgd6ukgA0GYIpAAAgMuKyqtkweZsszymRwfp2oFU5wDaFwIpAADgEsuyZN6mLKmotkun6FAZ1zPB00UCgDZHIAUAAFyyKj1f9h4sk6AAm5w+JFkCSXUOoB0ikAIAAM2WXVgui3fkmOWT+nWUDhEhni4SAHhEkGfeFgAA+Jpqu8jXGzPFbon07hgpg7vEeLpIAOAx1EgBAIBmWZcfKAdLqyQyNFAmD0wi1TmAdo1ACgAAHFV4n3Gyq7g2vflpA5MknFTnANo5AikAAHBE2SXVknDWTLM8MiVOUhMiPV0kAPA4AikAANCkqhq7PLk0XwLDoqRDiF0m9E70dJEAwCsQSAEAgCb9Y95W2ZJbJfbyYhmXWE2qcwA4hEAKAAA06pst2fLioh1mOXfuMxJJrl8AqEMgBQAADpNVWC63vr/WLJ/eO0JKt/zg6SIBgFchkAIAAPXU2C256d3VkltSKQM7x8gVIxgvCgAaIpACAAD1PPP1Nlm6M08iQgLluUtGSkgg/aIAoCECKQAAUGfJjlx5esE2s/y384dIr45Rni4SAHglAikAAGDkFFeYJn12S+RXo7rJ+SO7ebpIAOC1CKQAAIBU19jlhrdXSXZRhfTpFCUPnDvY00UCAK9GIAUAAOThLzebflGRIYHywozjJCKEXOcAcCQEUgAAtHOfrtknL3+/yyz/49fDpW9StKeLBABej0AKAIB2LG1/odzx4Tqz/IeTe8vpQzp7ukgA4BMIpAAAaKdyiyvkmjdXSHmVXSb1TZRbp/b3dJEAwGcQSAEA0A5VVNfI799cKXvyyiQlPkKevmikBAYwXhQANBeBFAAA7YxlWXLnh+tlxe6DEh0WJK9cMVo6RIZ4ulgA4FMIpAAAaGeeW7hdPl69z9RAvTBjlPTpRHIJAHAVgRQAAO3IZ2sz5PF5W83yA+cMlol9Ez1dJADwSQRSAAC0E99tOyC3vr/GLP92Qk+59PhUTxcJAHwWgRQAAO3A2j35JrlEVY0lvxjWWe4+a6CniwQAPo1ACgAAP7fjQLFc+dqPUlpZIxP7JJpBdwPI0AcAx4RACgAAP7Ynr1Qu+/dyySuplGHdYuXF34yS0KBATxcLAHwegRQAAH4cRF300lLZl18mvRIj5ZUrxkhUaJCniwUAfoFACgAAP7T3YKlc/H+1QVTPxEh555rjJTEq1NPFAgC/QSAFAICf1kTtPXgoiLr6eEmKCfN0sQDAr1C/DwCAH9mcWSiXv7Jcsgor6oKo5FiCKABobQRSAAD4ieW78uSq13+UovJq6ZcUJW/8dhxBFAC4CYEUAAB+YN7GTLnxndVSUW2X0akd5N+Xj5HYiGBPFwsA/BaBFAAAPsyyLHlh0Q557H9bxLJEpgxMkmcvGSlhwaQ4BwB3IpACAMBHlVZWy+3/WSefr9tvHl8yLkUePGewBAWSSwoA3I1ACgAAH83Md82bKyVtf6EEBdjk/nMGy6XHp3q6WADQbhBIAQDgY035Pl69T+79dKMUV1RLYlSIPD9jlIztGe/pogFAu0IgBQCAjygorZI/f7K+rimfJpV4+uKR0iUu3NNFA4B2h0AKAAAfqIX638YseWDORtlfUC6BATa5eXJfue7k3nX9odLT0yUnJ8ct75+WluaW/QKALyOQAgDAi6Xnlsr9czbK15uzzeMeCRHy1EUjZUT3uJ+3SU+XAQMHSllpqVvLUlxc7Nb9A4AvIZACAMALFZZXycvf7pR/fbvTjA0VHGiT35/YW64/pY+Eh9RPba41URpEzbjjMUlK6d3qZUlbvki+fP2fUl5e3ur7BgBfRSAFAICXpTR/bfFP8q9FO6WgrMqsG98rQf5y3hDp0ynqiK/VIKpb38GtXqas9B2tvk8A8HUEUgAAeIEDRRUye9lueWvpbskprjTrNHC69bR+cvqQZLHZbJ4uIgDACYGUF3Jnh+HExERJSUlxy74BX/9bqrFbUm0XqayxpMpumXWWJVK79DNdpzrEd5AuXbqYG9zgAJvp9B8UqMu1cx3bx1tvfvme8Z4kEqv35MvspekyZ22GVNbYzfqU+Ai5eUpfOXdEV5NYAgDgfQikvIy7OwyHR0TI5rQ0bnLglzekheXVkltcIbkllbJ51z656c57xB4ULraQcAkIjZSA0Agz2RzLIeFiCwwRW5BOwWILDHbxXbXz/5YjbqH9WoIOBVbBGmgFHJofCrQcyzrXKcTMbRIS9PPjuuW6dT8/71hvXhdkk5DAQPP64KAACTXrnPdT+3x21n456cSJUlZUKGKvFqumWsSqvYFvDXzPHN2OA8Xy6ZoM+XTNPtmd+/P3/ciUOLlyQk85Y0iy+dwAAN6LQMrLuLPDsLZxn/3IbeY9uMGBL9SMVNVYUlhhl4IKu+SX15h5QbndrMsvr11fUFFj1umy1iY5iznl6mMomSUN6wGcH1tiSU1VlYSEhIjYbKIVCVYTx1BVUyNS29XFa3T83cv1HtvEEq340Ft3M7eJBOkUYEngoeXAAKt2nVnW+c/PBQWIBNosKc7JkPlv/FPW7syUsA6dJCYsWEKDAry2Zq6tlFXWyPKf8mTh5mxZtPWA7MopqXsuPDjQNN27bHyqjEzp4NFyAgCaj0DKS7mrwzD8ky8009Jmc4VlVXKwtFIOllZJfmmlbN+TKXc98DexB4ZJQHiMBETESGBErARGxElAZJwEhh25Y31j7BWlUlOaLzUl+WIvLZC+Q0dKUlKyqbUJDQqsq+HRm3utrdFaIZ20+ZTWHNXObRJwlOZUm5Z9Iy/f8/v6K20BIgGBYgsMEltAoEhAkNgCA8UWEFS33szN+qC6583c8TqtFQsMkuDQcLnq6t9LRFS0CRCrTbNDbXLotFzjWBbTFLHR5ZrG1tmlxrIdFjbWWCI1tQ+cD8rFTyBVOl/2hNw494DI3AVmjdaQRYUGSXRYsESH6TxIokKDJebQsmN9lNNy7XPBh14XJJEhQUf9TLxFZbXdBEqbMwtldXq+rEo/KJsyCs35d9AAdFhSqJyYGi5ju4RKeLBdrJxdsipnV4vek3GeAKDtEUgBPq7Vm4MGBktAcKjYgsPEFhwq4dFx8vZ7H0h0h0Tzq3pZVbWUVdqlrKpGyiqrzby0skbKD81rt/l5rus0eNLsY46+Rc6iJ11+xOJoTUlooEhowM/zMJ0HWhIa8PM87NA8UIMWSZS05Rvly4//KWeMeElG9B0qra2suNDMz/r9n6X/sFGtuu+dG1bIJy/8Xf6xYo640+8efkP6DBsjNZYldrtlgl1dru0rZkl1jf1QjZrOf16urrFMX57Gni8tK5W83ByJSUiSsmrLfOa6XoNnnVpKK7RMUBXaeOBlpnrB2s9BmNaKmWAsNMgEdcdSO6bHrE1I80oqJK+kSnKKKyQjv0z2HiyTffll8lNOiQminIMmh46RwZK+fK4UbVkq5bvXyM7KMvlEWhfjPAFA2yGQAvykOeiFd/xDYjv3lEq7TSpNrYVNqvQm9tBytWPZzG21NRNW7XOmJkKXzb3f4TeZN3y8U2/vW6W8EcE2iQoJkOiQALFVlcra5T/IsLGTJLFjRwkLDpSIEMcUZMbKCWths7C2Stec0CW11WuPHWV3R5DmPCZQZUW5qZ1rTXu3bZQn/vpbeeStt6T/gAFSXm1JaZVO9gbzn5dLqjQwb/DY6XV6rWpAVlRebSYpaPlYRnolBQdqU0TtL1Zb+6hDMgUeusYCAgMlMCioNsmIBpj6t2G3zI8CJRXVZjyn5tAgrm9SlAzvFifHpXaQ41LiJGtnmoy+98lDTbdvk9bEOE8A0PYIpAAvpr9+HyiukMyC8tqp8Oe5/hKuv4hn5RdLyq0fy1IJFtnfeu+tN5Y2q0bKiw6KVVUuVlWF2HVeXSFWZe3cXlVh1td7/tBju26ny5XlYi8vkpqyIjMXu2k8Vk+ficNkQK8BrVd4P+GOIM3dQWZh3gEzv/TSS1tvp1pLWpcspHZucyyH1CYQ0eXadT9vY9Y5ng8JN7vS3woqa2ozM2qgdrhqEak4apG06WF8ZIiZOseFS7e4cOnaIdxk2+uXFC2dY8MO+wEg+9BjdzTdZpwnAGh7BFJeZHt2sfz2syzpduNs+WxvsNj27RC7pV3apXZu1f6aqn04Gps05XJocG3fD+0LUjs/tBwcIEVlNgnp3E8yiqoltbhCYsODTbpmeKYPU3m1XfLK7JJbViN5ZTWSW1q7XPtYn6sxCRUaaSF0GM04p/TX9dpanMDaTG6HsrkdadmRVc5keqvLJlfbV2jlgs9k9vO31daMjHFfzQi/ovsPdzZ5dL5mXN2/3ao0Na6mhunQ3G7Zfl4WkYPZGfLV2y/K3XffJT1Te5jmhDrpt2RokE3CgmwSHhQg4cG1NVn1ldVOJXmSuVMks7Gy048JAPwKgZRXscyNs3a21yZYtf+1N9zi547mrgs2ncBv+PKAyJdfmTXabyAuIlg6RISYwCouIkQ6RARLXHiwxDqWI4IlNtyxHGJ+iSUAa5r2FdLaIh1QUwfY3LJ7vzzw2FNihcVKYFSCBEbrlNjsRAqamrqmJE+qi3Klxkw5UlOcKzXFB6WmrNAkVKgpLZQr/vwPGXzcOLccky/WjMCz3H3NuGP/m5ZlS8We9XLP7y8Wd6IfEwD4B78JpJ577jl57LHHJDMzU4YPHy7PPPOMjB07VnxJtw4R8o+piXLxhb+Wy+9+Sjr36CsB2rzK/CJaO9fwyXQIbzBV27UfgSUVVXapqK4x7fhrp5pD6+xSXFIiuTkHJLJDJ6mw1/6a6uhzsCdPf01tvshD/Vxq+7rYJDo0QCKCAyQ+OkKSE+MlMjTQdOzWKSq0tr+L9hkw60K0hqy2xkxrRbw5E5ee2+LyapMoQafC8kPzsirJK60NlByTBk86147oDUWO/VWj+9d00eGBIuGBloQHOS2bScw6TaBgs8WJiE69m/yFvqqCWh3AF2rTqIEFAP/gF4HUe++9J7fccou8+OKLMm7cOHnqqadk2rRpsmXLFunUqZP4Cu1o3zMuWKpy0iUmWEwtUWsy6ZpfPJSu2RYgAWFREhAeLQFh0RJ4aB4QHtXg8c9zrUHR16gS0yG8RrJKGvZ30cxxrqXh1r7u2uk72AxOKmZuHpt1tamTQzQdtdglOCjw0Bg3trqxbhyPNRyrW7bVNoesbb5TGxDVLTuvtzQzmUhFjSU1tkCx2wLrMtBp1rkS7UzRAhogJkaFSMfoUAmuKZeFX3wi404+Tbp06WICSjOFBZntjnV8HWp1gNZFDSwAoN0EUk888YRcffXVcuWVV5rHGlD997//lVdeeUXuvPNOTxfPr35t1X4G2uxQs8I5ssNV1tTOszMzZMu6FaZTd0BwuNh0HqLzsENz7fAdZtbbdLydQxxj5DQ+nGlDnhnVVAfM1KaPMeFBtfOw2maOGijp5AiaOunjqDCznSNAWrVqlXx483My8NzJ0q1LrEfKDwAAgNbl84FUZWWlrFy5UmbNmlW3LiAgQKZMmSJLlixp9DUVFRVmcigoKDDzwsLaQMOTHG3nNYVwRVkrjQvU4NfQqsqKY9631pXV1ZcF1k5l2eskf+ErMuaMC6VbSt9DgZG+j9N7VYpYOplntY7JJnYJMIOB2m21jy3HY7Ncuz4nK0N2b1kvqYPHSGxCbS2jPlcbeh2a22rnur527aFnrNo1zs86v1LfrawgVzZ8P1du+eMN0rtHioQG2iQ0qLaDeaQO2qqjZ9apPjQdag6pXdkKRaoLRTKkdnKmNaPu+kydP9fMn7bKjsgIn9m3u/dP2T2zf8rumf1Tds/sn7J7Zv++um9379+Xy35g7666+2BvuB93lEGHwTgSm3W0LbxcRkaGdO3aVRYvXizjx4+vW3/77bfLokWLZNmyZYe95v7775cHHnigjUsKAAAAwFfs2bNHunXr5r81Ui2htVfap8rBbrdLXl6eJCQkHHN/FVci3e7du5sPKCYmpk3eE76JawXNxbWC5uA6QXNxraC9XiuWZUlRUZHp234kPh9IJSYmSmBgoGRlZdVbr4+Tk5MbfU1oaKiZnMXFaUa0tqcXmz9ccHA/rhU0F9cKmoPrBM3FtYL2eK3Exh69X7vPDwYUEhIio0aNkgULFtSrYdLHzk39AAAAAKC1+HyNlNJmepdffrmMHj3ajB2l6c9LSkrqsvgBAAAAQGvyi0DqwgsvlAMHDsi9995rBuQdMWKEzJ07V5KSksRbadPC++6777AmhkBDXCtoLq4VNAfXCZqLawXNFdpOrxWfz9oHAAAAAG3N5/tIAQAAAEBbI5ACAAAAABcRSAEAAACAiwikAAAAAMBFBFIe8txzz0mPHj0kLCxMxo0bJ8uXL/d0keBB999/v9hstnrTgAED6p4vLy+X66+/XhISEiQqKkouuOCCwwahhn/69ttv5eyzzzajq+t18cknn9R7XvMFacbSzp07S3h4uEyZMkW2bdtWb5u8vDyZMWOGGSRRBx+/6qqrpLi4uI2PBJ6+Vq644orDvmdOP/30ettwrfi/hx56SMaMGSPR0dHSqVMnOe+882TLli31tmnO/znp6ely1llnSUREhNnPbbfdJtXV1W18NPD0tXLyyScf9r1y7bXXtptrhUDKA9577z0z9pWmiVy1apUMHz5cpk2bJtnZ2Z4uGjxo8ODBsn///rrp+++/r3tu5syZMmfOHPnggw9k0aJFkpGRIdOnT/doedE2dEw8/Y7QH18a8+ijj8rTTz8tL774oixbtkwiIyPN94neCDnojfHGjRtl/vz58vnnn5sb7muuuaYNjwLecK0oDZycv2feeeedes9zrfg//T9Eg6SlS5eaz7mqqkqmTp1qrp/m/p9TU1NjbowrKytl8eLF8vrrr8trr71mftRB+7pW1NVXX13ve0X/X2o314qmP0fbGjt2rHX99dfXPa6pqbG6dOliPfTQQx4tFzznvvvus4YPH97oc/n5+VZwcLD1wQcf1K1LS0vTYQusJUuWtGEp4Wn6mX/88cd1j+12u5WcnGw99thj9a6X0NBQ65133jGPN23aZF73448/1m3z5ZdfWjabzdq3b18bHwE8da2oyy+/3Dr33HObfA3XSvuUnZ1tPvdFixY1+/+cL774wgoICLAyMzPrtnnhhResmJgYq6KiwgNHAU9cK+qkk06ybrrpJqsp/n6tUCPVxjQiX7lypWl+4xAQEGAeL1myxKNlg2dpcyxtktOrVy/zq7BWhSu9XvRXIOdrRpv9paSkcM20c7t27TKDkDtfG7Gxsaa5sOPa0Lk20Ro9enTdNrq9fu9oDRbal2+++cY0renfv79cd911kpubW/cc10r7VFBQYObx8fHN/j9H50OHDpWkpKS6bbQmvLCw0NRoon1cKw6zZ8+WxMREGTJkiMyaNUtKS0vrnvP3ayXI0wVob3Jyckw1p/MFpfTx5s2bPVYueJbe+GpVt97caLX4Aw88IJMmTZINGzaYG+WQkBBzg9PwmtHn0H45Pv/Gvk8cz+lcb5ydBQUFmf8IuX7aF23Wp82zevbsKTt27JC77rpLzjjjDHOjExgYyLXSDtntdrn55ptlwoQJ5iZYNef/HJ039r3jeA7t41pRl1xyiaSmppofgtetWyd33HGH6Uf10UcftYtrhUAK8AJ6M+MwbNgwE1jpF9P7779vEggAwLG66KKL6pb1F2L9rundu7eppZo8ebJHywbP0P4v+oOdc59cwJVr5RqnPpT6vaKJj/T7RH+s0e8Xf0fTvjamVZ/6y1/D7Df6ODk52WPlgnfRXwL79esn27dvN9eFNgnNz8+vtw3XDByf/5G+T3TeMJGNZkvS7GxcP+2bNiPW/5P0e0ZxrbQvN9xwg0kosnDhQunWrVvd+ub8n6Pzxr53HM+hfVwrjdEfgpXz94o/XysEUm1Mq8tHjRolCxYsqFddqo/Hjx/v0bLBe2i6Yf01R3/Z0eslODi43jWj1ebah4prpn3TJlr6H5HztaHtzrU/i+Pa0LneEGm/B4evv/7afO84/sND+7R3717TR0q/ZxTXSvuguUj0xvjjjz82n69+jzhrzv85Ol+/fn29wFuzumna/EGDBrXh0cCT10pj1qxZY+bO3yt+fa14OttFe/Tuu++arFqvvfaayZJ0zTXXWHFxcfUymqB9ufXWW61vvvnG2rVrl/XDDz9YU6ZMsRITE02GHHXttddaKSkp1tdff22tWLHCGj9+vJng/4qKiqzVq1ebSb+yn3jiCbO8e/du8/zDDz9svj8+/fRTa926dSYrW8+ePa2ysrK6fZx++unWyJEjrWXLllnff/+91bdvX+viiy/24FGhra8Vfe5Pf/qTybqm3zNfffWVddxxx5lroby8vG4fXCv+77rrrrNiY2PN/zn79++vm0pLS+u2Odr/OdXV1daQIUOsqVOnWmvWrLHmzp1rdezY0Zo1a5aHjgqeuFa2b99uPfjgg+Ya0e8V/X+oV69e1oknnthurhUCKQ955plnzJdUSEiISYe+dOlSTxcJHnThhRdanTt3NtdD165dzWP9gnLQm+I//OEPVocOHayIiAjr/PPPN19m8H8LFy40N8UNJ01l7UiBfs8991hJSUnmB5rJkydbW7ZsqbeP3NxcczMcFRVlUs5eeeWV5sYa7eda0RsfvZHRGxhNbZ2ammpdffXVh/2Ax7Xi/xq7RnR69dVXXfo/56effrLOOOMMKzw83Pzwpz8IVlVVeeCI4KlrJT093QRN8fHx5v+fPn36WLfddptVUFDQbq4Vm/7j6VoxAAAAAPAl9JECAAAAABcRSAEAAACAiwikAAAAAMBFBFIAAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAvMKbb74pX3zxhaeLAQDNQiAFAGgRm80mn3zyiUfe+/7775cRI0a4/X2++eYbc5z5+fmtts/Kykrp06ePLF68uNX26S+OP/54ufbaa2Xt2rVt8n45OTnSqVMn2bt3b5u8HwD/QiAFAE6uuOIKOe+88zxdDPixF198UXr27CknnHCCtEdHCk779u0r77//vlx22WVSWFjo9rIkJiaa97rvvvvc/l4A/A+BFACgTlVVlaeL4Ncsy5Jnn31WrrrqKvHFsldXV7dJrZTWSMXExLRJea688kqZPXu25OXlHfO+ALQvBFIA4IJFixbJ2LFjJTQ0VDp37ix33nlnvZu5k08+Wf74xz/K7bffLvHx8ZKcnGyaoTnbvHmzTJw4UcLCwmTQoEHy1Vdf1Wsm19gv9mvWrDHrfvrpp7p133//vUyaNEnCw8Ole/fu5n1LSkqO2PQuLi5OXnvtNbOs+9Jt3nvvPTnppJNMefSGsjHbtm2TE088sa7M8+fPP2ybPXv2yK9//WvzHnrs5557br3y6nHpuYuMjDTbTJgwQXbv3t3kudbmVhdffLHZl75m9OjRsmzZssP61PTo0UNiY2PloosukqKiorrndP1TTz1Vb3ttDuj8eejxv/zyy3L++edLRESEqRH57LPPmixTaWmpnHHGGabs+vloM70bbrjBXAt6blJTU+Whhx5q8vUrV66UHTt2yFlnneXSuXPUlD7++OPmvRISEuT6668/auA7Z84cGTNmjCmb1r7ocTqfOz2n0dHR5jq95JJLJDs7u+55x3X45ZdfyqhRo8w1r9ecll/Ll5SUJFFRUWb/eg07q6iokDvuuMNcl/o6bcr473//2xzTKaecYrbp0KGD2b8em7Lb7ebcaW2dXtNDhw6Vd95556jlafi64cOHy3/+85+61x08eFBmzJghHTt2NM/rZ/zqq6/WPT948GDp0qWLfPzxx0c8lwDQEIEUADTTvn375MwzzzQ3jvqL+QsvvGBuDv/617/W2+711183N/560//oo4/Kgw8+WBd41NTUmBtivWnX51966SX585//7HJZ9Gb29NNPlwsuuEDWrVtngiG9qdSbeldpMHjTTTdJWlqaTJs27bDn9UZ1+vTpEhISYsqsTdP0JtmZ3tDra/Wm/LvvvpMffvjB3GRrGTXY0GBTj1sDNi3vkiVL5JprrjE3xo0pLi422+o518BGz7cGp1oW53OggeLnn39uJg1yH374YZeP/4EHHjBBjJZLP1+96W6sdkIDp9NOO82UQT9PDXqefvppUz5tjrZlyxYTiGoA1xQ9N/369TPnqbnnzmHhwoXmmHWu15gGxI6guDH//e9/TeCkx7R69WpZsGCBCWSd3/cvf/mLObd6HjXIcQQ1Da8PPa96fQwbNsx8NrpP3Z/uV8t59tlnS3p6et1rtLmcBkF6fvR1//rXv8wxaWD14Ycfmm30fO3fv1/++c9/mscaDOn507+JTZs2yW233Sa//e1vZd68eUcsj77ujTfeMNflxo0bZebMmXLppZea60Hdc889Zn8agOlr9O9Wg0pnel703AOASywAQJ3LL7/cOvfccxt97q677rL69+9v2e32unXPPfecFRUVZdXU1JjHJ510kjVx4sR6rxszZox1xx13mOUvv/zSCgoKsvbv31/3/Pz58y39Ov7444/N44ULF5rHBw8erNtm9erVZt2uXbvM46uuusq65ppr6r3Pd999ZwUEBFhlZWXmsfM+HWJjY61XX33VLOu+dJunnnrqiOfkf//7nynzvn376tbpcTjv/8033zzs3FRUVFjh4eHm9bm5uWb7b775xmqOf/3rX1Z0dLR5XWPuu+8+KyIiwiosLKxbd9ttt1njxo2re5yammo9+eST9V43fPhw81oHLdPdd99d97i4uNis0+Nz/izS0tKsYcOGWRdccIE5Locbb7zROvXUU+sd95HcdNNNZntnRzt3jutSj6e6urpum1/96lfWhRde2OR7jR8/3poxY4bVXD/++KM51qKionrH/sknnxz1tYMHD7aeeeYZs7xlyxbzOr2uG9PY9V1eXm4+z2XLltXb9uqrrzbH2VR5HK9bvHhxvdfp38fFF19sls8++2zryiuvPGL5Z86caZ188slHPU4AcEaNFAA0k/6aPX78+Hq1KNrES3+hd876pb+SO9OmWI4mU/orvP4qr02pHJxrCZpLaxG0NkJ/5XdMWquhtSW7du1yaV/avOtox61l1uZPDnoeGpZn+/btplbFUR5tolZeXm5qUXRZazu0jFp7obUQWhvRFG3KOHLkSPO6pmjNj3PNjvN5doXz56U1ido3p+F+tCZKm6dpzZ/WzDnoMWlZ+/fvb5pWNqw9aaisrMw0s3Pl3Dk3QQsMDGz28Wq5Jk+efMRmhvpZpKSkmPfWGkDlXLPU2PWh1/uf/vQnGThwoKmV0/LqNeJ4nb6vltOxv+bQ49dmk+PGjTN/X47p//7v/2Tnzp1NlsfxOv18nP8WtIbKce6uu+46effdd02zTq3VbCxbojb50/0AgCuCXNoaAHBUwcHB9R7rDaFzk7SjCQio/Y2rtsKkVsO+MHoz+/vf/97cvDekN8aO93XeR2P7cQQPx0rLo/1WGutjpX1TlPZL0fLOnTvXBCR33323aSKnyQUau7E91vOs57E5x9+cz0v7NGmTNG0ipn13HI477jgTuGqzMe0npE0Ep0yZUq+PjjNtUrZ+/XqXz11zy9ncc6h96TSo1UnfV99HAyF97NycsLHrQ4Mo/dy0v5YGl/o+v/zlL+te15zPriE9B0qDJu3rdCTO5XG8Tpsxdu3atd522odKaZ827Yun41NpuTW41P5lWn4HbcrpfK4BoDmokQKAZtJf4LVvj/PNufZn0V/zu3Xr1qx9aM2FJhbIysqqW/fjjz/W28ZxQ+dcY6O/8jvTG3i9qdcb2YaTo8ZE9+O8D00Y0ZJf3fW4tczO+1q6dOlh5dH965g8DcujiSActJZp1qxZplZgyJAh8vbbbzdZS6THfCyZ1Boev6bTdrW2zkH75Fx++eXmJlzPuzOtwbrwwgtN7YkGiBpwNVVuPX5NNuJ8DTX33LlKz6H2Y2qMliE3N9cclyYsGTBgQLNr8/Sa15o47X+lQaXWrjonxtB1GuA5+ig15Lg+tb+ggyYw0cCnqfI2xfE6DQIbnjutRXW+FvTze+utt0wCEu2H5WzDhg3mswEAVxBIAUADBQUF5ibeedJA4g9/+IOZ33jjjeZG9NNPPzXjz9xyyy11tUhHo02QevfubW7qNLmB3pRqzYxyNBl03ARqdjm9wdZf2//xj3/U248me9BgRJNLaPl0Oy2Pc7KJU0891aTa1oQAK1asMAOdNqzVaA6tYdEECVpmbYamnfIbJsjQBA1a26LZ3PR5DVg0y5rWQGmzR32sAZQGolo7oE3gtMwapDVGs/XpDbomqNBzpDUVGqDo65tLj18z02l5tBZIy+/cNM5VWoOhx6n71c9fPfHEEyapgj7eunWrfPDBB6bc2uStMZqxTmtRNClCc89dS+m1qWXTuTa903PwyCOP1NVaakDzzDPPmHOrCTM08URzaNa7jz76yFx3ej1otj/nmjFtcqnnWhNFaBILx/FoQg6lmQ31WtcEIQcOHDDnQ3+M0JoubXr3yiuvmCZ7q1atMkGPZlVsiuN1mmBCE3Bocz59nR6XPlb33nuv+dvQfep51/d1vu70xwVt5jh16tQWn2sA7VS9HlMA0M5pp379amw4aed1pckSNHlESEiIlZycbJJIVFVV1b1ek01oQgFnmrxC9+ugiQsmTJhg9jFgwABrzpw55j3mzp1bt833339vDR061AoLC7MmTZpkffDBB/WSTajly5dbp512mkl2ERkZaZIh/O1vf6t7XpNDTJ061TzXt29f64svvmg02YQmsjgaTSCgSTS0zP369TNlbZjMQhNoXHbZZVZiYqIVGhpq9erVyyQLKCgosDIzM63zzjvP6ty5s9mHJk64995765J0NOann34yyR1iYmJMQoHRo0fXJSPQhBGaOMKZJpbQ/Tro+2oyBn199+7drddee63RZBNHSsjRWGIETTChx6Hn5KWXXrJGjBhhzrG+z+TJk61Vq1Yd8Vz++te/tu68885664507ppKgqLXmV5vR/Lhhx+a8uk5131Pnz697rm3337b6tGjh3k/TUzx2Wef1bseGjt2x3VzyimnmGQYel6fffbZw657TXiiCRwcn3efPn2sV155pe75Bx980Pz92Gy2ur8NTbahiU808UZwcLDVsWNHa9q0adaiRYuOWJ6jve4vf/mLNXDgQFPe+Ph4cx537txZ7zzoawHAVTb9x9PBHAC0Z1rjouNK6S/mWlsF/6Y1kVozqbUnmhgBnqV99LT2T2vWAMAVBFIA0MZ04E+9gdYmUho86RhOOjipjgOF9kEzLmqCCefEFWh7OTk5pimhjlnV1JhmANAUAikAaGOamlkH8dUO8to3RvsgaR+ohIQETxcNAAA0E4EUAAAAALiIrH0AAAAA4CICKQAAAABwEYEUAAAAALiIQAoAAAAAXEQgBQAAAAAuIpACAAAAABcRSAEAAACAiwikAAAAAMBFBFIAAAAAIK75fz2JYsgh9artAAAAAElFTkSuQmCC", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "import matplotlib.pyplot as plt\n", - "import seaborn as sns\n", - "\n", - "plt.figure(figsize=(10, 6))\n", - "sns.histplot([len(text.page_content) for text in texts], bins=30, kde=True)\n", - "plt.title(\"Distribution des longueurs des chunks\")\n", - "plt.xlabel(\"Longueur des chunks (en caractères)\")\n", - "plt.ylabel(\"Nombre de chunks\")\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "43bf41cd", - "metadata": {}, - "source": [ - "Nous observons des chunks avec très peu de caractères. Inspecter les contenus des documents avec moins de 100 caractères et noter les améliorations possibles." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "8d300959", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INTRODUCTION AU MACHINE LEARNING\n", - "2022-2026\n", - "Théo Lopès-Quintas\n", - "------------------------------\n", - "vue un peu plus complète du domaine, ainsi qu’un aperçu plus récent des développements en cours.\n", - "2\n", - "------------------------------\n", - "3. À condition que l’algorithme soit performant.\n", - "7\n", - "------------------------------\n", - "Pour essayer de comprendre ce passage, faisons un exercice :\n", - "4. Voir l’équation (2.3).\n", - "8\n", - "------------------------------\n", - "données avec lesquelles on mesure notre performance.\n", - "10\n", - "------------------------------\n", - "le résultat, on peut vérifier la cohérence de la formule avec un exercice.\n", - "15\n", - "------------------------------\n", - "Quel est l’intérêt d’ajouter cette pénalité en terme de biais et variance?\n", - "19\n", - "------------------------------\n", - "Figure 2.3– Simulation d’un mini-jeu de basketball\n", - "20\n", - "------------------------------\n", - "L(θ;x, y) =−\n", - "h\n", - "yln{f θ(x)} + (1−y) ln{1−f θ(x)}\n", - "i\n", - "Observation positive\n", - "Observation négative\n", - "24\n", - "------------------------------\n", - "28\n", - "------------------------------\n", - "L’idée est de partitionner l’espace engendré parD, dont voici la procédure à chaque étape :\n", - "33\n", - "------------------------------\n", - "définir ce que l’on appelle intuitivementla meilleure coupure.\n", - "34\n", - "------------------------------\n", - "Devant cet exemple jouet, on peut imaginer une situation plus proche de la réalité :\n", - "37\n", - "------------------------------\n", - "Pour saisir l’intérêt de la proposition, résolvons l’exercice suivant.\n", - "38\n", - "------------------------------\n", - "40\n", - "------------------------------\n", - "des champions.\n", - "41\n", - "------------------------------\n", - "42\n", - "------------------------------\n", - "•On définit la fonction de perteL:Y × Y. Par exempleL(y, f(x)) = (y−f(x))2.\n", - "44\n", - "------------------------------\n", - "fm(x) =f m−1(x)−γ\n", - "nX\n", - "i=1\n", - "∂C\n", - "∂fm−1\n", - "\u0000\n", - "x(i)\u0001\n", - "\u0010\n", - "yi, fm−1\n", - "\u0010\n", - "x(i)\n", - "\u0011\u0011\n", - "=f m−1(x) +γ ′hm(x)\n", - "45\n", - "------------------------------\n", - "i (xi −µ k)\n", - "2. Conclure sur la convergence deJ.\n", - "53\n", - "------------------------------\n", - "pour amener le clustering vers sa meilleure version.\n", - "62\n", - "------------------------------\n", - "3. Que nous ne démontrerons pas\n", - "68\n", - "------------------------------\n", - "6. Puisqu’on peut normaliser la distance par rapport au voisin le plus éloigné.\n", - "71\n", - "------------------------------\n", - "6. Voir le space Tokenizer Playground sur Hugging Face\n", - "81\n", - "------------------------------\n", - "7. Plus précisément, les langues considérées sont essentiellement latine ou cyrillique.\n", - "83\n", - "------------------------------\n", - "84\n", - "------------------------------\n", - "100002×3/4\n", - "\u0013\u0013\n", - "= (0.84,0.99,0.01,0.99)\n", - "9. À notre connaissance.\n", - "85\n", - "------------------------------\n", - "physique. Étudions plus en détail les possibilités de cette fonction d’activation.\n", - "86\n", - "------------------------------\n", - "10. Pour plus de détails, voir la section (G.1)\n", - "89\n", - "------------------------------\n", - "11. Dépendant donc de la méthode de tokenization et de la taille du vocabulaire.\n", - "90\n", - "------------------------------\n", - "Appendices\n", - "94\n", - "------------------------------\n", - "95\n", - "------------------------------\n", - "donner. Il nous faudrait une caractérisation plus simple d’utilisation :\n", - "96\n", - "------------------------------\n", - "existe deux minimaux globaux et on aboutit à une absurdité en exploitant la stricte convexité.\n", - "99\n", - "------------------------------\n", - "∥xi∥. Alors lak-ième erreur de classification du perceptron aura lieu avant :\n", - "k⩽\n", - "\u0012R\n", - "γ\n", - "\u00132\n", - "∥w∗∥2\n", - "104\n", - "------------------------------\n", - "P({y=k}) × P\n", - "\n", - "\n", - "d\\\n", - "j=1\n", - "xj | {y=k}\n", - "\n", - "\n", - "P\n", - "\n", - "\n", - "d\\\n", - "j=1\n", - "xj\n", - "\n", - "\n", - "(C.1)\n", - "110\n", - "------------------------------\n", - "exploratoire et d’augmentation des données pour répondre à un problème de Machine Learning.\n", - "114\n", - "------------------------------\n", - "115\n", - "------------------------------\n", - "résoudre le problème, mais ne le résolvent clairement pas par construction.\n", - "116\n", - "------------------------------\n", - "époque il y avait également Yann Le Cun, à la tête de la recherche chez Meta.\n", - "117\n", - "------------------------------\n", - "119\n", - "------------------------------\n", - "2.Kernelen allemand.\n", - "126\n", - "------------------------------\n", - "3. Pour ça on utilise le théorème de Mercer, mais nous ne le présenterons pas ici.\n", - "127\n", - "------------------------------\n", - "s’améliore! Deux phénomènes contre-intuitifs se réalisent :\n", - "133\n", - "------------------------------\n", - "with categorical features support.arXiv preprint arXiv :1810.11363, 2018.\n", - "137\n", - "------------------------------\n" - ] - } - ], - "source": [ - "for doc in texts:\n", - " if len(doc.page_content) < 100:\n", - " print(doc.page_content)\n", - " print(\"-\" * 30)" - ] - }, - { - "cell_type": "markdown", - "id": "f69b2033", - "metadata": {}, - "source": [ - "Nous avons à présent un ensemble de chunk, il nous reste à construire l'embedding pour stocker toute ces informations. Nous faisons les choix suivants :\n", - "* Nous utiliserons l'embedding [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) pour sa taille et son entraînement spécifique à notre tâche\n", - "* Nous utiliserons le *vector store* [FAISS](https://python.langchain.com/docs/integrations/vectorstores/faiss/) puisque nous l'avons couvert en cours.\n", - "* Nous récupérerons les trois chunks les plus proches, pour commencer" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "40021b12", - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "\n", - "from langchain_community.vectorstores import FAISS\n", - "from langchain_huggingface import HuggingFaceEmbeddings\n", - "\n", - "os.environ[\"USE_TF\"] = \"false\"\n", - "os.environ[\"USE_TORCH\"] = \"true\"\n", - "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\"\n", - "\n", - "\n", - "embedding_model = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n", - "vectordb = FAISS.from_documents(texts, embedding_model)\n", - "n_doc_to_retrieve = 3\n", - "retriever = vectordb.as_retriever(search_kwargs={\"k\": n_doc_to_retrieve})" - ] - }, - { - "cell_type": "markdown", - "id": "ed148169", - "metadata": {}, - "source": [ - "Notre base de connaissance est réalisée ! Passons maintenant à l'augmentation du modèle de langage.\n", - "\n", - "## Génération\n", - "\n", - "Pour cette étape, il nous reste à définir le modèle de langage et comment nous allons nous adresser à lui.\n", - "\n", - "**Consigne** : Définir la variable *model* à partir de la classe [OllamaLLM](https://python.langchain.com/api_reference/ollama/llms/langchain_ollama.llms.OllamaLLM.html#ollamallm) et du modèle de votre choix." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "4abfbda6", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_ollama import OllamaLLM\n", - "\n", - "model = OllamaLLM(model=\"gemma3:4b\", base_url=\"http://localhost:11434\")" - ] - }, - { - "cell_type": "markdown", - "id": "d42c7f56", - "metadata": {}, - "source": [ - "**Consigne** : À l'aide de la classe [PromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html#langchain_core.prompts.prompt.PromptTemplate) et en s'inspirant éventuellement de [cet exemple](https://smith.langchain.com/hub/rlm/rag-prompt), définir un template de prompt qui aura deux *input_variable* : 'context' et 'question'." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c3c7729", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_classic.prompts import PromptTemplate\n", - "\n", - "prompt_template = PromptTemplate(\n", - " template=\"\"\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n", - "Question: {question}\n", - "Context: {context}\n", - "Answer:\"\"\",\n", - " input_variables=[\"context\", \"question\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0da52ea4", - "metadata": {}, - "source": [ - "Pour construire la chaîne de RAG, LangChain utilise le [LangChain Expression Language (LCEL)](https://python.langchain.com/v0.2/docs/concepts/#langchain-expression-language-lcel), voici dans notre cas comment cela se traduit :" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "id": "c51afe07", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_core.output_parsers import StrOutputParser\n", - "from langchain_core.runnables import RunnablePassthrough\n", - "\n", - "\n", - "def format_docs(docs: list) -> str:\n", - " \"\"\"Format documents into a single string.\"\"\"\n", - " return \"\\n\\n\".join(doc.page_content for doc in docs)\n", - "\n", - "\n", - "rag_chain = (\n", - " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", - " | prompt_template\n", - " | model\n", - " | StrOutputParser()\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "7db86940", - "metadata": {}, - "source": [ - "Une fois la chaîne définie, nous pouvons lui poser des questions :" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "id": "02444b65", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Answer: Alan Turing’s famous quote is, “We can only have glimpses of the future, but that is enough to make us realize how much there is to do.” This quote came from his 1950 article, Computing Machinery and Intelligence. It remains relevant today as concerns about security and toxicity increase.\n" - ] - } - ], - "source": [ - "query = \"Quelle est la citation d'Alan Turing ?\"\n", - "result = rag_chain.invoke(query)\n", - "print(\"Answer:\", result)" - ] - }, - { - "cell_type": "markdown", - "id": "3ffe0531", - "metadata": {}, - "source": [ - "LangChain ne permet pas nativement d'afficher quels chunks ont été utilisé pour produire la réponse, ni le score de similarité. Pour le faire, nous allons utiliser directement FAISS.\n", - "\n", - "**Consigne** : À l'aide de la méthode [`similarity_search_with_score`](https://python.langchain.com/v0.2/docs/integrations/vectorstores/llm_rails/#similarity-search-with-score) de `FAISS`, afficher les trois documents utilisé dans le RAG." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "95d81fe2", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "6aeeadf8", - "metadata": {}, - "source": [ - "Nous avons finalement bien défini notre premier RAG !\n", - "\n", - "## Amélioration de notre RAG\n", - "\n", - "Mais nous pouvons faire mieux, notamment afficher la source dans la génération pour que l'utilisateur puisse vérifier et mesurer les performances de notre RAG. Une fois que nous aurons réalisé ces deux améliorations, alors nous pourrons modifier plusieurs points techniques spécifique et mesurer l'apport en performance.\n", - "\n", - "### Exploiter les méta-données\n", - "\n", - "Nous avons utilisé la classe `PyPDFLoader` qui charge chaque page dans un document. Nous avons largement utilisé le contenu *page_content* mais l'attribut *metadata* contient deux informations qui nous intéressent : *source* et *page*. \n", - "\n", - "**Consigne** : Modifier la fonction `format_doc` pour qu'elle prenne en paramètre une liste de document LangChain puis qu'elle affiche la source et la page en plus de seulement le contenu texte." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cae9a90c", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "0363d832", - "metadata": {}, - "source": [ - "Maintenant que nous passons des informations sur les métadonnées, il faut s'assurer que le modèle de langage les utilises.\n", - "\n", - "**Consigne** : Modifier le prompt template défini plus tôt pour intégrer cette règle." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a57e10a6", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "260f39f4", - "metadata": {}, - "source": [ - "Testons à présent avec la même question sur une nouvelle chaîne RAG prenant en compte nos améliorations.\n", - "\n", - "**Consigne** : Définir un nouveau RAG prenant en compte les informations des méta-données, puis poser la même question." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b3824802", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "973dfa8d", - "metadata": {}, - "source": [ - "C'est ce que nous souhaitions obtenir ! Mais nous pourrions avoir un format un peu plus structuré et moins libre. Pour cela, nous allons modifier notre système pour qu'il renvoie des JSON !\n", - "Commençons par modifier le template de prompt pour lui donner les instructions :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d4892e8d", - "metadata": {}, - "outputs": [], - "source": [ - "prompt_template = PromptTemplate(\n", - " template=\"\"\"\n", - " You are an assistant for question-answering tasks, use the retrieved context to answer the question. Each piece of context includes metadata (source + page).\n", - " If you don’t know the answer, respond with: {{\"answer\": \"I don't know\", \"sources\": []}}\n", - " Otherwise, return your answer in JSON with this exact structure:\n", - " {{\n", - " \"answer\": \"your answer here\",\n", - " \"sources\": [\"source:page\", \"source:page\"]\n", - " }}\n", - " Rules:\n", - " - Answer in the same language as the question.\n", - " - Always include the sources (source:page).\n", - " - Never add extra fields.\n", - "\n", - " Question: {question}\n", - " Context:\\n{context}\n", - " Answer:\n", - " \"\"\",\n", - " input_variables=[\"context\", \"question\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "01e34935", - "metadata": {}, - "source": [ - "Puisque nous demandons ici de répondre par exemple : '['ML.pdf:91\"], nous allons lui faciliter la tâche en modifiant la fonction `format_docs`.\n", - "\n", - "**Consigne** : Modifier la fonction `format_docs` pour prendre en compte le formattage 'source:page'." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "547f6ea2", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "0238f9f6", - "metadata": {}, - "source": [ - "Si nous souhaitons obtenir un JSON, ou un dictionnaire, en sortie du modèle, nous devons modifier la chaîne RAG définie précédemment.\n", - "\n", - "**Consigne** : Remplacer la fonction [`JsonOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html) à la place de [`StrOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.string.StrOutputParser.html#langchain_core.output_parsers.string.StrOutputParser) puis tester la nouvelle chaîne RAG avec la même question." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c0f90db7", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "3db037d1", - "metadata": {}, - "source": [ - "C'est mieux ! Il nous reste à présent à mesurer la performance de notre système.\n", - "\n", - "\n", - "### Mesurer les performances\n", - "\n", - "Nous avons défini manuellement plusieurs questions dont les réponses sont contenus dans le cours dans le fichier JSON *eval_dataset*." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d4398984", - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "\n", - "with open(\"eval_dataset.json\", encoding=\"utf-8\") as file:\n", - " eval_dataset = json.load(file)\n", - "\n", - "print(eval_dataset[0])" - ] - }, - { - "cell_type": "markdown", - "id": "37b8eb75", - "metadata": {}, - "source": [ - "Il sera probablement difficile de mesurer la performance de manière frontale. Ainsi, nous optons pour une méthodologie *LLM as a Judge*.\n", - "\n", - "**Consigne** : Définir une fonction `evaluate_rag` qui prend en paramètre une chaîne RAG et un dataset pour évaluation. La fonction renverra une liste de dictionnaire avec pour clés :\n", - "* *question* : la question posée\n", - "* *expected_answer* : la réponse attendue\n", - "* *predicted_answer* : la réponse obtenue\n", - "* *expected_sources* : la ou les sources attendues\n", - "* *predicted_sources* : la ou les sources obtenues" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4a3a70a4", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "da59e623", - "metadata": {}, - "source": [ - "**Consigne** : Tester la fonction précédente avec les trois premières questions puis afficher le résultat sous la forme d'un dataframe pandas." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a33db551", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "14393690", - "metadata": {}, - "source": [ - "Nous sommes capable d'obtenir un ensemble de réponse de la part d'un modèle avec un RAG, il nous reste à mettre en place le juge.\n", - "\n", - "**Consigne** : Définir un prompt pour décrire le rôle du juge." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a9eacd88", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "bc714900", - "metadata": {}, - "source": [ - "**Consigne** : Définir une chaîne pour le juge, de la même manière que le RAG : prompt --> model --> JSONParser" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b3c30cc3", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "6069627d", - "metadata": {}, - "source": [ - "**Consigne** : Modifier la fonction `evaluate_rag` pour qu'elle note directement la performance du modèle et renvoie sous forme d'un dataframe pandas les résultats. On implémentera également des mesures temporelles pour le RAG et le juge, ainsi que des blocs *try...except...* pour ne pas bloquer l'exécution de toutes les requêtes si une renvoie une erreur.\n", - "Pour pouvoir suivre l'avancement de l'évaluation, on utilisera la barre de progression tqdm." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0556cbed", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "73d842ea", - "metadata": {}, - "source": [ - "**Consigne** : Utiliser cette fonction sur les trois premières question du dataset d'évaluation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "afad101d", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "91231c6d", - "metadata": {}, - "source": [ - "**Consigne** : A partir des résultats précédents, donner des statistiques de performance du modèle." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "59d821db", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "id": "289c97f8", - "metadata": {}, - "source": [ - "## Pour aller plus loin\n", - "\n", - "Nous avons plusieurs axes d'améliorations, de manière non exhaustive :\n", - "* Une meilleure récupération du texte dans le PDF : par exemple utiliser [Docling](https://python.langchain.com/docs/integrations/document_loaders/docling/) ?\n", - "* Une meilleure manière de découper en *chunk* le texte : par exemple utiliser [RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html#recursivecharactertextsplitter), ou changer la taille des chunks...\n", - "* Un meilleur modèle d'embedding : voir le [leaderboard](https://huggingface.co/spaces/mteb/leaderboard) des embeddings\n", - "* Un meilleur retrieval : meilleure méthode pour chercher, par exemple [MMR](https://python.langchain.com/v0.2/docs/how_to/example_selectors_mmr/)\n", - "* De meilleurs prompt\n", - "* Une meilleure mesure de performance : plus de questions par exemple\n", - "\n", - "Nous encourageons l'étudiant à tester la ou les améliorations qu'ils souhaitent faire et surtout que les apports soit mesurés séparemment. On encourage également d'utiliser ses propres documents et son propre benchmark.\n", - "Pour accélérer encore un peu l'évaluation, on propose une version asynchrone de la fonction d'évaluation :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7ae5fd5d", - "metadata": {}, - "outputs": [], - "source": [ - "import asyncio\n", - "\n", - "from tqdm.asyncio import tqdm_asyncio\n", - "\n", - "\n", - "async def evaluate_rag_async(rag_chain, dataset, judge_chain, max_concurrency=5):\n", - " \"\"\"Async evaluation of a RAG chain against a dataset using a judge LLM.\"\"\"\n", - " results = []\n", - " semaphore = asyncio.Semaphore(max_concurrency)\n", - "\n", - " async def process_example(example):\n", - " async with semaphore:\n", - " rag_start = time.time()\n", - " try:\n", - " prediction = await rag_chain.ainvoke(example[\"question\"])\n", - " except Exception as e:\n", - " prediction = {\"answer\": \"\", \"sources\": []}\n", - " print(f\"[RAG ERROR] Question: {example['question']} | {e}\")\n", - " rag_end = time.time()\n", - "\n", - " judge_input = {\n", - " \"question\": example[\"question\"],\n", - " \"expected_answer\": example[\"answer\"],\n", - " \"predicted_answer\": prediction.get(\"answer\", \"\"),\n", - " \"expected_sources\": example[\"sources\"],\n", - " \"predicted_sources\": prediction.get(\"sources\", []),\n", - " }\n", - "\n", - " judge_start = time.time()\n", - " try:\n", - " judgment = await judge_chain.ainvoke(judge_input)\n", - " except Exception as e:\n", - " judgment = {\n", - " \"answer_correct\": False,\n", - " \"sources_correct\": False,\n", - " \"explanation\": f\"Judge error: {e}\",\n", - " }\n", - " print(f\"[JUDGE ERROR] Question: {example['question']} | {e}\")\n", - " judge_end = time.time()\n", - "\n", - " results.append(\n", - " {\n", - " **judge_input,\n", - " **judgment,\n", - " \"rag_time\": rag_end - rag_start,\n", - " \"judge_time\": judge_end - judge_start,\n", - " \"total_time\": judge_end - rag_start,\n", - " },\n", - " )\n", - "\n", - " tasks = [process_example(example) for example in dataset]\n", - " for f in tqdm_asyncio.as_completed(\n", - " tasks, desc=\"Evaluating RAG\", total=len(dataset),\n", - " ):\n", - " await f\n", - "\n", - " return pd.DataFrame(results)\n" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "studies", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/pyproject.toml b/pyproject.toml index f024f7c..de92216 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -13,6 +13,7 @@ dependencies = [ "langchain-community>=0.4.1", "langchain-huggingface>=1.2.0", "langchain-ollama>=1.0.1", + "langchain-text-splitters>=1.1.0", "matplotlib>=3.10.1", "nbformat>=5.10.4", "numpy>=2.2.5", diff --git a/uv.lock b/uv.lock index e107bbe..532623f 100644 --- a/uv.lock +++ b/uv.lock @@ -3503,6 +3503,7 @@ dependencies = [ { name = "langchain-community" }, { name = "langchain-huggingface" }, { name = "langchain-ollama" }, + { name = "langchain-text-splitters" }, { name = "matplotlib" }, { name = "nbformat" }, { name = "numpy" }, @@ -3539,6 +3540,7 @@ requires-dist = [ { name = "langchain-community", specifier = ">=0.4.1" }, { name = "langchain-huggingface", specifier = ">=1.2.0" }, { name = "langchain-ollama", specifier = ">=1.0.1" }, + { name = "langchain-text-splitters", specifier = ">=1.1.0" }, { name = "matplotlib", specifier = ">=3.10.1" }, { name = "nbformat", specifier = ">=5.10.4" }, { name = "numpy", specifier = ">=2.2.5" },