Compare commits

...

271 Commits

Author SHA1 Message Date
b8b0024852 Merge branch 'master' of https://github.com/ArthurDanjou/studies 2026-01-13 10:36:14 +01:00
3e1ac18acd Add Lab 4 2026-01-13 10:36:09 +01:00
77feb27b97 Add TP2 2026-01-13 10:36:04 +01:00
bcb8c66a9d add new dependencies 2026-01-13 10:35:56 +01:00
03bc530c3a edit .gitignore 2026-01-13 10:35:41 +01:00
27fd147d0f Implement feature X to enhance user experience and fix bug Y in module Z 2026-01-12 12:54:32 +01:00
56fdd5da45 Ajout de la dépendance "polars" version 1.37.0 dans pyproject.toml et uv.lock 2026-01-12 10:59:50 +01:00
3e6b2e313a Add langchain-text-splitters dependency to pyproject.toml and uv.lock
- Updated pyproject.toml to include langchain-text-splitters version >=1.1.0 in dependencies.
- Modified uv.lock to add langchain-text-splitters in both dependencies and requires-dist sections.
2026-01-12 10:48:31 +01:00
346695212d Ajout de fichiers pour le calcul des graphiques de dépendance partielle : ajout de TP1.ipynb et data/data_pdp.xlsx ; mise à jour des dépendances dans pyproject.toml et uv.lock 2026-01-12 10:37:11 +01:00
8e7bbc1fe9 Implement feature X to enhance user experience and optimize performance 2026-01-12 10:37:04 +01:00
c8c1bf4807 Add "Clustering In Practice" section: add Encoding.Rmd and data/chiffres.csv; update README 2026-01-08 13:44:01 +01:00
2e2500b509 Update execution counts and runtime metrics in the Maze Game notebook for consistency and accuracy 2026-01-06 13:09:06 +01:00
5f5bd609d7 Remove unnecessary newline in policy comparison output for clarity in Lab 3 notebook 2026-01-06 13:09:02 +01:00
e56fd6f2af Implement feature X to enhance user experience and optimize performance 2026-01-06 12:32:09 +01:00
0e65815e38 Fix execution counts and update policy array initialization in maze game notebooks 2026-01-06 11:13:17 +01:00
6eecdd6ab3 Update Python version and refine Jupyter Notebook formatting
- Bump Python version from 3.11 to 3.13 in .python-version file.
- Reset execution counts to null in Jupyter Notebook for reproducibility.
- Improve code readability by adjusting comments and formatting in the notebook.
- Change the policy definition to use numpy.ndarray for better clarity.
- Modify pyproject.toml to enable E501 rule for line length management.
2026-01-06 11:07:31 +01:00
06bc1f28a9 Refactor code structure for improved readability and maintainability 2026-01-06 10:22:42 +01:00
c5f60472fb Mettre à jour .gitignore pour exclure les fichiers PDF et ajouter plusieurs rapports PDF pour les projets M1 et M2. 2026-01-05 17:51:46 +01:00
0cb4dd4c57 Remove extra blank lines in M2/Statistiques Non Paramétrique/TP1.Rmd 2026-01-05 16:57:22 +01:00
98807a1b63 README: add "Statistiques Non Paramétrique" to project list 2026-01-05 16:54:33 +01:00
156411965d Add TP1.Rmd for Statistiques Non Paramétrique (M2) 2026-01-05 16:53:31 +01:00
fd775d1251 NoticeTechnique.Rmd: italicize report title, correct Global Tuberculosis Report year to 2024, assign k-means variance to var_totale, and switch reactive dataset from tb_final to tb_clustered. 2026-01-04 18:01:01 +01:00
2824a9aed1 Réviser le titre et la problématique pour insister sur l'analyse multivariée et la typologie opérationnelle (NoticeTechnique.Rmd) 2026-01-04 17:32:17 +01:00
acf1aa82c4 Increase Shiny slider animation interval from 3000ms to 5000ms 2026-01-04 17:32:05 +01:00
f326ca42e0 Add NoticeTechnique.Rmd and app.R to M2/Data Visualisation project 2026-01-04 17:25:51 +01:00
5d01240748 Add initial project setup files including .Rprofile, renv.lock, logo, and data files 2026-01-03 22:25:10 +01:00
7e62eaeb04 Update .gitignore to include additional files and directories for exclusion 2026-01-03 22:24:59 +01:00
9e28765022 Update .gitignore 2026-01-03 22:23:56 +01:00
9b0b24bc8b Implement feature X to enhance user experience and optimize performance 2025-12-24 22:27:05 +01:00
bcac5764f6 Refactor error messages and function signatures across multiple notebooks for clarity and consistency
- Updated error messages in Gauss method and numerical methods to use variables for better readability.
- Added return type hints to function signatures in various notebooks to improve code documentation.
- Corrected minor grammatical issues in docstrings for better clarity.
- Adjusted print statements and list concatenations for improved output formatting.
- Enhanced plotting functions to ensure consistent figure handling.
2025-12-24 22:26:59 +01:00
1141382c81 style: ajout d'un espace après la virgule dans la liste des dépendances 2025-12-13 23:42:15 +01:00
3cb05d3210 Update Python version in Jupyter notebooks to 3.13.9 across multiple files 2025-12-13 23:38:27 +01:00
d5a6bfd339 Refactor code for improved readability and consistency across multiple Jupyter notebooks
- Added missing commas in various print statements and function calls for better syntax.
- Reformatted code to enhance clarity, including breaking long lines and aligning parameters.
- Updated function signatures to use float type for sigma parameters instead of int for better precision.
- Cleaned up comments and documentation strings for clarity and consistency.
- Ensured consistent formatting in plotting functions and data handling.
2025-12-13 23:38:17 +01:00
f89ff4a016 style: mise à jour du fichier .gitignore et ajout de nouvelles règles de linting dans pyproject.toml 2025-12-13 23:37:33 +01:00
0f766b62c3 update pyproject.toml 2025-12-13 23:29:08 +01:00
a2fa13ef8d Mise à jour du fichier Course1.xlsm avec des modifications binaires 2025-12-06 16:04:18 +01:00
0420f09b69 Implement feature X to enhance user experience and optimize performance 2025-12-02 16:59:26 +01:00
82fb7e53de style: amélioration de la mise en forme et des espaces dans le code 2025-12-02 16:51:19 +01:00
33930ab89c Add xgboost dependency and update lock file with new package details
- Added xgboost version 3.1.2 to pyproject.toml dependencies.
- Updated uv.lock to include xgboost package with its dependencies and wheel URLs.
- Added nvidia-nccl-cu12 package to uv.lock for compatibility with xgboost on specific platforms.
2025-12-02 16:50:46 +01:00
95308de0cb Implement feature X to enhance user experience and fix bug Y in module Z 2025-12-02 12:49:27 +01:00
5338517fee Refactor code structure for improved readability and maintainability 2025-12-02 12:47:29 +01:00
8397c8fee3 feat: mise à jour du fichier Course1.xlsm 2025-11-27 10:49:18 +01:00
41378a2b42 Merge pull request #2 from ArthurDanjou/copilot/fix-code-cells-tp4
Complete TP4 DeepLearning notebooks - RNN with Embedding layer exercises
2025-11-26 13:54:37 +01:00
aad17ec465 Implement feature X to enhance user experience and optimize performance 2025-11-26 13:53:51 +01:00
copilot-swe-agent[bot]
886a7a2e2c Complete TP4 Bonus notebook code cells for DeepLearning
Co-authored-by: ArthurDanjou <29738535+ArthurDanjou@users.noreply.github.com>
2025-11-26 12:32:15 +00:00
copilot-swe-agent[bot]
dc054417f7 Initial plan 2025-11-26 12:26:59 +00:00
c4d5b67321 feat: supprimer le fichier .python-version 2025-11-26 13:22:16 +01:00
08cf8fbeda Refactor and enhance code in Reinforcement Learning notebook; add new R script for EM algorithm in Unsupervised Learning; update README to include new section for Unsupervised Learning. 2025-11-26 13:20:18 +01:00
5d968fa5e5 feat: ajouter la section 'Reinforcement Learning' dans le README 2025-11-25 12:41:50 +01:00
38ea77e86c Implement feature X to enhance user experience and optimize performance 2025-11-25 12:40:57 +01:00
baf0d21a25 Implement feature X to enhance user experience and optimize performance 2025-11-25 12:34:42 +01:00
f0854e58ba feat: ajouter le fichier .python-version et mettre à jour les règles de linting dans pyproject.toml 2025-11-25 12:30:55 +01:00
8400c722a5 Refactor code formatting and improve readability in Jupyter notebooks for TP_4 and TP_5
- Adjusted indentation and line breaks for better clarity in function definitions and import statements.
- Standardized string quotes for consistency across the codebase.
- Enhanced readability of DataFrame creation and manipulation by breaking long lines into multiple lines.
- Cleaned up print statements and comments for improved understanding.
- Ensured consistent use of whitespace around operators and after commas.
2025-11-25 10:46:16 +01:00
21e376de79 fix: mettre à jour le fichier Course1.xlsm pour corriger des erreurs 2025-11-25 10:25:02 +01:00
dc69e98b0d Implement feature X to enhance user experience and fix bug Y in module Z 2025-11-13 22:38:15 +01:00
12c37869eb Implement feature X to enhance user experience and fix bug Y in module Z 2025-11-13 19:51:04 +01:00
e217b83754 Refactor code structure for improved readability and maintainability 2025-11-13 17:55:06 +01:00
1c61de108b fix: update project sections in README to replace 'Risks Management' with 'VBA' 2025-11-13 17:23:04 +01:00
a3e636044a Add flexdashboard library to R Markdown for enhanced data visualization 2025-11-13 17:22:11 +01:00
2b00a351c0 Limit plots to single series: show only Call in first plot and only Put in second; remove unused series and adjust data frames. 2025-11-13 16:37:18 +01:00
4570a011ec Implement Black‑Scholes Shiny app: complete server & UI (call/put pricing, plotly plots, add volatility/rates/dividend inputs, run app) and add kable/paged_table examples to tp3.Rmd 2025-11-13 16:31:44 +01:00
f58afe7d71 Add empty R code chunk for future use in tp2.Rmd 2025-11-13 16:13:08 +01:00
c7d0f4878f Delete tp2.rmarkdown from M2/Data Visualisation/tp2 2025-11-13 16:08:11 +01:00
8b0afced5c Enhance tp2 Rmd (histogram, interactive maps, choropleth fixes); add TP3 Shiny apps and project files; update .gitignore; add shap (+cloudpickle, numba, llvmlite, tqdm, slicer) to pyproject.toml and uv.lock; remove generated tp2 HTML/assets 2025-11-13 16:07:58 +01:00
74feddbddb Ajout de 'catboost_info' au fichier .gitignore pour ignorer les fichiers de sortie de CatBoost 2025-11-11 17:09:15 +01:00
5f1cec7858 Implement feature X to enhance user experience and fix bug Y in module Z 2025-11-11 15:50:59 +01:00
dec54d91d7 Fix mapview example: enable code evaluation and correct st_as_sf usage with coordinates and CRS. 2025-11-06 11:44:18 +01:00
8fbf4681c9 Add manchot.png image and style.css for tp1 in Data Visualisation module 2025-11-06 11:33:04 +01:00
568f38a59a Add data visualisation TP1 and TP2 HTML files and assets, remove GEMINI.md and studies.Rproj 2025-11-06 11:24:24 +01:00
007ca3c12c Remove old HTML enonce files and rename Rmd files for tp1 and tp2 to tp1.Rmd and tp2.Rmd 2025-11-06 11:18:57 +01:00
bd05082e3c Modifie la couleur de remplissage du graphique en utilisant "lightgreen" au lieu de "grey" 2025-11-06 09:27:20 +01:00
03bf0a4db2 Refactor code for improved readability and consistency across R Markdown files
- Updated comments and code formatting in `3-td_ggplot2 - enonce.Rmd` for clarity.
- Enhanced code structure in `4-td_graphiques - enonce.Rmd` by organizing options and library calls.
- Replaced pipe operator `%>%` with `|>` in `Code_Lec3.Rmd` for consistency with modern R syntax.
- Cleaned up commented-out code and ensured consistent spacing in ggplot calls.
2025-11-06 09:26:58 +01:00
8f5f2b417c Implement code changes to enhance functionality and improve performance 2025-11-06 09:13:40 +01:00
5c8efbdc2e Add initial project files and styles for data visualization
- Created a new Excel file: `departements-francais.xlsx` for data storage.
- Added a CSS file: `style.css` with custom styles for various mathematical environments including boxes for lemmas, theorems, definitions, and more, complete with automatic numbering.
- Initialized R project file: `tp2.Rproj` with default settings for workspace management and LaTeX integration.
2025-11-06 09:07:30 +01:00
6369e30257 Met à jour le compteur d'exécution, réorganise les importations et modifie la version de Python dans le notebook TP2 2025-11-06 09:06:36 +01:00
aec178208e Met à jour le compteur d'exécution et supprime l'importation de TensorFlow dans le notebook TP2 2025-11-05 19:53:09 +01:00
098e20c982 Implement feature X to enhance user experience and optimize performance 2025-11-05 19:43:29 +01:00
e71aae349f Implement new feature for user authentication and improve error handling 2025-11-05 18:27:54 +01:00
fa5785e714 Implement code changes to enhance functionality and improve performance 2025-11-05 17:32:20 +01:00
632240d232 Ajoute une section "Deep Learning" au README et met à jour les dépendances pour inclure Keras 2025-11-05 17:15:53 +01:00
ba6bea2c73 Add new Jupyter notebooks for ResNet and CNN exercises; update execution counts in existing notebooks 2025-11-05 17:09:58 +01:00
0af6f7a5d0 Corrige l'affichage des prédictions et ajuste les impressions pour une meilleure clarté 2025-10-29 20:10:42 +01:00
f53ff6a2be Implement feature X to enhance user experience and optimize performance 2025-10-29 20:06:40 +01:00
20e3ca2326 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-29 19:55:30 +01:00
eab43866c3 Implement code changes to enhance functionality and improve performance 2025-10-29 19:53:14 +01:00
61cc00c973 Implement code changes to enhance functionality and improve performance 2025-10-29 19:38:48 +01:00
365da9c37e Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-29 19:13:30 +01:00
db85923e94 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-29 19:03:42 +01:00
37c0cd370e Add CatBoost dependency and update lock file
- Added CatBoost version 1.2.8 to the project dependencies in pyproject.toml.
- Updated uv.lock to include CatBoost and its dependencies, along with the necessary wheel files.
- Included Graphviz version 0.21 in the lock file as a dependency for CatBoost.
2025-10-27 19:26:30 +01:00
76a57c3d73 Mise à jour du fichier Course1.xlsm avec des modifications binaires 2025-10-23 15:02:45 +02:00
a0cc98744f Implement code changes to enhance functionality and improve performance 2025-10-23 15:02:34 +02:00
2ca65ffe73 Refactor Gradient Boosting Classifier Implementation
- Updated execution counts for various code cells to maintain consistency.
- Changed the model from RandomForestClassifier to GradientBoostingClassifier.
- Modified hyperparameter grid for GridSearchCV to include learning_rate and adjusted n_estimators.
- Added stratification to train-test split for better representation of classes.
- Corrected scoring parameter in GridSearchCV to use a valid metric.
- Updated output messages to reflect changes in model evaluation metrics.
2025-10-20 19:24:23 +02:00
3f2cd3a308 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-20 18:44:22 +02:00
12bba2cea7 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-20 18:13:39 +02:00
cf7d23261b Add Jupyter notebook for supervised machine learning algorithms and update dependencies
- Created a new Jupyter notebook: 2025_M2_ISF_TP_4.ipynb for supervised machine learning exercises, including data preparation, model building, and performance analysis.
- Added 'imblearn' as a dependency in pyproject.toml to support handling imbalanced datasets.
- Updated uv.lock to include the 'imbalanced-learn' package and its dependencies.
2025-10-20 17:43:11 +02:00
e1255f326d Add new data file datafreMPTL.RData for analysis in Data Visualisation project 2025-10-20 17:43:04 +02:00
4efbee7ce4 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-14 17:09:14 +02:00
3c0113115c Ajout de sections pour la visualisation des données et RShiny dans le README 2025-10-14 11:41:19 +02:00
85a7469195 Réorganisation du code pour améliorer la lisibilité et la structure de l'installation des packages 2025-10-14 11:16:14 +02:00
5af3c76113 Implement code changes to enhance functionality and improve performance 2025-10-14 11:12:31 +02:00
ba158c366b Refactor normality test logic in portef_v3_4_3.ipynb
- Changed execution_count from 3 to null for a cleaner notebook state.
- Simplified the normality test logic by using a conditional expression to determine the p-value calculation, improving code readability.
2025-10-14 10:46:56 +02:00
d8b535418c Implement structural updates and optimizations across multiple modules 2025-10-14 10:45:57 +02:00
b6c9e91481 Implement feature X to enhance user experience and optimize performance 2025-10-14 10:18:24 +02:00
ec5e23e3d4 Ajout de la séparation des données en ensembles d'apprentissage et de test, et implémentation de la recherche de grille pour les hyperparamètres du modèle Random Forest. 2025-10-13 20:00:44 +02:00
a0b0a9f8bd Refactor code in 2025_TP_3_M2_ISF.ipynb:
- Updated execution counts for multiple code cells to maintain consistency.
- Removed redundant imports and organized import statements.
- Improved formatting for better readability in train-test split section.
- Added markdown explanations for model performance metrics (MAE, RMSE).
- Enhanced cross-validation training loop with detailed output for each fold's metrics.
2025-10-13 19:58:58 +02:00
047f30def1 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-13 19:29:48 +02:00
f3a09a5282 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-13 19:22:57 +02:00
1ccdcb3803 Ajout de l'exécution de cellules pour le One Hot Encoding, la normalisation des variables numériques et la séparation des données en ensembles d'apprentissage et de test. 2025-10-13 18:24:13 +02:00
a63b1bf94c Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-13 18:12:52 +02:00
19d7d398ae Implement code changes to enhance functionality and improve performance 2025-10-13 18:10:31 +02:00
7cc7df0376 Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-13 18:10:19 +02:00
963948f19f Ajout du fichier Course1.xlsm et mise à jour de l'exécution des cellules dans le notebook TP_2 pour corriger le compteur d'exécution. 2025-10-13 17:40:54 +02:00
592d7bc7eb Correction de la numérotation de la question dans la requête SQL pour la sélection des orateurs dans DANJOU_Arthur.sql 2025-10-09 18:14:49 +02:00
26e7a4da36 Correction de la requête SQL pour supprimer le GROUP BY dans la sélection du prix minimum dans DANJOU_Arthur.sql 2025-10-09 12:29:21 +02:00
6247d4b7e1 Correction des alias dans les requêtes SQL pour une meilleure lisibilité et cohérence dans TP3.sql 2025-10-09 12:25:58 +02:00
59b0c0de5c Ajout des requêtes SQL pour les exercices 1 et 2, y compris la création des tables, l'insertion des données et les modifications du Makefile pour inclure TP3. 2025-10-09 12:16:58 +02:00
9fc0fad1ef Refactor les requêtes SQL pour utiliser des jointures explicites dans TP2.sql 2025-10-09 12:13:16 +02:00
fe8be01369 Merge branch 'master' of https://github.com/ArthurDanjou/studies 2025-10-09 12:13:05 +02:00
Arthur Danjou
40085147f0 Ajouter le script SQL complet pour les exercices 1 et 2 avec les requêtes correspondantes 2025-10-09 11:22:42 +02:00
Arthur Danjou
beedb187f7 Ajouter des requêtes SQL pour les questions Q3.10 à Q3.13 dans le script TP3.sql 2025-10-09 10:02:04 +02:00
danjar24
585277a622 Add TP3 2025-10-09 09:40:06 +02:00
bbed2263ef Modifier les paramètres de contrôle personnalisés pour utiliser une validation croisée à 10 plis et répéter 10 fois pour une robustesse accrue 2025-10-08 16:22:16 +02:00
82ff2db44c Refactor Q2.15 query to use JOIN syntax for improved clarity and performance 2025-10-08 16:22:10 +02:00
cdac478b83 Refactor code structure for improved readability and maintainability 2025-10-08 16:04:53 +02:00
3cacb6be8a Add initial R project configuration file (studies.Rproj) with default settings 2025-10-08 15:33:39 +02:00
1effa6dc4b Ajouter des sections 'Linear Models' et 'Risks Management' dans la section M2 du README 2025-10-08 13:45:31 +02:00
1d5089bfc8 Supprimer un commentaire obsolète concernant la clé 'Dep' dans la table 'Employe' 2025-10-08 12:31:10 +02:00
37ede46fac Implement feature X to enhance user experience and fix bug Y in module Z 2025-10-08 11:46:58 +02:00
cb4e7d2ac2 Add initial implementation of portfolio risk management analysis
- Created a new Python script for analyzing historical stock data.
- Implemented functions to test normality of price and return distributions.
- Included functionality to compute and visualize the efficient frontier for a portfolio of stocks.
- Added comments and documentation for clarity and future reference.
2025-10-08 11:17:12 +02:00
a4adf0a392 Add portfolio analysis script and update dependencies
- Created a new Python script for portfolio analysis using historical stock data.
- Implemented functions for normality testing of prices and returns.
- Added histogram plots for prices and returns.
- Included logic for random portfolio allocation and efficient frontier calculation.
- Updated `pyproject.toml` to include `pandas-stubs` for type hinting support.
- Modified `uv.lock` to reflect the addition of `pandas-stubs` and its dependencies.
2025-10-08 11:08:26 +02:00
04d8b4cf14 Revert la version de Python à 3.13 et ajoute yfinance comme dépendance 2025-10-08 10:35:40 +02:00
9606f4224a Mettre à jour la version de Python à 3.14 2025-10-08 10:27:59 +02:00
185de1142d Refactor code in Jupyter notebooks for clarity and consistency
- Set execution_count to null for specific code cells in 2025_TP_1_M2_ISF.ipynb to reset execution state.
- Replace output display of DataFrames with print statements in 2025_TP_1_M2_ISF.ipynb for better visibility during execution.
- Clean up import statements in 2025_TP_2_M2_ISF.ipynb by adding noqa comments for better linting and readability.
2025-10-08 10:24:51 +02:00
b6cfa3349e Implement code changes to enhance functionality and improve performance 2025-10-08 10:22:05 +02:00
8e081a1ccb Implement code changes to enhance functionality and improve performance 2025-10-06 18:12:46 +02:00
2022563a28 Refactor les requêtes SQL pour utiliser des jointures explicites dans les sections Q2.6, Q2.7, Q2.10 et Q2.11 2025-10-02 11:47:38 +02:00
6c120acab3 Refactor les requêtes SQL pour utiliser des jointures explicites et ajouter une alternative avec MIN pour la comparaison de dates. 2025-10-02 11:45:01 +02:00
f4a5b5b708 Correction des tables Employe et Departement : ajout de contraintes de clé primaire et étrangère, mise à jour des types de données, et ajout de nouvelles requêtes SQL pour améliorer la structure et les fonctionnalités. 2025-10-02 11:34:41 +02:00
c925c8a5c0 Correction de la mise en forme et mise à jour de la version de Python dans le notebook TP_2_M2_ISF.ipynb 2025-10-02 08:26:49 +02:00
7b9a6bd0ff Implement code changes to enhance functionality and improve performance 2025-09-29 17:56:11 +02:00
e498a3eee8 Mise à jour du README.md pour renommer le projet en ArtStudies et ajouter la section M2 avec des projets de Machine Learning et SQL. 2025-09-29 17:49:22 +02:00
f3d7c2fc09 Checkpoint from VS Code for coding agent session 2025-09-29 17:34:06 +02:00
a4e0e55efc Correction de la configuration MySQL dans docker-compose.yml et ajout du script TP2.sql pour la gestion des employés et départements 2025-09-25 13:00:08 +02:00
34bd0307d5 Mise à jour du Makefile pour corriger les chemins des fichiers journaux et ajouter la cible tp2 2025-09-25 09:44:00 +02:00
76620f1d9d Ajout de nouvelles requêtes SQL pour les analyses de données dans TP1.sql 2025-09-25 09:43:30 +02:00
0d00de44e8 Add MySQL setup and initial data scripts
- Created a Docker Compose file to set up a MySQL container named M2_SQL_COURSE with an empty password and a database named TP.
- Added a Makefile with a target to execute a SQL script (TP1.sql) inside the MySQL container and log the output.
- Implemented the TP1.sql script to create tables for Magasin and Localite, insert initial data, and perform several queries.
2025-09-25 09:31:10 +02:00
2768bcb565 Ajout des fichiers journaux au .gitignore 2025-09-25 09:30:53 +02:00
6738419f7c Add new plot 2025-09-15 20:02:19 +02:00
c72538fac3 Add new plot 2025-09-15 19:59:26 +02:00
08d0d93393 Add 'Machine Learning' TP1 2025-09-15 19:58:46 +02:00
fbd939c300 Update Python version in notebooks to 3.13.3 and adjust kernel display name 2025-09-01 16:14:59 +02:00
8cf328e18a Refactor code in numerical methods notebooks
- Updated import order in Point_Fixe.ipynb for consistency.
- Changed lambda functions to regular function definitions for clarity in Point_Fixe.ipynb.
- Added numpy import in TP1_EDO_EulerExp.ipynb, TP2_Lokta_Volterra.ipynb, and TP3_Convergence.ipynb for better readability.
- Modified for loops in TP1_EDO_EulerExp.ipynb and TP2_Lokta_Volterra.ipynb to include strict=False for compatibility with future Python versions.
2025-09-01 16:14:53 +02:00
dfee405ea0 Refactor code structure for improved readability and maintainability 2025-09-01 16:09:30 +02:00
1a1c3c31f9 Refactor code structure for improved readability and maintainability 2025-09-01 16:04:25 +02:00
f94ff07cab Refactor code for improved readability and consistency across notebooks
- Standardized spacing around operators and function arguments in TP7_Kmeans.ipynb and neural_network.ipynb.
- Enhanced the formatting of model building and training code in neural_network.ipynb for better clarity.
- Updated the pyproject.toml to remove a specific TensorFlow version and added linting configuration for Ruff.
- Improved comments and organization in the code to facilitate easier understanding and maintenance.
2025-07-01 20:46:08 +02:00
e273cf90f7 Add TP2 and Project 2025-06-21 14:20:46 +02:00
ecbdbc1dce Implement code changes to enhance functionality and improve performance 2025-06-21 14:17:47 +02:00
192c4e02f1 Add K-means clustering notebook and associated image assets for statistical learning exercises 2025-05-07 19:06:31 +02:00
ad8f5857ca Update execution counts and model definitions in TP6 Keras introduction notebook 2025-05-07 11:02:43 +02:00
c7f0603087 Implement code changes to enhance functionality and improve performance 2025-05-05 16:52:07 +02:00
05222e0e65 Add TP6: Introduction to Keras 2025-04-30 11:40:12 +02:00
01f64ae022 Implement new feature for user authentication and improve error handling 2025-04-30 10:10:45 +02:00
915cfeb97d Update dependencies 2025-04-29 18:05:06 +02:00
1f1f52f3c6 Merge branch 'master' of https://github.com/ArthurDanjou/studies 2025-04-24 12:41:17 +02:00
7438ec6f5f Refactor code structure for improved readability and maintainability 2025-04-24 12:40:42 +02:00
32901d1247 Migrate to uv package manager 2025-04-24 12:39:46 +02:00
6475965bd4 Update README.md 2025-04-17 19:06:13 +02:00
c0a2307c94 Update README.md 2025-04-17 19:05:56 +02:00
07407fcdd4 Improve TP Noté 2025-04-07 17:03:03 +02:00
a4c09c50a5 Add tp noté 2025-04-07 16:44:41 +02:00
5211dc754f Add TP noté 2025-04-07 16:39:59 +02:00
8af29ceb78 Add: TP3 2025-04-01 19:11:39 +02:00
704cefeeb1 add: .vscode in gitignore 2025-04-01 18:38:51 +02:00
786ebadefc update: enhance ComputerSession2 and TP4_Ridge_Lasso_and_CV notebooks with execution outputs and code improvements 2025-03-30 20:43:00 +02:00
44c277c8a7 add: TP4 about Lasso, Ridge and CV 2025-03-26 11:37:11 +01:00
b784751776 add: TP1/2 2025-03-26 11:36:59 +01:00
52c6012197 add: TP1/2 2025-03-24 16:07:17 +01:00
4853ad1d64 add: TP2 (NN) 2025-03-24 14:11:13 +01:00
bc64c7ddcc start: tp1 2025-03-24 14:11:04 +01:00
632a1c6950 add: tp4 2025-03-19 12:01:36 +01:00
485844f674 work: tp3 2025-03-19 12:01:31 +01:00
f536e28a24 fix: TP2 2025-03-19 12:01:24 +01:00
d28631c1c7 add: TP1 and TP2 in numerical optimisation 2025-03-19 10:07:14 +01:00
d795afe07e fix: remove useless dtype 2025-03-05 11:46:12 +01:00
6e60299ff9 fix: grad_w formula 2025-03-05 11:43:01 +01:00
ba5bc36879 Add TP3 in Statistical Learning 2025-03-05 11:37:37 +01:00
dd760dad03 Fix TP1 in Numerical methods 2025-03-04 13:51:09 +01:00
31f77d59e4 Fix TP1 in Numerical methods 2025-03-04 13:49:19 +01:00
d51159ad9e TP2 in Numerical Opti 2025-03-03 16:34:44 +01:00
458a9b9698 Add TP1 in numerical methods 2025-03-03 16:34:14 +01:00
3b1347c54c Add ComputerSession2.ipynb 2025-02-18 19:19:02 +01:00
964821c058 fix 2025-02-07 17:36:49 +01:00
070892c551 fix 2025-02-07 17:32:41 +01:00
bef64b5eb6 Correction of tp2-bis 2025-02-07 17:31:49 +01:00
f77bd7b184 Delete useless comments 2025-02-06 13:11:31 +01:00
b0646dfb96 Rename directory name 2025-02-05 22:09:21 +01:00
82ed0a1d8a Add tp2 and tp2 bis 2025-02-05 22:08:02 +01:00
767355c4df Add tp2 2025-02-05 11:17:40 +01:00
6449317f91 Add tp2 2025-02-05 11:01:44 +01:00
0e3c8aca99 Add end of tp 1 of Numerical optimisation 2025-02-04 21:56:39 +01:00
dbf9816453 Add end of tp 1 of Numerical optimisation 2025-02-04 19:09:07 +01:00
b7ca3f6e66 Add tp2 on KNN 2025-01-29 11:56:29 +01:00
6596d39060 Add GLM Project 2025-01-29 11:56:19 +01:00
66d4be6542 Edit .gitignore 2025-01-22 11:05:17 +01:00
a86834aeb5 Add TP0 and TP1 for Stat learning 2025-01-22 11:04:34 +01:00
d06b212417 Add TP0 for numerical optimization 2025-01-22 11:04:19 +01:00
a668c6798a Add final part Monte carlo project 2024-12-22 20:49:02 +01:00
9bfa080c06 Add tp3 bis 2024-12-10 10:37:32 +01:00
c892ce7110 fix confusion matrix 2024-12-10 10:01:30 +01:00
4c58e4a97a Add exo 13 2024-12-04 19:18:26 +01:00
5e7db282cd Add tp 4 2024-12-04 19:18:18 +01:00
188d1f7cad Add second part of the project 2024-11-27 13:51:42 +01:00
32e6c1733a Delete CNAME 2024-11-26 20:32:56 +01:00
41c789b7d4 Create CNAME 2024-11-26 20:30:08 +01:00
c84d813de6 Remvoe html file of project 2024-11-25 14:45:16 +01:00
d00429881e Add Project Portfolio management 2024-11-25 14:45:09 +01:00
2a863e6c9c Remove old project 2024-11-25 14:45:01 +01:00
bef5077485 Add TP3 2024-11-21 20:02:05 +01:00
af27fbba72 Add TP3 2024-11-20 18:32:54 +01:00
10e9191969 Remove useless import 2024-11-14 19:57:15 +01:00
5e36d0f220 Add TP1 In portfolio management 2024-11-14 18:49:23 +01:00
351b32cdb2 Add tp2 bis 2024-11-14 17:07:21 +01:00
23b405ed57 Add tp2 bis 2024-11-14 17:07:19 +01:00
4168d66030 Remove Chapter 1 2024-11-14 16:22:17 +01:00
0e16469176 Fix .gitignore 2024-11-14 16:22:07 +01:00
7fbe02aced Add TP1 bis 2024-11-14 16:15:49 +01:00
750ec5c719 Add TP1 of Data Analysis 2024-11-14 15:22:21 +01:00
87e7e58cd5 Edit TP2 2024-11-13 18:31:42 +01:00
4e1aaa2310 Add TP2 2024-11-13 18:30:43 +01:00
7e3e01706d Edit TP1 2024-11-13 18:30:38 +01:00
00388ad6b8 Add Exo 10-11-12 2024-11-13 18:30:34 +01:00
88d0907535 Edit Exercise11.rmd 2024-11-06 16:58:52 +01:00
df37fca8ac Add Exercise11.rmd 2024-11-06 16:41:29 +01:00
2656250b9c Remove RStudio files 2024-11-05 10:43:22 +01:00
cb5713ff6d Fix TP1 2024-11-05 10:43:02 +01:00
f6ba3d0890 Add TP1 GLM & DM Monte Carlo 2024-11-05 10:33:07 +01:00
611f22b99d Add TP1 2024-10-16 18:34:12 +02:00
365faafb5a Add exercise 10 2024-10-16 17:11:25 +02:00
7de8a90adf Edit exercise 9 2024-10-09 15:10:25 +02:00
8e680a4f41 Edit exercise 9 2024-10-09 15:08:28 +02:00
fdfff091a7 Add exercise 9 2024-10-09 14:46:31 +02:00
aeb5314b8b Fix 2024-10-09 14:46:22 +02:00
fc0b87a405 Finish Exercise 8 2024-10-09 13:57:41 +02:00
0ecd2582bb Add exo 8 2024-10-02 15:04:03 +02:00
ae18b13ad2 Add exo 8 2024-10-02 15:03:41 +02:00
ea97f4e314 Add exo 7 2024-09-25 15:15:39 +02:00
c20c4f1585 Add exo 6 2024-09-25 14:55:40 +02:00
bcbe47df12 End exo 2 2024-09-25 14:55:34 +02:00
e1fad33c55 Edit html file 2024-09-24 12:02:08 +02:00
decac8bff2 Add Chapter 1 of GLM 2024-09-24 11:11:26 +02:00
515609c16b Add exo 2 2024-09-24 11:11:09 +02:00
ba27f1ce7c Add question 2 2024-09-18 17:01:12 +02:00
52796e9018 Refactor files 2024-09-18 16:53:09 +02:00
75b83bf0a4 Move Projet.pdf 2024-09-17 22:52:25 +02:00
65ead01e8e Add Projet.pdf 2024-09-17 22:52:07 +02:00
7a582c0601 Test 2024-09-17 22:47:59 +02:00
b8a53db50c Test 2024-09-17 22:46:44 +02:00
fccdc5dfb8 Fix typo 2024-09-17 22:45:56 +02:00
27333910df Merge remote-tracking branch 'origin/master' 2024-09-17 22:40:15 +02:00
5093ee4d25 Move all files 2024-09-17 22:40:00 +02:00
4055502110 Move all files 2024-09-17 22:36:02 +02:00
3012b8a505 Update .gitignore 2024-04-22 15:32:14 +02:00
14951f25a5 Delete Projet Numérique/.idea directory 2024-04-22 15:32:03 +02:00
13c19e5cf3 Delete Projet Numérique/.virtual_documents directory 2024-04-22 15:31:53 +02:00
16e52106ce Delete Projet Numérique/Java directory 2024-04-22 15:31:46 +02:00
88ff5dbae1 Add tp 4 2024-04-22 15:30:09 +02:00
52096035e7 Rename Figure 1Voisins.png to Figure 1 Voisins.png 2024-04-12 13:59:31 +02:00
0b1a10328f Add pictures 2024-04-12 13:59:00 +02:00
c44cf4e836 Added projet numérique 2024-04-12 13:11:17 +02:00
48339b949b End of TP5 2024-04-11 15:28:01 +02:00
dae19d4eb6 Initialisation of TP5 2024-04-11 13:49:00 +02:00
227 changed files with 499950 additions and 3791 deletions

32
.gitignore vendored
View File

@@ -1,3 +1,35 @@
.DS_Store
.Rproj.user
.idea
.vscode
.RData
.RHistory
.ipynb_checkpoints
*.log
logs
catboost_info
tp1_files
tp2_files
tp3_files
dashboard_files
Beaudelaire.txt
Baudelaire_len_32.p
NoticeTechnique_files
.posit
renv
results/
results_stage_1/
results_stage_2/
*.safetensors
*.pt
*.pth
*.bin

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.13

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -76,24 +76,40 @@
],
"source": [
"import numpy as np\n",
"\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"\n",
"\n",
"u = np.array([1,2,3,4,5])\n",
"v = np.array([[1,2,3,4,5]])\n",
"su=u.shape\n",
"sv=v.shape\n",
"u = np.array([1, 2, 3, 4, 5])\n",
"v = np.array([[1, 2, 3, 4, 5]])\n",
"su = u.shape\n",
"sv = v.shape\n",
"ut = np.transpose(u)\n",
"vt = np.transpose(v)\n",
"vt2 = np.array([[1],[2],[3],[4],[5]])\n",
"A = np.array([[1,2,0,0,0],[0,2,0,0,0],[0,0,3,0,0],[0,0,0,4,0],[0,0,0,0,5]])\n",
"B = np.array([[1,2,3,4,5],[2,3,4,5,6],[3,4,5,6,7],[4,5,6,7,8],[5,6,7,8,9]])\n",
"d=np.diag(A)\n",
"dd=np.array([np.diag(A)])\n",
"dt=np.transpose(d)\n",
"ddt=np.transpose(dd)\n",
"Ad=np.diag(np.diag(A))\n",
"vt2 = np.array([[1], [2], [3], [4], [5]])\n",
"A = np.array(\n",
" [\n",
" [1, 2, 0, 0, 0],\n",
" [0, 2, 0, 0, 0],\n",
" [0, 0, 3, 0, 0],\n",
" [0, 0, 0, 4, 0],\n",
" [0, 0, 0, 0, 5],\n",
" ],\n",
")\n",
"B = np.array(\n",
" [\n",
" [1, 2, 3, 4, 5],\n",
" [2, 3, 4, 5, 6],\n",
" [3, 4, 5, 6, 7],\n",
" [4, 5, 6, 7, 8],\n",
" [5, 6, 7, 8, 9],\n",
" ],\n",
")\n",
"d = np.diag(A)\n",
"dd = np.array([np.diag(A)])\n",
"dt = np.transpose(d)\n",
"ddt = np.transpose(dd)\n",
"Ad = np.diag(np.diag(A))\n",
"\n",
"print(np.dot(np.linalg.inv(A), A))"
]
@@ -138,13 +154,14 @@
" x = 0 * b\n",
" n = len(b)\n",
" if np.allclose(A, np.triu(A)):\n",
" for i in range(n-1, -1, -1):\n",
" x[i] = (b[i] - np.dot(A[i,i+1:], x[i+1:])) / A[i,i]\n",
" for i in range(n - 1, -1, -1):\n",
" x[i] = (b[i] - np.dot(A[i, i + 1 :], x[i + 1 :])) / A[i, i]\n",
" elif np.allclose(A, np.tril(A)):\n",
" for i in range(n):\n",
" x[i] = (b[i] - np.dot(A[i,:i], x[:i])) / A[i,i]\n",
" x[i] = (b[i] - np.dot(A[i, :i], x[:i])) / A[i, i]\n",
" else:\n",
" raise ValueError(\"A est ni triangulaire supérieure ni triangulaire inférieure\")\n",
" msg = \"A est ni triangulaire supérieure ni triangulaire inférieure\"\n",
" raise ValueError(msg)\n",
" return x"
]
},
@@ -171,7 +188,7 @@
"b = np.dot(A, xe)\n",
"x = remontee_descente(A, b)\n",
"\n",
"print(np.dot(x - xe, x-xe))"
"print(np.dot(x - xe, x - xe))"
]
},
{
@@ -263,9 +280,9 @@
" U = A\n",
" n = len(A)\n",
" for j in range(n):\n",
" for i in range(j+1, n):\n",
" beta = U[i,j]/U[j,j]\n",
" U[i,j:] = U[i,j:] - beta * U[j, j:]\n",
" for i in range(j + 1, n):\n",
" beta = U[i, j] / U[j, j]\n",
" U[i, j:] = U[i, j:] - beta * U[j, j:]\n",
" return U"
]
},
@@ -280,16 +297,20 @@
"def met_gauss_sys(A, b):\n",
" n, m = A.shape\n",
" if n != m:\n",
" raise ValueError(\"Erreur de dimension : A doit etre carré\")\n",
" msg = \"Erreur de dimension : A doit etre carré\"\n",
" raise ValueError(msg)\n",
" if n != b.size:\n",
" raise valueError(\"Erreur de dimension : le nombre de lignes de A doit être égal au nombr ede colonnes de b\")\n",
" U = np.zeros((n, n+1))\n",
" msg = \"Erreur de dimension : le nombre de lignes de A doit être égal au nombr ede colonnes de b\"\n",
" raise valueError(\n",
" msg,\n",
" )\n",
" U = np.zeros((n, n + 1))\n",
" U = A\n",
" V = b\n",
" for j in range(n):\n",
" for i in range(j+1, n):\n",
" beta = U[i,j]/U[j,j]\n",
" U[i,j:] = U[i,j:] - beta * U[j, j:]\n",
" for i in range(j + 1, n):\n",
" beta = U[i, j] / U[j, j]\n",
" U[i, j:] = U[i, j:] - beta * U[j, j:]\n",
" V[i] = V[i] - beta * V[j]\n",
" return remontee_descente(U, V)"
]

View File

@@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="R_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>

View File

@@ -0,0 +1,6 @@
```{r}
library(FactoMineR)
data(iris)
res.test <- PCA(iris[,1:4], scale.unit=TRUE, ncp=4)
res.test
```

View File

@@ -1,12 +1,12 @@
---
title: "DM Statistique exploratoire multidimensionelle - Arthur DANJOU"
output:
pdf_document: default
html_document:
df_print: paged
editor_options:
markdown:
wrap: 72
pdf_document: default
html_document:
df_print: paged
editor_options:
markdown:
wrap: 72
---
------------------------------------------------------------------------
@@ -28,43 +28,42 @@ knitr::opts_chunk$set(include = FALSE)
# PARTIE 1 : Calcul de composantes principales sous R (Sans FactoMineR)
- Vide l'environnement de travail, initialise la matrice avec laquelle
vous allez travailler
vous allez travailler
```{r}
rm(list=ls())
rm(list = ls())
```
- Importation du jeu de données (compiler ce qui est ci-dessous mais
NE SURTOUT PAS MODIFIER)
NE SURTOUT PAS MODIFIER)
```{r}
library(dplyr)
notes_MAN <- read.table("notes_MAN.csv", sep=";", dec=",", row.names=1, header=TRUE)
notes_MAN <- read.table("notes_MAN.csv", sep = ";", dec = ",", row.names = 1, header = TRUE)
# on prépare le jeu de données en retirant la colonne des Mentions
# qui est une variable catégorielle
notes_MAN_prep <- notes_MAN[,-1]
notes_MAN_prep <- notes_MAN[, -1]
X <- notes_MAN[1:6,]%>%select(c("Probas","Analyse","Anglais","MAN.Stats","Stats.Inférentielles"))
X <- notes_MAN[1:6, ] |> select(c("Probas", "Analyse", "Anglais", "MAN.Stats", "Stats.Inférentielles"))
# on prépare le jeu de données en retirant la colonne des Mentions
# qui est une variable catégorielle
# View(X)
```
```{r}
X <- scale(X,center=TRUE,scale=TRUE)
X <- scale(X, center = TRUE, scale = TRUE)
X
```
- Question 1 : que fait la fonction “scale” dans la cellule ci-dessus
? (1 point)
? (1 point)
La fonction *scale* permet de normaliser et de réduire notre matrice X.
- Question 2: utiliser la fonction eigen afin de calculer les valeurs
propres et vecteurs propres de la matrice de corrélation de X. Vous
stockerez les valeurs propres dans un vecteur nommé lambda et les
vecteurs propres dans une matrice nommée vect (1 point).
propres et vecteurs propres de la matrice de corrélation de X. Vous
stockerez les valeurs propres dans un vecteur nommé lambda et les
vecteurs propres dans une matrice nommée vect (1 point).
```{r}
cor_X <- cor(X)
@@ -78,7 +77,7 @@ lambda
```
- Question 3 : quelle est la part dinertie expliquée par les 2
premières composantes principales ? (1 point)
premières composantes principales ? (1 point)
```{r}
inertie_total_1 <- sum(diag(cor_X)) # Inertie est égale à la trace de la matrice de corrélation
@@ -90,28 +89,28 @@ inertie_axes
```
- Question 4 : calculer les coordonnées des individus sur les deux
premières composantes principales (1 point)
premières composantes principales (1 point)
```{r}
C <- X %*% vect
C[,1:2]
C[, 1:2]
```
- Question 5 : représenter les individus sur le plan formé par les
deux premières composantes principales (1 point)
deux premières composantes principales (1 point)
```{r}
colors <- c('blue', 'red', 'green', 'yellow', 'purple', 'orange')
colors <- c("blue", "red", "green", "yellow", "purple", "orange")
plot(
C[,1],C[,2],
main="Coordonnées des individus par rapport \n aux deux premières composantes principales",
C[, 1], C[, 2],
main = "Coordonnées des individus par rapport \n aux deux premières composantes principales",
xlab = "Première composante principale",
ylab = "Deuxieme composante principale",
panel.first = grid(),
col = colors,
pch=15
pch = 15
)
legend(x = 'topleft', legend = rownames(X), col = colors, pch = 15)
legend(x = "topleft", legend = rownames(X), col = colors, pch = 15)
```
------------------------------------------------------------------------
@@ -122,7 +121,7 @@ legend(x = 'topleft', legend = rownames(X), col = colors, pch = 15)
étudiants.
- Question 1 : Écrire maximum 2 lignes de code qui renvoient le nombre
dindividus et le nombre de variables.
dindividus et le nombre de variables.
```{r}
nrow(notes_MAN_prep) # Nombre d'individus
@@ -130,7 +129,7 @@ ncol(notes_MAN_prep) # Nombre de variables
```
```{r}
dim(notes_MAN_prep) # On peut également utiliser 'dim' qui renvoit la dimension
dim(notes_MAN_prep) # On peut également utiliser 'dim' qui renvoit la dimension
```
Il y a donc **42** individus et **14** variables. A noter que la
@@ -146,7 +145,7 @@ library(FactoMineR)
```{r}
# Ne pas oublier de charger la librairie FactoMineR
# Indication : pour afficher les résultats de l'ACP pour tous les individus, utiliser la
# Indication : pour afficher les résultats de l'ACP pour tous les individus, utiliser la
# fonction summary en précisant dedans nbind=Inf et nbelements=Inf
res.notes <- PCA(notes_MAN_prep, scale.unit = TRUE)
```
@@ -161,7 +160,7 @@ summary(res.notes, nbind = Inf, nbelements = Inf, nb.dec = 2)
eigen_values <- res.notes$eig
bplot <- barplot(
eigen_values[, 1],
eigen_values[, 1],
names.arg = 1:nrow(eigen_values),
main = "Eboulis des valeurs propres",
xlab = "Principal Components",
@@ -169,11 +168,11 @@ bplot <- barplot(
col = "lightblue"
)
lines(x = bplot, eigen_values[, 1], type = "b", col = "red")
abline(h=1, col = "darkgray", lty = 5)
abline(h = 1, col = "darkgray", lty = 5)
```
- Question 4 : Quelles sont les coordonnées de la variable MAN.Stats
sur le cercle des corrélations ?
sur le cercle des corrélations ?
La variable **MAN.Stats** est la **9-ième** variable de notre dataset. Les
coordonnées de cette variable sont : $(corr(C_1, X_9), corr(C_2, X_9))$
@@ -190,7 +189,7 @@ avec:
Depuis notre ACP, on peut donc récupérer les coordonnées:
```{r}
coords_man_stats <- res.notes$var$coord["MAN.Stats",]
coords_man_stats <- res.notes$var$coord["MAN.Stats", ]
coords_man_stats[1:2]
```
@@ -198,12 +197,12 @@ Les coordonnées de la variable **MAN.Stats** sont donc environ
**(0.766,-0.193)**
- Question 5 : Quelle est la contribution moyenne des individus ?
Quelle est la contribution de Thérèse au 3e axe principal ?
Quelle est la contribution de Thérèse au 3e axe principal ?
```{r}
contribs <- res.notes$ind$contrib
contrib_moy_ind <- mean(contribs) # 100 * 1/42
contrib_therese <- res.notes$ind$contrib["Thérèse",3]
contrib_therese <- res.notes$ind$contrib["Thérèse", 3]
contrib_moy_ind
contrib_therese
@@ -213,7 +212,7 @@ La contribution moyenne est donc environ égale à **2,38%**. La
contribution de Thérèse au 3e axe principal est environ égal à **5.8%**
- Question 6 : Quelle est la qualité de représentation de Julien sur
le premier plan factoriel (constitué du premier et deuxième axe) ?
le premier plan factoriel (constitué du premier et deuxième axe) ?
La qualité de représentation de 'Julien' sur le premier plan factoriel
est donné par la formule :
@@ -236,8 +235,8 @@ principales. On a donc une qualité environ égale à **0.95** soit
**95%.**
- Question 7 : Discuter du nombre daxes à conserver selon les deux
critères vus en cours. Dans toutes la suite on gardera néanmoins 2
axes.
critères vus en cours. Dans toutes la suite on gardera néanmoins 2
axes.
Nous avons vu deux critères principaux: le critère de Kaiser et le
critère du coude. Le critère de Kaiser dit de garder uniquement les
@@ -253,7 +252,7 @@ plus grandes valeurs propres ou bien les quatre plus grandes**, donc
conserver ou bien **deux axes principaux, ou bien quatre**.
- Question 8 : Effectuer létude des individus. Être en particulier
vigilant aux étudiants mal représentés et commenter.
vigilant aux étudiants mal représentés et commenter.
## Contribution moyenne
@@ -266,7 +265,7 @@ La contribution moyenne est donc environ égale à **2,38%**
## Axe 1
```{r}
indiv_contrib_axe_1 <- sort(res.notes$ind$contrib[,1], decreasing = TRUE)
indiv_contrib_axe_1 <- sort(res.notes$ind$contrib[, 1], decreasing = TRUE)
head(indiv_contrib_axe_1, 3)
```
@@ -278,7 +277,7 @@ sur l'axe 1.
## Axe 2
```{r}
indiv_contrib_axe_2 <- sort(res.notes$ind$contrib[,2], decreasing = TRUE)
indiv_contrib_axe_2 <- sort(res.notes$ind$contrib[, 2], decreasing = TRUE)
head(indiv_contrib_axe_2, 3)
```
@@ -294,12 +293,12 @@ axes, c'est à dire ceux qui se distinguent ni par l'axe 1, ni par l'axe
2.
```{r}
mal_representes <- rownames(res.notes$ind$cos2)[rowSums(res.notes$ind$cos2[,1:2]) <= mean(res.notes$ind$cos2[,1:2])]
mal_representes <- rownames(res.notes$ind$cos2)[rowSums(res.notes$ind$cos2[, 1:2]) <= mean(res.notes$ind$cos2[, 1:2])]
mal_representes
```
- Question 9 : Relancer une ACP en incluant la variable catégorielle
des mentions comme variable supplémentaire.
des mentions comme variable supplémentaire.
```{r}
res.notes_sup <- PCA(notes_MAN, scale.unit = TRUE, quali.sup = c("Mention"))
@@ -311,7 +310,7 @@ summary(res.notes_sup, nb.dec = 2, nbelements = Inf, nbind = Inf)
```
- Question 10 : Déduire des deux questions précédentes une
interprétation du premier axe principal.
interprétation du premier axe principal.
La prise en compte de la variable supplémentaire **Mentions**, montre en outre que la
première composante principale est liée à la mention obtenue par les étudiants.
@@ -320,7 +319,7 @@ réussite des étudiants.
- Question 11 : Effectuer lanalyse des variables. Commenter les UE
mal représentées.
mal représentées.
## Contribution moyenne
@@ -340,7 +339,7 @@ toutes une coordonnée positive.
## Axe 2
```{r}
var_contrib_axe_2 <- sort(res.notes_sup$var$contrib[,2], decreasing = TRUE)
var_contrib_axe_2 <- sort(res.notes_sup$var$contrib[, 2], decreasing = TRUE)
head(var_contrib_axe_2, 3)
```
@@ -351,9 +350,9 @@ et **Options.S6**, corrélées négativement.
## Qualité de la représentation
```{r}
mal_representes <- rownames(res.notes_sup$var$cos2[,1:2])[rowSums(res.notes_sup$var$cos2[,1:2]) <= 0.6]
mal_representes <- rownames(res.notes_sup$var$cos2[, 1:2])[rowSums(res.notes_sup$var$cos2[, 1:2]) <= 0.6]
mal_representes
mal_representes_moy <- rownames(res.notes_sup$var$cos2[,1:2])[rowSums(res.notes_sup$var$cos2[,1:2]) <= mean(res.notes_sup$var$cos2[,1:2])]
mal_representes_moy <- rownames(res.notes_sup$var$cos2[, 1:2])[rowSums(res.notes_sup$var$cos2[, 1:2]) <= mean(res.notes_sup$var$cos2[, 1:2])]
mal_representes_moy
```
@@ -364,7 +363,7 @@ sauf 4 variables : l'**Anglais**, **MAN.PPEI.Projet**, **Options.S5** et
On remarque également que l'**Options.S5** est la variable la moins bien représentée dans le plan car sa qualité de représentation dans le plan est inférieure à la moyenne des qualités de représentation des variables dans le plan.
- Question 12 : Interpréter les deux premières composantes
principales.
principales.
On dira que la première composante principale définit un “facteur de taille” car
toutes les variables sont corrélées positivement entre elles. Ce phénomène
@@ -379,5 +378,5 @@ moyenne), c'est à dire selon leur réussite, donc leur moyenne générale de le
Le deuxième axe définit un “facteur de forme” : il y a deux groupes de variables
opposées, celles qui contribuent positivement à laxe, celles qui contribuent
négativement. Vu les variables en question, la deuxième composante principale
négativement. Vu les variables en question, la deuxième composante principale
sinterprète aisément comme opposant les matières du semestre 5 à celles du semestre 6.

View File

@@ -3,6 +3,7 @@ title: "TP2 : ACP "
output:
pdf_document: default
html_document: default
output: rmarkdown::html_vignette
---
```{r setup, include=FALSE}
@@ -59,7 +60,7 @@ help(PCA)
```{r,echo=FALSE}
res.autos<-PCA(autos, scale.unit=TRUE, quanti.sup = c("PRIX"))
res.autos<-PCA(autos, scale.unit=TRUE, quanti.sup = "PRIX")
```
```{r}
summary(res.autos, nb.dec=2, nb.elements =Inf, nbind = Inf, ncp=3) #les résultats avec deux décimales, pour tous les individus, toutes les variables, sur les 3 premières CP

View File

@@ -0,0 +1,13 @@
Version: 1.0
RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default
EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8
RnwWeave: Sweave
LaTeX: pdfLaTeX

View File

@@ -0,0 +1,9 @@
;INF05;S0510;S1020;S2035;S3550;SUP50
ARIE;870;330;730;680;470;890
AVER;820;1260;2460;3330;2170;2960
H.G.;2290;1070;1420;1830;1260;2330
GERS;1650;890;1350;2540;2090;3230
LOT;1940;1130;1750;1660;770;1140
H.P.;2110;1170;1640;1500;550;430
TARN;1770;820;1260;2010;1680;2090
T.G;1740;920;1560;2210;990;1240
1 INF05 S0510 S1020 S2035 S3550 SUP50
2 ARIE 870 330 730 680 470 890
3 AVER 820 1260 2460 3330 2170 2960
4 H.G. 2290 1070 1420 1830 1260 2330
5 GERS 1650 890 1350 2540 2090 3230
6 LOT 1940 1130 1750 1660 770 1140
7 H.P. 2110 1170 1640 1500 550 430
8 TARN 1770 820 1260 2010 1680 2090
9 T.G 1740 920 1560 2210 990 1240

View File

@@ -0,0 +1,13 @@
Version: 1.0
RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default
EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8
RnwWeave: Sweave
LaTeX: pdfLaTeX

View File

@@ -0,0 +1,249 @@
---
title: "TP5_Enonce"
author: ''
date: ''
output:
pdf_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
rm(list=ls())
library(FactoMineR)
```
----------------------------------------------------------------------------------------
Exercice 1
AFC sur le lien entre couleur des cheveux et ceux des yeux
```{r}
data("HairEyeColor")
```
```{r}
HairEyeColor
```
```{r}
data <- apply(HairEyeColor, c(1, 2), sum)
n <- sum(data)
data
```
```{r}
barplot(data,beside=TRUE,legend.text =rownames(data),main="Effectifs observés",col=c("black","brown","red","yellow"))
```
1) Commentez le barplot ci-dessus ? S'attend on à une situation d'indépendance ?
On voit que la couleur des yeux a une incidence sur la couleur des cheveux car il n'y a pas la même proportion de blond pour les yeux bleus que pour les autres couleurs de yeux. On peut donc s'attendre à une situation de dépendance entre ces deux variables.
2) Etudiez cette situation par un test du chi-deux d'indépendance
```{r}
test <- chisq.test(data)
test
```
3) Affichez le tableau des effectifs théoriques et la contribution moyenne
```{r}
test$expected
n_cases <- ncol(data) * nrow(data)
contrib_moy <- 100/n_cases
contrib_moy
```
4) Calculer le tableau des contributions au khi-deux
```{r}
contribs <- (test$observed - test$expected)**2 / test$expected * 100/test$statistic
contribs
```
5) Calculer le tableau des probabilités associé au tableau de contingence.
```{r}
prob <- data/sum(data)
prob
```
6) Calculer le tableau des profils lignes et le profil moyen associé.
-> Le profil ligne est une probabilité conditionnelle.
```{r}
marginale_ligne <- apply(prob, 1, sum)
profil_ligne <- prob / marginale_ligne
profil_ligne_moyen <- apply(prob, 2, sum)
marginale_ligne
profil_ligne
profil_ligne_moyen
```
7) Calculer le tableau des profils colonnes et le profil moyen associé.
```{r}
marginale_colonne <- apply(prob, 2, sum)
profil_colonne <- t(t(prob) / marginale_colonne)
profil_colonne_moyen <- apply(prob, 1, sum)
marginale_colonne
profil_colonne
profil_colonne_moyen
```
8) Que vaut linertie du nuage des profils lignes ? Celle du nuage des profils colonnes ?
-> inertie : la variance des profils par rapport au profil moyen. l'inertie des lignes et la même que celle des colonnes. I = chi2/Nombre d'individus
```{r}
inertie <- test$statistic/sum(data)
inertie
```
9) Lancer une AFC avec FactoMineR
```{r}
library(FactoMineR)
res.afc<-CA(data)
summary(res.afc)
plot(res.afc, invisible = "row")
plot(res.afc, invisible = "col")
```
```{r}
```
10) Faire la construcution des éboulis des valeurs propres
```{r}
eigen_values <- res.afc$eig
bplot <- barplot(
eigen_values[, 1],
names.arg = 1:nrow(eigen_values),
main = "Eboulis des valeurs propres",
xlab = "Principal Components",
ylab = "Eigenvalues",
col = "lightblue"
)
lines(x = bplot, eigen_values[, 1], type = "b", col = "red")
abline(h=1, col = "darkgray", lty = 5)
```
11) Effectuer l'analyse des correspondances
----------------------------------------------------------------------------------------
Exercice 2
AFC sur la répartition des tâches ménagères dans un foyer
```{r}
data<-read.table("housetasks.csv",sep=";",header = TRUE)
data
```
```{r}
barplot(as.matrix(data),beside=TRUE,legend.text=rownames(data),main="Effectifs observés",col=rainbow(length(rownames(data))))
```
1) Commentez le barplot ci-dessus ? S'attend on à une situation d'indépendance ?
On voit que la place dans la famille a une incidence sur les taches de la famille car il n'y a pas la même proportion de Laundry chez la femme que pour les autres membres de la famille. On peut donc s'attendre à une situation de dépendance entre ces deux variables.
2) Etudiez cette situation par un test du chi-deux d'indépendance
```{r}
data_house <- apply(data, c(1, 2), sum)
test_house <- chisq.test(data_house)
test_house
```
3) Affichez le tableau des effectifs théoriques et la contribution moyenne
```{r}
test_house$expected
n_cases <- ncol(data_house) * nrow(data_house)
contrib_moy_house <- 100/n_cases
contrib_moy_house
```
4) Calculer le tableau des contributions au khi-deux
```{r}
contrib_house <- (test_house$observed - test_house$expected)**2 / test_house$expected * 100/test_house$statistic
contrib_house
```
5) Calculer le tableau des probabilités associé au tableau de contingence.
```{r}
proba_house <- data_house / sum(data_house)
proba_house
```
6) Calculer le tableau des profils lignes et le profil moyen associé.
```{r}
marginale_ligne <- apply(proba_house, 1, sum)
profil_ligne <- proba_house / marginale_ligne
profil_ligne_moyen <- apply(proba_house, 2, sum)
marginale_ligne
profil_ligne
profil_ligne_moyen
```
7) Calculer le tableau des profils colonnes et le profil moyen associé.
```{r}
marginale_colonne <- apply(proba_house, 2, sum)
profil_colonne <- t(t(proba_house) / marginale_colonne)
profil_colonne_moyen <- apply(proba_house, 1, sum)
marginale_colonne
profil_colonne
profil_colonne_moyen
```
8) Que vaut linertie du nuage des profils lignes ? Celle du nuage des profils colonnes ?
```{r}
inertie <- test_house$statistic / sum(data_house)
inertie
```
9) Lancer une AFC avec FactoMineR
```{r}
res.afc<-CA(data)
summary(res.afc,nbelements = Inf)
plot(res.afc, invisible = "row")
plot(res.afc, invisible = "col")
```
10) Faire la construcution des éboulis des valeurs propres
```{r}
eigen_values <- res.afc$eig
bplot <- barplot(
eigen_values[, 1],
names.arg = 1:nrow(eigen_values),
main = "Eboulis des valeurs propres",
xlab = "Principal Components",
ylab = "Eigenvalues",
col = "lightblue"
)
lines(x = bplot, eigen_values[, 1], type = "b", col = "red")
abline(h=1, col = "darkgray", lty = 5)
```
11) Effectuer l'analyse des correspondances
Axe 1 : taches pour les femmes a gauche et les maris a droite
Axe 2 : taches individuelles en haut, taches collectives au milieu et en bas

View File

@@ -0,0 +1,14 @@
"Wife";"Alternating";"Husband";"Jointly"
"Laundry";156;14;2;4
"Main_meal";124;20;5;4
"Dinner";77;11;7;13
"Breakfeast";82;36;15;7
"Tidying";53;11;1;57
"Dishes";32;24;4;53
"Shopping";33;23;9;55
"Official";12;46;23;15
"Driving";10;51;75;3
"Finances";13;13;21;66
"Insurance";8;1;53;77
"Repairs";0;3;160;2
"Holidays";0;1;6;153
1 Wife Alternating Husband Jointly
2 Laundry 156 14 2 4
3 Main_meal 124 20 5 4
4 Dinner 77 11 7 13
5 Breakfeast 82 36 15 7
6 Tidying 53 11 1 57
7 Dishes 32 24 4 53
8 Shopping 33 23 9 55
9 Official 12 46 23 15
10 Driving 10 51 75 3
11 Finances 13 13 21 66
12 Insurance 8 1 53 77
13 Repairs 0 3 160 2
14 Holidays 0 1 6 153

View File

@@ -9,9 +9,9 @@
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"\n",
"import numpy as np # pour les numpy array\n",
"import matplotlib.pyplot as plt # librairie graphique\n",
"from scipy.integrate import odeint # seulement odeint"
"import matplotlib.pyplot as plt # librairie graphique\n",
"import numpy as np # pour les numpy array\n",
"from scipy.integrate import odeint # seulement odeint"
]
},
{
@@ -103,25 +103,25 @@
"source": [
"# Initialisation des variables\n",
"T = 130\n",
"t = np.arange(0, T+1)\n",
"t = np.arange(0, T + 1)\n",
"\n",
"K = 50 # capacité d'accueil maximale du milieu\n",
"K = 50 # capacité d'accueil maximale du milieu\n",
"K_star = 50 # capacité minimale pour maintenir l'espèce\n",
"r = 0.1 # taux de corissance de la capacité d'accueil du milieu.\n",
"t_fl = 30 # la find ed la période de formation.\n",
"K0 = 1 # Valeur initiale de la capacité d'accueil du milieu.\n",
"r = 0.1 # taux de corissance de la capacité d'accueil du milieu.\n",
"t_fl = 30 # la find ed la période de formation.\n",
"K0 = 1 # Valeur initiale de la capacité d'accueil du milieu.\n",
"\n",
"\n",
"def C(t):\n",
" \"\"\"\n",
" Fonction retournant la solution exacte du problème au temps t\n",
" \"\"\"\n",
" \"\"\"Fonction retournant la solution exacte du problème au temps t.\"\"\"\n",
" return K_star + K / (1 + (K / K0 - 1) * np.exp(-r * (t - t_fl)))\n",
"\n",
"\n",
"# On trace le graphique de la solution exacte\n",
"plt.plot(t, C(t), label=\"C(t)\")\n",
"plt.hlines(K_star, 0, T, linestyle='dotted', label=\"C = K*\", color='red')\n",
"plt.hlines(K + K_star, 0, T, linestyle='dotted', label=\"C = K + K*\", color='green')\n",
"plt.plot(t_fl, K0 + K_star, 'o', label=\"(t_fl, K0 + K*)\")\n",
"plt.hlines(K_star, 0, T, linestyle=\"dotted\", label=\"C = K*\", color=\"red\")\n",
"plt.hlines(K + K_star, 0, T, linestyle=\"dotted\", label=\"C = K + K*\", color=\"green\")\n",
"plt.plot(t_fl, K0 + K_star, \"o\", label=\"(t_fl, K0 + K*)\")\n",
"plt.legend()\n",
"plt.xlim(0, 130)\n",
"plt.suptitle(\"Courbe de la solution exacte du problème\")\n",
@@ -133,14 +133,14 @@
"N0 = 10\n",
"r_N = 0.2\n",
"\n",
"\n",
"def dN(N, t, C_sol):\n",
" \"\"\"\n",
" Fonction calculant la dérivée de la solution approchée du problème à l'instant t dépendant de N(t) et de C(t)\n",
" \"\"\"\n",
" \"\"\"Fonction calculant la dérivée de la solution approchée du problème à l'instant t dépendant de N(t) et de C(t).\"\"\"\n",
" return r_N * N * (1 - N / C_sol(t))\n",
"\n",
"\n",
"t = np.linspace(0, T, 200)\n",
"N_sol = odeint(dN, N0, t, args=(C,)) # On calcule la solution a l'aide de odeint\n",
"N_sol = odeint(dN, N0, t, args=(C,)) # On calcule la solution a l'aide de odeint\n",
"\n",
"# On trace le graphique de la solution approchée en comparaison à la solution exacte\n",
"plt.plot(t, N_sol, label=\"Solution approchée\")\n",
@@ -219,42 +219,53 @@
"T, N = 200, 100\n",
"H0, P0 = 1500, 500\n",
"\n",
"def F(X, t, a, b, c, d, p):\n",
" \"\"\"Fonction second membre pour le système\"\"\"\n",
" x, y = X\n",
" return np.array([x * (a - p - b*y), y * (-c - p + d*x)])\n",
"\n",
"t = np.linspace(0, T, N+1)\n",
"sardines, requins = np.meshgrid(\n",
" np.linspace(0.1, 3000, 20),\n",
" np.linspace(0.1, 4500, 30)\n",
")\n",
"def F(X, t, a, b, c, d, p):\n",
" \"\"\"Fonction second membre pour le système.\"\"\"\n",
" x, y = X\n",
" return np.array([x * (a - p - b * y), y * (-c - p + d * x)])\n",
"\n",
"\n",
"t = np.linspace(0, T, N + 1)\n",
"sardines, requins = np.meshgrid(np.linspace(0.1, 3000, 20), np.linspace(0.1, 4500, 30))\n",
"fsardines = F((sardines, requins), t, a, b, c, d, 0)[0]\n",
"frequins = F((sardines, requins), t, a, b, c, d, 0)[1]\n",
"n_sndmb = np.sqrt(fsardines**2 + frequins**2) \n",
"n_sndmb = np.sqrt(fsardines**2 + frequins**2)\n",
"\n",
"# On crée une figure à trois graphiques\n",
"fig = plt.figure(figsize=(12, 6))\n",
"ax = fig.add_subplot(1, 2, 2) # subplot pour le champ de vecteurs et le graphe sardines vs requins\n",
"axr = fig.add_subplot(2, 2, 1) # subplot pour le graphe du nombre de requins en fonction du temps\n",
"axs = fig.add_subplot(2, 2, 3) # subplot pour le graphe du nombre de sardines en fonction du temps \n",
"ax.quiver(sardines, requins, fsardines/n_sndmb, frequins/n_sndmb)\n",
"ax = fig.add_subplot(\n",
" 1,\n",
" 2,\n",
" 2,\n",
") # subplot pour le champ de vecteurs et le graphe sardines vs requins\n",
"axr = fig.add_subplot(\n",
" 2,\n",
" 2,\n",
" 1,\n",
") # subplot pour le graphe du nombre de requins en fonction du temps\n",
"axs = fig.add_subplot(\n",
" 2,\n",
" 2,\n",
" 3,\n",
") # subplot pour le graphe du nombre de sardines en fonction du temps\n",
"ax.quiver(sardines, requins, fsardines / n_sndmb, frequins / n_sndmb)\n",
"\n",
"list_p = [0, 0.02, 0.04, 0.06]\n",
"for k, pk in enumerate(list_p):\n",
" couleur = (0, k/len(list_p), 1-k/len(list_p))\n",
" couleur = (0, k / len(list_p), 1 - k / len(list_p))\n",
" X = odeint(F, np.array([H0, P0]), t, args=(a, b, c, d, pk))\n",
" \n",
" # Tracer la courbe parametrée (H(t),P(t)) \n",
"\n",
" # Tracer la courbe parametrée (H(t),P(t))\n",
" ax.plot(X[:, 0], X[:, 1], linewidth=2, color=couleur, label=f\"$p={pk}$\")\n",
" \n",
" # Tracer H en fonction du temps \n",
"\n",
" # Tracer H en fonction du temps\n",
" axs.plot(t, X[:, 0], label=f\"Sardines pour p={pk}\", color=couleur)\n",
" \n",
"\n",
" # Tracer P en fonction du temps\n",
" axr.plot(t, X[:, 1], label=f\"Requins pour p={pk}\", color=couleur)\n",
" \n",
"ax.axis('equal')\n",
"\n",
"ax.axis(\"equal\")\n",
"ax.set_title(\"Champ de vecteur du problème de Lotka-Volterra\")\n",
"ax.set_xlabel(\"Sardines\")\n",
"ax.set_ylabel(\"Requins\")\n",
@@ -266,8 +277,8 @@
"axs.set_xlabel(\"Temps t\")\n",
"axs.set_ylabel(\"Sardines\")\n",
"\n",
"axs.set_title('Evolution des sardines')\n",
"axr.set_title('Evolution des requins')\n",
"axs.set_title(\"Evolution des sardines\")\n",
"axr.set_title(\"Evolution des requins\")\n",
"plt.show()"
]
},
@@ -308,12 +319,10 @@
"outputs": [],
"source": [
"def crank_nicolson(y0, T, N, r):\n",
" \"\"\"\n",
" schéma de Crank-Nicolson pour le modèle de Malthus \n",
" \n",
" \"\"\"schéma de Crank-Nicolson pour le modèle de Malthus.\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" y0: float\n",
" donnée initiale\n",
" T: float\n",
@@ -325,34 +334,32 @@
"\n",
" Returns\n",
" -------\n",
" \n",
" t: ndarray\n",
" les instants où la solution approchée est calculée\n",
" y: ndarray\n",
" les valeurs de la solution approchée par le theta-schema\n",
"\n",
" \"\"\"\n",
" \n",
" dt = T / N\n",
" t = np.zeros(N+1)\n",
" y = np.zeros(N+1)\n",
" t = np.zeros(N + 1)\n",
" y = np.zeros(N + 1)\n",
" tk, yk = 0, y0\n",
" y[0] = yk\n",
" \n",
"\n",
" for n in range(N):\n",
" tk += dt\n",
" yk *= (2 + dt * r) / (2 - dt * r)\n",
" y[n+1] = yk\n",
" t[n+1] = tk\n",
" \n",
" y[n + 1] = yk\n",
" t[n + 1] = tk\n",
"\n",
" return t, y\n",
"\n",
"\n",
"def euler_explicit(y0, T, N, r):\n",
" \"\"\"\n",
" schéma de d'Euler pour le modèle de Malthus \n",
" \n",
" \"\"\"schéma de d'Euler pour le modèle de Malthus.\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" y0: float\n",
" donnée initiale\n",
" T: float\n",
@@ -364,30 +371,29 @@
"\n",
" Returns\n",
" -------\n",
" \n",
" t: ndarray\n",
" les instants où la solution approchée est calculée\n",
" y: ndarray\n",
" les valeurs de la solution approchée par le theta-schema\n",
"\n",
" \"\"\"\n",
" dt = T / N\n",
" t = np.zeros(N+1)\n",
" y = np.zeros(N+1)\n",
" t = np.zeros(N + 1)\n",
" y = np.zeros(N + 1)\n",
" tk, yk = 0, y0\n",
" y[0] = yk\n",
" \n",
"\n",
" for n in range(N):\n",
" tk += dt\n",
" yk += dt * r * yk\n",
" y[n+1] = yk\n",
" t[n+1] = tk\n",
" \n",
" y[n + 1] = yk\n",
" t[n + 1] = tk\n",
"\n",
" return t, y\n",
"\n",
"\n",
"def solution_exacte(t):\n",
" \"\"\"\n",
" Fonction calculant la solution exacte du modèle de Malthus à l'instant t \n",
" \"\"\"\n",
" \"\"\"Fonction calculant la solution exacte du modèle de Malthus à l'instant t.\"\"\"\n",
" return y0 * np.exp(r * t)"
]
},
@@ -436,23 +442,28 @@
"# Schéma d'Euler explicite\n",
"ax = fig.add_subplot(1, 2, 1)\n",
"for n in liste_N:\n",
" t, y = euler_explicit(y0, T, n, r) # On calcule la fonction Euler pour chaque n\n",
" t, y = euler_explicit(y0, T, n, r) # On calcule la fonction Euler pour chaque n\n",
" ax.scatter(t, y, label=f\"Solution approchée pour N={n}\")\n",
" \n",
"ax.plot(t_exact, solution_exacte(t_exact), label='Solution exacte')\n",
"\n",
"ax.plot(t_exact, solution_exacte(t_exact), label=\"Solution exacte\")\n",
"ax.legend()\n",
"ax.axis('equal')\n",
"ax.axis(\"equal\")\n",
"ax.set_title(\"Schéma d'Euler explicite\")\n",
"ax.set_xlabel(\"Temps t\")\n",
"ax.set_ylabel(\"y\")\n",
" \n",
"\n",
"\n",
"# Schéma de Crank-Nicolson\n",
"ax = fig.add_subplot(1, 2, 2)\n",
"for n in liste_N:\n",
" t, y = crank_nicolson(y0, T, n, r) # On calcule la fonction Crank-Nicolson pour chaque n\n",
" t, y = crank_nicolson(\n",
" y0,\n",
" T,\n",
" n,\n",
" r,\n",
" ) # On calcule la fonction Crank-Nicolson pour chaque n\n",
" ax.scatter(t, y, label=f\"Solution approchée pour N={n}\")\n",
"ax.plot(t_exact, solution_exacte(t_exact), label='Solution exacte')\n",
"ax.plot(t_exact, solution_exacte(t_exact), label=\"Solution exacte\")\n",
"ax.legend()\n",
"ax.set_title(\"Schéma de Crank-Nicolson\")\n",
"ax.set_xlabel(\"Temps t\")\n",
@@ -504,7 +515,7 @@
" t, sol_appr = crank_nicolson(y0, T, n, r)\n",
" sol_ex = solution_exacte(t)\n",
" erreur = np.max(np.abs(sol_appr - sol_ex))\n",
" #erreur = np.linalg.norm(sol_appr - sol_ex, np.inf)\n",
" # erreur = np.linalg.norm(sol_appr - sol_ex, np.inf)\n",
" print(f\"Delta_t = {T / N:10.3e}, e = {erreur:10.3e}\")\n",
" liste_erreur[k] = erreur\n",
"\n",
@@ -514,7 +525,7 @@
"ax.scatter(liste_delta, liste_erreur, color=\"black\")\n",
"for p in [0.5, 1, 2]:\n",
" C = liste_erreur[-1] / (liste_delta[-1] ** p)\n",
" plt.plot(liste_delta, C * liste_delta ** p, label=f\"$p={p}$\")\n",
" plt.plot(liste_delta, C * liste_delta**p, label=f\"$p={p}$\")\n",
"ax.set_title(\"Erreur du schéma de Crank-Nicolson\")\n",
"ax.set_xlabel(r\"$\\Delta t$\")\n",
"ax.set_ylabel(r\"$e(\\Delta t)$\")\n",

View File

@@ -19,8 +19,8 @@
},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt"
"import matplotlib.pyplot as plt\n",
"import numpy as np"
]
},
{
@@ -151,24 +151,27 @@
],
"source": [
"def M(x):\n",
" \"\"\"\n",
" Retourne la matrice du système (2)\n",
" \n",
" \"\"\"Retourne la matrice du système (2).\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" vecteurs contenant les valeurs [x0, x1, ..., xN]\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" out: ndarray\n",
" matrice du système (2)\n",
"\n",
" \"\"\"\n",
" h = x[1:] - x[:-1] # x[i+1] - x[i]\n",
" return np.diag(2*(1/h[:-1] + 1/h[1:])) + np.diag(1/h[1:-1], k=-1) + np.diag(1/h[1:-1], k=1)\n",
" \n",
" h = x[1:] - x[:-1] # x[i+1] - x[i]\n",
" return (\n",
" np.diag(2 * (1 / h[:-1] + 1 / h[1:]))\n",
" + np.diag(1 / h[1:-1], k=-1)\n",
" + np.diag(1 / h[1:-1], k=1)\n",
" )\n",
"\n",
"\n",
"# Test\n",
"print(M(np.array([0, 1, 2, 3, 4])))"
]
@@ -189,12 +192,10 @@
"outputs": [],
"source": [
"def sprime(x, y, p0, pN):\n",
" \"\"\"\n",
" Retourne la solution du système (2)\n",
" \n",
" \"\"\"Retourne la solution du système (2).\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" vecteurs contenant les valeurs [x0, x1, ..., xN]\n",
" y: ndarray\n",
@@ -203,18 +204,18 @@
" première valeur du vecteur p\n",
" pN: int\n",
" N-ième valeur du vecteur p\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" out: ndarray\n",
" solution du système (2)\n",
"\n",
" \"\"\"\n",
" h = x[1:] - x[:-1]\n",
" delta_y = (y[1:] - y[:-1]) / h\n",
" c = 3 * (delta_y[1:]/h[1:] + delta_y[:-1]/h[:-1])\n",
" c[0] -= p0/h[0]\n",
" c[-1] -= pN/h[-1]\n",
" c = 3 * (delta_y[1:] / h[1:] + delta_y[:-1] / h[:-1])\n",
" c[0] -= p0 / h[0]\n",
" c[-1] -= pN / h[-1]\n",
" return np.linalg.solve(M(x), c)"
]
},
@@ -271,54 +272,52 @@
],
"source": [
"def f(x):\n",
" \"\"\"\n",
" Retourne la fonction f évaluée aux points x\n",
" \n",
" \"\"\"Retourne la fonction f évaluée aux points x.\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" vecteurs contenant les valeurs [x0, x1, ..., xN]\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" out: ndarray\n",
" Valeur de la fonction f aux points x\n",
"\n",
" \"\"\"\n",
" return 1 / (1 + x**2)\n",
"\n",
"\n",
"def fprime(x):\n",
" \"\"\"\n",
" Retourne la fonction dérivée de f évaluée aux points x\n",
" \n",
" \"\"\"Retourne la fonction dérivée de f évaluée aux points x.\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" vecteurs contenant les valeurs [x0, x1, ..., xN]\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" out: ndarray\n",
" Valeur de la fonction dérivée de f aux points x\n",
" \"\"\"\n",
" return -2*x/((1+x**2)**2)\n",
"\n",
"# Paramètres \n",
" \"\"\"\n",
" return -2 * x / ((1 + x**2) ** 2)\n",
"\n",
"\n",
"# Paramètres\n",
"xx = np.linspace(-5, 5, 200)\n",
"x = np.linspace(-5, 5, 21)\n",
"pi = sprime(x, f(x), fprime(-5), fprime(5))\n",
"\n",
"# Graphique\n",
"fig, ax = plt.subplots(figsize=(6, 6))\n",
"ax.plot(xx, fprime(xx), label=f'$f\\'$', color='red')\n",
"ax.scatter(x[1:-1], pi, label=f'$p_i$')\n",
"ax.plot(xx, fprime(xx), label=\"$f'$\", color=\"red\")\n",
"ax.scatter(x[1:-1], pi, label=\"$p_i$\")\n",
"ax.legend()\n",
"ax.set_xlabel(f'$x$')\n",
"ax.set_ylabel(f'$f(x)$')\n",
"ax.set_title('Les pentes de la spline cubique')"
"ax.set_xlabel(\"$x$\")\n",
"ax.set_ylabel(\"$f(x)$\")\n",
"ax.set_title(\"Les pentes de la spline cubique\")"
]
},
{
@@ -361,12 +360,10 @@
"outputs": [],
"source": [
"def splines(x, y, p0, pN):\n",
" \"\"\"\n",
" Retourne la matrice S de taille (4, N)\n",
" \n",
" \"\"\"Retourne la matrice S de taille (4, N).\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" vecteurs contenant les valeurs [x0, x1, ..., xN]\n",
" y: ndarray\n",
@@ -375,20 +372,20 @@
" première valeur du vecteur p\n",
" pN: int\n",
" N-ième valeur du vecteur p\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" out: ndarray\n",
" Matrice S de taille (4, N) tel que la i-ième ligne contient les valeurs a_i, b_i, c_i et d_i\n",
"\n",
" \"\"\"\n",
" h = x[1:] - x[:-1]\n",
" delta_y = (y[1:] - y[:-1]) / h\n",
" \n",
"\n",
" a = y\n",
" b = np.concatenate((np.array([p0]), sprime(x, y, p0, pN), np.array([pN])))\n",
" c = 3/h * delta_y - (b[1:] + 2*b[:-1]) / h\n",
" d = 1/h**2 * (b[1:] + b[:-1]) - 2/h**2 * delta_y\n",
" c = 3 / h * delta_y - (b[1:] + 2 * b[:-1]) / h\n",
" d = 1 / h**2 * (b[1:] + b[:-1]) - 2 / h**2 * delta_y\n",
" return np.transpose([a[:-1], b[:-1], c, d])"
]
},
@@ -412,35 +409,36 @@
},
"outputs": [],
"source": [
"def spline_eval( x, xx, S ):\n",
" \"\"\"\n",
" Evalue une spline définie par des noeuds équirepartis\n",
" \n",
"def spline_eval(x, xx, S):\n",
" \"\"\"Evalue une spline définie par des noeuds équirepartis.\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" x: ndarray\n",
" noeuds définissant la spline\n",
" \n",
"\n",
" xx: ndarray\n",
" abscisses des points d'évaluation\n",
" \n",
"\n",
" S: ndarray\n",
" de taille (x.size-1, 4)\n",
" tableau dont la i-ème ligne contient les coéficients du polynome cubique qui est la restriction\n",
" de la spline à l'intervalle [x_i, x_{i+1}]\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" ndarray\n",
" ordonnées des points d'évaluation\n",
"\n",
" \"\"\"\n",
" ind = ( np.floor( ( xx - x[ 0 ] ) / ( x[ 1 ] - x[ 0 ] ) ) ).astype( int )\n",
" ind = np.where( ind == x.size-1, ind - 1 , ind )\n",
" yy = S[ ind, 0 ] + S[ ind, 1 ] * ( xx - x[ ind ] ) + \\\n",
" S[ ind, 2 ] * ( xx - x[ ind ] )**2 + S[ ind, 3 ] * ( xx - x[ ind ] )**3\n",
" return yy"
" ind = (np.floor((xx - x[0]) / (x[1] - x[0]))).astype(int)\n",
" ind = np.where(ind == x.size - 1, ind - 1, ind)\n",
" return (\n",
" S[ind, 0]\n",
" + S[ind, 1] * (xx - x[ind])\n",
" + S[ind, 2] * (xx - x[ind]) ** 2\n",
" + S[ind, 3] * (xx - x[ind]) ** 3\n",
" )"
]
},
{
@@ -472,21 +470,21 @@
}
],
"source": [
"# Paramètres \n",
"# Paramètres\n",
"x = np.linspace(-5, 5, 6)\n",
"y = np.random.rand(5+1)\n",
"y = np.random.rand(5 + 1)\n",
"xx = np.linspace(-5, 5, 200)\n",
"s = splines(x, y, 0, 0)\n",
"s_eval = spline_eval(x, xx, s)\n",
"\n",
"# Graphique\n",
"fig, ax = plt.subplots(figsize=(6, 6))\n",
"ax.plot(xx, s_eval, label='spline cubique interpolateur', color='red')\n",
"ax.scatter(x, y, label=f'$(x_i, y_i)$')\n",
"ax.plot(xx, s_eval, label=\"spline cubique interpolateur\", color=\"red\")\n",
"ax.scatter(x, y, label=\"$(x_i, y_i)$\")\n",
"ax.legend()\n",
"ax.set_xlabel(f'$x$')\n",
"ax.set_ylabel(f'$f(x)$')\n",
"ax.set_title('Evaluation de la spline cubique')"
"ax.set_xlabel(\"$x$\")\n",
"ax.set_ylabel(\"$f(x)$\")\n",
"ax.set_title(\"Evaluation de la spline cubique\")"
]
},
{
@@ -525,7 +523,7 @@
}
],
"source": [
"# Paramètres \n",
"# Paramètres\n",
"a, b = -5, 5\n",
"N_list = [4, 9, 19]\n",
"\n",
@@ -533,17 +531,17 @@
"fig, ax = plt.subplots(figsize=(15, 6))\n",
"\n",
"for N in N_list:\n",
" x = np.linspace(a, b, N+1)\n",
" x = np.linspace(a, b, N + 1)\n",
" xx = np.linspace(a, b, 200)\n",
" s = splines(x, f(x), 0, 0)\n",
" s_eval = spline_eval(x, xx, s)\n",
" ax.plot(xx, s_eval, label=f'Spline cubique interpolateur pour N={N}')\n",
" ax.scatter(x, f(x), label=f'f(x) pour N={N}')\n",
" \n",
" ax.plot(xx, s_eval, label=f\"Spline cubique interpolateur pour N={N}\")\n",
" ax.scatter(x, f(x), label=f\"f(x) pour N={N}\")\n",
"\n",
"ax.legend()\n",
"ax.set_xlabel(f'$x$')\n",
"ax.set_ylabel(f'$f(x)$')\n",
"ax.set_title('Evaluation de la spline cubique')"
"ax.set_xlabel(\"$x$\")\n",
"ax.set_ylabel(\"$f(x)$\")\n",
"ax.set_title(\"Evaluation de la spline cubique\")"
]
},
{

View File

@@ -19,10 +19,10 @@
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"from scipy.special import roots_legendre\n",
"from scipy.integrate import quad\n",
"import matplotlib.pyplot as plt"
"from scipy.special import roots_legendre"
]
},
{
@@ -60,17 +60,20 @@
},
"outputs": [],
"source": [
"def f0(x): \n",
"def f0(x):\n",
" return np.exp(x)\n",
"\n",
"\n",
"def f1(x):\n",
" return 1 / (1 + 16*np.power(x, 2))\n",
" return 1 / (1 + 16 * np.power(x, 2))\n",
"\n",
"\n",
"def f2(x):\n",
" return np.power(np.abs(x**2 - 1/4), 3)\n",
" return np.power(np.abs(x**2 - 1 / 4), 3)\n",
"\n",
"\n",
"def f3(x):\n",
" return np.power(np.abs(x+1/2), 1/2)"
" return np.power(np.abs(x + 1 / 2), 1 / 2)"
]
},
{
@@ -176,10 +179,12 @@
],
"source": [
"for f in [f0, f1, f2, f3]:\n",
" print(f'Calcule de I(f) par la méthode de gauss et par la formule quadratique pour la fonction {f.__name__}')\n",
" print(\n",
" f\"Calcule de I(f) par la méthode de gauss et par la formule quadratique pour la fonction {f.__name__}\",\n",
" )\n",
" for n in range(1, 11):\n",
" print(f\"Pour n = {n}, gauss = {gauss(f, n)} et quad = {quad(f, -1, 1)[0]}\")\n",
" print('')"
" print()"
]
},
{
@@ -210,11 +215,12 @@
"source": [
"def simpson(f, N):\n",
" if N % 2 == 0:\n",
" raise ValueError(\"N doit est impair.\")\n",
" \n",
" msg = \"N doit est impair.\"\n",
" raise ValueError(msg)\n",
"\n",
" h = 2 / (2 * (N - 1) // 2)\n",
" fx = f(np.linspace(-1, 1, N))\n",
" \n",
"\n",
" return (h / 3) * (fx[0] + 4 * fx[1:-1:2].sum() + 2 * fx[2:-1:2].sum() + fx[-1])"
]
},
@@ -270,10 +276,12 @@
],
"source": [
"for f in [f0, f1, f2, f3]:\n",
" print(f'Calcule de I(f) par la méthode de simpson et par la formule quadratique pour la fonction {f.__name__}')\n",
" print(\n",
" f\"Calcule de I(f) par la méthode de simpson et par la formule quadratique pour la fonction {f.__name__}\",\n",
" )\n",
" for n in range(3, 16, 2):\n",
" print(f\"Pour n = {n}, simpson = {simpson(f, n)} et quad = {quad(f, -1, 1)[0]}\")\n",
" print('')"
" print()"
]
},
{
@@ -333,10 +341,9 @@
"def poly_tchebychev(x, N):\n",
" if N == 0:\n",
" return np.ones_like(x)\n",
" elif N == 1:\n",
" if N == 1:\n",
" return x\n",
" else:\n",
" return 2 * x * poly_tchebychev(x, N-1) - poly_tchebychev(x, N-2)"
" return 2 * x * poly_tchebychev(x, N - 1) - poly_tchebychev(x, N - 2)"
]
},
{
@@ -346,7 +353,7 @@
"outputs": [],
"source": [
"def points_tchebychev(N):\n",
" k = np.arange(1, N+1)\n",
" k = np.arange(1, N + 1)\n",
" return np.cos((2 * k - 1) * np.pi / (2 * N))"
]
},
@@ -414,7 +421,7 @@
" print(f\"Pour N = {n}\")\n",
" print(f\"Les points de Tchebychev sont {xk}\")\n",
" print(f\"L'evaluation du polynome de Tchebychev Tn en ces points est {Tn}\")\n",
" print(\"\")"
" print()"
]
},
{
@@ -449,7 +456,7 @@
" lamk = np.zeros(N)\n",
" for k in range(N):\n",
" s = 0\n",
" for m in range(1, N//2+1):\n",
" for m in range(1, N // 2 + 1):\n",
" T = poly_tchebychev(xk[k], 2 * m)\n",
" s += 2 * T / (4 * np.power(m, 2) - 1)\n",
" lamk[k] = 2 / N * (1 - s)\n",
@@ -529,10 +536,12 @@
],
"source": [
"for f in [f0, f1, f2, f3]:\n",
" print(f'Calcule de I(f) par la méthode de fejer et par la formule quadratique pour la fonction {f.__name__}')\n",
" print(\n",
" f\"Calcule de I(f) par la méthode de fejer et par la formule quadratique pour la fonction {f.__name__}\",\n",
" )\n",
" for n in range(1, 11):\n",
" print(f\"Pour n = {n}, fejer = {fejer(f, n)} et quad = {quad(f, -1, 1)[0]}\")\n",
" print('')"
" print()"
]
},
{
@@ -568,26 +577,47 @@
"figure = plt.figure(figsize=(15, 10))\n",
"for fi, f in enumerate([f0, f1, f2, f3]):\n",
" error_gauss = np.zeros((N,))\n",
" error_simp = np.zeros(((N-1)//2,))\n",
" error_simp = np.zeros(((N - 1) // 2,))\n",
" error_fejer = np.zeros((N,))\n",
" I_quad, _ = quad(f, -1, 1)\n",
" \n",
" for n in range(1, N+1):\n",
"\n",
" for n in range(1, N + 1):\n",
" I_gauss = gauss(f, n)\n",
" error_gauss[n-1] = np.abs(I_gauss - I_quad)\n",
" error_gauss[n - 1] = np.abs(I_gauss - I_quad)\n",
" I_fejer = fejer(f, n)\n",
" error_fejer[n-1] = np.abs(I_fejer - I_quad)\n",
" \n",
" for n in range( 3, N+1, 2 ):\n",
" error_fejer[n - 1] = np.abs(I_fejer - I_quad)\n",
"\n",
" for n in range(3, N + 1, 2):\n",
" I_simp = simpson(f, n)\n",
" error_simp[(n-2)//2] = np.abs(I_simp - I_quad)\n",
" \n",
" error_simp[(n - 2) // 2] = np.abs(I_simp - I_quad)\n",
"\n",
" ax = figure.add_subplot(2, 2, fi + 1)\n",
" ax.scatter(np.arange(1, N+1), np.log10(error_gauss, out = -16. * np.ones(error_gauss.shape), \n",
" where = (error_gauss > 1e-16)), label = 'Gauss', marker=\"+\")\n",
" ax.scatter(np.arange(3, N+1, 2), np.log10( error_simp ), label = 'Simpson', marker=\"+\")\n",
" ax.scatter(np.arange(1, N+1), np.log10(error_fejer, out = -16. * np.ones(error_fejer.shape), \n",
" where = (error_fejer > 1e-16)), label = 'Fejer', marker=\"+\")\n",
" ax.scatter(\n",
" np.arange(1, N + 1),\n",
" np.log10(\n",
" error_gauss,\n",
" out=-16.0 * np.ones(error_gauss.shape),\n",
" where=(error_gauss > 1e-16),\n",
" ),\n",
" label=\"Gauss\",\n",
" marker=\"+\",\n",
" )\n",
" ax.scatter(\n",
" np.arange(3, N + 1, 2),\n",
" np.log10(error_simp),\n",
" label=\"Simpson\",\n",
" marker=\"+\",\n",
" )\n",
" ax.scatter(\n",
" np.arange(1, N + 1),\n",
" np.log10(\n",
" error_fejer,\n",
" out=-16.0 * np.ones(error_fejer.shape),\n",
" where=(error_fejer > 1e-16),\n",
" ),\n",
" label=\"Fejer\",\n",
" marker=\"+\",\n",
" )\n",
" ax.legend()\n",
" ax.set_title(f\"Erreur de différentes méthodes de quadrature pour {f.__name__}\")\n",
" ax.set_xlabel(\"n\")\n",
@@ -672,18 +702,29 @@
"def f(x, k):\n",
" return x**k\n",
"\n",
"\n",
"print(\"-----------------------------------------------------------------------\")\n",
"print(\"{:>5s} | {:>7s} {:>9s} {:>9s} {:>9s} {:>9s} {:>9s}\".format(\"N\", \"x^0\", \"x^2\", \"x^4\", \"x^6\", \"x^8\", \"x^10\"))\n",
"print(\n",
" \"{:>5s} | {:>7s} {:>9s} {:>9s} {:>9s} {:>9s} {:>9s}\".format(\n",
" \"N\",\n",
" \"x^0\",\n",
" \"x^2\",\n",
" \"x^4\",\n",
" \"x^6\",\n",
" \"x^8\",\n",
" \"x^10\",\n",
" ),\n",
")\n",
"print(\"-----------------------------------------------------------------------\")\n",
"\n",
"for N in range(1, 11):\n",
" approx_errors = []\n",
" for k in [x for x in range(0, 11, 2)]:\n",
" for k in list(range(0, 11, 2)):\n",
" I_approx = gauss(lambda x: f(x, k), N)\n",
" I_exact = 2 / (k + 1) if k % 2 == 0 else 0\n",
" approx_error = np.abs(I_approx - I_exact)\n",
" approx_errors.append(approx_error)\n",
" print(\"{:5d} | \".format(N) + \" \".join(\"{:.3f} \".format(e) for e in approx_errors))"
" print(f\"{N:5d} | \" + \" \".join(f\"{e:.3f} \" for e in approx_errors))"
]
},
{
@@ -722,18 +763,29 @@
"def f(x, k):\n",
" return x**k\n",
"\n",
"\n",
"print(\"-----------------------------------------------------------------------\")\n",
"print(\"{:>5s} | {:>7s} {:>9s} {:>9s} {:>9s} {:>9s} {:>9s}\".format(\"N\", \"x^0\", \"x^2\", \"x^4\", \"x^6\", \"x^8\", \"x^10\"))\n",
"print(\n",
" \"{:>5s} | {:>7s} {:>9s} {:>9s} {:>9s} {:>9s} {:>9s}\".format(\n",
" \"N\",\n",
" \"x^0\",\n",
" \"x^2\",\n",
" \"x^4\",\n",
" \"x^6\",\n",
" \"x^8\",\n",
" \"x^10\",\n",
" ),\n",
")\n",
"print(\"-----------------------------------------------------------------------\")\n",
"\n",
"for N in range(1, 11):\n",
" approx_errors = []\n",
" for k in [x for x in range(0, 11, 2)]:\n",
" for k in list(range(0, 11, 2)):\n",
" I_approx = fejer(lambda x: f(x, k), N)\n",
" I_exact = 2 / (k + 1) if k % 2 == 0 else 0\n",
" approx_error = np.abs(I_approx - I_exact)\n",
" approx_errors.append(approx_error)\n",
" print(\"{:5d} | \".format(N) + \" \".join(\"{:.3f} \".format(e) for e in approx_errors))"
" print(f\"{N:5d} | \" + \" \".join(f\"{e:.3f} \" for e in approx_errors))"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -22,8 +22,8 @@
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"\n",
"import numpy as np # pour les numpy array\n",
"import matplotlib.pyplot as plt # librairie graphique"
"import matplotlib.pyplot as plt # librairie graphique\n",
"import numpy as np # pour les numpy array"
]
},
{
@@ -72,11 +72,11 @@
" N = len(x)\n",
" M = len(xx)\n",
" L = np.ones((M, N))\n",
" \n",
"\n",
" for i in range(N):\n",
" for j in range(N):\n",
" if i != j:\n",
" L[:, i] *= (xx - x[j])/(x[i]-x[j])\n",
" L[:, i] *= (xx - x[j]) / (x[i] - x[j])\n",
" return L.dot(y)"
]
},
@@ -119,7 +119,7 @@
"y = np.random.rand(N)\n",
"xx = np.linspace(0, 1, 200)\n",
"\n",
"plt.scatter(x,y)\n",
"plt.scatter(x, y)\n",
"plt.plot(xx, interp_Lagrange(x, y, xx))"
]
},
@@ -155,10 +155,16 @@
"outputs": [],
"source": [
"def equirepartis(a, b, N):\n",
" return np.array([a + (b-a) * (i-1)/(N-1) for i in range(1, N+1)])\n",
" \n",
" return np.array([a + (b - a) * (i - 1) / (N - 1) for i in range(1, N + 1)])\n",
"\n",
"\n",
"def tchebychev(a, b, N):\n",
" return np.array([(a+b)/2 + (b-a)/2 * np.cos((2*i-1)/(2*N)*np.pi) for i in range(1, N+1)])"
" return np.array(\n",
" [\n",
" (a + b) / 2 + (b - a) / 2 * np.cos((2 * i - 1) / (2 * N) * np.pi)\n",
" for i in range(1, N + 1)\n",
" ],\n",
" )"
]
},
{
@@ -188,7 +194,7 @@
" L = np.ones_like(xx)\n",
" for j in range(N):\n",
" if i != j:\n",
" L *= (xx-x[j])/(x[i]-x[j]) \n",
" L *= (xx - x[j]) / (x[i] - x[j])\n",
" return L"
]
},
@@ -271,16 +277,16 @@
" ax[0].set_title(f\"Points équi-repartis (N={n})\")\n",
" xeq = equirepartis(a, b, n)\n",
" for i in range(n):\n",
" ax[0].scatter(xeq[i], 0, color='black')\n",
" ax[0].scatter(xeq[i], 1, color='black')\n",
" ax[0].scatter(xeq[i], 0, color=\"black\")\n",
" ax[0].scatter(xeq[i], 1, color=\"black\")\n",
" ax[0].plot(xx, Li(i, xeq, xx))\n",
" ax[0].grid()\n",
" \n",
"\n",
" ax[1].set_title(f\"Points de Tchebychev (N={n})\")\n",
" xchev = tchebychev(a, b, n)\n",
" for i in range(n):\n",
" ax[1].scatter(xchev[i], 0, color='black')\n",
" ax[1].scatter(xchev[i], 1, color='black')\n",
" ax[1].scatter(xchev[i], 0, color=\"black\")\n",
" ax[1].scatter(xchev[i], 1, color=\"black\")\n",
" ax[1].plot(xx, Li(i, xchev, xx))\n",
" ax[1].grid()"
]
@@ -325,20 +331,23 @@
}
],
"source": [
"f = lambda x: 1/(1+x**2)\n",
"def f(x):\n",
" return 1 / (1 + x**2)\n",
"\n",
"\n",
"a, b = -5, 5\n",
"xx = np.linspace(a, b, 200)\n",
"\n",
"plt.plot(xx, f(xx), label='Courbe de f')\n",
"plt.plot(xx, f(xx), label=\"Courbe de f\")\n",
"for n in [5, 10, 20]:\n",
" xeq = equirepartis(a, b, n)\n",
" for i in range(n):\n",
" plt.scatter(xeq[i], f(xeq[i]))\n",
" plt.plot(xx, Li(i, xeq, xx))\n",
" \n",
"\n",
"plt.ylim(-1, 1)\n",
"plt.legend()\n",
"plt.title('Interpolation de f avec Lagrange pour N points répartis')\n",
"plt.title(\"Interpolation de f avec Lagrange pour N points répartis\")\n",
"plt.grid()"
]
},
@@ -366,20 +375,23 @@
}
],
"source": [
"f = lambda x: 1/(1+x**2)\n",
"def f(x):\n",
" return 1 / (1 + x**2)\n",
"\n",
"\n",
"a, b = -5, 5\n",
"xx = np.linspace(a, b, 200)\n",
"\n",
"plt.plot(xx, f(xx), label='Courbe de f')\n",
"plt.plot(xx, f(xx), label=\"Courbe de f\")\n",
"\n",
"for n in [5, 10, 20]:\n",
" xchev = tchebychev(a, b, n)\n",
" for i in range(n):\n",
" plt.scatter(xchev[i], f(xchev[i]))\n",
" plt.plot(xx, Li(i, xchev, xx))\n",
" \n",
"\n",
"plt.legend()\n",
"plt.title('Interpolation de f avec Lagrange pour N points de Tchebychev')\n",
"plt.title(\"Interpolation de f avec Lagrange pour N points de Tchebychev\")\n",
"plt.grid()"
]
},
@@ -437,9 +449,11 @@
"source": [
"N = np.arange(5, 101, 5)\n",
"\n",
"\n",
"def n_inf(f, p):\n",
" return np.max([f, p])\n",
"\n",
"\n",
"# Norme inf en fct de N\n",
"for n in N:\n",
" xeq = equirepartis(a, b, n)\n",

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -20,8 +20,8 @@
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt"
"import matplotlib.pyplot as plt\n",
"import numpy as np"
]
},
{
@@ -63,11 +63,24 @@
},
"outputs": [],
"source": [
"f1 = lambda x: np.exp(x) - 1 - x\n",
"f2 = lambda x: x - np.sin(x)\n",
"f3 = lambda x: x + np.sin(x)\n",
"f4 = lambda x: x + np.cos(x) - 1\n",
"f5 = lambda x: x - np.cos(x) + 1"
"def f1(x):\n",
" return np.exp(x) - 1 - x\n",
"\n",
"\n",
"def f2(x):\n",
" return x - np.sin(x)\n",
"\n",
"\n",
"def f3(x):\n",
" return x + np.sin(x)\n",
"\n",
"\n",
"def f4(x):\n",
" return x + np.cos(x) - 1\n",
"\n",
"\n",
"def f5(x):\n",
" return x - np.cos(x) + 1"
]
},
{
@@ -98,16 +111,16 @@
"\n",
"x = np.linspace(-1, 1, 200)\n",
"ax = fig.add_subplot(3, 3, 1)\n",
"ax.plot(x, f1(x), label='Courbe f')\n",
"ax.plot(x, x, label=f'$y=x$')\n",
"ax.plot(x, f1(x), label=\"Courbe f\")\n",
"ax.plot(x, x, label=\"$y=x$\")\n",
"ax.scatter([i for i in x if f1(i) == i], [f1(i) for i in x if f1(i) == i])\n",
"ax.legend()\n",
"\n",
"x = np.linspace(-np.pi / 2, 5*np.pi / 2)\n",
"x = np.linspace(-np.pi / 2, 5 * np.pi / 2)\n",
"for fk, f in enumerate([f2, f3, f4, f5]):\n",
" ax = fig.add_subplot(3, 3, fk+2)\n",
" ax.plot(x, f(x), label='Courbe f')\n",
" ax.plot(x, x, label=f'$y=x$')\n",
" ax = fig.add_subplot(3, 3, fk + 2)\n",
" ax.plot(x, f(x), label=\"Courbe f\")\n",
" ax.plot(x, x, label=\"$y=x$\")\n",
" ax.scatter([i for i in x if f(i) == i], [f(i) for i in x if f(i) == i])\n",
" ax.legend()"
]
@@ -129,13 +142,11 @@
},
"outputs": [],
"source": [
"def point_fixe(f, x0, tol=1.e-6, itermax=5000):\n",
" \"\"\"\n",
" Recherche de point fixe : méthode brute x_{n+1} = f(x_n)\n",
" \n",
"def point_fixe(f, x0, tol=1.0e-6, itermax=5000):\n",
" \"\"\"Recherche de point fixe : méthode brute x_{n+1} = f(x_n).\n",
"\n",
" Parameters\n",
" ----------\n",
" \n",
" f: function\n",
" la fonction dont on cherche le point fixe\n",
" x0: float\n",
@@ -144,16 +155,16 @@
" critère d'arrêt : |x_{n+1} - x_n| < tol\n",
" itermax: int\n",
" le nombre maximal d'itérations autorisées\n",
" \n",
"\n",
" Returns\n",
" -------\n",
" \n",
" x: float\n",
" la valeur trouvée pour le point fixe\n",
" niter: int\n",
" le nombre d'itérations effectuées\n",
" xL: ndarray\n",
" la suite des itérés de la suite\n",
"\n",
" \"\"\"\n",
" xL = [x0]\n",
" niter = 0\n",
@@ -225,14 +236,14 @@
"source": [
"# F1\n",
"\n",
"fig, ax = plt.subplots(figsize=(6,6))\n",
"fig, ax = plt.subplots(figsize=(6, 6))\n",
"\n",
"x, niter, xL = point_fixe(f1, -0.5)\n",
"xx = np.linspace(-1, 1, 200)\n",
"\n",
"ax.plot(xx, f1(xx), label='Courbe f1')\n",
"ax.plot(xx, xx, label=f'$y=x$')\n",
"ax.scatter(x, f1(x), label='Point Fixe')\n",
"ax.plot(xx, f1(xx), label=\"Courbe f1\")\n",
"ax.plot(xx, xx, label=\"$y=x$\")\n",
"ax.scatter(x, f1(x), label=\"Point Fixe\")\n",
"ax.legend()\n",
"ax.set_title(f\"Nombre d'itérations : {niter}\")"
]

View File

Before

Width:  |  Height:  |  Size: 199 KiB

After

Width:  |  Height:  |  Size: 199 KiB

View File

Before

Width:  |  Height:  |  Size: 114 KiB

After

Width:  |  Height:  |  Size: 114 KiB

View File

Before

Width:  |  Height:  |  Size: 14 KiB

After

Width:  |  Height:  |  Size: 14 KiB

View File

@@ -100,41 +100,48 @@
],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"\n",
"# Fonction f définissant l'EDO\n",
"def f1(t,y):\n",
" return -t*y\n",
"def f1(t, y):\n",
" return -t * y\n",
"\n",
"\n",
"# Solution exacte\n",
"def uex1(t, y0):\n",
" return y0 * np.exp(-(t**2) / 2)\n",
"\n",
"# Solution exacte \n",
"def uex1(t,y0):\n",
" return y0 * np.exp(-t**2 /2)\n",
"\n",
"plt.figure(1)\n",
"\n",
"## Solutions de l'EDO 1 telles que y(0)=1 et y(0)=2\n",
"\n",
"tt=np.linspace(-3, 3, 100) # vecteur représentant l'intervalle de temps\n",
"y1=uex1(tt, 1) # sol. exacte avec y_0=1\n",
"y2=uex1(tt, 2) # sol. exacte avec y_0=2\n",
"plt.plot(tt,y1,label='y(0)=1')\n",
"plt.plot(tt,y2,label='y(0)=2')\n",
"tt = np.linspace(-3, 3, 100) # vecteur représentant l'intervalle de temps\n",
"y1 = uex1(tt, 1) # sol. exacte avec y_0=1\n",
"y2 = uex1(tt, 2) # sol. exacte avec y_0=2\n",
"plt.plot(tt, y1, label=\"y(0)=1\")\n",
"plt.plot(tt, y2, label=\"y(0)=2\")\n",
"\n",
"##Tracé du champ de vecteurs\n",
"\n",
"plt.title('Solution exacte pour y0 et y1')\n",
"plt.title(\"Solution exacte pour y0 et y1\")\n",
"plt.legend()\n",
"plt.xlabel('t')\n",
"plt.ylabel('y')\n",
"plt.xlabel(\"t\")\n",
"plt.ylabel(\"y\")\n",
"\n",
"t=np.linspace(-5,5,35) # abcisse des points de la grille \n",
"y=np.linspace(0,2.1,23) # ordonnées des points de la grille \n",
"T,Y=np.meshgrid(t,y) # grille de points dans le plan (t,y) \n",
"U=np.ones(T.shape)/np.sqrt(1+f1(T,Y)**2) # matrice avec les composantes horizontales des vecteurs (1), normalisées \n",
"V=f1(T,Y)/np.sqrt(1+f1(T,Y)**2) # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées \n",
"plt.quiver(T,Y,U,V,angles='xy',scale=20,color='cyan')\n",
"plt.axis([-5,5,0,2.1])"
"t = np.linspace(-5, 5, 35) # abcisse des points de la grille\n",
"y = np.linspace(0, 2.1, 23) # ordonnées des points de la grille\n",
"T, Y = np.meshgrid(t, y) # grille de points dans le plan (t,y)\n",
"U = np.ones(T.shape) / np.sqrt(\n",
" 1 + f1(T, Y) ** 2,\n",
") # matrice avec les composantes horizontales des vecteurs (1), normalisées\n",
"V = f1(T, Y) / np.sqrt(\n",
" 1 + f1(T, Y) ** 2,\n",
") # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées\n",
"plt.quiver(T, Y, U, V, angles=\"xy\", scale=20, color=\"cyan\")\n",
"plt.axis([-5, 5, 0, 2.1])"
]
},
{
@@ -182,43 +189,50 @@
],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"\n",
"# Fonction f définissant l'EDO\n",
"def f2(t,y):\n",
"def f2(t, y):\n",
" return t * y**2\n",
"\n",
"# Solution exacte \n",
"def uex2(t,y0):\n",
"\n",
"# Solution exacte\n",
"def uex2(t, y0):\n",
" return y0 / (1 - y0 * t**2 / 2)\n",
"\n",
"\n",
"plt.figure(1)\n",
"\n",
"## Solutions de l'EDO 1 telles que y(0)=1 et y(0)=2\n",
"\n",
"tt=np.linspace(-3, 3, 100) # vecteur représentant l'intervalle de temps\n",
"y1=uex2(tt, 1) # sol. exacte avec y_0=1\n",
"y2=uex2(tt, 2) # sol. exacte avec y_0=2\n",
"plt.plot(tt,y1,label='y(0)=1')\n",
"plt.plot(tt,y2,label='y(0)=2')\n",
"tt = np.linspace(-3, 3, 100) # vecteur représentant l'intervalle de temps\n",
"y1 = uex2(tt, 1) # sol. exacte avec y_0=1\n",
"y2 = uex2(tt, 2) # sol. exacte avec y_0=2\n",
"plt.plot(tt, y1, label=\"y(0)=1\")\n",
"plt.plot(tt, y2, label=\"y(0)=2\")\n",
"\n",
"##Tracé du champ de vecteurs\n",
"\n",
"plt.title('Solution exacte pour y0 et y1')\n",
"plt.title(\"Solution exacte pour y0 et y1\")\n",
"plt.legend()\n",
"plt.xlabel('t')\n",
"plt.ylabel('y')\n",
"plt.xlabel(\"t\")\n",
"plt.ylabel(\"y\")\n",
"\n",
"xmin, xmax = -2, 2\n",
"ymin, ymax = -4, 4\n",
"\n",
"t=np.linspace(xmin, xmax,35) # abcisse des points de la grille \n",
"y=np.linspace(ymin, ymax) # ordonnées des points de la grille \n",
"T,Y=np.meshgrid(t,y) # grille de points dans le plan (t,y) \n",
"U=np.ones(T.shape)/np.sqrt(1+f2(T,Y)**2) # matrice avec les composantes horizontales des vecteurs (1), normalisées \n",
"V=f1(T,Y)/np.sqrt(1+f2(T,Y)**2) # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées \n",
"plt.quiver(T,Y,U,V,angles='xy',scale=20,color='cyan')\n",
"t = np.linspace(xmin, xmax, 35) # abcisse des points de la grille\n",
"y = np.linspace(ymin, ymax) # ordonnées des points de la grille\n",
"T, Y = np.meshgrid(t, y) # grille de points dans le plan (t,y)\n",
"U = np.ones(T.shape) / np.sqrt(\n",
" 1 + f2(T, Y) ** 2,\n",
") # matrice avec les composantes horizontales des vecteurs (1), normalisées\n",
"V = f1(T, Y) / np.sqrt(\n",
" 1 + f2(T, Y) ** 2,\n",
") # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées\n",
"plt.quiver(T, Y, U, V, angles=\"xy\", scale=20, color=\"cyan\")\n",
"plt.axis([xmin, xmax, ymin, ymax])"
]
},
@@ -339,33 +353,34 @@
],
"source": [
"# Euler explicite\n",
"def euler_exp(t0,T,y0,h,f):\n",
" t = np.arange(t0, t0+T+h, h)\n",
"def euler_exp(t0, T, y0, h, f):\n",
" t = np.arange(t0, t0 + T + h, h)\n",
" y = np.empty(t.size)\n",
" y[0] = y0\n",
" N = len(t)-1\n",
" N = len(t) - 1\n",
" for n in range(N):\n",
" y[n+1] = y[n] + h * f(t[n], y[n])\n",
" y[n + 1] = y[n] + h * f(t[n], y[n])\n",
" return t, y\n",
"\n",
"t0=0\n",
"T=1\n",
"y0=1\n",
"\n",
"for f, uex in zip([f1, f2], [uex1, uex2]):\n",
"t0 = 0\n",
"T = 1\n",
"y0 = 1\n",
"\n",
"for f, uex in zip([f1, f2], [uex1, uex2], strict=False):\n",
" plt.figure()\n",
" t = np.arange(0, 1, 1e-3)\n",
" y = uex(t, y0)\n",
" plt.plot(t, y, label='Solution exacte')\n",
" plt.plot(t, y, label=\"Solution exacte\")\n",
"\n",
" for h in [1/5, 1/10, 1/50]:\n",
" for h in [1 / 5, 1 / 10, 1 / 50]:\n",
" tt, y = euler_exp(t0, T, y0, h, f)\n",
" plt.plot(tt, y, label=f'Solution approchée pour h={h}')\n",
" \n",
" plt.plot(tt, y, label=f\"Solution approchée pour h={h}\")\n",
"\n",
" plt.title(f\"Solutions exacte et approchées de la fonction {f.__name__}\")\n",
" plt.legend()\n",
" plt.xlabel('t')\n",
" plt.ylabel('y')"
" plt.xlabel(\"t\")\n",
" plt.ylabel(\"y\")"
]
},
{
@@ -441,12 +456,14 @@
],
"source": [
"# Modèle de Verhulst\n",
"n=1\n",
"d=0.75\n",
"K=200\n",
"n = 1\n",
"d = 0.75\n",
"K = 200\n",
"\n",
"\n",
"def fV(t, P):\n",
" return (n - d) * P * (1 - P / K)\n",
"\n",
"def fV(t,P):\n",
" return (n-d)*P*(1-P/K)\n",
"\n",
"plt.figure()\n",
"\n",
@@ -455,20 +472,24 @@
"t0, T = 0, 50\n",
"\n",
"\n",
"for P in range(1, K+100, 15):\n",
"for P in range(1, K + 100, 15):\n",
" tt, yy = euler_exp(t0, T, P, h, fV)\n",
" plt.plot(tt, yy, label=f'Solution approchée pour h={h}')\n",
" \n",
"plt.title(f\"Solution approchée du modèle de Verhulst pour la population d'individus\")\n",
"plt.xlabel('t')\n",
"plt.ylabel('y')\n",
" plt.plot(tt, yy, label=f\"Solution approchée pour h={h}\")\n",
"\n",
"t=np.linspace(t0, T, 35) # abcisse des points de la grille \n",
"y=np.linspace(0, K+100, 23) # ordonnées des points de la grille \n",
"T,P=np.meshgrid(t,y) # grille de points dans le plan (t,y) \n",
"U=np.ones(T.shape)/np.sqrt(1+fV(T,P)**2) # matrice avec les composantes horizontales des vecteurs (1), normalisées \n",
"V=fV(T,P)/np.sqrt(1+fV(T,P)**2) # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées \n",
"plt.quiver(T,P,U,V,angles='xy',scale=20,color='cyan')\n",
"plt.title(\"Solution approchée du modèle de Verhulst pour la population d'individus\")\n",
"plt.xlabel(\"t\")\n",
"plt.ylabel(\"y\")\n",
"\n",
"t = np.linspace(t0, T, 35) # abcisse des points de la grille\n",
"y = np.linspace(0, K + 100, 23) # ordonnées des points de la grille\n",
"T, P = np.meshgrid(t, y) # grille de points dans le plan (t,y)\n",
"U = np.ones(T.shape) / np.sqrt(\n",
" 1 + fV(T, P) ** 2,\n",
") # matrice avec les composantes horizontales des vecteurs (1), normalisées\n",
"V = fV(T, P) / np.sqrt(\n",
" 1 + fV(T, P) ** 2,\n",
") # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées\n",
"plt.quiver(T, P, U, V, angles=\"xy\", scale=20, color=\"cyan\")\n",
"plt.legend(fontsize=4)"
]
},
@@ -527,29 +548,34 @@
}
],
"source": [
"def fS(t,P):\n",
" return (2-np.cos(t))*P-(P**2)/2-1\n",
"def fS(t, P):\n",
" return (2 - np.cos(t)) * P - (P**2) / 2 - 1\n",
"\n",
"P0=5\n",
"t0=0\n",
"T=10\n",
"h=0.1\n",
"\n",
"P0 = 5\n",
"t0 = 0\n",
"T = 10\n",
"h = 0.1\n",
"\n",
"plt.figure()\n",
"tt, yy = euler_exp(t0, T, P0, h, fS)\n",
"plt.plot(tt, yy, label=f'Solution approchée pour h={h}')\n",
" \n",
"plt.title(f\"Solutions approchée du modèle de Verhulst pour une population de saumons\")\n",
"plt.legend()\n",
"plt.xlabel('t')\n",
"plt.ylabel('y')\n",
"plt.plot(tt, yy, label=f\"Solution approchée pour h={h}\")\n",
"\n",
"t=np.linspace(t0, T, 35) # abcisse des points de la grille \n",
"y=np.linspace(0, 6, 23) # ordonnées des points de la grille \n",
"T,P=np.meshgrid(t,y) # grille de points dans le plan (t,y) \n",
"U=np.ones(T.shape)/np.sqrt(1+fS(T,P)**2) # matrice avec les composantes horizontales des vecteurs (1), normalisées \n",
"V=fS(T,P)/np.sqrt(1+fS(T,P)**2) # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées \n",
"plt.quiver(T,P,U,V,angles='xy',scale=20,color='cyan')\n"
"plt.title(\"Solutions approchée du modèle de Verhulst pour une population de saumons\")\n",
"plt.legend()\n",
"plt.xlabel(\"t\")\n",
"plt.ylabel(\"y\")\n",
"\n",
"t = np.linspace(t0, T, 35) # abcisse des points de la grille\n",
"y = np.linspace(0, 6, 23) # ordonnées des points de la grille\n",
"T, P = np.meshgrid(t, y) # grille de points dans le plan (t,y)\n",
"U = np.ones(T.shape) / np.sqrt(\n",
" 1 + fS(T, P) ** 2,\n",
") # matrice avec les composantes horizontales des vecteurs (1), normalisées\n",
"V = fS(T, P) / np.sqrt(\n",
" 1 + fS(T, P) ** 2,\n",
") # matrice avec les composantes verticales des vecteurs (f(t,y)), normalisées\n",
"plt.quiver(T, P, U, V, angles=\"xy\", scale=20, color=\"cyan\")"
]
},
{

View File

@@ -92,56 +92,66 @@
],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"\n",
"# Fonction F définissant l'EDO\n",
"def F(Y):\n",
" x=Y[0]\n",
" y=Y[1]\n",
" A=np.array([[0,1],[-2,-3]])\n",
" return np.dot(A,Y)\n",
" Y[0]\n",
" Y[1]\n",
" A = np.array([[0, 1], [-2, -3]])\n",
" return np.dot(A, Y)\n",
"\n",
"# ou \n",
"def F1(x,y):\n",
"\n",
"# ou\n",
"def F1(x, y):\n",
" return y\n",
"\n",
"def F2(x,y):\n",
" return -2*x-3*y\n",
"\n",
"def F2(x, y):\n",
" return -2 * x - 3 * y\n",
"\n",
"\n",
"# Solution exacte de Y'=AY, Y(t_0)=Y_0\n",
"def uex(t,t0,Y0):\n",
" U1=np.array([1,-1])\n",
" U2=np.array([1,-2])\n",
" P=np.ones((2,2))\n",
" P[:,0]=U1\n",
" P[:,1]=U2\n",
" C=np.linalg.solve(P,Y0)\n",
" return np.array([(C[0]*np.exp(-(t-t0))*U1[0]+C[1]*np.exp(-2*(t-t0))*U2[0]),(C[0]*np.exp(-(t-t0))*U1[1]+C[1]*np.exp(-2*(t-t0))*U2[1])])\n",
"def uex(t, t0, Y0):\n",
" U1 = np.array([1, -1])\n",
" U2 = np.array([1, -2])\n",
" P = np.ones((2, 2))\n",
" P[:, 0] = U1\n",
" P[:, 1] = U2\n",
" C = np.linalg.solve(P, Y0)\n",
" return np.array(\n",
" [\n",
" (C[0] * np.exp(-(t - t0)) * U1[0] + C[1] * np.exp(-2 * (t - t0)) * U2[0]),\n",
" (C[0] * np.exp(-(t - t0)) * U1[1] + C[1] * np.exp(-2 * (t - t0)) * U2[1]),\n",
" ],\n",
" )\n",
"\n",
"## Représentation des solutions pour chaque valeur de la donnée initiale \n",
"\n",
"## Représentation des solutions pour chaque valeur de la donnée initiale\n",
"tt = np.linspace(-10, 10, 100)\n",
"t0 = tt[0]\n",
"for x, y in zip([1, -2, 0, 1, 3], [2, -2, -4, -2, 4]):\n",
"for x, y in zip([1, -2, 0, 1, 3], [2, -2, -4, -2, 4], strict=False):\n",
" sol = uex(tt, t0, [x, y])\n",
" plt.plot(sol[0], sol[1], label=f'$((x0, y0) = ({x}, {y})$')\n",
" plt.plot(sol[0], sol[1], label=f\"$((x0, y0) = ({x}, {y})$\")\n",
" plt.scatter(x, y)\n",
"\n",
"#Tracé du champ de vecteurs\n",
"x=np.linspace(-5,5,26)\n",
"y=x\n",
"xx,yy=np.meshgrid(x,y)\n",
"U=F1(xx,yy)/np.sqrt(F1(xx,yy)**2+F2(xx,yy)**2)\n",
"V=F2(xx,yy)/np.sqrt(F1(xx,yy)**2+F2(xx,yy)**2)\n",
"plt.quiver(xx,yy,U,V,angles='xy', scale=20, color='gray')\n",
"plt.axis([-5.,5,-5,5])\n",
"# Tracé du champ de vecteurs\n",
"x = np.linspace(-5, 5, 26)\n",
"y = x\n",
"xx, yy = np.meshgrid(x, y)\n",
"U = F1(xx, yy) / np.sqrt(F1(xx, yy) ** 2 + F2(xx, yy) ** 2)\n",
"V = F2(xx, yy) / np.sqrt(F1(xx, yy) ** 2 + F2(xx, yy) ** 2)\n",
"plt.quiver(xx, yy, U, V, angles=\"xy\", scale=20, color=\"gray\")\n",
"plt.axis([-5.0, 5, -5, 5])\n",
"\n",
"## Représentation des espaces propres \n",
"plt.plot(tt, -tt, label=f'SEP associé à -1', linewidth=3)\n",
"plt.plot(tt, -2*tt, label=f'SEP associé à -2', linewidth=3)\n",
"## Représentation des espaces propres\n",
"plt.plot(tt, -tt, label=\"SEP associé à -1\", linewidth=3)\n",
"plt.plot(tt, -2 * tt, label=\"SEP associé à -2\", linewidth=3)\n",
"\n",
"plt.legend(fontsize=7)\n",
"plt.title('Représentation des solutions et champs de vecteurs pour le système (S)')"
"plt.title(\"Représentation des solutions et champs de vecteurs pour le système (S)\")"
]
},
{
@@ -210,13 +220,15 @@
"source": [
"a, b, c, d = 0.1, 5e-5, 0.04, 5e-5\n",
"H0, P0 = 2000, 1000\n",
"He, Pe = c/d, a/b\n",
"He, Pe = c / d, a / b\n",
"\n",
"\n",
"def F1(H, P):\n",
" return H * (a - b*P)\n",
" return H * (a - b * P)\n",
"\n",
"\n",
"def F2(H, P):\n",
" return P * (-c + d*H)"
" return P * (-c + d * H)"
]
},
{
@@ -256,17 +268,17 @@
}
],
"source": [
"xx, yy = np.linspace(0, 3000, 20), np.linspace (0, 4500, 30)\n",
"h,p = np.meshgrid(xx, yy)\n",
"n=np.sqrt(F1(h,p)**2+F2(h,p)**2)\n",
"plt.quiver(h, p, F1(h,p)/n, F2(h,p)/n, angles='xy', scale=20, color='gray')\n",
"xx, yy = np.linspace(0, 3000, 20), np.linspace(0, 4500, 30)\n",
"h, p = np.meshgrid(xx, yy)\n",
"n = np.sqrt(F1(h, p) ** 2 + F2(h, p) ** 2)\n",
"plt.quiver(h, p, F1(h, p) / n, F2(h, p) / n, angles=\"xy\", scale=20, color=\"gray\")\n",
"\n",
"plt.vlines(He, 0, 4500, label=f'H=He={He}')\n",
"plt.hlines(Pe, 0, 3000, label=f'P=Pe={Pe}')\n",
"plt.scatter(He, Pe, label=f'(H0, P0) = (He, Pe)')\n",
"plt.scatter(H0, P0, label=f'(H, P)=(H0, P0)=({H0},{P0})', color='red')\n",
"plt.vlines(He, 0, 4500, label=f\"H=He={He}\")\n",
"plt.hlines(Pe, 0, 3000, label=f\"P=Pe={Pe}\")\n",
"plt.scatter(He, Pe, label=\"(H0, P0) = (He, Pe)\")\n",
"plt.scatter(H0, P0, label=f\"(H, P)=(H0, P0)=({H0},{P0})\", color=\"red\")\n",
"\n",
"plt.title('Le modèle de Lotka-Volterra')\n",
"plt.title(\"Le modèle de Lotka-Volterra\")\n",
"plt.legend(fontsize=7)"
]
},
@@ -374,19 +386,20 @@
"outputs": [],
"source": [
"a, b, c, d = 0.1, 5e-5, 0.04, 5e-5\n",
"T=200\n",
"T = 200\n",
"H0, P0 = 2000, 1000\n",
"He, Pe = c/d, a/b\n",
"p=0.02\n",
"He, Pe = c / d, a / b\n",
"p = 0.02\n",
"\n",
"\n",
"def voltEE(T, X0, h):\n",
" t = np.arange(0, T+h, h)\n",
" t = np.arange(0, T + h, h)\n",
" H = 0 * t\n",
" P = 0 * t\n",
" H[0], P[0] = X0\n",
" for n in range(len(t)-1):\n",
" H[n+1] = H[n] + h * F1(H[n], P[n])\n",
" P[n+1] = P[n] + h * F2(H[n], P[n])\n",
" for n in range(len(t) - 1):\n",
" H[n + 1] = H[n] + h * F1(H[n], P[n])\n",
" P[n + 1] = P[n] + h * F2(H[n], P[n])\n",
" return np.array([t, H, P])"
]
},
@@ -438,22 +451,22 @@
],
"source": [
"t, H, P = voltEE(T, [H0, P0], 0.001)\n",
"plt.plot(t, H, label='Population de sardines')\n",
"plt.plot(t, P, label='Population de requins')\n",
"plt.plot(t, H, label=\"Population de sardines\")\n",
"plt.plot(t, P, label=\"Population de requins\")\n",
"plt.legend(fontsize=7)\n",
"\n",
"plt.figure()\n",
"xx, yy = np.linspace(0, 3000, 20), np.linspace (0, 4500, 30)\n",
"h,p = np.meshgrid(xx, yy)\n",
"n=np.sqrt(F1(h,p)**2 + F2(h,p)**2)\n",
"plt.quiver (h, p, F1(h,p)/n, F2(h,p)/n, angles='xy', scale=20, color='gray')\n",
"xx, yy = np.linspace(0, 3000, 20), np.linspace(0, 4500, 30)\n",
"h, p = np.meshgrid(xx, yy)\n",
"n = np.sqrt(F1(h, p) ** 2 + F2(h, p) ** 2)\n",
"plt.quiver(h, p, F1(h, p) / n, F2(h, p) / n, angles=\"xy\", scale=20, color=\"gray\")\n",
"\n",
"plt.vlines(He, 0, 4500, label=f'H=He={He}')\n",
"plt.hlines(Pe, 0, 3000, label=f'P=Pe={Pe}')\n",
"plt.scatter(He, Pe, label=f'(H0, P0) = (He, Pe)')\n",
"plt.scatter(H0, P0, label=f'(H, P)=(H0, P0)=({H0},{P0})', color='red')\n",
"plt.vlines(He, 0, 4500, label=f\"H=He={He}\")\n",
"plt.hlines(Pe, 0, 3000, label=f\"P=Pe={Pe}\")\n",
"plt.scatter(He, Pe, label=\"(H0, P0) = (He, Pe)\")\n",
"plt.scatter(H0, P0, label=f\"(H, P)=(H0, P0)=({H0},{P0})\", color=\"red\")\n",
"\n",
"plt.title('Le modèle de Lotka-Volterra')\n",
"plt.title(\"Le modèle de Lotka-Volterra\")\n",
"plt.legend(fontsize=7)\n",
"plt.plot(H, P)"
]
@@ -467,25 +480,28 @@
"outputs": [],
"source": [
"a, b, c, d = 0.1, 5e-5, 0.04, 5e-5\n",
"T=200\n",
"T = 200\n",
"H0, P0 = 2000, 1000\n",
"He, Pe = c/d, a/b\n",
"p=0.02\n",
"He, Pe = c / d, a / b\n",
"p = 0.02\n",
"\n",
"\n",
"def F1_p(H, P):\n",
" return (a - p) * H - b * H * P\n",
"\n",
"\n",
"def F2_p(H, P):\n",
" return (-c-p) * P + d * H * P\n",
" return (-c - p) * P + d * H * P\n",
"\n",
"\n",
"def voltEE_p(T, X0, h):\n",
" t = np.arange(0, T+h, h)\n",
" t = np.arange(0, T + h, h)\n",
" H = 0 * t\n",
" P = 0 * t\n",
" H[0], P[0] = X0\n",
" for n in range(len(t)-1):\n",
" H[n+1] = H[n] + h * F1_p(H[n], P[n])\n",
" P[n+1] = P[n] + h * F2_p(H[n], P[n])\n",
" for n in range(len(t) - 1):\n",
" H[n + 1] = H[n] + h * F1_p(H[n], P[n])\n",
" P[n + 1] = P[n] + h * F2_p(H[n], P[n])\n",
" return np.array([t, H, P])"
]
},
@@ -535,22 +551,22 @@
],
"source": [
"t, H, P = voltEE_p(T, [H0, P0], 0.001)\n",
"plt.plot(t, H, label='Population de sardines')\n",
"plt.plot(t, P, label='Population de requins')\n",
"plt.plot(t, H, label=\"Population de sardines\")\n",
"plt.plot(t, P, label=\"Population de requins\")\n",
"plt.legend(fontsize=7)\n",
"\n",
"plt.figure()\n",
"xx, yy = np.linspace(0, 3000, 20), np.linspace (0, 4500, 30)\n",
"h,p = np.meshgrid(xx, yy)\n",
"n=np.sqrt(F1(h,p)**2 + F2(h,p)**2)\n",
"plt.quiver (h, p, F1(h,p)/n, F2(h,p)/n, angles='xy', scale=20, color='gray')\n",
"xx, yy = np.linspace(0, 3000, 20), np.linspace(0, 4500, 30)\n",
"h, p = np.meshgrid(xx, yy)\n",
"n = np.sqrt(F1(h, p) ** 2 + F2(h, p) ** 2)\n",
"plt.quiver(h, p, F1(h, p) / n, F2(h, p) / n, angles=\"xy\", scale=20, color=\"gray\")\n",
"\n",
"plt.vlines(He, 0, 4500, label=f'H=He={He}')\n",
"plt.hlines(Pe, 0, 3000, label=f'P=Pe={Pe}')\n",
"plt.scatter(He, Pe, label=f'(H0, P0) = (He, Pe)')\n",
"plt.scatter(H0, P0, label=f'(H, P)=(H0, P0)=({H0},{P0})', color='red')\n",
"plt.vlines(He, 0, 4500, label=f\"H=He={He}\")\n",
"plt.hlines(Pe, 0, 3000, label=f\"P=Pe={Pe}\")\n",
"plt.scatter(He, Pe, label=\"(H0, P0) = (He, Pe)\")\n",
"plt.scatter(H0, P0, label=f\"(H, P)=(H0, P0)=({H0},{P0})\", color=\"red\")\n",
"\n",
"plt.title('Le modèle de Lotka-Volterra')\n",
"plt.title(\"Le modèle de Lotka-Volterra\")\n",
"plt.legend(fontsize=7)\n",
"plt.plot(H, P)"
]
@@ -697,46 +713,49 @@
}
],
"source": [
"gamma=0.5\n",
"a=0.004\n",
"lm=10\n",
"gamma = 0.5\n",
"a = 0.004\n",
"lm = 10\n",
"\n",
"\n",
"def fphi(d):\n",
" return (gamma/d)*np.log(d/a)\n",
" return (gamma / d) * np.log(d / a)\n",
"\n",
"\n",
"def F(X):\n",
" (n,m)=np.shape(X)\n",
" Y=np.zeros((n,m))\n",
" Y[0,:]=X[1,:]\n",
" Xaux=np.zeros(m+2)\n",
" Xaux[-1]=1\n",
" Xaux[1:-1]=X[0,:]\n",
" Y[1,:]=fphi(Xaux[2:]-Xaux[1:-1])-fphi(Xaux[1:-1]-Xaux[0:-2])-lm*X[1,:]\n",
" (n, m) = np.shape(X)\n",
" Y = np.zeros((n, m))\n",
" Y[0, :] = X[1, :]\n",
" Xaux = np.zeros(m + 2)\n",
" Xaux[-1] = 1\n",
" Xaux[1:-1] = X[0, :]\n",
" Y[1, :] = fphi(Xaux[2:] - Xaux[1:-1]) - fphi(Xaux[1:-1] - Xaux[0:-2]) - lm * X[1, :]\n",
" return Y\n",
"\n",
"h=0.0002\n",
"T=15\n",
"N=100\n",
"\n",
"t=np.arange(0,T+h,h)\n",
"Nt=np.size(t)\n",
"X=np.zeros((2,N,Nt))\n",
"R0=-1+2*np.random.rand(N)\n",
"X0=np.arange(1/(N+1),1,1/(N+1))+0.1*R0*(1/(N+1))\n",
"h = 0.0002\n",
"T = 15\n",
"N = 100\n",
"\n",
"X[0,:,0]=X0\n",
"X[1,:,0]=X0\n",
"t = np.arange(0, T + h, h)\n",
"Nt = np.size(t)\n",
"X = np.zeros((2, N, Nt))\n",
"R0 = -1 + 2 * np.random.rand(N)\n",
"X0 = np.arange(1 / (N + 1), 1, 1 / (N + 1)) + 0.1 * R0 * (1 / (N + 1))\n",
"\n",
"X[0, :, 0] = X0\n",
"X[1, :, 0] = X0\n",
"\n",
"plt.figure(1, figsize=(24, 18))\n",
"for n in range(Nt - 1):\n",
" Y = F(X[:, :, n])\n",
" X[:, :, n + 1] = X[:, :, n] + (h / 2) * Y + (h / 2) * F(X[:, :, n] + h * Y)\n",
"\n",
"plt.figure(1,figsize=(24,18))\n",
"for n in range(Nt-1):\n",
" Y=F(X[:,:,n])\n",
" X[:,:,n+1]=X[:,:,n]+(h/2)*Y+(h/2)*F(X[:,:,n]+h*Y)\n",
" \n",
"for i in range(N):\n",
" plt.plot(t,X[0,i,:],'k')\n",
"plt.xlabel('t')\n",
"plt.ylabel('$x_i$')\n",
"plt.title('position x_i des globules rouges au cours du temps')\n"
" plt.plot(t, X[0, i, :], \"k\")\n",
"plt.xlabel(\"t\")\n",
"plt.ylabel(\"$x_i$\")\n",
"plt.title(\"position x_i des globules rouges au cours du temps\")"
]
},
{

View File

@@ -73,17 +73,18 @@
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"\n",
"## Question 1\n",
"def mon_schema(t0, T, y0, h, f):\n",
" t = np.arange(t0, t0 + T + h, h)\n",
" y = 0 * t\n",
" y[0] = y0\n",
" for n in range(len(t)-1):\n",
" yn1 = y[n] + h * f(t[n] + h / 2, y[n] + h/2 * f(t[n], y[n]))\n",
" y[n+1] = y[n] + h / 2 * (f(t[n], y[n]) + f(t[n+1], yn1))\n",
" for n in range(len(t) - 1):\n",
" yn1 = y[n] + h * f(t[n] + h / 2, y[n] + h / 2 * f(t[n], y[n]))\n",
" y[n + 1] = y[n] + h / 2 * (f(t[n], y[n]) + f(t[n + 1], yn1))\n",
" return t, y"
]
},
@@ -140,25 +141,27 @@
"## Question 2\n",
"\n",
"# f second membre de l'EDO\n",
"def f(t,y):\n",
"def f(t, y):\n",
" return y + np.sin(t) * np.power(y, 2)\n",
"\n",
"\n",
"# sol. exacte de (P)\n",
"def yex(t):\n",
" return 1 / (1/2 * np.exp(-t) + (np.cos(t)-np.sin(t)) / 2)\n",
" return 1 / (1 / 2 * np.exp(-t) + (np.cos(t) - np.sin(t)) / 2)\n",
"\n",
"\n",
"t0, T = 0, 0.5\n",
"N = 100\n",
"y0 = 1\n",
"t, y_app = mon_schema(t0, T, y0, T/N, f)\n",
"t, y_app = mon_schema(t0, T, y0, T / N, f)\n",
"y_ex = yex(t)\n",
"\n",
"plt.figure(1)\n",
"plt.plot(t, y_ex, label='Solution exacte', lw=3)\n",
"plt.plot(t, y_app, '+ ', label=f'Solution approchée pour N={N}')\n",
"plt.title('Solutions approchées pour mon_schema')\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel(f'$y$')\n",
"plt.plot(t, y_ex, label=\"Solution exacte\", lw=3)\n",
"plt.plot(t, y_app, \"+ \", label=f\"Solution approchée pour N={N}\")\n",
"plt.title(\"Solutions approchées pour mon_schema\")\n",
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$y$\")\n",
"plt.legend(fontsize=7)"
]
},
@@ -212,15 +215,17 @@
"# Question 3\n",
"plt.figure(2)\n",
"\n",
"for s in range(2,11):\n",
" h=(1/2)/(2**s)\n",
"for s in range(2, 11):\n",
" h = (1 / 2) / (2**s)\n",
" t, y_app = mon_schema(t0, T, y0, h, f)\n",
" plt.plot(t, y_app, label=f'h={h}')\n",
" plt.plot(t, y_app, label=f\"h={h}\")\n",
"\n",
"plt.legend(fontsize=7)\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel('$y^n$')\n",
"plt.title('Solutions approchées de (P) obtenus avec mon_schema pour différentes valeurs du pas h')"
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$y^n$\")\n",
"plt.title(\n",
" \"Solutions approchées de (P) obtenus avec mon_schema pour différentes valeurs du pas h\",\n",
")"
]
},
{
@@ -252,23 +257,25 @@
}
],
"source": [
"E=[]\n",
"H=[]\n",
"E = []\n",
"H = []\n",
"\n",
"plt.figure(3)\n",
"for s in range(2,11):\n",
" h=(1/2)/(2**s)\n",
"for s in range(2, 11):\n",
" h = (1 / 2) / (2**s)\n",
" t, y_app = mon_schema(t0, T, y0, h, f)\n",
" y_ex = yex(t)\n",
" err = np.max(np.abs(y_app - y_ex))\n",
" plt.plot(t, np.abs(y_app - y_ex), label=f'h={h}')\n",
" plt.plot(t, np.abs(y_app - y_ex), label=f\"h={h}\")\n",
" E.append(err)\n",
" H.append(h)\n",
" \n",
"\n",
"plt.legend()\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel('$|y(t_n) - y^n|$')\n",
"plt.title('différence en valeur absolue entre sol. exacte et sol. approchée par mon_schema, pour différentes valeurs du pas h')"
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$|y(t_n) - y^n|$\")\n",
"plt.title(\n",
" \"différence en valeur absolue entre sol. exacte et sol. approchée par mon_schema, pour différentes valeurs du pas h\",\n",
")"
]
},
{
@@ -304,13 +311,13 @@
"\n",
"H, E = np.array(H), np.array(E)\n",
"\n",
"plt.plot(T/H, E/H, label=f'$E_h / h$')\n",
"plt.plot(T/H, E/H**2, label=f'$E_h / h^2$')\n",
"plt.plot(T / H, E / H, label=\"$E_h / h$\")\n",
"plt.plot(T / H, E / H**2, label=\"$E_h / h^2$\")\n",
"\n",
"plt.legend()\n",
"plt.xlabel('N')\n",
"plt.ylabel('Erreur globale')\n",
"plt.title('Erreur globale en fonction de N')"
"plt.xlabel(\"N\")\n",
"plt.ylabel(\"Erreur globale\")\n",
"plt.title(\"Erreur globale en fonction de N\")"
]
},
{
@@ -344,14 +351,16 @@
"source": [
"plt.figure(5)\n",
"\n",
"plt.plot(np.log(H), np.log(E), '+-', label='erreur')\n",
"plt.plot(np.log(H), np.log(H), '-', label='droite pente 1')\n",
"plt.plot(np.log(H), 2*np.log(H), '-', label='droite pente 2')\n",
"plt.plot(np.log(H), np.log(E), \"+-\", label=\"erreur\")\n",
"plt.plot(np.log(H), np.log(H), \"-\", label=\"droite pente 1\")\n",
"plt.plot(np.log(H), 2 * np.log(H), \"-\", label=\"droite pente 2\")\n",
"\n",
"plt.legend()\n",
"plt.title('Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)')\n",
"plt.xlabel(f'$log(h)$')\n",
"plt.ylabel(f'$log(E)$')"
"plt.title(\n",
" \"Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)\",\n",
")\n",
"plt.xlabel(\"$log(h)$\")\n",
"plt.ylabel(\"$log(E)$\")"
]
},
{
@@ -444,24 +453,26 @@
" t = np.arange(t0, t0 + T + h, h)\n",
" y = 0 * t\n",
" y[0] = y0\n",
" for n in range(len(t)-1):\n",
" y[n+1] = y[n] + h * f(t[n+1], y[n+1])\n",
" for n in range(len(t) - 1):\n",
" y[n + 1] = y[n] + h * f(t[n + 1], y[n + 1])\n",
" return t, y\n",
"\n",
"\n",
"def crank(t0, T, y0, h, f):\n",
" t = np.arange(t0, t0 + T + h, h)\n",
" y = 0 * t\n",
" y[0] = y0\n",
" for n in range(len(t)-1):\n",
" y[n+1] = y[n] + h/2 * (f(t[n], y[n]) + f(t[n+1], y[n+1]))\n",
" for n in range(len(t) - 1):\n",
" y[n + 1] = y[n] + h / 2 * (f(t[n], y[n]) + f(t[n + 1], y[n + 1]))\n",
" return t, y\n",
"\n",
"\n",
"def adams(t0, T, y0, h, f):\n",
" t = np.arange(t0, t0 + T + h, h)\n",
" y = 0 * t\n",
" y[0] = y0\n",
" for n in range(1, len(t)-1):\n",
" y[n+1] = y[n] + h/2 * (3* f(t[n], y[n]) - f(t[n-1], y[n-1]))\n",
" for n in range(1, len(t) - 1):\n",
" y[n + 1] = y[n] + h / 2 * (3 * f(t[n], y[n]) - f(t[n - 1], y[n - 1]))\n",
" return t, y"
]
},
@@ -474,39 +485,41 @@
"outputs": [],
"source": [
"def euler_explicite(t0, T, y0, h, f):\n",
" N = int(T/h)\n",
" N = int(T / h)\n",
" n = len(y0)\n",
" t = np.linspace(t0, t0 + T, N+1)\n",
" y = np.zeros((N+1, n))\n",
" t = np.linspace(t0, t0 + T, N + 1)\n",
" y = np.zeros((N + 1, n))\n",
" y[0,] = y0\n",
" for n in range(N):\n",
" y[n+1] = y[n] + h * f(t[n], y[n])\n",
" y[n + 1] = y[n] + h * f(t[n], y[n])\n",
" return t, y\n",
"\n",
"\n",
"def heun(t0, T, y0, h, f):\n",
" N = int(T/h)\n",
" N = int(T / h)\n",
" n = len(y0)\n",
" t = np.linspace(t0, t0 + T, N+1)\n",
" y = np.zeros((N+1, n))\n",
" t = np.linspace(t0, t0 + T, N + 1)\n",
" y = np.zeros((N + 1, n))\n",
" y[0,] = y0\n",
" for n in range(N):\n",
" p1 = f(t[n], y[n])\n",
" p2 = f(t[n] + h, y[n] + h * p1)\n",
" y[n+1] = y[n] + h/2 * (p1 + p2)\n",
" y[n + 1] = y[n] + h / 2 * (p1 + p2)\n",
" return t, y\n",
"\n",
"\n",
"def runge(t0, T, y0, h, f):\n",
" N = int(T/h)\n",
" N = int(T / h)\n",
" n = len(y0)\n",
" t = np.linspace(t0, t0 + T, N+1)\n",
" y = np.zeros((N+1, n))\n",
" t = np.linspace(t0, t0 + T, N + 1)\n",
" y = np.zeros((N + 1, n))\n",
" y[0,] = y0\n",
" for n in range(N):\n",
" p1 = f(t[n], y[n])\n",
" p2 = f(t[n] + h/2, y[n] + h/2 * p1)\n",
" p3 = f(t[n] + h/2, y[n] + h/2 * p2)\n",
" p4 = f(t[n] + h, y[n] + h* p3)\n",
" y[n+1] = y[n] + h/6 * (p1 + 2*p2 + 2*p3 + p4)\n",
" p2 = f(t[n] + h / 2, y[n] + h / 2 * p1)\n",
" p3 = f(t[n] + h / 2, y[n] + h / 2 * p2)\n",
" p4 = f(t[n] + h, y[n] + h * p3)\n",
" y[n + 1] = y[n] + h / 6 * (p1 + 2 * p2 + 2 * p3 + p4)\n",
" return t, y"
]
},
@@ -624,13 +637,16 @@
"# Question 1\n",
"a, b, c = 0.1, 2, 1\n",
"\n",
"\n",
"# f second membre de l'EDO\n",
"def f1(t,y):\n",
" return c * y * (1 - y/b)\n",
"def f1(t, y):\n",
" return c * y * (1 - y / b)\n",
"\n",
"\n",
"# sol. exacte de (P1)\n",
"def yex1(t):\n",
" return b / (1 + (b-a)/a * np.exp(-c*t))\n",
" return b / (1 + (b - a) / a * np.exp(-c * t))\n",
"\n",
"\n",
"t0, T = 0, 15\n",
"h = 0.2\n",
@@ -639,25 +655,27 @@
"plt.figure()\n",
"for schema in [euler_explicite, heun, runge]:\n",
" t, y_app = schema(t0, T, y0, h, f1)\n",
" plt.plot(t, y_app.ravel(), '+ ', label=f'Solution approchée par {schema.__name__}')\n",
" plt.plot(t, y_app.ravel(), \"+ \", label=f\"Solution approchée par {schema.__name__}\")\n",
"\n",
"t = np.arange(t0, t0+T+h, h)\n",
"t = np.arange(t0, t0 + T + h, h)\n",
"y_ex = yex1(t)\n",
"plt.plot(t, y_ex, label='Solution exacte', lw=1)\n",
"plt.title(f'Solutions approchées pour le schema {schema.__name__}')\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel(f'$y$')\n",
"plt.plot(t, y_ex, label=\"Solution exacte\", lw=1)\n",
"plt.title(f\"Solutions approchées pour le schema {schema.__name__}\")\n",
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$y$\")\n",
"plt.legend(fontsize=7)\n",
"\n",
"plt.figure()\n",
"for schema in [euler_explicite, heun, runge]: \n",
"for schema in [euler_explicite, heun, runge]:\n",
" t, y_app = schema(t0, T, y0, h, f1)\n",
" plt.plot(t, np.abs(y_app.ravel() - yex1(t)), label=f'Schema {schema.__name__}')\n",
" \n",
" plt.plot(t, np.abs(y_app.ravel() - yex1(t)), label=f\"Schema {schema.__name__}\")\n",
"\n",
"plt.legend()\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel('$|y(t_n) - y^n|$')\n",
"plt.title(f'différence en valeur absolue entre sol. exacte et sol. approchée, pour différents schemas')"
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$|y(t_n) - y^n|$\")\n",
"plt.title(\n",
" \"différence en valeur absolue entre sol. exacte et sol. approchée, pour différents schemas\",\n",
")"
]
},
{
@@ -696,24 +714,26 @@
" Z[1] = -Y[0] + np.cos(t)\n",
" return Z\n",
"\n",
"\n",
"def yexF(t):\n",
" return 1/2 * np.sin(t) * t + 5 * np.cos(t) + np.sin(t),\n",
" return (1 / 2 * np.sin(t) * t + 5 * np.cos(t) + np.sin(t),)\n",
"\n",
"\n",
"t0, T = 0, 15\n",
"h = 0.2\n",
"Y0 = np.array([5,1])\n",
"Y0 = np.array([5, 1])\n",
"\n",
"plt.figure()\n",
"for schema in [euler_explicite, heun, runge]:\n",
" t, y_app = schema(t0, T, Y0, h, F)\n",
" plt.plot(t, y_app[:,0], '+ ', label=f'Solution approchée par {schema.__name__}')\n",
" plt.plot(t, y_app[:, 0], \"+ \", label=f\"Solution approchée par {schema.__name__}\")\n",
"\n",
"t = np.arange(t0, t0+T+h, h)\n",
"t = np.arange(t0, t0 + T + h, h)\n",
"y_ex = yexF(t)\n",
"plt.plot(t, y_ex[0], label='Solution exacte', lw=3)\n",
"plt.title('Solutions approchées par differents schemas')\n",
"plt.xlabel(f'$t$')\n",
"plt.ylabel(f'$y$')\n",
"plt.plot(t, y_ex[0], label=\"Solution exacte\", lw=3)\n",
"plt.title(\"Solutions approchées par differents schemas\")\n",
"plt.xlabel(\"$t$\")\n",
"plt.ylabel(\"$y$\")\n",
"plt.legend(fontsize=7)"
]
},
@@ -726,20 +746,24 @@
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"# Question 3\n",
"\n",
"\n",
"# f second membre de l'EDO\n",
"def f3(t,y):\n",
" return (np.cos(t) - y) / (1+t)\n",
"def f3(t, y):\n",
" return (np.cos(t) - y) / (1 + t)\n",
"\n",
"\n",
"# sol. exacte de (P1)\n",
"def yex3(t):\n",
" return (np.sin(t) - 1/4) / (1 + t)\n",
" return (np.sin(t) - 1 / 4) / (1 + t)\n",
"\n",
"\n",
"t0, T = 0, 10\n",
"y0 = np.array([-1/4])"
"y0 = np.array([-1 / 4])"
]
},
{
@@ -793,15 +817,17 @@
"plt.figure()\n",
"for schema in [euler_explicite, heun, runge]:\n",
" plt.figure()\n",
" for s in range(2,11):\n",
" h=1/(2**s)\n",
" for s in range(2, 11):\n",
" h = 1 / (2**s)\n",
" t, y_app = schema(t0, T, y0, h, f3)\n",
" plt.plot(t, y_app, label=f'h={h}')\n",
" plt.plot(t, y_app, label=f\"h={h}\")\n",
"\n",
" plt.legend(fontsize=7)\n",
" plt.xlabel(f'$t$')\n",
" plt.ylabel('$y^n$')\n",
" plt.title(f'Solutions approchées de (P) obtenus avec {schema.__name__} pour différentes valeurs du pas h')"
" plt.xlabel(\"$t$\")\n",
" plt.ylabel(\"$y^n$\")\n",
" plt.title(\n",
" f\"Solutions approchées de (P) obtenus avec {schema.__name__} pour différentes valeurs du pas h\",\n",
" )"
]
},
{
@@ -846,22 +872,24 @@
"for schema in [euler_explicite, heun, runge]:\n",
" H, E = [], []\n",
" plt.figure()\n",
" for s in range(2,11):\n",
" h=1/(2**s)\n",
" for s in range(2, 11):\n",
" h = 1 / (2**s)\n",
" t, y_app = schema(t0, T, y0, h, f3)\n",
" E.append(np.max(np.abs(yex3(t) - y_app.ravel())))\n",
" H.append(h)\n",
" \n",
"\n",
" H, E = np.array(H), np.array(E)\n",
" plt.plot(np.log(H), np.log(E), '+-', label='erreur')\n",
" plt.plot(np.log(H), np.log(H), '-', label='droite pente 1')\n",
" plt.plot(np.log(H), 2*np.log(H), '-', label='droite pente 2')\n",
" plt.plot(np.log(H), 3*np.log(H), '-', label='droite pente 3')\n",
" \n",
" plt.plot(np.log(H), np.log(E), \"+-\", label=\"erreur\")\n",
" plt.plot(np.log(H), np.log(H), \"-\", label=\"droite pente 1\")\n",
" plt.plot(np.log(H), 2 * np.log(H), \"-\", label=\"droite pente 2\")\n",
" plt.plot(np.log(H), 3 * np.log(H), \"-\", label=\"droite pente 3\")\n",
"\n",
" plt.legend()\n",
" plt.title(f'Erreur pour la méthode {schema.__name__} en echelle logarithmique : log(E) en fonction de log(h)')\n",
" plt.xlabel(f'$log(h)$')\n",
" plt.ylabel(f'$log(E)$')"
" plt.title(\n",
" f\"Erreur pour la méthode {schema.__name__} en echelle logarithmique : log(E) en fonction de log(h)\",\n",
" )\n",
" plt.xlabel(\"$log(h)$\")\n",
" plt.ylabel(\"$log(E)$\")"
]
},
{
@@ -943,19 +971,20 @@
"a = 5\n",
"t0, T = 0, 20\n",
"H = [0.1, 0.05, 0.025]\n",
"Y0 = np.array([1/10, 1])\n",
"Y0 = np.array([1 / 10, 1])\n",
"\n",
"\n",
"def P(t, Y):\n",
" return np.array([Y[1], (a - Y[0]**2) * Y[1] - Y[0]])\n",
" return np.array([Y[1], (a - Y[0] ** 2) * Y[1] - Y[0]])\n",
"\n",
"\n",
"for schema in [euler_explicite, heun, runge]:\n",
" plt.figure()\n",
" for h in H:\n",
" t, y_app = schema(t0, T, Y0, h, P)\n",
" plt.plot(t, y_app[:,0], '+ ', label=f'Solution approchée pour h={h}')\n",
" plt.plot(t, y_app[:, 0], \"+ \", label=f\"Solution approchée pour h={h}\")\n",
" plt.legend()\n",
" plt.title(f'Solutions approchees pour differents pas h par {schema.__name__}')\n",
" "
" plt.title(f\"Solutions approchees pour differents pas h par {schema.__name__}\")"
]
},
{

View File

@@ -240,22 +240,25 @@
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"\n",
"\n",
"def A(M):\n",
" return 2 * np.eye(M) - np.eye(M, k=-1) - np.eye(M, k=1)\n",
"\n",
"\n",
"def solution_approchée(f, M, a=0, b=1, ua=0, ub=0):\n",
" X = np.linspace(a, b + 1 if b == 0 else b, M+2)\n",
" U = np.zeros(M+2)\n",
" X = np.linspace(a, b + 1 if b == 0 else b, M + 2)\n",
" U = np.zeros(M + 2)\n",
" U[0], U[-1] = ua, ub\n",
" \n",
" h = (b-a)/(M+1)\n",
"\n",
" h = (b - a) / (M + 1)\n",
"\n",
" delta = np.zeros(M)\n",
" delta[0], delta[-1] = ua/np.power(h, 2), ub/np.power(h, 2)\n",
" \n",
" delta[0], delta[-1] = ua / np.power(h, 2), ub / np.power(h, 2)\n",
"\n",
" U[1:-1] = np.linalg.solve(A(M), h**2 * (f(X)[1:-1] + delta))\n",
" return X, U"
]
@@ -290,22 +293,25 @@
],
"source": [
"M = 50\n",
"h = 1/(M+1)\n",
"h = 1 / (M + 1)\n",
"\n",
"\n",
"def f1(x):\n",
" return (2 * np.pi)**2 * np.sin(2 * np.pi * x)\n",
" return (2 * np.pi) ** 2 * np.sin(2 * np.pi * x)\n",
"\n",
"\n",
"def u1(x):\n",
" return np.sin(2 * np.pi * x)\n",
"\n",
"\n",
"x, U_app = solution_approchée(f1, M)\n",
"plt.figure()\n",
"plt.scatter(x, U_app, label=f\"$A_M U_h = h² F$\")\n",
"plt.plot(x, u1(x), label='Solution exacte', color='red')\n",
"plt.scatter(x, U_app, label=\"$A_M U_h = h² F$\")\n",
"plt.plot(x, u1(x), label=\"Solution exacte\", color=\"red\")\n",
"plt.legend()\n",
"plt.ylabel('f(x)')\n",
"plt.xlabel('x')\n",
"plt.title(f\"$-u''=f(x)$\")"
"plt.ylabel(\"f(x)\")\n",
"plt.xlabel(\"x\")\n",
"plt.title(\"$-u''=f(x)$\")"
]
},
{
@@ -340,21 +346,23 @@
"H, E = [], []\n",
"for k in range(2, 12):\n",
" M = 2**k\n",
" h = 1/(M+1)\n",
" \n",
" h = 1 / (M + 1)\n",
"\n",
" x, U_app = solution_approchée(f1, M)\n",
" \n",
"\n",
" e = np.abs(u1(x) - U_app)\n",
" plt.plot(x, e, label=f'{h}')\n",
" plt.plot(x, e, label=f\"{h}\")\n",
" E.append(np.max(e))\n",
" H.append(h)\n",
"\n",
"H, E = np.array(H), np.array(E) \n",
"H, E = np.array(H), np.array(E)\n",
"\n",
"plt.xlabel('$h$')\n",
"plt.ylabel('$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$')\n",
"plt.xlabel(\"$h$\")\n",
"plt.ylabel(r\"$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$\")\n",
"plt.legend(fontsize=7)\n",
"plt.title('Différence en valeur absolue entre la solution exacte et la solution approchée')"
"plt.title(\n",
" \"Différence en valeur absolue entre la solution exacte et la solution approchée\",\n",
")"
]
},
{
@@ -387,12 +395,12 @@
],
"source": [
"plt.figure()\n",
"plt.plot(H, E/H, label=f'$E_h / h$')\n",
"plt.plot(H, E/H**2, label=f'$E_h / h^2$')\n",
"plt.plot(H, E / H, label=\"$E_h / h$\")\n",
"plt.plot(H, E / H**2, label=\"$E_h / h^2$\")\n",
"plt.legend()\n",
"plt.xlabel('$h$')\n",
"plt.ylabel('Erreur globale')\n",
"plt.title('Erreur globale en fonction de $h$')"
"plt.xlabel(\"$h$\")\n",
"plt.ylabel(\"Erreur globale\")\n",
"plt.title(\"Erreur globale en fonction de $h$\")"
]
},
{
@@ -425,13 +433,15 @@
],
"source": [
"plt.figure()\n",
"plt.plot(np.log(H), np.log(E), '+-', label='erreur')\n",
"plt.plot(np.log(H), np.log(H), '-', label='droite pente 1')\n",
"plt.plot(np.log(H), 2*np.log(H), '-', label='droite pente 2')\n",
"plt.plot(np.log(H), np.log(E), \"+-\", label=\"erreur\")\n",
"plt.plot(np.log(H), np.log(H), \"-\", label=\"droite pente 1\")\n",
"plt.plot(np.log(H), 2 * np.log(H), \"-\", label=\"droite pente 2\")\n",
"plt.legend()\n",
"plt.xlabel(f'$log(h)$')\n",
"plt.ylabel(f'$log(E)$')\n",
"plt.title('Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)')"
"plt.xlabel(\"$log(h)$\")\n",
"plt.ylabel(\"$log(E)$\")\n",
"plt.title(\n",
" \"Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)\",\n",
")"
]
},
{
@@ -509,27 +519,31 @@
],
"source": [
"import numpy as np\n",
"\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"\n",
"ua, ub = 1/2, 1/3\n",
"ua, ub = 1 / 2, 1 / 3\n",
"a, b = 1, 2\n",
"M = 49\n",
"\n",
"\n",
"def f2(x):\n",
" return -2/np.power((x+1), 3)\n",
" return -2 / np.power((x + 1), 3)\n",
"\n",
"\n",
"def u2(x):\n",
" return 1/(x+1)\n",
" return 1 / (x + 1)\n",
"\n",
"\n",
"x, U_app = solution_approchée(f2, M, a, b, ua, ub)\n",
"plt.figure()\n",
"plt.scatter(x, U_app, label=f\"$A_M U_h = h² F$\")\n",
"plt.plot(x, u2(x), label='Solution exacte', color='red')\n",
"plt.scatter(x, U_app, label=\"$A_M U_h = h² F$\")\n",
"plt.plot(x, u2(x), label=\"Solution exacte\", color=\"red\")\n",
"plt.legend()\n",
"plt.ylabel('f(x)')\n",
"plt.xlabel('x')\n",
"plt.title(f\"$-u''=f(x)$\")"
"plt.ylabel(\"f(x)\")\n",
"plt.xlabel(\"x\")\n",
"plt.title(\"$-u''=f(x)$\")"
]
},
{
@@ -565,19 +579,21 @@
"H, E = [], []\n",
"for k in range(2, 12):\n",
" M = 2**k\n",
" h = (b-a)/(M+1)\n",
" \n",
" h = (b - a) / (M + 1)\n",
"\n",
" x, U_app = solution_approchée(f2, M, a, b, ua, ub)\n",
" \n",
"\n",
" e = np.abs(u2(x) - U_app)\n",
" plt.plot(x, e, label=f'{h}')\n",
" plt.plot(x, e, label=f\"{h}\")\n",
" E.append(np.max(e))\n",
" H.append(h)\n",
" \n",
"plt.xlabel('$h$')\n",
"plt.ylabel('$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$')\n",
"\n",
"plt.xlabel(\"$h$\")\n",
"plt.ylabel(r\"$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$\")\n",
"plt.legend(fontsize=7)\n",
"plt.title('Différence en valeur absolue entre la solution exacte et la solution approchée')"
"plt.title(\n",
" \"Différence en valeur absolue entre la solution exacte et la solution approchée\",\n",
")"
]
},
{
@@ -609,14 +625,14 @@
}
],
"source": [
"H, E = np.array(H), np.array(E) \n",
"H, E = np.array(H), np.array(E)\n",
"plt.figure()\n",
"plt.plot(H, E/H, label=f'$E_h / h$')\n",
"plt.plot(H, E/H**2, label=f'$E_h / h^2$')\n",
"plt.plot(H, E / H, label=\"$E_h / h$\")\n",
"plt.plot(H, E / H**2, label=\"$E_h / h^2$\")\n",
"plt.legend()\n",
"plt.xlabel('$h$')\n",
"plt.ylabel('Erreur globale')\n",
"plt.title('Erreur globale en fonction de $h$')"
"plt.xlabel(\"$h$\")\n",
"plt.ylabel(\"Erreur globale\")\n",
"plt.title(\"Erreur globale en fonction de $h$\")"
]
},
{
@@ -649,13 +665,15 @@
],
"source": [
"plt.figure()\n",
"plt.plot(np.log(H), np.log(E), '+-', label='erreur')\n",
"plt.plot(np.log(H), np.log(H), '-', label='droite pente 1')\n",
"plt.plot(np.log(H), 2*np.log(H), '-', label='droite pente 2')\n",
"plt.plot(np.log(H), np.log(E), \"+-\", label=\"erreur\")\n",
"plt.plot(np.log(H), np.log(H), \"-\", label=\"droite pente 1\")\n",
"plt.plot(np.log(H), 2 * np.log(H), \"-\", label=\"droite pente 2\")\n",
"plt.legend()\n",
"plt.xlabel(f'$log(h)$')\n",
"plt.ylabel(f'$log(E)$')\n",
"plt.title('Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)')"
"plt.xlabel(\"$log(h)$\")\n",
"plt.ylabel(\"$log(E)$\")\n",
"plt.title(\n",
" \"Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)\",\n",
")"
]
},
{
@@ -732,22 +750,25 @@
],
"source": [
"def An(M, h):\n",
" mat = 2*np.eye(M) - np.eye(M, k=1) - np.eye(M, k=-1)\n",
" mat[0,0] = 1\n",
" mat = 2 * np.eye(M) - np.eye(M, k=1) - np.eye(M, k=-1)\n",
" mat[0, 0] = 1\n",
" mat[-1, -1] = 1\n",
" return mat + h**2 * np.eye(M)\n",
"\n",
"\n",
"def f3(x):\n",
" return (np.power(2 * np.pi, 2) + 1 ) * np.cos(2*np.pi * x)\n",
" return (np.power(2 * np.pi, 2) + 1) * np.cos(2 * np.pi * x)\n",
"\n",
"\n",
"def u3(x):\n",
" return np.cos(2*np.pi * x)\n",
" return np.cos(2 * np.pi * x)\n",
"\n",
"\n",
"def solution_neumann(f, M, a, b):\n",
" X = np.linspace(a, b, M+2)\n",
" U = np.zeros(M+2)\n",
" h = 1/(M+1)\n",
" \n",
" X = np.linspace(a, b, M + 2)\n",
" U = np.zeros(M + 2)\n",
" h = 1 / (M + 1)\n",
"\n",
" U[1:-1] = np.linalg.solve(An(M, h), h**2 * f3(X[1:-1]))\n",
" U[0], U[-1] = U[1], U[-2]\n",
" return X, U\n",
@@ -759,12 +780,18 @@
"for M in [49, 99, 499]:\n",
" x, U_app = solution_neumann(f3, M, a, b)\n",
"\n",
" plt.scatter(x, U_app, marker='+', s=3, label=\"$h^2({A_N}_h + I_M)U = h^2F$ pour M={M}\")\n",
"plt.plot(x, u3(x), label='Solution exacte', color='red')\n",
" plt.scatter(\n",
" x,\n",
" U_app,\n",
" marker=\"+\",\n",
" s=3,\n",
" label=\"$h^2({A_N}_h + I_M)U = h^2F$ pour M={M}\",\n",
" )\n",
"plt.plot(x, u3(x), label=\"Solution exacte\", color=\"red\")\n",
"plt.legend(fontsize=8)\n",
"plt.ylabel('f(x)')\n",
"plt.xlabel('x')\n",
"plt.title(f\"$-u''(x) + u(x)=f(x)$\")"
"plt.ylabel(\"f(x)\")\n",
"plt.xlabel(\"x\")\n",
"plt.title(\"$-u''(x) + u(x)=f(x)$\")"
]
},
{
@@ -800,19 +827,21 @@
"H, E = [], []\n",
"for k in range(2, 12):\n",
" M = 2**k\n",
" h = (b-a)/(M+1)\n",
" \n",
" h = (b - a) / (M + 1)\n",
"\n",
" x, U_app = solution_neumann(f3, M, a, b)\n",
" \n",
"\n",
" e = np.abs(u3(x) - U_app)\n",
" plt.plot(x, e, label=f'{h}')\n",
" plt.plot(x, e, label=f\"{h}\")\n",
" E.append(np.max(e))\n",
" H.append(h)\n",
" \n",
"plt.xlabel('$h$')\n",
"plt.ylabel('$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$')\n",
"\n",
"plt.xlabel(\"$h$\")\n",
"plt.ylabel(r\"$\\max_{j=0,\\dots,M+1}|u(x_j)-u_j|$\")\n",
"plt.legend(fontsize=7)\n",
"plt.title('Différence en valeur absolue entre la solution exacte et la solution approchée')"
"plt.title(\n",
" \"Différence en valeur absolue entre la solution exacte et la solution approchée\",\n",
")"
]
},
{
@@ -845,13 +874,15 @@
],
"source": [
"plt.figure()\n",
"plt.plot(np.log(H), np.log(E), '+-', label='erreur')\n",
"plt.plot(np.log(H), np.log(H), '-', label='droite pente 1')\n",
"plt.plot(np.log(H), 2*np.log(H), '-', label='droite pente 2')\n",
"plt.plot(np.log(H), np.log(E), \"+-\", label=\"erreur\")\n",
"plt.plot(np.log(H), np.log(H), \"-\", label=\"droite pente 1\")\n",
"plt.plot(np.log(H), 2 * np.log(H), \"-\", label=\"droite pente 2\")\n",
"plt.legend()\n",
"plt.xlabel(f'$log(h)$')\n",
"plt.ylabel(f'$log(E)$')\n",
"plt.title('Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)')"
"plt.xlabel(\"$log(h)$\")\n",
"plt.ylabel(\"$log(E)$\")\n",
"plt.title(\n",
" \"Erreur pour la méthode mon_schema en echelle logarithmique : log(E) en fonction de log(h)\",\n",
")"
]
},
{
@@ -959,7 +990,11 @@
"outputs": [],
"source": [
"def v(x):\n",
" return 4*np.exp(-500*np.square(x-.8)) + np.exp(-50*np.square(x-.2))+.5*np.random.random(x.shape)"
" return (\n",
" 4 * np.exp(-500 * np.square(x - 0.8))\n",
" + np.exp(-50 * np.square(x - 0.2))\n",
" + 0.5 * np.random.random(x.shape)\n",
" )"
]
}
],

Binary file not shown.

File diff suppressed because one or more lines are too long

View File

@@ -9,9 +9,9 @@
},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import scipy.stats as stats"
"import numpy as np\n",
"from scipy import stats"
]
},
{
@@ -48,10 +48,10 @@
"nb_repl = 100000\n",
"sample = np.random.exponential(lam, nb_repl)\n",
"intervalle = np.linspace(0, 5, 100)\n",
"plt.hist(sample, bins=intervalle, density=True, label='Echantillon')\n",
"plt.hist(sample, bins=intervalle, density=True, label=\"Echantillon\")\n",
"\n",
"densite = stats.expon().pdf\n",
"plt.plot(intervalle, densite(intervalle), label='Fonction densité')\n",
"plt.plot(intervalle, densite(intervalle), label=\"Fonction densité\")\n",
"plt.legend()"
]
},
@@ -86,15 +86,15 @@
],
"source": [
"np_repl = 10000\n",
"sampleX, sampleY = 2*np.random.rand(np_repl)-1, 2*np.random.rand(np_repl)-1\n",
"sampleX, sampleY = 2 * np.random.rand(np_repl) - 1, 2 * np.random.rand(np_repl) - 1\n",
"\n",
"intervalle = np.linspace(-2, 2, 100)\n",
"plt.hist(sampleX + sampleY, bins=intervalle, density=True, label='Echantillon')\n",
"plt.hist(sampleX + sampleY, bins=intervalle, density=True, label=\"Echantillon\")\n",
"\n",
"densite = stats.uniform(-1, 2).pdf\n",
"plt.plot(intervalle, densite(intervalle), label='Fonction densité')\n",
"plt.plot(intervalle, densite(intervalle), label=\"Fonction densité\")\n",
"\n",
"plt.plot(intervalle, 1/4 * np.maximum(2 - np.abs(intervalle), 0), label='Fonction')\n",
"plt.plot(intervalle, 1 / 4 * np.maximum(2 - np.abs(intervalle), 0), label=\"Fonction\")\n",
"plt.legend()"
]
},
@@ -130,10 +130,26 @@
"source": [
"lam = 1\n",
"nb_repl = 10000\n",
"sampleX, sampleY, sampleZ = np.random.exponential(lam, nb_repl), np.random.exponential(lam, nb_repl), np.random.exponential(lam, nb_repl)\n",
"sampleX, sampleY, sampleZ = (\n",
" np.random.exponential(lam, nb_repl),\n",
" np.random.exponential(lam, nb_repl),\n",
" np.random.exponential(lam, nb_repl),\n",
")\n",
"intervalle = np.linspace(-5, 5, 100)\n",
"plt.hist(sampleX - sampleY/2, bins=intervalle, density=True, alpha=.7, label='Echantillon de X - Y/2')\n",
"plt.hist(sampleZ/2, bins=intervalle, density=True, alpha=.7, label='Echantillon de Z')\n",
"plt.hist(\n",
" sampleX - sampleY / 2,\n",
" bins=intervalle,\n",
" density=True,\n",
" alpha=0.7,\n",
" label=\"Echantillon de X - Y/2\",\n",
")\n",
"plt.hist(\n",
" sampleZ / 2,\n",
" bins=intervalle,\n",
" density=True,\n",
" alpha=0.7,\n",
" label=\"Echantillon de Z\",\n",
")\n",
"plt.legend()"
]
},
@@ -157,15 +173,20 @@
}
],
"source": [
"p = 1/2\n",
"p = 1 / 2\n",
"nb_repl = 1000\n",
"\n",
"plt.figure(figsize=(18, 5))\n",
"for i, k in enumerate([10, 100, 1000]):\n",
" plt.subplot(1, 3, i+1)\n",
" plt.subplot(1, 3, i + 1)\n",
" sample = np.random.binomial(k, p, nb_repl)\n",
" intervalle = np.linspace(np.min(sample), np.max(sample), 100)\n",
" plt.hist(sample, bins=intervalle, density=True, label=f'Echantillon de X pour n={k}')\n",
" plt.hist(\n",
" sample,\n",
" bins=intervalle,\n",
" density=True,\n",
" label=f\"Echantillon de X pour n={k}\",\n",
" )\n",
" plt.legend()"
]
},
@@ -179,7 +200,7 @@
"outputs": [],
"source": [
"def sample_uniforme(N):\n",
" return 2*np.random.rand(N) - 1"
" return 2 * np.random.rand(N) - 1"
]
},
{
@@ -218,7 +239,7 @@
"\n",
"for _ in range(nb_lgn):\n",
" liste_Sn.append(np.mean(sample_uniforme(nb_repl)))\n",
" \n",
"\n",
"nb_bins = 100\n",
"intervalles = np.linspace(np.min(liste_Sn), np.max(liste_Sn), nb_bins)\n",
"plt.hist(liste_Sn, density=True, bins=intervalles)\n",
@@ -256,12 +277,14 @@
"liste_Sn = []\n",
"\n",
"for _ in range(nb_lgn):\n",
" liste_Sn.append(np.mean(np.sqrt(3*nb_repl) * np.tan(np.pi/2 * sample_uniforme(nb_repl))))\n",
" liste_Sn.append(\n",
" np.mean(np.sqrt(3 * nb_repl) * np.tan(np.pi / 2 * sample_uniforme(nb_repl))),\n",
" )\n",
"\n",
"#nb_bins = 100\n",
"#intervalles = np.linspace(np.min(liste_Sn), np.max(liste_Sn), nb_bins)\n",
"#plt.hist(liste_Sn, density=True, bins=intervalles)\n",
"#plt.show()\n",
"# nb_bins = 100\n",
"# intervalles = np.linspace(np.min(liste_Sn), np.max(liste_Sn), nb_bins)\n",
"# plt.hist(liste_Sn, density=True, bins=intervalles)\n",
"# plt.show()\n",
"\n",
"plt.figure()\n",
"densite = stats.norm(scale=1).pdf\n",

View File

@@ -9,11 +9,11 @@
},
"outputs": [],
"source": [
"import numpy as np\n",
"import scipy.stats as stats\n",
"import scipy.special as sp\n",
"import matplotlib.pyplot as plt\n",
"import scipy.optimize as opt"
"import numpy as np\n",
"import scipy.optimize as opt\n",
"import scipy.special as sp\n",
"from scipy import stats"
]
},
{
@@ -27,7 +27,12 @@
"source": [
"def f(L, z):\n",
" x, y = L[0], L[1]\n",
" return np.power(z*x, 2) + np.power(y/z, 2) - np.cos(2 * np.pi * x) - np.cos(2 * np.pi * y)"
" return (\n",
" np.power(z * x, 2)\n",
" + np.power(y / z, 2)\n",
" - np.cos(2 * np.pi * x)\n",
" - np.cos(2 * np.pi * y)\n",
" )"
]
},
{
@@ -123,17 +128,17 @@
"source": [
"plt.figure(figsize=(18, 5))\n",
"for i, nb_repl in enumerate([100, 1000, 10000]):\n",
" plt.subplot(1, 3, i+1)\n",
" plt.subplot(1, 3, i + 1)\n",
" sample_X1 = np.random.normal(0, 1, nb_repl)\n",
" sample_X2 = np.random.normal(3, np.sqrt(5), nb_repl)\n",
" sample_e = np.random.normal(0, np.sqrt(1/4), nb_repl)\n",
" sample_e = np.random.normal(0, np.sqrt(1 / 4), nb_repl)\n",
" Y = 5 * sample_X1 - 4 * sample_X2 + 2 + sample_e\n",
"\n",
" intervalle = np.linspace(np.min(Y), np.max(Y), 100)\n",
" plt.hist(Y, bins=intervalle, density=True, label='Echantillon de Y')\n",
" plt.hist(Y, bins=intervalle, density=True, label=\"Echantillon de Y\")\n",
"\n",
" densite = stats.norm(-10, np.sqrt(105.25)).pdf\n",
" plt.plot(intervalle, densite(intervalle), label='Fonction densité')\n",
" plt.plot(intervalle, densite(intervalle), label=\"Fonction densité\")\n",
"\n",
" plt.title(f\"Graphique de la somme de gaussiennes pour N={nb_repl}\")\n",
" plt.legend()"
@@ -151,8 +156,9 @@
"def theta_hat(Y):\n",
" return np.mean(Y)\n",
"\n",
"\n",
"def sigma_hat(Y):\n",
" return 1/nb_repl * np.sum(np.power(Y - theta_hat(Y), 2))"
" return 1 / nb_repl * np.sum(np.power(Y - theta_hat(Y), 2))"
]
},
{
@@ -166,7 +172,11 @@
"source": [
"def log_likehood_gauss(X, Y):\n",
" theta, sigma_2 = X[0], X[1]\n",
" return 1/2*np.log(2*np.pi) + 1/2*np.log(sigma_2) + 1/(2*nb_repl*sigma_2) * np.sum(np.power(Y - theta, 2))"
" return (\n",
" 1 / 2 * np.log(2 * np.pi)\n",
" + 1 / 2 * np.log(sigma_2)\n",
" + 1 / (2 * nb_repl * sigma_2) * np.sum(np.power(Y - theta, 2))\n",
" )"
]
},
{
@@ -191,9 +201,9 @@
"nb_repl = 5000\n",
"sample_X1 = np.random.normal(0, 1, nb_repl)\n",
"sample_X2 = np.random.normal(3, np.sqrt(5), nb_repl)\n",
"sample_e = np.random.normal(0, np.sqrt(1/4), nb_repl)\n",
"sample_e = np.random.normal(0, np.sqrt(1 / 4), nb_repl)\n",
"Y = 5 * sample_X1 - 4 * sample_X2 + 2 + sample_e\n",
" \n",
"\n",
"mk = {\"method\": \"BFGS\", \"args\": Y}\n",
"res = opt.basinhopping(log_likehood_gauss, x0=(-1, 98.75), minimizer_kwargs=mk)\n",
"print(res.x)\n",
@@ -210,13 +220,13 @@
},
"outputs": [],
"source": [
"def simule(a, b, n):\n",
" X = np.random.gamma(a, 1/b, n)\n",
"def simule(a, b, n) -> None:\n",
" X = np.random.gamma(a, 1 / b, n)\n",
" intervalle = np.linspace(0, np.max(X), 100)\n",
" plt.hist(X, bins=intervalle, density=True, label='Echantillon de X')\n",
" plt.hist(X, bins=intervalle, density=True, label=\"Echantillon de X\")\n",
"\n",
" densite = stats.gamma.pdf(intervalle, a, 0, 1/b)\n",
" plt.plot(intervalle, densite, label='Fonction densité Gamma(2, 1)')\n",
" densite = stats.gamma.pdf(intervalle, a, 0, 1 / b)\n",
" plt.plot(intervalle, densite, label=\"Fonction densité Gamma(2, 1)\")\n",
" plt.legend()"
]
},
@@ -254,8 +264,13 @@
"source": [
"def log_likehood_gamma(X, sample):\n",
" a, b = X[0], X[1]\n",
" n = len(sample) \n",
" return -n*a*np.log(b) + n * np.log(sp.gamma(a)) - (a-1) * np.sum(np.log(sample)) + b * np.sum(sample)"
" n = len(sample)\n",
" return (\n",
" -n * a * np.log(b)\n",
" + n * np.log(sp.gamma(a))\n",
" - (a - 1) * np.sum(np.log(sample))\n",
" + b * np.sum(sample)\n",
" )"
]
},
{
@@ -296,7 +311,7 @@
"nb_repl = 1000\n",
"a, b = 2, 1\n",
"\n",
"sample = np.random.gamma(a, 1/b, nb_repl)\n",
"sample = np.random.gamma(a, 1 / b, nb_repl)\n",
"mk = {\"method\": \"BFGS\", \"args\": sample}\n",
"res = opt.basinhopping(log_likehood_gamma, x0=(1, 1), minimizer_kwargs=mk)\n",
"print(res.x)"

View File

@@ -0,0 +1,120 @@
```{r}
setwd('/Users/arthurdanjou/Workspace/studies/M1/Data Analysis/TP1')
```
# Part 1 - Analysis of the data
```{r}
x <- c(1, 2, 3, 4)
mean(x)
y <- x - mean(x)
mean(y)
t(y)
sum(y^2)
```
```{r}
T <- read.table("Temperature Data.csv", header = TRUE, sep = ";", dec = ",", row.names = 1)
n <- nrow(T)
g <- colMeans(T)
Y <- as.matrix(T - rep(1, n) %*% t(g))
Dp <- diag(1 / n, n)
V <- t(Y) %*% Dp %*% Y
eigen_values <- eigen(V)$values
vectors <- eigen(V)$vectors
total_inertia <- sum(eigen_values)
inertia_one <- max(eigen_values) / sum(eigen_values)
inertia_plan <- (eigen_values[1] + eigen_values[2]) / sum(eigen_values)
P <- Y %*% vectors[, 1:2]
plot(P, pch = 19, xlab = "PC1", ylab = "PC2")
text(P, rownames(T), cex = 0.7, pos = 3)
axis(1, -10:10, pos = 0, labels = F)
axis(2, -5:5, pos = 0, labels = F)
```
```{r}
France <- P %*% matrix(c(0, -1, 1, 0), 2, 2)
plot(France, pch = 19, xlab = "PC1", ylab = "PC2")
text(France, rownames(T), cex = 0.7, pos = 3)
axis(1, -10:10, pos = 0, labels = F)
axis(2, -5:5, pos = 0, labels = F)
```
```{r}
results <- matrix(NA, nrow = n, ncol = 2)
colnames(results) <- c("Quality of Representation (%)", "Contribution to Inertia (%)")
rownames(results) <- rownames(T)
for (i in 1:n) {
yi <- Y[i,]
norm_yi <- sqrt(sum(yi^2))
qlt <- sum((yi %*% vectors[, 1:2])^2) / norm_yi^2 * 100
ctr <- (P[i, 1]^2 / eigen_values[1]) / n * 100
results[i,] <- c(qlt, ctr)
}
# Add the total row
results <- rbind(results, colSums(results))
rownames(results)[n + 1] <- "Total"
results
```
# Part 2 - PCA with FactoMineR
```{r}
library(FactoMineR)
T <- read.csv("Temperature Data.csv", header = TRUE, sep = ";", dec = ",", row.names = 1)
summary(T)
```
```{r}
T.pca <- PCA(T, graph = F)
plot.PCA(T.pca, axes = c(1, 2), habillage = 1, choix = "ind")
plot.PCA(T.pca, axes = c(1, 2), habillage = 1, choix = "var")
print("Var coords")
round(T.pca$var$coord[, 1:2], 2)
print("Eigen values")
round(T.pca$eig, 2)
print("Ind dis")
round(T.pca$ind$dist, 2)
print("Ind contrib")
round(T.pca$ind$contrib[, 1:2], 2)
print("Var contrib")
round(T.pca$var$contrib[, 1:2], 2)
```
## We add new values
```{r}
Amiens <- c(3.1, 3.8, 6.7, 9.5, 12.8, 15.8, 17.6, 17.6, 15.5, 11.1, 6.8, 4.2)
T <- rbind(T, Amiens)
row.names(T)[16] <- "Amiens"
Moscow <- c(-9.2, -8, -2.5, 5.9, 12.8, 16.8, 18.4, 16.6, 11.2, 4.9, -1.5, -6.2)
T <- rbind(T, Moscow)
row.names(T)[17] <- "Moscow"
Marrakech <- c(11.3, 12.8, 15.8, 18.1, 21.2, 24.7, 28.6, 28.6, 25, 20.9, 15.9, 12.1)
T <- rbind(T, Marrakech)
row.names(T)[18] <- "Marrakech"
```
## We redo the PCA
```{r}
T.pca <- PCA(T, ind.sup = 16:18, graph = F)
plot.PCA(T.pca, axes = c(1, 2), habillage = 1, choix = "ind")
plot.PCA(T.pca, axes = c(1, 2), habillage = 1, choix = "var")
```

View File

@@ -0,0 +1,16 @@
Ville;janv;fev;mars;avril;mai;juin;juil;aout;sept;oct;nov;dec
Bordeaux;5,6;6,6;10,3;12,8;15,8;19,3;20,9;21,0;18,6;13,8;9,1;6,2
Brest;6,1;5,8;7,8;9,2;11,6;14,4;15,6;16,0;14,7;12,0;9,0;7,0
Clermont-Ferrand;2,6;3,7;7,5;10,3;13,8;17,3;19,4;19,1;16,2;11,2;6,6;3,6
Grenoble;1,5;3,2;7,7;10,6;14,5;17,8;20,1;19,5;16,7;11,4;6,5;2,3
Lille;2,4;2,9;6,0;8,9;12,4;15,3;17,1;17,1;14,7;10,4;6,1;3,5
Lyon;2,1;3,3;7,7;10,9;14,9;18,5;20,7;20,1;16,9;11,4;6,7;3,1
Marseille;5,5;6,6;10,0;13,0;16,8;20,8;23,3;22,8;19,9;15,0;10,2;6,9
Montpellier;5,6;6,7;9,9;12,8;16,2;20,1;22,7;22,3;19,3;14,6;10,0;6,5
Nantes;5,0;5,3;8,4;10,8;13,9;17,2;18,8;18,6;16,4;12,2;8,2;5,5
Nice;7,5;8,5;10,8;13,3;16,7;20,1;22,7;22,5;20,3;16,0;11,5;8,2
Paris;3,4;4,1;7,6;10,7;14,3;17,5;19,1;18,7;16,0;11,4;7,1;4,3
Rennes;4,8;5,3;7,9;10,1;13,1;16,2;17,9;17,8;15,7;11,6;7,8;5,4
Strasbourg;0,4;1,5;5,6;9,8;14,0;17,2;19,0;18,3;15,1;9,5;4,9;1,3
Toulouse;4,7;5,6;9,2;11,6;14,9;18,7;20,9;20,9;18,3;13,3;8,6;5,5
Vichy;2,4;3,4;7,1;9,9;13,6;17,1;19,3;18,8;16,0;11,0;6,6;3,4
1 Ville janv fev mars avril mai juin juil aout sept oct nov dec
2 Bordeaux 5,6 6,6 10,3 12,8 15,8 19,3 20,9 21,0 18,6 13,8 9,1 6,2
3 Brest 6,1 5,8 7,8 9,2 11,6 14,4 15,6 16,0 14,7 12,0 9,0 7,0
4 Clermont-Ferrand 2,6 3,7 7,5 10,3 13,8 17,3 19,4 19,1 16,2 11,2 6,6 3,6
5 Grenoble 1,5 3,2 7,7 10,6 14,5 17,8 20,1 19,5 16,7 11,4 6,5 2,3
6 Lille 2,4 2,9 6,0 8,9 12,4 15,3 17,1 17,1 14,7 10,4 6,1 3,5
7 Lyon 2,1 3,3 7,7 10,9 14,9 18,5 20,7 20,1 16,9 11,4 6,7 3,1
8 Marseille 5,5 6,6 10,0 13,0 16,8 20,8 23,3 22,8 19,9 15,0 10,2 6,9
9 Montpellier 5,6 6,7 9,9 12,8 16,2 20,1 22,7 22,3 19,3 14,6 10,0 6,5
10 Nantes 5,0 5,3 8,4 10,8 13,9 17,2 18,8 18,6 16,4 12,2 8,2 5,5
11 Nice 7,5 8,5 10,8 13,3 16,7 20,1 22,7 22,5 20,3 16,0 11,5 8,2
12 Paris 3,4 4,1 7,6 10,7 14,3 17,5 19,1 18,7 16,0 11,4 7,1 4,3
13 Rennes 4,8 5,3 7,9 10,1 13,1 16,2 17,9 17,8 15,7 11,6 7,8 5,4
14 Strasbourg 0,4 1,5 5,6 9,8 14,0 17,2 19,0 18,3 15,1 9,5 4,9 1,3
15 Toulouse 4,7 5,6 9,2 11,6 14,9 18,7 20,9 20,9 18,3 13,3 8,6 5,5
16 Vichy 2,4 3,4 7,1 9,9 13,6 17,1 19,3 18,8 16,0 11,0 6,6 3,4

View File

@@ -0,0 +1,512 @@
---
title: "Generalized Linear Models - Project by Arthur DANJOU"
always_allow_html: true
output:
html_document:
toc: true
toc_depth: 4
fig_caption: true
---
## Data Analysis
```{r}
setwd('/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/Projet')
```
```{r}
library(MASS)
library(AER)
library(rmarkdown)
library(car)
library(corrplot)
library(carData)
library(ggfortify)
library(ggplot2)
library(gridExtra)
library(caret)
```
### Data Preprocessing
```{r}
data <- read.csv("./projet.csv", header = TRUE, sep = ",", dec = ".")
# Factors
data$saison <- as.factor(data$saison)
data$mois <- as.factor(data$mois)
data$jour_mois <- as.factor(data$jour_mois)
data$jour_semaine <- as.factor(data$jour_semaine)
data$horaire <- as.factor(data$horaire)
data$jour_travail <- as.factor(data$jour_travail)
data$vacances <- as.factor(data$vacances)
data$meteo <- as.factor(data$meteo)
# Quantitative variables
data$humidite_sqrt <- sqrt(data$humidite)
data$humidite_square <- data$humidite^2
data$temperature1_square <- data$temperature1^2
data$temperature2_square <- data$temperature2^2
data$vent_square <- data$vent^2
data$vent_sqrt <- sqrt(data$vent)
# Remove obs column
rownames(data) <- data$obs
data <- data[, -1]
paged_table(data)
```
```{r}
colSums(is.na(data))
sum(duplicated(data))
```
```{r}
str(data)
summary(data)
```
### Study of the Quantitative Variables
#### Distribution of Variables
```{r}
plot_velos <- ggplot(data, aes(x = velos)) +
geom_histogram(bins = 25, fill = "blue") +
labs(title = "Distribution du nombre de vélos loués", x = "Nombre de locations de vélos")
plot_temperature1 <- ggplot(data, aes(x = temperature1)) +
geom_histogram(bins = 30, fill = "green") +
labs(title = "Distribution de la température1", x = "Température moyenne mesuré (°C)")
plot_temperature2 <- ggplot(data, aes(x = temperature2)) +
geom_histogram(bins = 30, fill = "green") +
labs(title = "Distribution de la température2", x = "Température moyenne ressentie (°C)")
plot_humidite <- ggplot(data, aes(x = humidite)) +
geom_histogram(bins = 30, fill = "green") +
labs(title = "Distribution de la humidité", x = "Pourcentage d'humidité")
plot_vent <- ggplot(data, aes(x = vent)) +
geom_histogram(bins = 30, fill = "green") +
labs(title = "Distribution de la vent", x = "Vitesse du vent (Km/h)")
grid.arrange(plot_velos, plot_temperature1, plot_temperature2, plot_humidite, plot_vent, ncol = 3)
```
#### Correlation between Quantitatives Variables
```{r}
#detach(package:arm) # If error, uncomment this line due to duplicate function 'corrplot'
corr_matrix <- cor(data[, sapply(data, is.numeric)])
corrplot(corr_matrix, type = "lower", tl.col = "black", tl.srt = 45)
pairs(data[, sapply(data, is.numeric)])
cor.test(data$temperature1, data$temperature2)
```
### Study of the Qualitative Variables
```{r}
plot_saison <- ggplot(data, aes(x = saison, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par saison", x = "Saison", y = "Nombre de vélos loués")
plot_jour_semaine <- ggplot(data, aes(x = jour_semaine, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par jour de la semaine", x = "Jour de la semaine", y = "Nombre de vélos loués")
plot_mois <- ggplot(data, aes(x = mois, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par mois de l'année", x = "Mois de l'année", y = "Nombre de vélos loués")
plot_jour_mois <- ggplot(data, aes(x = jour_mois, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par jour du mois", x = "Jour du mois", y = "Nombre de vélos loués")
plot_horaire <- ggplot(data, aes(x = horaire, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par horaire de la journée", x = "Horaire de la journée", y = "Nombre de vélos loués")
plot_jour_travail <- ggplot(data, aes(x = jour_travail, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par jour travaillé", x = "Jour travaillé", y = "Nombre de vélos loués")
plot_vacances <- ggplot(data, aes(x = vacances, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par vacances", x = "Vacances", y = "Nombre de vélos loués")
plot_meteo <- ggplot(data, aes(x = meteo, y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Nombre de vélos par météo", x = "Météo", y = "Nombre de vélos loués")
grid.arrange(plot_horaire, plot_jour_semaine, plot_jour_mois, plot_mois, plot_saison, plot_jour_travail, plot_vacances, plot_meteo, ncol = 3)
```
```{r}
chisq.test(data$mois, data$saison)
chisq.test(data$meteo, data$saison)
chisq.test(data$meteo, data$mois)
```
### Outliers Detection
```{r}
boxplot_velos <- ggplot(data, aes(x = "", y = velos)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de vélos", y = "Nombre de vélos")
boxplot_temperature1 <- ggplot(data, aes(x = "", y = temperature1)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de température1", y = "Température moyenne mesuré (°C)")
boxplot_temperature1_sq <- ggplot(data, aes(x = "", y = temperature1_square)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de température1^2", y = "Température moyenne mesuré (°C^2)")
boxplot_temperature2 <- ggplot(data, aes(x = "", y = temperature2)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de température2", y = "Température moyenne ressentie (°C)")
boxplot_temperature2_sq <- ggplot(data, aes(x = "", y = temperature2_square)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de température2^2", y = "Température moyenne ressentie (°C^2)")
boxplot_humidite <- ggplot(data, aes(x = "", y = humidite)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de humidité", y = "Pourcentage d'humidité")
boxplot_humidite_sqrt <- ggplot(data, aes(x = "", y = humidite_sqrt)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de sqrt(humidité)", y = "Pourcentage d'humidité (sqrt(%))")
boxplot_humidite_square <- ggplot(data, aes(x = "", y = humidite_square)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de humidité^2", y = "Pourcentage d'humidité (%^2)")
boxplot_vent <- ggplot(data, aes(x = "", y = vent)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de vent", y = "Vitesse du vent (Km/h)")
boxplot_vent_sqrt <- ggplot(data, aes(x = "", y = vent_sqrt)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de sqrt(vent)", y = "Vitesse du vent sqrt(Km/h)")
boxplot_vent_square <- ggplot(data, aes(x = "", y = vent_square)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Boxplot de vent^2", y = "Vitesse du vent (Km/h)^2")
grid.arrange(boxplot_velos, boxplot_temperature1, boxplot_temperature1_sq, boxplot_temperature2, boxplot_temperature2_sq, boxplot_humidite, boxplot_humidite_sqrt, boxplot_humidite_square, boxplot_vent, boxplot_vent_sqrt, boxplot_vent_square, ncol = 4)
```
```{r}
length_data <- nrow(data)
numeric_vars <- sapply(data, is.numeric)
total_outliers <- 0
for (var in names(data)[numeric_vars]) {
data_outlier <- data[[var]]
Q1 <- quantile(data_outlier, 0.25)
Q3 <- quantile(data_outlier, 0.75)
IQR <- Q3 - Q1
lower_limit <- Q1 - 1.5 * IQR
upper_limit <- Q3 + 1.5 * IQR
outliers <- data[data_outlier < lower_limit | data_outlier > upper_limit,]
cat("Number of outliers for the variable", var, ":", nrow(outliers), "\n")
data <- data[data_outlier >= lower_limit & data_outlier <= upper_limit,]
}
cat("Number of outliers removed :", length_data - nrow(data), "\n")
cat("Data length after removing outliers :", nrow(data), "\n")
```
## Model Creation and Comparison
### Data Split
```{r}
set.seed(123)
data_split <- rsample::initial_split(data, prop = 0.8)
data_train <- rsample::training(data_split)
data_test <- rsample::testing(data_split)
```
### Choice of the Distribution *vélos*
```{r}
model_poisson <- glm(velos ~ ., family = poisson, data = data_train)
model_nb <- glm.nb(velos ~ ., data = data_train)
model_gaussian <- glm(velos ~ ., family = gaussian, data = data_train)
t(AIC(model_poisson, model_nb, model_gaussian)[2])
dispersiontest(model_poisson)
mean_velos <- mean(data_train$velos)
var_velos <- var(data_train$velos)
cat("Mean :", mean_velos, "Variance :", var_velos, "\n")
```
### Model Selection
```{r}
model_full_quantitative <- glm.nb(
velos ~ vent +
vent_square +
vent_sqrt +
humidite +
humidite_square +
humidite_sqrt +
temperature1_square +
temperature1 +
temperature2 +
temperature2_square
, data = data_train)
summary(model_full_quantitative)
```
```{r}
anova(model_full_quantitative)
```
```{r}
model_quantitative_quali <- glm.nb(
velos ~
vent +
vent_square +
vent_sqrt +
humidite +
humidite_square +
humidite_sqrt +
temperature1 +
temperature1_square +
horaire +
saison +
meteo +
mois +
vacances +
jour_travail +
jour_semaine +
jour_mois
, data = data_train)
summary(model_quantitative_quali)
```
```{r}
anova(model_quantitative_quali)
```
```{r}
model_quanti_quali_final <- glm.nb(
velos ~
vent +
vent_square + #
humidite +
humidite_square +
humidite_sqrt + #
temperature1 +
temperature1_square +
horaire +
saison +
meteo +
mois +
vacances
, data = data_train)
summary(model_quanti_quali_final)
```
```{r}
model_0 <- glm.nb(velos ~ 1, data = data_train)
model_forward <- stepAIC(
model_0,
vélos ~ vent +
humidite +
humidite_square +
temperature1 +
temperature1_square +
horaire +
saison +
meteo +
mois +
vacances,
data = data_train,
trace = FALSE,
direction = "forward"
)
model_backward <- stepAIC(
model_quanti_quali_final,
~1,
trace = FALSE,
direction = "backward"
)
model_both <- stepAIC(
model_0,
vélos ~ vent +
humidite +
humidite_square +
temperature1 +
temperature1_square +
horaire +
saison +
meteo +
mois +
vacances,
data = data_train,
trace = FALSE,
direction = "both"
)
AIC(model_forward, model_both, model_backward)
summary(model_forward)
```
```{r}
model_final_without_interaction <- glm.nb(
velos ~ horaire +
mois +
meteo +
temperature1 +
saison +
temperature1_square +
humidite_square +
vacances +
humidite +
vent
, data = data_train)
summary(model_final_without_interaction)
```
### Final Model
#### Choice and Validation of the Model
```{r}
model_final <- glm.nb(
velos ~ horaire +
mois +
meteo +
temperature1 +
saison +
temperature1_square +
humidite_square +
vacances +
humidite +
vent +
horaire:temperature1 +
temperature1_square:mois
, data = data_train
)
summary(model_final)
```
```{r}
anova(model_final, test = "Chisq")
```
```{r}
lrtest(model_final, model_final_without_interaction)
```
```{r}
dispersion_ratio <- sum(residuals(model_final, type = "pearson")^2) / df.residual(model_final)
print(paste("Dispersion Ratio:", round(dispersion_ratio, 2)))
hist(residuals(model_final, type = "deviance"), breaks = 30, freq = FALSE, col = "lightblue", main = "Histogram of Deviance Residuals")
x <- seq(-5, 5, length = 100)
curve(dnorm(x), col = "darkblue", lwd = 2, add = TRUE)
autoplot(model_final, 1:4)
```
```{r}
library(arm)
binnedplot(fitted(model_final), residuals(model_final, type = "deviance"), col.int = "blue", col.pts = 2)
```
## Performance and Limitations of the Final Model
### Predictions and *Mean Square Error*
```{r}
data_test$predictions <- predict(model_final, newdata = data_test, type = "response")
data_train$predictions <- predict(model_final, newdata = data_train, type = "response")
predictions_ci <- predict(
model_final,
newdata = data_test,
type = "link",
se.fit = TRUE
)
data_test$lwr <- exp(predictions_ci$fit - 1.96 * predictions_ci$se.fit)
data_test$upr <- exp(predictions_ci$fit + 1.96 * predictions_ci$se.fit)
MSE <- mean((data_test$velos - data_test$predictions)^2)
cat("MSE :", MSE, "\n")
cat("RMSE train :", sqrt(mean((data_train$velos - data_train$predictions)^2)), "\n")
cat('RMSE :', sqrt(MSE), '\n')
```
### Evaluation of Model Performance
```{r}
bounds <- c(200, 650)
cat("Observations vs Prédictions\n")
cat(nrow(data_test[data_test$velos < bounds[1],]), "vs", sum(data_test$predictions < bounds[1]), "\n")
cat(nrow(data_test[data_test$velos > bounds[1] & data_test$velos < bounds[2],]), "vs", sum(data_test$predictions > bounds[1] & data_test$predictions < bounds[2]), "\n")
cat(nrow(data_test[data_test$velos > bounds[2],]), "vs", sum(data_test$predictions > bounds[2]), "\n")
cat('\n')
categories <- c(0, bounds, Inf)
categories_label <- c("Low", "Mid", "High")
data_test$velos_cat <- cut(data_test$velos,
breaks = categories,
labels = categories_label,
include.lowest = TRUE)
data_test$predictions_cat <- cut(data_test$predictions,
breaks = categories,
labels = categories_label,
include.lowest = TRUE)
conf_matrix <- confusionMatrix(data_test$velos_cat, data_test$predictions_cat)
conf_matrix
```
```{r}
ggplot(data_test, aes(x = velos, y = predictions)) +
geom_point(color = "blue", alpha = 0.6, size = 2) +
geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, fill = "grey") +
labs(
title = "Comparison of Observed and Predicted Bikes",
x = "Number of Observed Bikes",
y = "Number of Predicted Bikes"
) +
theme_minimal()
ggplot(data_test, aes(x = velos_cat, fill = predictions_cat)) +
geom_bar(position = "dodge") +
labs(title = "Predictions vs Observations (by categories)",
x = "Observed categories", y = "Number of predictions")
df_plot <- data.frame(observed = data_test$velos, predicted = data_test$predictions)
# Créer l'histogramme avec la courbe de densité
ggplot(df_plot, aes(x = observed)) +
geom_histogram(aes(y = ..density..), binwidth = 50, fill = "lightblue", color = "black", alpha = 0.7) + # Histogramme des données observées
geom_density(aes(x = predicted), color = "red", size = 1) + # Courbe de densité des valeurs prédites
labs(title = "Observed distribution vs. Predicted distribution",
x = "Number of vélos",
y = "Density") +
theme_bw()
```

File diff suppressed because one or more lines are too long

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,139 @@
```{r}
setwd("/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP1-bis")
library(tidyverse)
options(scipen = 999, digits = 5)
```
```{r}
data <- read.csv("data01.csv", header = TRUE, sep = ",", dec = ".")
head(data, 20)
```
```{r}
ggplot(data, aes(x = cholesterol)) +
geom_histogram(binwidth = 5, color = "black", fill = "gray80") +
labs(x = "Cholesterol", y = "Frequency", title = "Histogram of cholesterol") +
theme_bw(14)
ggplot(data, aes(x = poids)) +
geom_histogram(binwidth = 2.5, color = "black", fill = "gray80") +
labs(x = "Poids", y = "Frequency", title = "Histogram of Poids") +
theme_bw(14)
ggplot(aes(y = cholesterol, x = poids), data = data) +
geom_point() +
labs(y = "Cholesterol (y)", x = "Poids, kg (x)", title = "Scatter plot of cholesterol and poids") +
theme_bw(14)
```
# OLSE
```{r}
x <- data[, "poids"]
y <- data[, "cholesterol"]
Sxy <- sum((x - mean(x)) * y)
Sxx <- sum((x - mean(x))^2)
beta1 <- Sxy / Sxx
beta0 <- mean(y) - beta1 * mean(x)
c(beta0, beta1)
```
Final Equation: y = 18.4470 + 2.0523 * x
```{r}
X <- cbind(1, x)
colnames(X) <- NULL
X_t_X <- t(X) %*% X
inv_X_t_x <- solve(X_t_X)
betas <- inv_X_t_x %*% t(X) %*% y
betas
```
```{r}
model <- lm(data, formula = cholesterol ~ poids)
summary(model)
coef(model)
```
```{r}
data <- data |>
mutate(yhat = beta0 + beta1 * poids) |>
mutate(residuals = cholesterol - yhat)
data
ggplot(data, aes(x = poids, y = cholesterol)) +
geom_point(size = 2, shape = 21, fill = "blue", color = "cyan") +
geom_line(aes(y = yhat), color = "blue") +
labs(x = "Poids", y = "Cholesterol", title = "OLS Regression Line") +
theme_bw(14)
```
```{r}
mean(data[, "cholesterol"])
mean(data[, "yhat"])
mean(data[, "residuals"]) |> round(10)
cov(data[, "residuals"], data[, "poids"]) |> round(10)
(RSS <- sum((data[, "residuals"])^2))
(TSS <- sum((y - mean(y))^2))
TSS - beta1 * Sxy
```
```{r}
dof <- nrow(data) - 2
sigma_hat_2 <- RSS / dof
cov_beta <- sigma_hat_2 * inv_X_t_x
var_beta <- diag(cov_beta)
std_beta <- sqrt(var_beta)
sigma(model)^2
vcov(model)
```
# Hypothesis Testing
```{r}
(t_beta <- beta1 / std_beta[2])
qt(0.975, dof)
2 * pt(abs(t_beta), dof, lower.tail = FALSE)
summary(model)
```
```{r}
MSS <- (TSS - RSS)
(F <- MSS / (RSS / dof))
qf(0.95, 1, dof)
pf(F, 1, dof, lower.tail = FALSE)
anova(model)
```
# Confidence Interval
```{r}
(CI_beta1 <- c(beta1 - qt(0.97, dof) * std_beta[2], beta1 + qt(0.97, dof) * std_beta[2]))
confint(model, 0.95)
t <- qt(0.975, dof)
sigma_hat <- sigma(model)
n <- nrow(data)
data <- data |>
mutate(error = t *
sigma_hat *
sqrt(1 / n + (poids - mean(poids))^2 / RSS)) |>
mutate(conf.low = yhat - error, conf.high = yhat + error, error = NULL)
ggplot(data, aes(x = poids, y = cholesterol)) +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.3, fill = "blue") +
geom_point(size = 2, shape = 21, fill = "blue", color = "cyan") +
geom_line(aes(y = yhat), color = "blue") +
labs(x = "Poids", y = "Cholesterol", title = "OLS Regression Line") +
theme_bw(14)
```
# R^2
```{r}
(R2 <- MSS / TSS)
summary(model)$r.squared
cor(data[, "cholesterol"], data[, "poids"])^2
```

View File

@@ -0,0 +1,51 @@
id,cholesterol,poids
1,144.47452579195235,61.2
2,145.88924868817446,62.6
3,161.1126037995184,69.9
4,155.4458581667524,63.8
5,147.97655520732974,64
6,170.24615865999974,70.5
7,144.19706954005656,65.4
8,140.1067951922945,58.3
9,142.7466793355334,60.7
10,145.2692979993652,61.7
11,160.3640967819068,68.5
12,148.56479095579394,65
13,149.76055898423206,65.1
14,143.85346030579532,63.9
15,137.88860216547474,61.2
16,164.79575052668758,70.8
17,154.49767048200715,65.5
18,131.4504444444691,55.5
19,158.60793029776818,66.4
20,154.20805689206188,61.6
21,136.4149514835298,59.1
22,134.5904701014675,62.6
23,144.2563545892844,59.3
24,138.22197617664875,60.5
25,139.21587117755206,60.9
26,138.70662904163586,56.6
27,153.73921419517316,66.9
28,143.2077515962295,64.1
29,139.1754465872936,58.8
30,158.02566076295264,68.6
31,151.67509002312616,65.2
32,147.4035408260477,62.3
33,153.80002424297288,67.1
34,158.72992406100167,67.1
35,153.92208543890675,66.8
36,155.54678399213427,66.3
37,158.2201910034931,65.8
38,149.64656232873804,63.2
39,143.75436129736758,62.2
40,150.4910110335996,61.9
41,147.0277746100402,60.7
42,148.96429207047277,62.6
43,138.37451986964854,58.3
44,163.40504312344743,72.3
45,165.13133732815768,68.4
46,135.39612367787572,58.9
47,155.49199962327577,61.8
48,151.67314985744744,61.6
49,153.49019996627314,66.7
50,142.15508998013192,63.1
1 id cholesterol poids
2 1 144.47452579195235 61.2
3 2 145.88924868817446 62.6
4 3 161.1126037995184 69.9
5 4 155.4458581667524 63.8
6 5 147.97655520732974 64
7 6 170.24615865999974 70.5
8 7 144.19706954005656 65.4
9 8 140.1067951922945 58.3
10 9 142.7466793355334 60.7
11 10 145.2692979993652 61.7
12 11 160.3640967819068 68.5
13 12 148.56479095579394 65
14 13 149.76055898423206 65.1
15 14 143.85346030579532 63.9
16 15 137.88860216547474 61.2
17 16 164.79575052668758 70.8
18 17 154.49767048200715 65.5
19 18 131.4504444444691 55.5
20 19 158.60793029776818 66.4
21 20 154.20805689206188 61.6
22 21 136.4149514835298 59.1
23 22 134.5904701014675 62.6
24 23 144.2563545892844 59.3
25 24 138.22197617664875 60.5
26 25 139.21587117755206 60.9
27 26 138.70662904163586 56.6
28 27 153.73921419517316 66.9
29 28 143.2077515962295 64.1
30 29 139.1754465872936 58.8
31 30 158.02566076295264 68.6
32 31 151.67509002312616 65.2
33 32 147.4035408260477 62.3
34 33 153.80002424297288 67.1
35 34 158.72992406100167 67.1
36 35 153.92208543890675 66.8
37 36 155.54678399213427 66.3
38 37 158.2201910034931 65.8
39 38 149.64656232873804 63.2
40 39 143.75436129736758 62.2
41 40 150.4910110335996 61.9
42 41 147.0277746100402 60.7
43 42 148.96429207047277 62.6
44 43 138.37451986964854 58.3
45 44 163.40504312344743 72.3
46 45 165.13133732815768 68.4
47 46 135.39612367787572 58.9
48 47 155.49199962327577 61.8
49 48 151.67314985744744 61.6
50 49 153.49019996627314 66.7
51 50 142.15508998013192 63.1

View File

@@ -0,0 +1,77 @@
```{r}
setwd('/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP1')
```
```{r}
library(rmarkdown)
health <- read.table("./health.txt", header = TRUE, sep = " ", dec = ".")
paged_table(health)
```
```{r}
Health <- health[2:5]
library(dplyr)
library(corrplot)
correlation_matrix <- cor(Health)
corrplot(correlation_matrix, order = 'hclust', addrect = 3)
```
```{r}
model <- lm(y ~ ., data = Health)
coefficients(model)
summary(model)
```
```{r}
library(ggfortify)
library(car)
autoplot(model, 1:3)
```
The points are not well distributed around 0 -> [P1] is not verified
The points are not well distributed around 1 -> [P2] is not verified
The QQPlot is aligned with the line y = x, so it is globally gaussian -> [P4] is verified
```{r}
set.seed(0)
durbinWatsonTest(model)
```
The p-value is 0.58 > 0.05 -> We do not reject H0 so the residuals are not auto-correlated -> [P3] is verified
```{r}
library(GGally)
ggpairs(Health, progress = F)
```
We observe that the variable age is correlated with the variable y. There is a quadratic relation between both variables.
```{r}
Health2 <- Health
Health2$age_sq <- Health2$age^2
Health2 <- Health2[1:24,]
model2 <- lm(y ~ ., data = Health2)
summary(model2)
coefficients(model2)
```
```{r}
library(ggfortify)
library(car)
autoplot(model2, 1:4)
```
The points are well distributed around 0 -> [P1] is verified
The points are not well distributed around 1 -> [P2] is verified
The QQPlot is aligned with the line y = x, so it is gaussian -> [P4] is verified
```{r}
set.seed(0)
durbinWatsonTest(model2)
```
The p-value is 0.294 > 0.05 -> We do not reject H0 so the residuals are not auto-correlated -> [P3] is verified

View File

@@ -0,0 +1,26 @@
"Id" "y" "age" "tri" "chol"
"1" 0.344165525138049 61.3710178481415 255.998603752814 174.725536967162
"2" 0.468648478326459 70.0093143060803 303.203193042427 191.830689532217
"3" 0.0114155523307995 20.0903518591076 469.998112767935 126.269967628177
"4" 0.472875443347101 71.2420938722789 146.105958395638 211.434927976225
"5" 0 20.0899018836208 221.786231906153 240.876589370891
"6" 0.143997241336945 43.2352053862996 266.090678572655 137.168468555901
"7" 0.0414102111478797 25.8429238037206 372.907817093655 227.054411582649
"8" 0.113811893132696 41.4695196016692 131.169532104395 131.447072112933
"9" 0.109292366261165 35.7887735008262 430.398655338213 213.031274722889
"10" 0.137468875629733 41.5359775861725 213.798411563039 248.683155018371
"11" 0.0532629899999104 32.2978379554115 121.814481257461 216.917818398215
"12" 0.353328774446386 61.3454213505611 407.593103558756 125.090295937844
"13" 0.32693962726726 58.9427325245924 310.99092704244 234.297853410244
"14" 0.160669457776164 45.8133233059198 146.624271371402 179.472973288503
"15" 0.513130027249664 72.5317458156496 388.182279183529 179.566718712449
"16" 0.617920854597292 80.2682227501646 227.86060355138 171.406586996745
"17" 0.205049999385919 48.5804966441356 270.10274540633 216.100836060941
"18" 0.0125951861990601 23.7895587598905 173.06959128473 246.627336950041
"19" 0.321090735035653 58.1796469353139 451.47649507504 144.35365190031
"20" 0.375862066828795 65.0117327272892 158.536369032227 127.305114357732
"21" 0.0722375355431054 35.3545167273842 126.951391426846 193.100387256127
"22" 0.0772129996198421 31.5452552819625 379.832963193767 235.720843761228
"23" 0.0605716685358678 30.9160601114854 311.444346229546 191.08392035123
"24" 0.157789955300155 44.7116475505754 164.532195450738 224.480235648807
"25" 1 99 457.110102558509 239.30889045354

View File

@@ -0,0 +1,117 @@
```{r}
setwd("/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP2-bis")
library(tidyverse)
library(GGally)
library(broom)
library(scales)
library(car)
library(qqplotr)
options(scipen = 999, digits = 5)
```
```{r}
data <- read.csv("data02.csv", sep = ",", header = TRUE, dec = ".")
data |>
mutate(type = factor(type, levels = c("maths", "english", "final"), labels = c("maths", "english", "final"))) |>
ggplot(aes(x = note)) +
facet_wrap(vars(type), scales = "free_x") +
geom_histogram(binwidth = 4, color = "black", fill = "grey80") +
labs(title = "Histogram of notes", x = "Note") +
theme_bw(14)
```
```{r}
data_wide <- pivot_wider(data, names_from = type, values_from = note)
data_wide |>
select(-id) |>
ggpairs() + theme_bw(14)
```
```{r}
model <- lm(data_wide, formula = final ~ maths + english)
summary(model)
```
```{r}
tidy(model, conf.int = TRUE, conf.level = 0.95)
glance(model)
(R2 <- summary(model)$r.squared)
(R2 / 2) * 57 / (1 - R2)
vcov(model)
```
# Hypothesis testing
```{r}
C <- c(0, 1, -1)
beta <- cbind(coef(model))
(C_beta <- C %*% beta)
X <- model.matrix(model)
(inv_XtX <- solve(t(X) %*% X))
q <- 1
numerator <- t(C_beta) %*%
solve(t(C) %*% inv_XtX %*% C) %*%
C_beta
denominator <- sigma(model)^2
F <- (numerator / q) / denominator
F
dof <- nrow(data_wide) - 3
qf(0.95, q, dof)
pf(F, q, dof, lower.tail = FALSE)
linearHypothesis(model, "maths - english = 0")
```
# Submodel testing
```{r}
data_predict <- predict(model, newdata = expand.grid(maths = seq(70, 90, 2), english = c(75, 85)), interval = "confidence") |>
as_tibble() |>
bind_cols(expand.grid(maths = seq(70, 90, 2), english = c(75, 85)))
data_predict |>
mutate(english = as.factor(english)) |>
ggplot(aes(x = maths, y = fit, color = english, fill = english, label = round(fit, 1))) +
geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, show.legend = FALSE) +
geom_point(size = 2) +
geom_line(aes(y = fit)) +
geom_text(vjust = -1, show.legend = FALSE) +
labs(title = "Prediction of final note", x = "Maths note", y = "Final note", color = "English", fill = "English") +
theme_bw(14)
```
```{r}
diag_data <- augment(model)
ggplot(diag_data, aes(x = .fitted, y = .resid)) +
geom_point() +
geom_hline(yintercept = 0) +
labs(title = "Residuals vs Fitted", x = "Fitted values", y = "Residuals") +
theme_bw(14)
```
```{r}
ggplot(diag_data, aes(sample = .resid)) +
stat_qq_band(alpha = 0.2, fill = "blue") +
stat_qq_line(color = "red") +
stat_qq_point(size = 1) +
labs(y = "Sample quantile", x = "Theoritical quantile") +
theme_minimal(base_size = 14)
ggplot(diag_data, aes(x = .resid)) +
geom_histogram(fill = "dodgerblue", color = "black", bins = 7) +
labs(y = "Count", x = "Résiduals") +
scale_y_continuous(expand = expansion(c(0, 0.05))) +
scale_x_continuous(breaks = pretty_breaks(n = 10)) +
theme_minimal(base_size = 14)
mutate(diag_data, obs = row_number()) |>
ggplot(aes(x = obs, y = .cooksd)) +
geom_segment(aes(x = obs, y = 0, xend = obs, yend = .cooksd)) +
geom_point(color = "blue", size = 1) +
scale_x_continuous(breaks = seq(0, 60, 10)) +
labs(y = "Cook's distance", x = "Index") +
theme_minimal(base_size = 14)
influenceIndexPlot(model, vars = "cook", id = list(n = 5), main = NULL)
```

View File

@@ -0,0 +1,181 @@
id,type,note
1,final,59.29
1,maths,75.34
1,english,81.17
2,final,60.75
2,maths,78.96
2,english,81.99
3,final,65.88
3,maths,78.06
3,english,80.16
4,final,64.5
4,maths,77.83
4,english,76.74
5,final,59.99
5,maths,83.26
5,english,80.93
6,final,63.6
6,maths,83.26
6,english,77.98
7,final,69.01
7,maths,81.9
7,english,80.24
8,final,67.21
8,maths,86.83
8,english,80.75
9,final,69.93
9,maths,84.28
9,english,79.05
10,final,60.1
10,maths,77.89
10,english,72.77
11,final,59.19
11,maths,77.91
11,english,74.23
12,final,64.19
12,maths,75.72
12,english,74.48
13,final,68.89
13,maths,88.06
13,english,82.16
14,final,75.48
14,maths,84
14,english,83.16
15,final,70.95
15,maths,85.58
15,english,83.15
16,final,64.24
16,maths,83.5
16,english,81.94
17,final,63.32
17,maths,84.36
17,english,78.67
18,final,65.31
18,maths,79.1
18,english,82.73
19,final,60.88
19,maths,78.9
19,english,80.34
20,final,62.38
20,maths,76.31
20,english,81.53
21,final,60.83
21,maths,80.42
21,english,79.98
22,final,67.62
22,maths,81.1
22,english,83.96
23,final,70.03
23,maths,84.25
23,english,79.44
24,final,67.72
24,maths,79.44
24,english,82.07
25,final,72.76
25,maths,86.04
25,english,82.4
26,final,74.76
26,maths,87.03
26,english,87
27,final,74.33
27,maths,84.65
27,english,87.8
28,final,64.87
28,maths,77.41
28,english,78.92
29,final,66.04
29,maths,81.65
29,english,81.78
30,final,64.73
30,maths,75.84
30,english,79.03
31,final,60.86
31,maths,70.64
31,english,80.88
32,final,62.11
32,maths,71.72
32,english,76.82
33,final,65.91
33,maths,76.24
33,english,76.7
34,final,68.93
34,maths,86.21
34,english,83.18
35,final,71.63
35,maths,83.73
35,english,82.51
36,final,68.16
36,maths,84.52
36,english,80.64
37,final,67.39
37,maths,80.82
37,english,80.51
38,final,62.62
38,maths,76.91
38,english,78.17
39,final,61.97
39,maths,79.58
39,english,78.33
40,final,58.88
40,maths,72.49
40,english,76.25
41,final,62.72
41,maths,78.31
41,english,74.58
42,final,63.45
42,maths,73.51
42,english,76.17
43,final,62.28
43,maths,77.46
43,english,80.04
44,final,67.27
44,maths,77.79
44,english,78.98
45,final,64.04
45,maths,83.26
45,english,75.18
46,final,65.24
46,maths,80.48
46,english,79.08
47,final,61.27
47,maths,81.74
47,english,81.21
48,final,67.51
48,maths,77.36
48,english,78.09
49,final,56.02
49,maths,74.08
49,english,73.34
50,final,59.69
50,maths,76.24
50,english,73.21
51,final,62.68
51,maths,70.61
51,english,72.48
52,final,54.8
52,maths,72.41
52,english,78.33
53,final,55.89
53,maths,77.09
53,english,75.79
54,final,60.36
54,maths,78.81
54,english,77.19
55,final,60.47
55,maths,76.84
55,english,76.34
56,final,63.31
56,maths,80.6
56,english,76.26
57,final,60.2
57,maths,80.41
57,english,76.67
58,final,64.03
58,maths,86.18
58,english,76.56
59,final,67.61
59,maths,83.56
59,english,82.45
60,final,66.87
60,maths,84.23
60,english,79.77
1 id type note
2 1 final 59.29
3 1 maths 75.34
4 1 english 81.17
5 2 final 60.75
6 2 maths 78.96
7 2 english 81.99
8 3 final 65.88
9 3 maths 78.06
10 3 english 80.16
11 4 final 64.5
12 4 maths 77.83
13 4 english 76.74
14 5 final 59.99
15 5 maths 83.26
16 5 english 80.93
17 6 final 63.6
18 6 maths 83.26
19 6 english 77.98
20 7 final 69.01
21 7 maths 81.9
22 7 english 80.24
23 8 final 67.21
24 8 maths 86.83
25 8 english 80.75
26 9 final 69.93
27 9 maths 84.28
28 9 english 79.05
29 10 final 60.1
30 10 maths 77.89
31 10 english 72.77
32 11 final 59.19
33 11 maths 77.91
34 11 english 74.23
35 12 final 64.19
36 12 maths 75.72
37 12 english 74.48
38 13 final 68.89
39 13 maths 88.06
40 13 english 82.16
41 14 final 75.48
42 14 maths 84
43 14 english 83.16
44 15 final 70.95
45 15 maths 85.58
46 15 english 83.15
47 16 final 64.24
48 16 maths 83.5
49 16 english 81.94
50 17 final 63.32
51 17 maths 84.36
52 17 english 78.67
53 18 final 65.31
54 18 maths 79.1
55 18 english 82.73
56 19 final 60.88
57 19 maths 78.9
58 19 english 80.34
59 20 final 62.38
60 20 maths 76.31
61 20 english 81.53
62 21 final 60.83
63 21 maths 80.42
64 21 english 79.98
65 22 final 67.62
66 22 maths 81.1
67 22 english 83.96
68 23 final 70.03
69 23 maths 84.25
70 23 english 79.44
71 24 final 67.72
72 24 maths 79.44
73 24 english 82.07
74 25 final 72.76
75 25 maths 86.04
76 25 english 82.4
77 26 final 74.76
78 26 maths 87.03
79 26 english 87
80 27 final 74.33
81 27 maths 84.65
82 27 english 87.8
83 28 final 64.87
84 28 maths 77.41
85 28 english 78.92
86 29 final 66.04
87 29 maths 81.65
88 29 english 81.78
89 30 final 64.73
90 30 maths 75.84
91 30 english 79.03
92 31 final 60.86
93 31 maths 70.64
94 31 english 80.88
95 32 final 62.11
96 32 maths 71.72
97 32 english 76.82
98 33 final 65.91
99 33 maths 76.24
100 33 english 76.7
101 34 final 68.93
102 34 maths 86.21
103 34 english 83.18
104 35 final 71.63
105 35 maths 83.73
106 35 english 82.51
107 36 final 68.16
108 36 maths 84.52
109 36 english 80.64
110 37 final 67.39
111 37 maths 80.82
112 37 english 80.51
113 38 final 62.62
114 38 maths 76.91
115 38 english 78.17
116 39 final 61.97
117 39 maths 79.58
118 39 english 78.33
119 40 final 58.88
120 40 maths 72.49
121 40 english 76.25
122 41 final 62.72
123 41 maths 78.31
124 41 english 74.58
125 42 final 63.45
126 42 maths 73.51
127 42 english 76.17
128 43 final 62.28
129 43 maths 77.46
130 43 english 80.04
131 44 final 67.27
132 44 maths 77.79
133 44 english 78.98
134 45 final 64.04
135 45 maths 83.26
136 45 english 75.18
137 46 final 65.24
138 46 maths 80.48
139 46 english 79.08
140 47 final 61.27
141 47 maths 81.74
142 47 english 81.21
143 48 final 67.51
144 48 maths 77.36
145 48 english 78.09
146 49 final 56.02
147 49 maths 74.08
148 49 english 73.34
149 50 final 59.69
150 50 maths 76.24
151 50 english 73.21
152 51 final 62.68
153 51 maths 70.61
154 51 english 72.48
155 52 final 54.8
156 52 maths 72.41
157 52 english 78.33
158 53 final 55.89
159 53 maths 77.09
160 53 english 75.79
161 54 final 60.36
162 54 maths 78.81
163 54 english 77.19
164 55 final 60.47
165 55 maths 76.84
166 55 english 76.34
167 56 final 63.31
168 56 maths 80.6
169 56 english 76.26
170 57 final 60.2
171 57 maths 80.41
172 57 english 76.67
173 58 final 64.03
174 58 maths 86.18
175 58 english 76.56
176 59 final 67.61
177 59 maths 83.56
178 59 english 82.45
179 60 final 66.87
180 60 maths 84.23
181 60 english 79.77

View File

@@ -0,0 +1,37 @@
Origine;Couleur;Alcool;pH;AcTot;Tartrique;Malique;Citrique;Acetique;Lactique
Bordeaux;Blanc;12;2,84;89;21,1;21;4,3;16,9;9,3
Bordeaux;Blanc;11,5;3,1;97;26,4;34,2;3,9;9,9;16
Bordeaux;Blanc;14,6;2,96;99;20,7;21,8;8,1;19,7;11,2
Bordeaux;Blanc;10,5;3,1;72;29,7;4,2;3,6;11,9;14,4
Bordeaux;Blanc;14;3,29;76;22,3;9,3;4,7;20,1;21,6
Bordeaux;Blanc;13,2;2,94;83;24,6;9,4;4,1;19,7;16,8
Bordeaux;Blanc;11,2;2,91;95;39,4;14,5;4,2;19,4;10,5
Bordeaux;Blanc;15,4;3,43;86;14,1;28,8;8,5;15;12,6
Bordeaux;Blanc;13,4;3,35;76;18,9;23;6,4;14,4;10,5
Bourgogne;Blanc;11,4;2,9;103;50;18;2,8;14,4;8,5
Bourgogne;Blanc;10,5;2,95;118;31,6;38,8;4,2;13,1;15,7
Bourgogne;Blanc;10,3;2,9;106;37,5;27,6;3,2;14,7;11,5
Bourgogne;Blanc;13;2,89;97;42,1;19;3,6;11,2;15,7
Bourgogne;Blanc;13,4;2,83;107;50,5;22,1;5;11,9;9,4
Bourgogne;Blanc;12,7;3,11;101;33,2;17,5;5,3;9,4;36,7
Bourgogne;Blanc;10,8;2,83;121;42,5;44,8;4,6;9,4;8,7
Bourgogne;Blanc;12;3,25;76;44,2;1;3;17,5;17,6
Bourgogne;Blanc;13,8;3,15;87;29,2;24,4;6,4;10,6;8,6
Bordeaux;Rouge;11,7;3,49;74;22,5;3,6;1,4;21,8;24,3
Bordeaux;Rouge;10,8;3,5;76;24,8;1,8;1,3;21,2;27
Bordeaux;Rouge;10,1;3,66;72;25,7;5;0,8;20;29,7
Bordeaux;Rouge;11,7;3,22;77;30,1;1,4;1,1;17,5;20,2
Bordeaux;Rouge;10,2;3,45;74;23,5;2;1;20,6;23,5
Bordeaux;Rouge;11,5;3,28;86;26,4;3,4;1,1;21,2;30,4
Bordeaux;Rouge;11,2;3,35;73;30,8;1,8;3,6;10,3;22,2
Bordeaux;Rouge;12,2;3,43;77;26,1;1,8;1,4;14,7;17,9
Bordeaux;Rouge;11,2;3,41;75;27,2;3;1,5;15,9;23,1
Bourgogne;Rouge;12,7;3,36;78;36,2;2;0,8;17,2;20,1
Bourgogne;Rouge;13,1;3,48;87;30;2;1;22,1;37,4
Bourgogne;Rouge;13,4;3,5;75;27,8;0;1,3;14,7;33,1
Bourgogne;Rouge;13,2;3,59;76;30;3,6;2,2;13,4;40,9
Bourgogne;Rouge;13,4;3,34;79;34,6;0;1,6;16,2;23,2
Bourgogne;Rouge;13,8;3,46;76;27,7;1,2;3;16,2;30,4
Bourgogne;Rouge;13,6;3,17;97;39,7;18,4;5,4;12,2;14,3
Bourgogne;Rouge;14;3,42;76;32,6;0,4;2,1;14,7;20,4
Bourgogne;Rouge;13,1;3,35;81;36;1,6;1,9;15;23,1
1 Origine Couleur Alcool pH AcTot Tartrique Malique Citrique Acetique Lactique
2 Bordeaux Blanc 12 2,84 89 21,1 21 4,3 16,9 9,3
3 Bordeaux Blanc 11,5 3,1 97 26,4 34,2 3,9 9,9 16
4 Bordeaux Blanc 14,6 2,96 99 20,7 21,8 8,1 19,7 11,2
5 Bordeaux Blanc 10,5 3,1 72 29,7 4,2 3,6 11,9 14,4
6 Bordeaux Blanc 14 3,29 76 22,3 9,3 4,7 20,1 21,6
7 Bordeaux Blanc 13,2 2,94 83 24,6 9,4 4,1 19,7 16,8
8 Bordeaux Blanc 11,2 2,91 95 39,4 14,5 4,2 19,4 10,5
9 Bordeaux Blanc 15,4 3,43 86 14,1 28,8 8,5 15 12,6
10 Bordeaux Blanc 13,4 3,35 76 18,9 23 6,4 14,4 10,5
11 Bourgogne Blanc 11,4 2,9 103 50 18 2,8 14,4 8,5
12 Bourgogne Blanc 10,5 2,95 118 31,6 38,8 4,2 13,1 15,7
13 Bourgogne Blanc 10,3 2,9 106 37,5 27,6 3,2 14,7 11,5
14 Bourgogne Blanc 13 2,89 97 42,1 19 3,6 11,2 15,7
15 Bourgogne Blanc 13,4 2,83 107 50,5 22,1 5 11,9 9,4
16 Bourgogne Blanc 12,7 3,11 101 33,2 17,5 5,3 9,4 36,7
17 Bourgogne Blanc 10,8 2,83 121 42,5 44,8 4,6 9,4 8,7
18 Bourgogne Blanc 12 3,25 76 44,2 1 3 17,5 17,6
19 Bourgogne Blanc 13,8 3,15 87 29,2 24,4 6,4 10,6 8,6
20 Bordeaux Rouge 11,7 3,49 74 22,5 3,6 1,4 21,8 24,3
21 Bordeaux Rouge 10,8 3,5 76 24,8 1,8 1,3 21,2 27
22 Bordeaux Rouge 10,1 3,66 72 25,7 5 0,8 20 29,7
23 Bordeaux Rouge 11,7 3,22 77 30,1 1,4 1,1 17,5 20,2
24 Bordeaux Rouge 10,2 3,45 74 23,5 2 1 20,6 23,5
25 Bordeaux Rouge 11,5 3,28 86 26,4 3,4 1,1 21,2 30,4
26 Bordeaux Rouge 11,2 3,35 73 30,8 1,8 3,6 10,3 22,2
27 Bordeaux Rouge 12,2 3,43 77 26,1 1,8 1,4 14,7 17,9
28 Bordeaux Rouge 11,2 3,41 75 27,2 3 1,5 15,9 23,1
29 Bourgogne Rouge 12,7 3,36 78 36,2 2 0,8 17,2 20,1
30 Bourgogne Rouge 13,1 3,48 87 30 2 1 22,1 37,4
31 Bourgogne Rouge 13,4 3,5 75 27,8 0 1,3 14,7 33,1
32 Bourgogne Rouge 13,2 3,59 76 30 3,6 2,2 13,4 40,9
33 Bourgogne Rouge 13,4 3,34 79 34,6 0 1,6 16,2 23,2
34 Bourgogne Rouge 13,8 3,46 76 27,7 1,2 3 16,2 30,4
35 Bourgogne Rouge 13,6 3,17 97 39,7 18,4 5,4 12,2 14,3
36 Bourgogne Rouge 14 3,42 76 32,6 0,4 2,1 14,7 20,4
37 Bourgogne Rouge 13,1 3,35 81 36 1,6 1,9 15 23,1

View File

@@ -0,0 +1,85 @@
```{r}
setwd("/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP2")
```
# Question 1 : Import dataset and check variables
```{r}
library(dplyr)
cepages <- read.csv("Cepages B TP2.csv", header = TRUE, sep = ";", dec = ",")
cepages$Couleur <- as.factor(cepages$Couleur)
cepages$Origine <- as.factor(cepages$Origine)
cepages <- cepages |> mutate(across(where(is.character), as.numeric))
cepages <- cepages |> mutate(across(where(is.integer), as.numeric))
paged_table(cepages)
```
# Question 2 : Table of counts
```{r}
table(cepages$Origine, cepages$Couleur)
```
# Question 3
## Display the table of average Ph according to couleur and average ph
```{r}
tapply(cepages$pH, list(cepages$Couleur), mean)
```
## Display the table of average pH according to couleur and origine
```{r}
tapply(cepages$pH, list(cepages$Couleur, cepages$Origine), mean)
```
# Question 4 : Regression lines of ph over AcTol for different Color
```{r}
library(ggplot2)
ggplot(cepages, aes(x = AcTot, y = pH, color = Couleur)) +
geom_point(col = "red", size = 0.5) +
geom_smooth(method = "lm", se = F)
ggplot(cepages, aes(y = pH, x = AcTot, colour = Couleur, fill = Couleur)) +
geom_boxplot(alpha = 0.5, outlier.alpha = 0)
```
# Question 5 : Regression Ligne of pH over AcTot for different Origine
```{r}
ggplot(cepages, aes(x = AcTot, y = pH, color = Origine)) +
geom_smooth(method = "lm", se = F) +
geom_point(col = "red", size = 0.5)
ggplot(cepages, aes(y = pH, x = AcTot, colour = Origine, fill = Origine)) +
geom_boxplot(alpha = 0.5, outlier.alpha = 0)
```
# Question 6 : ANOVA
```{r}
model_full <- lm(pH ~ Couleur, data = cepages)
summary(model_full)
```
```{r}
autoplot(model_full, 1:4)
```
[P1] is verified as the 'Residuals vs Fitted' plot shows that the points are well distributed around 0
[P2] is verified as the 'Scale-Location' plot shows that the points are well distributed around 1
[P4] is verified as the 'QQPlot' is aligned with the 'y=x' line
```{r}
set.seed(12)
durbinWatsonTest(model_full)
```
[P3] is verified as the p-value is 0.7 > 0.05, so we do not reject H0 so the residuals are not auto-correlated
# Bonus : Type II Test
```{r}
library(car)
Anova(model_full)
```

View File

@@ -0,0 +1,206 @@
```{r}
setwd('/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP3-bis')
library(GGally)
library(broom)
library(scales)
library(car)
library(glue)
library(janitor)
library(marginaleffects)
library(tidyverse)
library(qqplotr)
options(scipen = 999, digits = 5)
```
```{r}
data03 <- read.csv('data03.csv', header = TRUE, sep = ',', dec = '.')
data03$sexe <- as.factor(data03$sexe)
data03$travail <- as.factor(data03$travail)
head(data03, 15)
```
```{r}
tab_sexe <- data03 |>
count(sexe) |>
mutate(freq1 = n / sum(n)) |>
mutate(freq2 = glue("{n} ({label_percent()(freq1)})")) |>
rename(Sexe = 1, Effectif = n, Proportion = freq1, "n (%)" = freq2)
tab_sexe
```
```{r}
tab_travail <- data03 |>
count(travail) |>
mutate(freq1 = n / sum(n)) |>
mutate(freq2 = glue("{n} ({label_percent()(freq1)})")) |>
rename(Travail = 1, Effectif = n, Proportion = freq1, "n (%)" = freq2)
tab_travail
```
```{r}
cross_sexe_travail <- data03 |>
count(sexe, Travail = travail) |>
group_by(sexe) |>
mutate(freq1 = n / sum(n)) |>
mutate(freq2 = glue("{n} ({label_percent(0.1)(freq1)})"), n = NULL, freq1 = NULL) |>
pivot_wider(names_from = sexe, values_from = freq2)
cross_sexe_travail
```
```{r}
data03 <- mutate(data03, logy = log(y))
data03 |>
pivot_longer(c(y, logy), names_to = "variable", values_to = "value") |>
mutate(variable = factor(
variable,
levels = c("y", "logy"),
labels = c("Salaire", "Log-Salaire"))
) |>
ggplot(aes(x = value)) +
facet_wrap(vars(variable), scales = "free") +
geom_histogram(
fill = "dodgerblue",
color = "black",
bins = 15) +
scale_y_continuous(expand = expansion(c(0, 0.05))) +
scale_x_continuous(breaks = pretty_breaks()) +
labs(x = "Valeur", y = "Effectif") +
theme_bw(base_size = 14) +
theme(strip.text = element_text(size = 11, face = "bold"))
```
```{r}
data03 |>
pivot_longer(c(y, logy), names_to = "variable", values_to = "value") |>
mutate(variable = factor(
variable,
levels = c("y", "logy"), labels = c("Salaire", "Log-Salaire")
)) |>
ggplot(aes(x = sexe, y = value)) +
facet_wrap(vars(variable), scales = "free") +
geom_boxplot(
width = 0.5, fill = "cyan", linewidth = 0.5, outlier.size = 2, outlier.alpha = 0.3
) +
scale_y_continuous(breaks = pretty_breaks()) +
labs(x = NULL, y = "Valeur") +
theme_bw(base_size = 14) +
theme(
strip.text = element_text(size = 11, face = "bold"),
panel.border = element_rect(linewidth = 0.5)
)
```
```{r}
data03 |>
pivot_longer(c(y, logy), names_to = "variable", values_to = "value") |>
mutate(variable = factor(
variable,
levels = c("y", "logy"), labels = c("Salaire", "Log-Salaire")
)) |>
ggplot(aes(x = travail, y = value)) +
facet_wrap(vars(variable), scales = "free") +
stat_boxplot(width = 0.25, geom = "errorbar", linewidth = 0.5) +
geom_boxplot(
width = 0.5, fatten = 0.25, fill = "cyan", linewidth = 0.5,
outlier.size = 2, outlier.alpha = 0.3
) +
scale_y_continuous(breaks = pretty_breaks()) +
labs(x = NULL, y = "Valeur") +
theme_bw(base_size = 14) +
theme(
strip.text = element_text(size = 11, face = "bold"),
panel.border = element_rect(linewidth = 0.5)
)
```
```{r}
data03 |>
group_by(sexe) |>
summarise(n = n(), "Mean (Salaire)" = mean(y), "SD (Salaire)" = sd(y))
```
```{r}
data03 |>
group_by(travail) |>
summarise(n = n(), "Mean (Salaire)" = mean(y), "SD (Salaire)" = sd(y))
```
```{r}
data03 |>
group_by(sexe, travail) |>
summarise(n = n(), "Mean (Salaire)" = mean(y), "SD (Salaire)" = sd(y))
```
```{r}
data03 |>
group_by(sexe) |>
summarise(n = n(), "Mean (Salaire)" = mean(logy), "SD (Salaire)" = sd(logy))
```
```{r}
data03 |>
group_by(travail) |>
summarise(n = n(), "Mean (Salaire)" = mean(logy), "SD (Salaire)" = sd(logy))
```
```{r}
data03 |>
group_by(sexe, travail) |>
summarise(n = n(), "Mean (Salaire)" = mean(logy), "SD (Salaire)" = sd(logy))
```
```{r}
data03 <- data03 |>
mutate(sexef = ifelse(sexe == "Femme", 1, 0), sexeh = 1 - sexef) |>
mutate(job1 = (travail == "Type 1") * 1) |>
mutate(job2 = (travail == "Type 2") * 1) |>
mutate(job3 = (travail == "Type 3") * 1)
head(data03, 15)
```
```{r}
mod1 <- lm(y ~ sexeh, data = data03)
mod2 <- lm(y ~ job2 + job3, data = data03)
mod3 <- lm(y ~ sexeh + job2 + job3, data = data03)
tidy(mod1)
tidy(mod2)
tidy(mod3)
```
Interprétations des coefficients du modèle 1
$𝑦_i = β_0 + β_1 sexeh_i + ϵ_i$
• (Intercept): $\hat{β_0} = 7.88$ : Salaire moyen chez les femmes
• sexeh: $\hat{𝛽1} = 2.12$ : Les hommes gagnent en moyenne 2.12 dollars par heure de plus que les femmes
(significatif).
Interprétations des coefficients du modèle 2
$𝑦_i = β_0 + β_1 job2_i + β_2 job3_i + ϵ_i$
• (Intercept): $\hat{β_0} = 12.21$ : Salaire moyen pour le travail de type 1.
• job2: $\hat{β_1} = 5.09$ : la différence de salaire entre le travail de type 2 et celui de type 1 est de 5.09 dollars par heure en moyenne (significatif)
• job3: $\hat{β_2} = 3.78$ : la différence de salaire entre le travail de type 3 et celui de type 1 est de 3.78 dollarspar heure en moyenne (significatif).
Interprétations des coefficients du modèle 3
$𝑦_i = β_0 + β_1 sexeh_i + β_2 job2_i + β_3 job3_i + ϵ_i$
• (Intercept): $\hat{β_0} = 11.13$ : Salaire moyen chez les femmes avec un travail de type 1
• sexeh: $\hat{β_1 = 1.97$ : Les hommes gagnent en moyenne 1.97 dollars par heure de plus que les femmes
(significatif), quelque soit le type de travail.
• job2: $\hat{β_2} = 4.71$ : la différence de salaire entre le travail de type 2 et celui de type 1 est de 4.71 dollars par heure en moyenne (significatif), quelque soit le sexe.
• job3: $\hat{β_3} = 4.3$ : la différence de salaire entre le travail de type 3 et celui de type 1 est de 4.3 dollars par heure en moyenne (significatif), quelque soit le sexe.
```{r}
anova(mod1, mod3)
```
La p-value du test est inférieur à 0.05, on rejette donc $𝐻0$ et on conclue quau moins un des coefficients (𝛽1, 𝛽2) est significativement non nulle.
• On vient de tester si le type de travail est associé au salaire horaire. Cest donc le cas pour ces données.
```{r}
linearHypothesis(mod3, c("job2", "job3"))
linearHypothesis(mod3, "job2 = job3")
```
Le test est non significatif (𝑝 = 0.44). On ne rejette pas $𝐻0$ et on conclue quil ny pas de différence de salaire horaire entre le travail de type 2 et le travail de type 3, quelque soit le sexe.
```{r}
mod3bis <- lm(y ~ sexe + travail, data = data03)
summary(mod3bis)
```
Les modèles mod3bis1 et mod3 sont identiques
• Pas besoins de créer toutes ces variables indicatrices si nos variables catégorielles sont de type factor !
• La référence sera toujours la première modalité du factor.
• On peut changer la référence avec `relevel()` ou avec `C()`. Par exemple, si on veut la modalité 2 en référence pour le travail
```{r}
lm(y ~ sexe + relevel(travail, ref = 2), data = data03) |>
tidy()
```

View File

@@ -0,0 +1,535 @@
id,y,education,experience,sexe,travail
1,3.75,12,6,Femme,Type 3
2,12,16,14,Femme,Type 1
3,15.73,16,4,Homme,Type 1
4,5,12,0,Femme,Type 2
5,9,12,20,Homme,Type 2
6,6.88,17,15,Femme,Type 1
7,9.1,13,16,Homme,Type 2
8,3.35,16,3,Homme,Type 2
9,10,16,10,Femme,Type 1
10,4.59,12,36,Femme,Type 2
11,7.61,12,20,Homme,Type 3
12,18.5,14,13,Femme,Type 3
13,20.4,17,3,Homme,Type 1
14,8.49,12,24,Femme,Type 2
15,12.5,15,6,Femme,Type 2
16,5.75,9,34,Femme,Type 1
17,4,12,12,Homme,Type 2
18,12.47,13,10,Homme,Type 3
19,5.55,11,45,Femme,Type 2
20,6.67,16,10,Femme,Type 1
21,4.75,12,2,Femme,Type 2
22,7,12,7,Homme,Type 3
23,12.2,14,10,Homme,Type 3
24,15,12,42,Homme,Type 1
25,9.56,12,23,Homme,Type 3
26,7.65,12,20,Homme,Type 2
27,5.8,13,5,Homme,Type 2
28,5.8,16,14,Homme,Type 1
29,3.6,12,6,Femme,Type 2
30,7.5,12,3,Homme,Type 3
31,8.75,12,30,Homme,Type 1
32,10.43,14,15,Femme,Type 2
33,3.75,12,41,Femme,Type 2
34,8.43,12,12,Homme,Type 2
35,4.55,12,4,Femme,Type 1
36,6.25,9,30,Homme,Type 3
37,4.5,12,16,Homme,Type 3
38,4.3,13,8,Homme,Type 3
39,8,12,15,Femme,Type 2
40,7,7,42,Homme,Type 3
41,4.5,12,3,Femme,Type 1
42,4.75,12,10,Femme,Type 2
43,4.5,12,7,Femme,Type 1
44,3.4,8,49,Femme,Type 2
45,9.83,12,23,Homme,Type 3
46,6.25,16,7,Femme,Type 1
47,23.25,17,25,Femme,Type 1
48,9,16,27,Femme,Type 2
49,19.38,16,11,Femme,Type 1
50,14,16,16,Femme,Type 1
51,6.5,12,8,Homme,Type 3
52,6,12,19,Femme,Type 2
53,6.25,16,7,Femme,Type 1
54,6.5,8,27,Homme,Type 3
55,9.5,12,25,Femme,Type 2
56,4.25,12,29,Femme,Type 2
57,14.53,16,18,Homme,Type 1
58,8.85,14,19,Femme,Type 2
59,9.17,12,44,Femme,Type 2
60,11,8,42,Homme,Type 3
61,3.75,16,13,Femme,Type 3
62,3.35,14,0,Homme,Type 2
63,3.35,12,8,Femme,Type 3
64,6.75,12,12,Homme,Type 3
65,10,16,7,Femme,Type 2
66,8.93,8,47,Homme,Type 2
67,11.11,12,28,Homme,Type 2
68,6.5,11,39,Homme,Type 2
69,14,5,44,Homme,Type 3
70,6.25,13,30,Femme,Type 2
71,4.7,8,19,Homme,Type 3
72,22.83,18,37,Femme,Type 1
73,13.45,12,16,Homme,Type 3
74,5,12,39,Femme,Type 2
75,10,10,12,Homme,Type 3
76,8.9,10,27,Homme,Type 3
77,7,11,12,Femme,Type 2
78,15,12,39,Homme,Type 1
79,3.5,12,35,Homme,Type 2
80,12.57,12,12,Homme,Type 3
81,8,12,14,Femme,Type 2
82,7,18,33,Homme,Type 1
83,26,14,21,Homme,Type 3
84,11.43,12,3,Homme,Type 3
85,5.56,14,5,Homme,Type 2
86,12,18,18,Femme,Type 1
87,5,12,4,Homme,Type 2
88,9.15,12,7,Homme,Type 3
89,8.5,12,16,Femme,Type 1
90,6.25,16,4,Homme,Type 2
91,3.35,7,43,Homme,Type 3
92,15.38,16,33,Homme,Type 1
93,6.1,12,33,Femme,Type 1
94,4.13,12,4,Femme,Type 2
95,14.21,11,15,Homme,Type 3
96,7.53,14,1,Homme,Type 2
97,3.5,10,33,Femme,Type 2
98,6.58,12,24,Femme,Type 1
99,4.85,10,13,Homme,Type 3
100,7.81,12,1,Femme,Type 1
101,6,13,31,Homme,Type 2
102,4.5,11,14,Homme,Type 3
103,24.98,18,29,Homme,Type 1
104,13.33,18,10,Homme,Type 2
105,20.5,16,14,Homme,Type 1
106,10,12,20,Homme,Type 2
107,11.22,18,19,Homme,Type 1
108,10.58,16,9,Homme,Type 1
109,12.65,17,13,Femme,Type 1
110,12,12,14,Homme,Type 3
111,5.1,8,21,Femme,Type 3
112,9.86,15,13,Homme,Type 1
113,10,12,11,Homme,Type 3
114,8,18,33,Homme,Type 1
115,22.2,18,40,Femme,Type 1
116,4.25,13,0,Femme,Type 2
117,5,16,3,Homme,Type 3
118,4,12,15,Femme,Type 2
119,12.5,13,16,Femme,Type 2
120,6.5,11,17,Homme,Type 3
121,3.6,12,43,Femme,Type 2
122,9.45,12,13,Homme,Type 3
123,10.81,12,40,Femme,Type 2
124,6.25,18,14,Homme,Type 1
125,4.35,8,37,Femme,Type 2
126,16,12,12,Homme,Type 1
127,7.69,12,19,Homme,Type 2
128,9,8,33,Homme,Type 3
129,4.25,12,2,Femme,Type 2
130,7.8,17,7,Homme,Type 1
131,3.8,12,4,Femme,Type 2
132,3.75,15,4,Homme,Type 2
133,8,12,8,Femme,Type 2
134,9.6,9,33,Homme,Type 2
135,3.5,12,34,Homme,Type 2
136,13.75,14,21,Homme,Type 2
137,5.13,12,5,Femme,Type 2
138,4.8,9,16,Homme,Type 3
139,15.56,16,10,Homme,Type 1
140,20.55,12,33,Homme,Type 3
141,7.5,12,9,Femme,Type 2
142,7.78,16,6,Femme,Type 1
143,4.5,12,7,Femme,Type 3
144,3.5,9,48,Homme,Type 2
145,10.2,17,26,Femme,Type 1
146,12,17,24,Femme,Type 1
147,4,12,6,Homme,Type 3
148,6,17,3,Femme,Type 1
149,13.51,18,14,Homme,Type 1
150,7.5,16,10,Homme,Type 1
151,12,12,20,Homme,Type 3
152,4,11,25,Femme,Type 3
153,12.5,12,43,Homme,Type 2
154,15,12,33,Homme,Type 3
155,10,18,13,Femme,Type 1
156,3.65,11,16,Homme,Type 3
157,7,11,16,Femme,Type 2
158,8.9,14,13,Homme,Type 3
159,25,14,4,Homme,Type 2
160,7,12,32,Femme,Type 1
161,14.29,14,32,Femme,Type 2
162,4.22,8,39,Femme,Type 3
163,7.3,12,37,Homme,Type 3
164,13.65,16,11,Homme,Type 1
165,7.5,12,16,Femme,Type 2
166,10,10,25,Femme,Type 2
167,7.78,12,23,Homme,Type 3
168,4.5,12,15,Femme,Type 3
169,10.58,14,25,Homme,Type 3
170,10,16,0,Femme,Type 1
171,17.25,15,31,Homme,Type 1
172,12,17,13,Homme,Type 2
173,14,18,31,Femme,Type 1
174,3.35,13,2,Femme,Type 2
175,6,12,9,Homme,Type 3
176,6.5,12,5,Homme,Type 3
177,10,16,4,Femme,Type 1
178,13,12,25,Homme,Type 2
179,4.28,12,38,Femme,Type 2
180,9.65,12,38,Femme,Type 2
181,5.79,12,42,Femme,Type 1
182,6.93,12,26,Homme,Type 2
183,12.67,16,15,Homme,Type 1
184,4.55,13,33,Femme,Type 2
185,6.8,9,30,Femme,Type 3
186,13,12,8,Homme,Type 3
187,3.35,12,0,Homme,Type 3
188,5.4,12,18,Femme,Type 2
189,10.67,12,36,Homme,Type 3
190,11.25,15,5,Homme,Type 1
191,4.1,12,15,Femme,Type 2
192,10.28,15,10,Femme,Type 1
193,5.71,18,3,Homme,Type 1
194,5,13,6,Femme,Type 2
195,3.75,2,16,Homme,Type 2
196,3.75,11,11,Homme,Type 3
197,10.62,12,45,Femme,Type 2
198,15.79,18,7,Homme,Type 1
199,6.36,12,9,Homme,Type 3
200,10.53,14,12,Femme,Type 2
201,8.63,12,18,Femme,Type 2
202,16,14,20,Homme,Type 3
203,7.45,12,25,Femme,Type 1
204,4.84,15,1,Homme,Type 1
205,6,4,54,Homme,Type 2
206,3.8,12,16,Femme,Type 2
207,15,12,40,Homme,Type 3
208,7,12,5,Homme,Type 2
209,12,18,23,Homme,Type 1
210,5.25,10,15,Homme,Type 3
211,9.36,12,8,Homme,Type 3
212,7.14,14,17,Homme,Type 2
213,19.98,9,29,Homme,Type 3
214,12,12,5,Homme,Type 2
215,10.75,12,24,Homme,Type 3
216,5.5,12,3,Homme,Type 3
217,9.25,12,19,Homme,Type 3
218,8.99,12,15,Femme,Type 2
219,3.5,11,17,Femme,Type 2
220,22.5,16,22,Homme,Type 1
221,5,12,5,Homme,Type 3
222,12.5,14,19,Femme,Type 2
223,7,17,2,Homme,Type 1
224,16.42,14,19,Homme,Type 1
225,19.98,14,44,Homme,Type 2
226,6.5,10,30,Homme,Type 3
227,9.5,12,39,Femme,Type 2
228,11.25,12,41,Homme,Type 3
229,8.06,13,17,Homme,Type 1
230,6,15,26,Femme,Type 2
231,24.98,17,18,Homme,Type 1
232,5,12,28,Femme,Type 3
233,3.98,14,6,Femme,Type 2
234,4.5,13,0,Femme,Type 2
235,7.75,12,24,Femme,Type 2
236,2.85,12,1,Homme,Type 3
237,9,16,7,Homme,Type 1
238,9.63,14,22,Homme,Type 2
239,12,14,10,Femme,Type 1
240,7.5,12,27,Femme,Type 2
241,13.45,16,16,Homme,Type 1
242,7.5,12,23,Femme,Type 2
243,4.5,7,14,Homme,Type 2
244,3.75,12,17,Femme,Type 2
245,5.75,12,3,Homme,Type 1
246,12.5,17,13,Femme,Type 1
247,6.5,12,11,Femme,Type 1
248,6.94,12,7,Femme,Type 2
249,11.36,18,5,Homme,Type 1
250,3.35,12,0,Femme,Type 2
251,9.22,14,13,Femme,Type 1
252,9.75,15,9,Femme,Type 3
253,24.98,17,5,Femme,Type 1
254,5.62,13,2,Femme,Type 2
255,9.57,12,5,Femme,Type 2
256,5.5,12,11,Femme,Type 3
257,7,12,10,Femme,Type 2
258,17.86,13,36,Homme,Type 1
259,5.5,16,2,Femme,Type 2
260,8.8,14,41,Homme,Type 1
261,5.35,12,13,Femme,Type 2
262,4.95,9,42,Femme,Type 3
263,13.89,16,17,Femme,Type 1
264,6,12,1,Homme,Type 3
265,4.17,14,24,Femme,Type 2
266,8.5,12,13,Homme,Type 3
267,9.37,16,26,Femme,Type 1
268,3.35,16,14,Femme,Type 2
269,6.25,13,4,Femme,Type 2
270,13,11,18,Homme,Type 2
271,11.25,16,3,Femme,Type 1
272,6.1,10,44,Femme,Type 3
273,10.5,13,14,Femme,Type 1
274,5.83,13,3,Homme,Type 2
275,5.25,12,8,Femme,Type 2
276,10,16,14,Homme,Type 1
277,8.5,12,25,Femme,Type 2
278,10.62,12,34,Homme,Type 1
279,5.2,12,24,Femme,Type 2
280,21.25,13,32,Homme,Type 1
281,15,18,12,Homme,Type 1
282,5,12,14,Femme,Type 2
283,4.55,8,45,Femme,Type 2
284,3.5,8,8,Homme,Type 3
285,8,16,17,Femme,Type 1
286,11.84,12,26,Homme,Type 1
287,13.45,16,8,Homme,Type 1
288,9,16,6,Femme,Type 2
289,8.75,11,36,Femme,Type 2
290,15,14,22,Homme,Type 2
291,13.95,18,14,Femme,Type 1
292,11.25,12,30,Femme,Type 1
293,10.5,12,29,Femme,Type 2
294,8.5,9,38,Homme,Type 3
295,8.63,14,4,Femme,Type 1
296,22.5,15,12,Homme,Type 1
297,13.12,12,33,Femme,Type 2
298,12.5,12,14,Femme,Type 2
299,9.75,11,37,Homme,Type 3
300,4.5,14,14,Homme,Type 2
301,12.5,12,11,Homme,Type 3
302,18,16,38,Homme,Type 1
303,3.55,13,1,Femme,Type 2
304,12.22,12,19,Homme,Type 3
305,5.5,16,29,Homme,Type 1
306,8.89,16,22,Femme,Type 1
307,3.35,16,9,Homme,Type 1
308,13.98,14,14,Homme,Type 3
309,7.5,14,2,Homme,Type 2
310,5.62,14,32,Femme,Type 2
311,11.32,12,13,Femme,Type 2
312,3.84,11,25,Femme,Type 2
313,6.88,14,10,Femme,Type 2
314,18.16,17,14,Homme,Type 1
315,6,13,7,Homme,Type 3
316,22.5,16,17,Homme,Type 1
317,6.67,12,1,Homme,Type 3
318,3.64,12,42,Femme,Type 1
319,4.5,12,3,Femme,Type 3
320,5,14,0,Homme,Type 2
321,8.75,13,18,Homme,Type 3
322,11.11,13,13,Homme,Type 2
323,5,12,4,Homme,Type 1
324,4,12,4,Homme,Type 3
325,10.62,16,6,Femme,Type 1
326,10,14,24,Femme,Type 2
327,4.35,11,20,Femme,Type 2
328,5.3,12,14,Homme,Type 1
329,24.98,17,31,Homme,Type 1
330,3.4,8,29,Femme,Type 2
331,12.5,12,9,Homme,Type 3
332,9.5,17,14,Femme,Type 1
333,7.38,14,15,Femme,Type 1
334,7.5,12,10,Femme,Type 2
335,3.75,12,9,Homme,Type 3
336,24.98,16,18,Homme,Type 1
337,6.85,12,8,Homme,Type 2
338,3.51,10,37,Femme,Type 2
339,7.5,12,17,Homme,Type 1
340,6.88,8,22,Femme,Type 3
341,8,12,19,Femme,Type 3
342,5.75,12,21,Femme,Type 1
343,5.77,16,3,Homme,Type 2
344,5,16,4,Femme,Type 2
345,5.25,16,2,Femme,Type 2
346,26.29,17,32,Homme,Type 1
347,3.5,12,25,Femme,Type 2
348,15.03,12,24,Femme,Type 2
349,22.2,12,26,Homme,Type 3
350,4.85,9,16,Femme,Type 3
351,18.16,18,7,Homme,Type 1
352,3.95,12,9,Femme,Type 2
353,6.73,12,2,Homme,Type 3
354,10,16,7,Homme,Type 1
355,4,12,46,Femme,Type 3
356,14.67,14,16,Homme,Type 2
357,10,10,20,Homme,Type 1
358,17.5,16,13,Homme,Type 1
359,5,12,6,Homme,Type 3
360,6.25,12,6,Femme,Type 3
361,11.25,12,28,Femme,Type 1
362,5.71,12,16,Femme,Type 3
363,5,12,12,Homme,Type 3
364,9.5,11,29,Homme,Type 3
365,14,11,13,Homme,Type 3
366,11.71,16,42,Femme,Type 2
367,3.35,11,3,Homme,Type 3
368,18,18,15,Homme,Type 1
369,10,12,22,Homme,Type 3
370,5.5,11,24,Femme,Type 2
371,16.65,14,21,Homme,Type 1
372,6.75,10,13,Homme,Type 3
373,5.87,17,6,Femme,Type 2
374,6,12,8,Homme,Type 3
375,8.56,14,14,Femme,Type 1
376,5.65,14,2,Homme,Type 3
377,5,17,1,Femme,Type 3
378,10,13,34,Homme,Type 1
379,8.75,10,19,Homme,Type 2
380,8,7,44,Homme,Type 3
381,7.5,12,20,Femme,Type 2
382,5,14,17,Femme,Type 2
383,6.15,16,16,Femme,Type 1
384,9,10,27,Homme,Type 3
385,11.02,14,12,Homme,Type 2
386,9.42,12,32,Homme,Type 2
387,7,10,9,Homme,Type 3
388,5,12,14,Femme,Type 2
389,4.15,11,4,Homme,Type 2
390,7.5,12,17,Homme,Type 3
391,8.75,13,10,Femme,Type 2
392,5.5,12,10,Homme,Type 2
393,20,16,28,Femme,Type 1
394,5.5,12,33,Femme,Type 2
395,8.89,8,29,Femme,Type 2
396,8,12,9,Femme,Type 2
397,6.4,12,6,Femme,Type 2
398,6.28,12,38,Femme,Type 3
399,15.95,17,13,Femme,Type 1
400,10,14,22,Homme,Type 1
401,5.65,16,6,Femme,Type 2
402,2.01,13,0,Homme,Type 2
403,11,14,14,Homme,Type 3
404,6.5,12,28,Homme,Type 3
405,10,12,35,Homme,Type 3
406,19.47,12,9,Homme,Type 3
407,7,12,26,Femme,Type 2
408,11.67,12,43,Femme,Type 2
409,22.2,18,8,Homme,Type 1
410,15,16,12,Homme,Type 3
411,5.95,13,9,Homme,Type 2
412,12,12,11,Femme,Type 1
413,13.2,16,10,Homme,Type 1
414,20,18,19,Femme,Type 1
415,6,7,15,Femme,Type 3
416,4.25,12,20,Femme,Type 2
417,11.35,12,17,Homme,Type 3
418,22,14,15,Homme,Type 1
419,5.5,11,18,Homme,Type 3
420,4.5,11,2,Homme,Type 2
421,4.5,16,21,Homme,Type 2
422,5.4,16,10,Femme,Type 1
423,5.25,12,2,Homme,Type 3
424,4.62,6,33,Femme,Type 3
425,7.5,16,22,Femme,Type 1
426,11.5,12,19,Homme,Type 3
427,13,12,19,Homme,Type 3
428,10.25,14,26,Femme,Type 1
429,16.14,14,16,Homme,Type 1
430,9.33,12,15,Femme,Type 2
431,5.5,12,21,Femme,Type 2
432,8.5,17,3,Homme,Type 2
433,4.75,12,8,Homme,Type 3
434,5.75,6,45,Homme,Type 3
435,3.43,12,2,Homme,Type 2
436,4.45,10,27,Homme,Type 3
437,5,12,26,Femme,Type 2
438,9,12,10,Homme,Type 2
439,13.28,16,11,Homme,Type 3
440,7.88,16,17,Homme,Type 2
441,8,13,15,Femme,Type 2
442,4,12,7,Homme,Type 2
443,13.07,13,9,Homme,Type 3
444,5.2,18,13,Femme,Type 2
445,8,14,15,Femme,Type 2
446,8.75,12,9,Homme,Type 3
447,3.5,9,47,Homme,Type 2
448,5.25,12,45,Femme,Type 2
449,3.65,11,8,Femme,Type 2
450,6.25,12,1,Homme,Type 3
451,7.96,14,12,Femme,Type 2
452,7.5,11,3,Homme,Type 2
453,4.8,12,16,Femme,Type 3
454,16.26,12,14,Homme,Type 2
455,6.25,18,27,Homme,Type 1
456,15,18,11,Femme,Type 1
457,5.71,18,7,Homme,Type 1
458,13.1,10,38,Femme,Type 2
459,5.75,12,15,Femme,Type 2
460,10.5,12,4,Homme,Type 3
461,22.5,18,14,Homme,Type 1
462,16,12,35,Homme,Type 3
463,17.25,18,5,Homme,Type 1
464,9.37,17,7,Homme,Type 1
465,3.5,14,6,Femme,Type 2
466,3.35,12,7,Femme,Type 2
467,19.88,12,13,Homme,Type 1
468,10.78,11,28,Homme,Type 3
469,5.5,12,20,Homme,Type 2
470,4,12,8,Homme,Type 2
471,12.5,15,10,Homme,Type 1
472,5.15,13,1,Homme,Type 2
473,5.5,12,12,Homme,Type 3
474,4,13,0,Homme,Type 3
475,4.17,12,27,Femme,Type 2
476,4,16,20,Femme,Type 2
477,8.5,12,2,Homme,Type 2
478,12.05,16,6,Femme,Type 1
479,7,3,55,Homme,Type 3
480,4.85,12,14,Femme,Type 2
481,10.32,13,28,Femme,Type 2
482,1,12,24,Homme,Type 1
483,9.5,9,46,Femme,Type 2
484,7.5,12,38,Homme,Type 3
485,24.98,16,5,Femme,Type 1
486,6.4,12,45,Femme,Type 2
487,44.5,14,1,Femme,Type 1
488,11.79,16,6,Femme,Type 1
489,11,16,13,Homme,Type 3
490,3.5,11,33,Femme,Type 2
491,5.21,18,10,Homme,Type 2
492,10.61,15,33,Femme,Type 1
493,6.75,12,22,Homme,Type 2
494,8.89,12,18,Homme,Type 3
495,6,16,8,Homme,Type 1
496,5.85,18,12,Homme,Type 1
497,11.25,17,10,Femme,Type 1
498,3.35,12,10,Femme,Type 3
499,8.2,14,17,Homme,Type 3
500,3,12,28,Femme,Type 2
501,9.24,16,5,Homme,Type 1
502,9.6,12,14,Femme,Type 2
503,19.98,12,23,Homme,Type 3
504,6.85,14,34,Homme,Type 2
505,3.56,12,12,Femme,Type 2
506,15,16,26,Homme,Type 1
507,13.16,12,38,Femme,Type 1
508,3,6,43,Femme,Type 3
509,9,13,8,Homme,Type 3
510,8.5,12,8,Homme,Type 3
511,19,18,13,Homme,Type 1
512,15,16,10,Homme,Type 1
513,1.75,12,5,Femme,Type 2
514,12.16,12,32,Femme,Type 2
515,9,12,16,Homme,Type 2
516,5.5,12,20,Homme,Type 2
517,8.93,12,18,Femme,Type 3
518,4.35,12,3,Femme,Type 1
519,6.25,14,2,Homme,Type 1
520,11.5,16,16,Homme,Type 2
521,3.45,13,1,Femme,Type 2
522,10,14,22,Homme,Type 1
523,5,14,0,Homme,Type 1
524,7.67,15,11,Femme,Type 1
525,8.4,13,17,Homme,Type 3
526,11.25,13,14,Homme,Type 3
527,6.25,14,12,Femme,Type 1
528,13.26,12,16,Homme,Type 3
529,11.25,17,32,Femme,Type 2
530,6.75,10,41,Homme,Type 3
531,7.7,13,8,Femme,Type 2
532,5.26,8,38,Femme,Type 2
533,13.71,12,43,Homme,Type 2
534,8,12,43,Femme,Type 2
1 id y education experience sexe travail
2 1 3.75 12 6 Femme Type 3
3 2 12 16 14 Femme Type 1
4 3 15.73 16 4 Homme Type 1
5 4 5 12 0 Femme Type 2
6 5 9 12 20 Homme Type 2
7 6 6.88 17 15 Femme Type 1
8 7 9.1 13 16 Homme Type 2
9 8 3.35 16 3 Homme Type 2
10 9 10 16 10 Femme Type 1
11 10 4.59 12 36 Femme Type 2
12 11 7.61 12 20 Homme Type 3
13 12 18.5 14 13 Femme Type 3
14 13 20.4 17 3 Homme Type 1
15 14 8.49 12 24 Femme Type 2
16 15 12.5 15 6 Femme Type 2
17 16 5.75 9 34 Femme Type 1
18 17 4 12 12 Homme Type 2
19 18 12.47 13 10 Homme Type 3
20 19 5.55 11 45 Femme Type 2
21 20 6.67 16 10 Femme Type 1
22 21 4.75 12 2 Femme Type 2
23 22 7 12 7 Homme Type 3
24 23 12.2 14 10 Homme Type 3
25 24 15 12 42 Homme Type 1
26 25 9.56 12 23 Homme Type 3
27 26 7.65 12 20 Homme Type 2
28 27 5.8 13 5 Homme Type 2
29 28 5.8 16 14 Homme Type 1
30 29 3.6 12 6 Femme Type 2
31 30 7.5 12 3 Homme Type 3
32 31 8.75 12 30 Homme Type 1
33 32 10.43 14 15 Femme Type 2
34 33 3.75 12 41 Femme Type 2
35 34 8.43 12 12 Homme Type 2
36 35 4.55 12 4 Femme Type 1
37 36 6.25 9 30 Homme Type 3
38 37 4.5 12 16 Homme Type 3
39 38 4.3 13 8 Homme Type 3
40 39 8 12 15 Femme Type 2
41 40 7 7 42 Homme Type 3
42 41 4.5 12 3 Femme Type 1
43 42 4.75 12 10 Femme Type 2
44 43 4.5 12 7 Femme Type 1
45 44 3.4 8 49 Femme Type 2
46 45 9.83 12 23 Homme Type 3
47 46 6.25 16 7 Femme Type 1
48 47 23.25 17 25 Femme Type 1
49 48 9 16 27 Femme Type 2
50 49 19.38 16 11 Femme Type 1
51 50 14 16 16 Femme Type 1
52 51 6.5 12 8 Homme Type 3
53 52 6 12 19 Femme Type 2
54 53 6.25 16 7 Femme Type 1
55 54 6.5 8 27 Homme Type 3
56 55 9.5 12 25 Femme Type 2
57 56 4.25 12 29 Femme Type 2
58 57 14.53 16 18 Homme Type 1
59 58 8.85 14 19 Femme Type 2
60 59 9.17 12 44 Femme Type 2
61 60 11 8 42 Homme Type 3
62 61 3.75 16 13 Femme Type 3
63 62 3.35 14 0 Homme Type 2
64 63 3.35 12 8 Femme Type 3
65 64 6.75 12 12 Homme Type 3
66 65 10 16 7 Femme Type 2
67 66 8.93 8 47 Homme Type 2
68 67 11.11 12 28 Homme Type 2
69 68 6.5 11 39 Homme Type 2
70 69 14 5 44 Homme Type 3
71 70 6.25 13 30 Femme Type 2
72 71 4.7 8 19 Homme Type 3
73 72 22.83 18 37 Femme Type 1
74 73 13.45 12 16 Homme Type 3
75 74 5 12 39 Femme Type 2
76 75 10 10 12 Homme Type 3
77 76 8.9 10 27 Homme Type 3
78 77 7 11 12 Femme Type 2
79 78 15 12 39 Homme Type 1
80 79 3.5 12 35 Homme Type 2
81 80 12.57 12 12 Homme Type 3
82 81 8 12 14 Femme Type 2
83 82 7 18 33 Homme Type 1
84 83 26 14 21 Homme Type 3
85 84 11.43 12 3 Homme Type 3
86 85 5.56 14 5 Homme Type 2
87 86 12 18 18 Femme Type 1
88 87 5 12 4 Homme Type 2
89 88 9.15 12 7 Homme Type 3
90 89 8.5 12 16 Femme Type 1
91 90 6.25 16 4 Homme Type 2
92 91 3.35 7 43 Homme Type 3
93 92 15.38 16 33 Homme Type 1
94 93 6.1 12 33 Femme Type 1
95 94 4.13 12 4 Femme Type 2
96 95 14.21 11 15 Homme Type 3
97 96 7.53 14 1 Homme Type 2
98 97 3.5 10 33 Femme Type 2
99 98 6.58 12 24 Femme Type 1
100 99 4.85 10 13 Homme Type 3
101 100 7.81 12 1 Femme Type 1
102 101 6 13 31 Homme Type 2
103 102 4.5 11 14 Homme Type 3
104 103 24.98 18 29 Homme Type 1
105 104 13.33 18 10 Homme Type 2
106 105 20.5 16 14 Homme Type 1
107 106 10 12 20 Homme Type 2
108 107 11.22 18 19 Homme Type 1
109 108 10.58 16 9 Homme Type 1
110 109 12.65 17 13 Femme Type 1
111 110 12 12 14 Homme Type 3
112 111 5.1 8 21 Femme Type 3
113 112 9.86 15 13 Homme Type 1
114 113 10 12 11 Homme Type 3
115 114 8 18 33 Homme Type 1
116 115 22.2 18 40 Femme Type 1
117 116 4.25 13 0 Femme Type 2
118 117 5 16 3 Homme Type 3
119 118 4 12 15 Femme Type 2
120 119 12.5 13 16 Femme Type 2
121 120 6.5 11 17 Homme Type 3
122 121 3.6 12 43 Femme Type 2
123 122 9.45 12 13 Homme Type 3
124 123 10.81 12 40 Femme Type 2
125 124 6.25 18 14 Homme Type 1
126 125 4.35 8 37 Femme Type 2
127 126 16 12 12 Homme Type 1
128 127 7.69 12 19 Homme Type 2
129 128 9 8 33 Homme Type 3
130 129 4.25 12 2 Femme Type 2
131 130 7.8 17 7 Homme Type 1
132 131 3.8 12 4 Femme Type 2
133 132 3.75 15 4 Homme Type 2
134 133 8 12 8 Femme Type 2
135 134 9.6 9 33 Homme Type 2
136 135 3.5 12 34 Homme Type 2
137 136 13.75 14 21 Homme Type 2
138 137 5.13 12 5 Femme Type 2
139 138 4.8 9 16 Homme Type 3
140 139 15.56 16 10 Homme Type 1
141 140 20.55 12 33 Homme Type 3
142 141 7.5 12 9 Femme Type 2
143 142 7.78 16 6 Femme Type 1
144 143 4.5 12 7 Femme Type 3
145 144 3.5 9 48 Homme Type 2
146 145 10.2 17 26 Femme Type 1
147 146 12 17 24 Femme Type 1
148 147 4 12 6 Homme Type 3
149 148 6 17 3 Femme Type 1
150 149 13.51 18 14 Homme Type 1
151 150 7.5 16 10 Homme Type 1
152 151 12 12 20 Homme Type 3
153 152 4 11 25 Femme Type 3
154 153 12.5 12 43 Homme Type 2
155 154 15 12 33 Homme Type 3
156 155 10 18 13 Femme Type 1
157 156 3.65 11 16 Homme Type 3
158 157 7 11 16 Femme Type 2
159 158 8.9 14 13 Homme Type 3
160 159 25 14 4 Homme Type 2
161 160 7 12 32 Femme Type 1
162 161 14.29 14 32 Femme Type 2
163 162 4.22 8 39 Femme Type 3
164 163 7.3 12 37 Homme Type 3
165 164 13.65 16 11 Homme Type 1
166 165 7.5 12 16 Femme Type 2
167 166 10 10 25 Femme Type 2
168 167 7.78 12 23 Homme Type 3
169 168 4.5 12 15 Femme Type 3
170 169 10.58 14 25 Homme Type 3
171 170 10 16 0 Femme Type 1
172 171 17.25 15 31 Homme Type 1
173 172 12 17 13 Homme Type 2
174 173 14 18 31 Femme Type 1
175 174 3.35 13 2 Femme Type 2
176 175 6 12 9 Homme Type 3
177 176 6.5 12 5 Homme Type 3
178 177 10 16 4 Femme Type 1
179 178 13 12 25 Homme Type 2
180 179 4.28 12 38 Femme Type 2
181 180 9.65 12 38 Femme Type 2
182 181 5.79 12 42 Femme Type 1
183 182 6.93 12 26 Homme Type 2
184 183 12.67 16 15 Homme Type 1
185 184 4.55 13 33 Femme Type 2
186 185 6.8 9 30 Femme Type 3
187 186 13 12 8 Homme Type 3
188 187 3.35 12 0 Homme Type 3
189 188 5.4 12 18 Femme Type 2
190 189 10.67 12 36 Homme Type 3
191 190 11.25 15 5 Homme Type 1
192 191 4.1 12 15 Femme Type 2
193 192 10.28 15 10 Femme Type 1
194 193 5.71 18 3 Homme Type 1
195 194 5 13 6 Femme Type 2
196 195 3.75 2 16 Homme Type 2
197 196 3.75 11 11 Homme Type 3
198 197 10.62 12 45 Femme Type 2
199 198 15.79 18 7 Homme Type 1
200 199 6.36 12 9 Homme Type 3
201 200 10.53 14 12 Femme Type 2
202 201 8.63 12 18 Femme Type 2
203 202 16 14 20 Homme Type 3
204 203 7.45 12 25 Femme Type 1
205 204 4.84 15 1 Homme Type 1
206 205 6 4 54 Homme Type 2
207 206 3.8 12 16 Femme Type 2
208 207 15 12 40 Homme Type 3
209 208 7 12 5 Homme Type 2
210 209 12 18 23 Homme Type 1
211 210 5.25 10 15 Homme Type 3
212 211 9.36 12 8 Homme Type 3
213 212 7.14 14 17 Homme Type 2
214 213 19.98 9 29 Homme Type 3
215 214 12 12 5 Homme Type 2
216 215 10.75 12 24 Homme Type 3
217 216 5.5 12 3 Homme Type 3
218 217 9.25 12 19 Homme Type 3
219 218 8.99 12 15 Femme Type 2
220 219 3.5 11 17 Femme Type 2
221 220 22.5 16 22 Homme Type 1
222 221 5 12 5 Homme Type 3
223 222 12.5 14 19 Femme Type 2
224 223 7 17 2 Homme Type 1
225 224 16.42 14 19 Homme Type 1
226 225 19.98 14 44 Homme Type 2
227 226 6.5 10 30 Homme Type 3
228 227 9.5 12 39 Femme Type 2
229 228 11.25 12 41 Homme Type 3
230 229 8.06 13 17 Homme Type 1
231 230 6 15 26 Femme Type 2
232 231 24.98 17 18 Homme Type 1
233 232 5 12 28 Femme Type 3
234 233 3.98 14 6 Femme Type 2
235 234 4.5 13 0 Femme Type 2
236 235 7.75 12 24 Femme Type 2
237 236 2.85 12 1 Homme Type 3
238 237 9 16 7 Homme Type 1
239 238 9.63 14 22 Homme Type 2
240 239 12 14 10 Femme Type 1
241 240 7.5 12 27 Femme Type 2
242 241 13.45 16 16 Homme Type 1
243 242 7.5 12 23 Femme Type 2
244 243 4.5 7 14 Homme Type 2
245 244 3.75 12 17 Femme Type 2
246 245 5.75 12 3 Homme Type 1
247 246 12.5 17 13 Femme Type 1
248 247 6.5 12 11 Femme Type 1
249 248 6.94 12 7 Femme Type 2
250 249 11.36 18 5 Homme Type 1
251 250 3.35 12 0 Femme Type 2
252 251 9.22 14 13 Femme Type 1
253 252 9.75 15 9 Femme Type 3
254 253 24.98 17 5 Femme Type 1
255 254 5.62 13 2 Femme Type 2
256 255 9.57 12 5 Femme Type 2
257 256 5.5 12 11 Femme Type 3
258 257 7 12 10 Femme Type 2
259 258 17.86 13 36 Homme Type 1
260 259 5.5 16 2 Femme Type 2
261 260 8.8 14 41 Homme Type 1
262 261 5.35 12 13 Femme Type 2
263 262 4.95 9 42 Femme Type 3
264 263 13.89 16 17 Femme Type 1
265 264 6 12 1 Homme Type 3
266 265 4.17 14 24 Femme Type 2
267 266 8.5 12 13 Homme Type 3
268 267 9.37 16 26 Femme Type 1
269 268 3.35 16 14 Femme Type 2
270 269 6.25 13 4 Femme Type 2
271 270 13 11 18 Homme Type 2
272 271 11.25 16 3 Femme Type 1
273 272 6.1 10 44 Femme Type 3
274 273 10.5 13 14 Femme Type 1
275 274 5.83 13 3 Homme Type 2
276 275 5.25 12 8 Femme Type 2
277 276 10 16 14 Homme Type 1
278 277 8.5 12 25 Femme Type 2
279 278 10.62 12 34 Homme Type 1
280 279 5.2 12 24 Femme Type 2
281 280 21.25 13 32 Homme Type 1
282 281 15 18 12 Homme Type 1
283 282 5 12 14 Femme Type 2
284 283 4.55 8 45 Femme Type 2
285 284 3.5 8 8 Homme Type 3
286 285 8 16 17 Femme Type 1
287 286 11.84 12 26 Homme Type 1
288 287 13.45 16 8 Homme Type 1
289 288 9 16 6 Femme Type 2
290 289 8.75 11 36 Femme Type 2
291 290 15 14 22 Homme Type 2
292 291 13.95 18 14 Femme Type 1
293 292 11.25 12 30 Femme Type 1
294 293 10.5 12 29 Femme Type 2
295 294 8.5 9 38 Homme Type 3
296 295 8.63 14 4 Femme Type 1
297 296 22.5 15 12 Homme Type 1
298 297 13.12 12 33 Femme Type 2
299 298 12.5 12 14 Femme Type 2
300 299 9.75 11 37 Homme Type 3
301 300 4.5 14 14 Homme Type 2
302 301 12.5 12 11 Homme Type 3
303 302 18 16 38 Homme Type 1
304 303 3.55 13 1 Femme Type 2
305 304 12.22 12 19 Homme Type 3
306 305 5.5 16 29 Homme Type 1
307 306 8.89 16 22 Femme Type 1
308 307 3.35 16 9 Homme Type 1
309 308 13.98 14 14 Homme Type 3
310 309 7.5 14 2 Homme Type 2
311 310 5.62 14 32 Femme Type 2
312 311 11.32 12 13 Femme Type 2
313 312 3.84 11 25 Femme Type 2
314 313 6.88 14 10 Femme Type 2
315 314 18.16 17 14 Homme Type 1
316 315 6 13 7 Homme Type 3
317 316 22.5 16 17 Homme Type 1
318 317 6.67 12 1 Homme Type 3
319 318 3.64 12 42 Femme Type 1
320 319 4.5 12 3 Femme Type 3
321 320 5 14 0 Homme Type 2
322 321 8.75 13 18 Homme Type 3
323 322 11.11 13 13 Homme Type 2
324 323 5 12 4 Homme Type 1
325 324 4 12 4 Homme Type 3
326 325 10.62 16 6 Femme Type 1
327 326 10 14 24 Femme Type 2
328 327 4.35 11 20 Femme Type 2
329 328 5.3 12 14 Homme Type 1
330 329 24.98 17 31 Homme Type 1
331 330 3.4 8 29 Femme Type 2
332 331 12.5 12 9 Homme Type 3
333 332 9.5 17 14 Femme Type 1
334 333 7.38 14 15 Femme Type 1
335 334 7.5 12 10 Femme Type 2
336 335 3.75 12 9 Homme Type 3
337 336 24.98 16 18 Homme Type 1
338 337 6.85 12 8 Homme Type 2
339 338 3.51 10 37 Femme Type 2
340 339 7.5 12 17 Homme Type 1
341 340 6.88 8 22 Femme Type 3
342 341 8 12 19 Femme Type 3
343 342 5.75 12 21 Femme Type 1
344 343 5.77 16 3 Homme Type 2
345 344 5 16 4 Femme Type 2
346 345 5.25 16 2 Femme Type 2
347 346 26.29 17 32 Homme Type 1
348 347 3.5 12 25 Femme Type 2
349 348 15.03 12 24 Femme Type 2
350 349 22.2 12 26 Homme Type 3
351 350 4.85 9 16 Femme Type 3
352 351 18.16 18 7 Homme Type 1
353 352 3.95 12 9 Femme Type 2
354 353 6.73 12 2 Homme Type 3
355 354 10 16 7 Homme Type 1
356 355 4 12 46 Femme Type 3
357 356 14.67 14 16 Homme Type 2
358 357 10 10 20 Homme Type 1
359 358 17.5 16 13 Homme Type 1
360 359 5 12 6 Homme Type 3
361 360 6.25 12 6 Femme Type 3
362 361 11.25 12 28 Femme Type 1
363 362 5.71 12 16 Femme Type 3
364 363 5 12 12 Homme Type 3
365 364 9.5 11 29 Homme Type 3
366 365 14 11 13 Homme Type 3
367 366 11.71 16 42 Femme Type 2
368 367 3.35 11 3 Homme Type 3
369 368 18 18 15 Homme Type 1
370 369 10 12 22 Homme Type 3
371 370 5.5 11 24 Femme Type 2
372 371 16.65 14 21 Homme Type 1
373 372 6.75 10 13 Homme Type 3
374 373 5.87 17 6 Femme Type 2
375 374 6 12 8 Homme Type 3
376 375 8.56 14 14 Femme Type 1
377 376 5.65 14 2 Homme Type 3
378 377 5 17 1 Femme Type 3
379 378 10 13 34 Homme Type 1
380 379 8.75 10 19 Homme Type 2
381 380 8 7 44 Homme Type 3
382 381 7.5 12 20 Femme Type 2
383 382 5 14 17 Femme Type 2
384 383 6.15 16 16 Femme Type 1
385 384 9 10 27 Homme Type 3
386 385 11.02 14 12 Homme Type 2
387 386 9.42 12 32 Homme Type 2
388 387 7 10 9 Homme Type 3
389 388 5 12 14 Femme Type 2
390 389 4.15 11 4 Homme Type 2
391 390 7.5 12 17 Homme Type 3
392 391 8.75 13 10 Femme Type 2
393 392 5.5 12 10 Homme Type 2
394 393 20 16 28 Femme Type 1
395 394 5.5 12 33 Femme Type 2
396 395 8.89 8 29 Femme Type 2
397 396 8 12 9 Femme Type 2
398 397 6.4 12 6 Femme Type 2
399 398 6.28 12 38 Femme Type 3
400 399 15.95 17 13 Femme Type 1
401 400 10 14 22 Homme Type 1
402 401 5.65 16 6 Femme Type 2
403 402 2.01 13 0 Homme Type 2
404 403 11 14 14 Homme Type 3
405 404 6.5 12 28 Homme Type 3
406 405 10 12 35 Homme Type 3
407 406 19.47 12 9 Homme Type 3
408 407 7 12 26 Femme Type 2
409 408 11.67 12 43 Femme Type 2
410 409 22.2 18 8 Homme Type 1
411 410 15 16 12 Homme Type 3
412 411 5.95 13 9 Homme Type 2
413 412 12 12 11 Femme Type 1
414 413 13.2 16 10 Homme Type 1
415 414 20 18 19 Femme Type 1
416 415 6 7 15 Femme Type 3
417 416 4.25 12 20 Femme Type 2
418 417 11.35 12 17 Homme Type 3
419 418 22 14 15 Homme Type 1
420 419 5.5 11 18 Homme Type 3
421 420 4.5 11 2 Homme Type 2
422 421 4.5 16 21 Homme Type 2
423 422 5.4 16 10 Femme Type 1
424 423 5.25 12 2 Homme Type 3
425 424 4.62 6 33 Femme Type 3
426 425 7.5 16 22 Femme Type 1
427 426 11.5 12 19 Homme Type 3
428 427 13 12 19 Homme Type 3
429 428 10.25 14 26 Femme Type 1
430 429 16.14 14 16 Homme Type 1
431 430 9.33 12 15 Femme Type 2
432 431 5.5 12 21 Femme Type 2
433 432 8.5 17 3 Homme Type 2
434 433 4.75 12 8 Homme Type 3
435 434 5.75 6 45 Homme Type 3
436 435 3.43 12 2 Homme Type 2
437 436 4.45 10 27 Homme Type 3
438 437 5 12 26 Femme Type 2
439 438 9 12 10 Homme Type 2
440 439 13.28 16 11 Homme Type 3
441 440 7.88 16 17 Homme Type 2
442 441 8 13 15 Femme Type 2
443 442 4 12 7 Homme Type 2
444 443 13.07 13 9 Homme Type 3
445 444 5.2 18 13 Femme Type 2
446 445 8 14 15 Femme Type 2
447 446 8.75 12 9 Homme Type 3
448 447 3.5 9 47 Homme Type 2
449 448 5.25 12 45 Femme Type 2
450 449 3.65 11 8 Femme Type 2
451 450 6.25 12 1 Homme Type 3
452 451 7.96 14 12 Femme Type 2
453 452 7.5 11 3 Homme Type 2
454 453 4.8 12 16 Femme Type 3
455 454 16.26 12 14 Homme Type 2
456 455 6.25 18 27 Homme Type 1
457 456 15 18 11 Femme Type 1
458 457 5.71 18 7 Homme Type 1
459 458 13.1 10 38 Femme Type 2
460 459 5.75 12 15 Femme Type 2
461 460 10.5 12 4 Homme Type 3
462 461 22.5 18 14 Homme Type 1
463 462 16 12 35 Homme Type 3
464 463 17.25 18 5 Homme Type 1
465 464 9.37 17 7 Homme Type 1
466 465 3.5 14 6 Femme Type 2
467 466 3.35 12 7 Femme Type 2
468 467 19.88 12 13 Homme Type 1
469 468 10.78 11 28 Homme Type 3
470 469 5.5 12 20 Homme Type 2
471 470 4 12 8 Homme Type 2
472 471 12.5 15 10 Homme Type 1
473 472 5.15 13 1 Homme Type 2
474 473 5.5 12 12 Homme Type 3
475 474 4 13 0 Homme Type 3
476 475 4.17 12 27 Femme Type 2
477 476 4 16 20 Femme Type 2
478 477 8.5 12 2 Homme Type 2
479 478 12.05 16 6 Femme Type 1
480 479 7 3 55 Homme Type 3
481 480 4.85 12 14 Femme Type 2
482 481 10.32 13 28 Femme Type 2
483 482 1 12 24 Homme Type 1
484 483 9.5 9 46 Femme Type 2
485 484 7.5 12 38 Homme Type 3
486 485 24.98 16 5 Femme Type 1
487 486 6.4 12 45 Femme Type 2
488 487 44.5 14 1 Femme Type 1
489 488 11.79 16 6 Femme Type 1
490 489 11 16 13 Homme Type 3
491 490 3.5 11 33 Femme Type 2
492 491 5.21 18 10 Homme Type 2
493 492 10.61 15 33 Femme Type 1
494 493 6.75 12 22 Homme Type 2
495 494 8.89 12 18 Homme Type 3
496 495 6 16 8 Homme Type 1
497 496 5.85 18 12 Homme Type 1
498 497 11.25 17 10 Femme Type 1
499 498 3.35 12 10 Femme Type 3
500 499 8.2 14 17 Homme Type 3
501 500 3 12 28 Femme Type 2
502 501 9.24 16 5 Homme Type 1
503 502 9.6 12 14 Femme Type 2
504 503 19.98 12 23 Homme Type 3
505 504 6.85 14 34 Homme Type 2
506 505 3.56 12 12 Femme Type 2
507 506 15 16 26 Homme Type 1
508 507 13.16 12 38 Femme Type 1
509 508 3 6 43 Femme Type 3
510 509 9 13 8 Homme Type 3
511 510 8.5 12 8 Homme Type 3
512 511 19 18 13 Homme Type 1
513 512 15 16 10 Homme Type 1
514 513 1.75 12 5 Femme Type 2
515 514 12.16 12 32 Femme Type 2
516 515 9 12 16 Homme Type 2
517 516 5.5 12 20 Homme Type 2
518 517 8.93 12 18 Femme Type 3
519 518 4.35 12 3 Femme Type 1
520 519 6.25 14 2 Homme Type 1
521 520 11.5 16 16 Homme Type 2
522 521 3.45 13 1 Femme Type 2
523 522 10 14 22 Homme Type 1
524 523 5 14 0 Homme Type 1
525 524 7.67 15 11 Femme Type 1
526 525 8.4 13 17 Homme Type 3
527 526 11.25 13 14 Homme Type 3
528 527 6.25 14 12 Femme Type 1
529 528 13.26 12 16 Homme Type 3
530 529 11.25 17 32 Femme Type 2
531 530 6.75 10 41 Homme Type 3
532 531 7.7 13 8 Femme Type 2
533 532 5.26 8 38 Femme Type 2
534 533 13.71 12 43 Homme Type 2
535 534 8 12 43 Femme Type 2

View File

@@ -0,0 +1,133 @@
```{r}
setwd("/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP3")
```
# Question 1 : Import dataset and check variables
```{r}
library(dplyr)
ozone <- read.table("ozone.txt", header = TRUE, sep = " ", dec = ".")
ozone$vent <- as.factor(ozone$vent)
ozone$temps <- as.factor(ozone$temps)
ozone <- ozone |> mutate(across(where(is.character), as.numeric))
ozone <- ozone |> mutate(across(where(is.integer), as.numeric))
paged_table(ozone)
```
# Question 2 : max03 ~ T12
```{r}
model_T12 <- lm(maxO3 ~ T12, data = ozone)
summary(model_T12)
```
```{r}
library(ggplot2)
ggplot(ozone, aes(x = T12, y = maxO3)) +
geom_smooth(method = "lm", se = T) +
geom_point(col = "red", size = 0.5) +
labs(title = "maxO3 ~ T12") +
theme_minimal()
```
```{r}
library(ggfortify)
library(car)
autoplot(model_T12, 1:4)
durbinWatsonTest(model_T12)
```
The p-value of the Durbin-Watson test is 0, so [P3] is not valid. The residuals are correlated.
The model is not valid.
# Question 3 : max03 ~ T12 + temps
```{r}
library(ggplot2)
ggplot(ozone, aes(y = maxO3, x = temps, colour = temps, fill = temps)) +
geom_boxplot(alpha = 0.5, outlier.alpha = 0) +
geom_jitter(width = 0.25, size = 1) +
stat_summary(fun = mean, colour = "black", geom = "point", shape = 18, size = 3)
```
```{r}
model_temps <- lm(maxO3 ~ -1 + temps, data = ozone)
summary(model_temps)
```
```{r}
autoplot(model_temps, 1:4)
durbinWatsonTest(model_temps)
```
```{r}
model_vent <- lm(maxO3 ~ -1 + vent, data = ozone)
summary(model_vent)
```
```{r}
autoplot(model_vent, 1:4)
durbinWatsonTest(model_vent)
```
The p-value of the Durbin-Watson test is 0, so [P3] is not valid. The residuals are correlated.
The model is not valid.
```{r}
model_temps_vent <- lm(maxO3 ~ temps * vent, data = ozone)
summary(model_temps_vent)
```
```{r}
autoplot(model_temps_vent, 1:4)
durbinWatsonTest(model_temps_vent)
```
# Question 4 : Multiple linear regression
```{r}
model <- lm(maxO3 ~ ., data = ozone)
summary(model)
```
```{r}
autoplot(model, 1:4)
durbinWatsonTest(model)
```
[P1] is verified as the 'Residuals vs Fitted' plot shows that the points are well distributed around 0
[P2] is verified as the 'Scale-Location' plot shows that the points are well distributed around 1
[P4] is verified as the 'QQPlot' is aligned with the 'y=x' line
[P3] is verified as the p-value is 0.7 > 0.05, so we do not reject H0 so the residuals are not auto-correlated
The model is valid.
```{r}
library(GGally)
ggpairs(ozone, progress = FALSE)
```
```{r}
library(MASS)
model_backward <- stepAIC(model, ~., trace = FALSE, direction = c("backward"))
summary(model_backward)
```
```{r}
AIC(model_backward)
autoplot(model_backward, 1:4)
durbinWatsonTest(model_backward)
```
# Question 5 : Prediction
```{r}
new_obs <- list(
T12 = 18,
Ne9 = 3,
Vx9 = 0.7,
maxO3v = 85
)
predict(model_backward, new_obs, interval = "confidence")
```

View File

@@ -0,0 +1,113 @@
maxO3 T9 T12 T15 Ne9 Ne12 Ne15 Vx9 Vx12 Vx15 maxO3v vent temps
87 15.6 18.5 18.4 4 4 8 0.6946 -1.7101 -0.6946 84 Nord Sec
82 17 18.4 17.7 5 5 7 -4.3301 -4 -3 87 Nord Sec
92 15.3 17.6 19.5 2 5 4 2.9544 1.8794 0.5209 82 Est Sec
114 16.2 19.7 22.5 1 1 0 0.9848 0.3473 -0.1736 92 Nord Sec
94 17.4 20.5 20.4 8 8 7 -0.5 -2.9544 -4.3301 114 Ouest Sec
80 17.7 19.8 18.3 6 6 7 -5.6382 -5 -6 94 Ouest Pluie
79 16.8 15.6 14.9 7 8 8 -4.3301 -1.8794 -3.7588 80 Ouest Sec
79 14.9 17.5 18.9 5 5 4 0 -1.0419 -1.3892 99 Nord Sec
101 16.1 19.6 21.4 2 4 4 -0.766 -1.0261 -2.2981 79 Nord Sec
106 18.3 21.9 22.9 5 6 8 1.2856 -2.2981 -3.9392 101 Ouest Sec
101 17.3 19.3 20.2 7 7 3 -1.5 -1.5 -0.8682 106 Nord Sec
90 17.6 20.3 17.4 7 6 8 0.6946 -1.0419 -0.6946 101 Sud Sec
72 18.3 19.6 19.4 7 5 6 -0.8682 -2.7362 -6.8944 90 Sud Sec
70 17.1 18.2 18 7 7 7 -4.3301 -7.8785 -5.1962 72 Ouest Pluie
83 15.4 17.4 16.6 8 7 7 -4.3301 -2.0521 -3 70 Nord Sec
88 15.9 19.1 21.5 6 5 4 0.5209 -2.9544 -1.0261 83 Ouest Sec
145 21 24.6 26.9 0 1 1 -0.342 -1.5321 -0.684 121 Ouest Sec
81 16.2 22.4 23.4 8 3 1 0 0.3473 -2.5712 145 Nord Sec
121 19.7 24.2 26.9 2 1 0 1.5321 1.7321 2 81 Est Sec
146 23.6 28.6 28.4 1 1 2 1 -1.9284 -1.2155 121 Sud Sec
121 20.4 25.2 27.7 1 0 0 0 -0.5209 1.0261 146 Nord Sec
146 27 32.7 33.7 0 0 0 2.9544 6.5778 4.3301 121 Est Sec
108 24 23.5 25.1 4 4 0 -2.5712 -3.8567 -4.6985 146 Sud Sec
83 19.7 22.9 24.8 7 6 6 -2.5981 -3.9392 -4.924 108 Ouest Sec
57 20.1 22.4 22.8 7 6 7 -5.6382 -3.8302 -4.5963 83 Ouest Pluie
81 19.6 25.1 27.2 3 4 4 -1.9284 -2.5712 -4.3301 57 Sud Sec
67 19.5 23.4 23.7 5 5 4 -1.5321 -3.0642 -0.8682 81 Ouest Sec
70 18.8 22.7 24.9 5 2 1 0.684 0 1.3681 67 Nord Sec
106 24.1 28.4 30.1 0 0 1 2.8191 3.9392 3.4641 70 Est Sec
139 26.6 30.1 31.9 0 1 4 1.8794 2 1.3681 106 Sud Sec
79 19.5 18.8 17.8 8 8 8 0.6946 -0.866 -1.0261 139 Ouest Sec
93 16.8 18.2 22 8 8 6 0 0 1.2856 79 Sud Pluie
97 20.8 23.7 25 2 3 4 0 1.7101 -2.7362 93 Nord Sec
113 17.5 18.2 22.7 8 8 5 -3.7588 -3.9392 -4.6985 97 Ouest Pluie
72 18.1 21.2 23.9 7 6 4 -2.5981 -3.9392 -3.7588 113 Ouest Pluie
88 19.2 22 25.2 4 7 4 -1.9696 -3.0642 -4 72 Ouest Sec
77 19.4 20.7 22.5 7 8 7 -6.5778 -5.6382 -9 88 Ouest Sec
71 19.2 21 22.4 6 4 6 -7.8785 -6.8937 -6.8937 77 Ouest Sec
56 13.8 17.3 18.5 8 8 6 1.5 -3.8302 -2.0521 71 Ouest Pluie
45 14.3 14.5 15.2 8 8 8 0.684 4 2.9544 56 Est Pluie
67 15.6 18.6 20.3 5 7 5 -3.2139 -3.7588 -4 45 Ouest Pluie
67 16.9 19.1 19.5 5 5 6 -2.2981 -3.7588 0 67 Ouest Pluie
84 17.4 20.4 21.4 3 4 6 0 0.3473 -2.5981 67 Sud Sec
63 15.1 20.5 20.6 8 6 6 2 -5.3623 -6.1284 84 Ouest Pluie
69 15.1 15.6 15.9 8 8 8 -4.5963 -3.8302 -4.3301 63 Ouest Pluie
92 16.7 19.1 19.3 7 6 4 -2.0521 -4.4995 -2.7362 69 Nord Sec
88 16.9 20.3 20.7 6 6 5 -2.8191 -3.4641 -3 92 Ouest Pluie
66 18 21.6 23.3 8 6 5 -3 -3.5 -3.2139 88 Sud Sec
72 18.6 21.9 23.6 4 7 6 0.866 -1.9696 -1.0261 66 Ouest Sec
81 18.8 22.5 23.9 6 3 2 0.5209 -1 -2 72 Nord Sec
83 19 22.5 24.1 2 4 6 0 -1.0261 0.5209 81 Nord Sec
149 19.9 26.9 29 3 4 3 1 -0.9397 -0.6428 83 Ouest Sec
153 23.8 27.7 29.4 1 1 4 0.9397 1.5 0 149 Nord Sec
159 24 28.3 26.5 2 2 7 -0.342 1.2856 -2 153 Nord Sec
149 23.3 27.6 28.8 4 6 3 0.866 -1.5321 -0.1736 159 Ouest Sec
160 25 29.6 31.1 0 3 5 1.5321 -0.684 2.8191 149 Sud Sec
156 24.9 30.5 32.2 0 1 4 -0.5 -1.8794 -1.2856 160 Ouest Sec
84 20.5 26.3 27.8 1 0 2 -1.3681 -0.6946 0 156 Nord Sec
126 25.3 29.5 31.2 1 4 4 3 3.7588 5 84 Est Sec
116 21.3 23.8 22.1 7 7 8 0 -2.3941 -1.3892 126 Sud Pluie
77 20 18.2 23.6 5 7 6 -3.4641 -2.5981 -3.7588 116 Ouest Pluie
63 18.7 20.6 20.3 6 7 7 -5 -4.924 -5.6382 77 Ouest Pluie
54 18.6 18.7 17.8 8 8 8 -4.6985 -2.5 -0.8682 63 Sud Pluie
65 19.2 23 22.7 8 7 7 -3.8302 -4.924 -5.6382 54 Ouest Sec
72 19.9 21.6 20.4 7 7 8 -3 -4.5963 -5.1962 65 Ouest Pluie
60 18.7 21.4 21.7 7 7 7 -5.6382 -6.0622 -6.8937 72 Ouest Pluie
70 18.4 17.1 20.5 3 6 3 -5.9088 -3.2139 -4.4995 60 Nord Pluie
77 17.1 20 20.8 4 5 4 -1.9284 -1.0261 0.5209 70 Nord Sec
98 17.8 22.8 24.3 1 1 0 0 -1.5321 -1 77 Ouest Pluie
111 20.9 25.2 26.7 1 5 2 -1.0261 -3 -2.2981 98 Ouest Sec
75 18.8 20.5 26 8 7 1 -0.866 0 0 111 Nord Sec
116 23.5 29.8 31.7 1 3 5 1.8794 1.3681 0.6946 75 Sud Sec
109 20.8 23.7 26.6 8 5 4 -1.0261 -1.7101 -3.2139 116 Sud Sec
67 18.8 21.1 18.9 7 7 8 -5.3623 -5.3623 -2.5 86 Ouest Pluie
76 17.8 21.3 24 7 5 5 -3.0642 -2.2981 -3.9392 67 Ouest Pluie
113 20.6 24.8 27 1 1 2 1.3681 0.8682 -2.2981 76 Sud Sec
117 21.6 26.9 28.6 6 6 4 1.5321 1.9284 1.9284 113 Sud Pluie
131 22.7 28.4 30.1 5 3 3 0.1736 -1.9696 -1.9284 117 Ouest Sec
166 19.8 27.2 30.8 4 0 1 0.6428 -0.866 0.684 131 Ouest Sec
159 25 33.5 35.5 1 1 1 1 0.6946 -1.7101 166 Sud Sec
100 20.1 22.9 27.6 8 8 6 1.2856 -1.7321 -0.684 159 Ouest Sec
114 21 26.3 26.4 7 4 5 3.0642 2.8191 1.3681 100 Est Sec
112 21 24.4 26.8 1 6 3 4 4 3.7588 114 Est Sec
101 16.9 17.8 20.6 7 7 7 -2 -0.5209 1.8794 112 Nord Pluie
76 17.5 18.6 18.7 7 7 7 -3.4641 -4 -1.7321 101 Ouest Sec
59 16.5 20.3 20.3 5 7 6 -4.3301 -5.3623 -4.5 76 Ouest Pluie
78 17.7 20.2 21.5 5 5 3 0 0.5209 0 59 Nord Pluie
76 17.3 22.7 24.6 4 5 6 -2.9544 -2.9544 -2 78 Ouest Pluie
55 15.3 16.8 19.2 8 7 5 -1.8794 -1.8794 -2.3941 76 Ouest Pluie
71 15.9 19.2 19.5 7 5 3 -6.1284 0 -1.3892 55 Nord Pluie
66 16.2 18.9 19.3 2 5 6 -1.3681 -0.8682 1.7101 71 Nord Pluie
59 18.3 18.3 19 7 7 7 -3.9392 -1.9284 -1.7101 66 Nord Pluie
68 16.9 20.8 22.5 6 5 7 -1.5 -3.4641 -3.0642 59 Ouest Pluie
63 17.3 19.8 19.4 7 8 8 -4.5963 -6.0622 -4.3301 68 Ouest Sec
78 14.2 22.2 22 5 5 6 -0.866 -5 -5 62 Ouest Sec
74 15.8 18.7 19.1 8 7 7 -4.5963 -6.8937 -7.5175 78 Ouest Pluie
71 15.2 17.9 18.6 6 5 1 -1.0419 -1.3681 -1.0419 74 Nord Pluie
69 17.1 17.7 17.5 6 7 8 -5.1962 -2.7362 -1.0419 71 Nord Pluie
71 15.4 17.7 16.6 4 5 5 -3.8302 0 1.3892 69 Nord Sec
60 13.7 14 15.8 4 5 4 0 3.2139 0 71 Nord Pluie
42 12.7 14.3 14.9 8 7 7 -2.5 -3.2139 -2.5 60 Nord Pluie
65 14.8 16.3 15.9 7 7 7 -4.3301 -6.0622 -5.1962 42 Ouest Pluie
71 15.5 18 17.4 7 7 6 -3.9392 -3.0642 0 65 Ouest Sec
96 11.3 19.4 20.2 3 3 3 -0.1736 3.7588 3.8302 71 Est Pluie
98 15.2 19.7 20.3 2 2 2 4 5 4.3301 96 Est Sec
92 14.7 17.6 18.2 1 4 6 5.1962 5.1423 3.5 98 Nord Sec
76 13.3 17.7 17.7 7 7 6 -0.9397 -0.766 -0.5 92 Ouest Pluie
84 13.3 17.7 17.8 3 5 6 0 -1 -1.2856 76 Sud Sec
77 16.2 20.8 22.1 6 5 5 -0.6946 -2 -1.3681 71 Sud Pluie
99 16.9 23 22.6 6 4 7 1.5 0.8682 0.8682 77 Sud Sec
83 16.9 19.8 22.1 6 5 3 -4 -3.7588 -4 99 Ouest Pluie
70 15.7 18.6 20.7 7 7 7 0 -1.0419 -4 83 Sud Sec

View File

@@ -0,0 +1,94 @@
```{r}
setwd("/Users/arthurdanjou/Workspace/studies/M1/General Linear Models/TP4")
set.seed(0911)
library(ggplot2)
library(gridExtra)
library(cowplot)
library(plotly) # interactif plot
library(ggfortify) # diagnostic plot
library(forestmodel) # plot odd ratio
library(arm) # binnedplot diagnostic plot in GLM
library(knitr)
library(dplyr)
library(tidyverse)
library(tidymodels)
library(broom) # funtion augment to add columns to the original data that was modeled
library(effects) # plot effect of covariate/factor
library(questionr) # odd ratio
library(lmtest) # LRtest
library(survey) # Wald test
library(vcdExtra) # deviance test
library(rsample) # for data splitting
library(glmnet)
library(nnet) # multinom, glm
library(caret)
library(ROCR)
# library(PRROC) autre package pour courbe roc et courbe pr
library(ISLR) # dataset for statistical learning
ggplot2::theme_set(ggplot2::theme_light()) # Set the graphical theme
```
```{r}
car <- read.table("car_income.txt", header = TRUE, sep = ";")
car |> rmarkdown::paged_table()
summary(car)
```
```{r}
model_purchase <- glm(purchase ~ ., data = car, family = "binomial")
summary(model_purchase)
```
```{r}
p1 <- car |>
ggplot(aes(y = purchase, x = income + age)) +
geom_point(alpha = .15) +
geom_smooth(method = "lm") +
ggtitle("Linear regression model fit") +
xlab("Income") +
ylab("Probability of Purchase")
p2 <- car |>
ggplot(aes(y = purchase, x = income + age)) +
geom_point(alpha = .15) +
geom_smooth(method = "glm", method.args = list(family = "binomial")) +
ggtitle("Logistic regression model fit") +
xlab("Income") +
ylab("Probability of Purchase")
ggplotly(p1)
ggplotly(p2)
```
```{r}
car <- car |>
mutate(old = ifelse(car$age > 3, 1, 0))
car <- car |>
mutate(rich = ifelse(car$income > 40, 1, 0))
model_old <- glm(purchase ~ age + income + rich + old, data = car, family = "binomial")
summary(model_old)
```
# Diabetes in Pima Indians
```{r}
library(MASS)
pima.tr <- Pima.tr
pima.te <- Pima.te
model_train_pima <- glm(type ~ npreg + glu + bp + skin + bmi + ped + age, data = pima.tr, family = "binomial")
summary(model_train_pima)
```
```{r}
pima.te$pred <- predict(model_train_pima, newdata = pima.te, type = "response")
pima.te$pred <- ifelse(pima.te$pred > 0.55, "Yes", "No")
pima.te$pred <- as.factor(pima.te$pred)
pima.te$type <- as.factor(pima.te$type)
# Confusion matrix
confusionMatrix(data = pima.te$type, reference = pima.te$pred, positive = "Yes")
```

View File

@@ -0,0 +1,34 @@
purchase;income;age
0;32;3
0;45;2
1;60;2
0;53;1
0;25;4
1;68;1
1;82;2
1;38;5
0;67;2
1;92;2
1;72;3
0;21;5
0;26;3
1;40;4
0;33;3
0;45;1
1;61;2
0;16;3
1;18;4
0;22;6
0;27;3
1;35;3
1;40;3
0;10;4
0;24;3
1;15;4
0;23;3
0;19;5
1;22;2
0;61;2
0;21;3
1;32;5
0;17;1

View File

@@ -0,0 +1,8 @@
# Exercise 1 : Uniform
```{r}
n <- 10e4
U <- runif(n)
X <- 5 * (U <= 0.4) + 6 * (0.4 < U & U <= 0.6) + 7 * (0.6 < U & U <= 0.9) + 8 * (0.9 < U)
barplot(table(X)/n)
```

View File

@@ -0,0 +1,40 @@
# Exercise 10 : Integral Calculation.
## Estimation of δ using the classical Monte Carlo
### Uniform + Normal distributions
```{r}
n <- 10e4
X <- runif(n, 0, 5)
Y <- rnorm(n, 0, sqrt(2))
Y[Y <= 2] <- 0
h <- function(x, y) {
5 *
sqrt(pi) *
sqrt(x + y) *
sin(y^4) *
exp(-3 * x / 2)
}
I1 <- mean(h(X, Y))
I1
```
### Exponential + Normal distributions
```{r}
n <- 10e4
X <- runif(n, 0, 5)
Y <- rexp(n, 3 / 2)
Y[Y <= 2] <- 0
X[X <= 5] <- 0
h <- function(x, y) {
4 / 3 * sqrt(pi) * sqrt(x + y) * sin(y^4)
}
I2 <- mean(h(X, Y))
I2
```

View File

@@ -0,0 +1,35 @@
# Exercise 11 : Importance Sampling, Cauchy Distribution
```{r}
set.seed(110)
n <- 10000
a <- 50
k <- 5
f <- function(x) {
1 / (pi * (1 + x^2))
}
g <- function(x, a, k) {
k * a^k / x^(k + 1) * (x >= a)
}
h <- function(x, a) {
x >= a
}
G_inv <- function(x, a, k) {
a / (1 - x)^(1 / k)
}
# Classical Monte Carlo method
x1 <- rcauchy(n, 0, 1)
p1 <- h(x1, a)
# Importance sampling
x2 <- G_inv(runif(n), a, k)
p2 <- f(x2) / g(x2, a, k) * h(x2, a)
# Results (cat the results and the var of the estimators)
cat(sprintf("Classical Monte Carlo: mean = %f, variance = %f \n", mean(p1), var(p1)))
cat(sprintf("Importance sampling: mean = %f, variance = %f", mean(p2), var(p2)))
```

View File

@@ -0,0 +1,48 @@
# Exercise 12 : Rejection vs Importance Sampling
```{r}
set.seed(123)
n <- 10000
f <- function(x, y) {
3 / 2 * exp(-3 / 2 * x) * (x > 0) * 1 / (2 * sqrt(pi)) *
exp(-y^2 / 4) *
(x > 0)
}
g <- function(x, y, lambda) {
(3 / 2 * exp(-3 / 2 * x)) *
(x >= 0) *
(lambda * exp(-lambda * (y - 2))) *
(y >= 2)
}
h <- function(x, y) {
sqrt(x + y) *
sin(y^4) *
(x <= 5) *
(x > 0) *
(y >= 2) *
(4 * sqrt(pi) / 3)
}
X <- rexp(n, 3 / 2)
Y <- rexp(n, 2) + 2
mean(h(X, Y) * f(X, Y) / g(X, Y, 0.4))
```
### Monte Carlo method
```{r}
set.seed(123)
n <- 10e4
X <- rexp(n, 3 / 2)
Y <- rexp(n, 3 / 2) + 2
X[X > 5] <- 0
X[X < 0] <- 0
Y[Y < 2] <- 0
mean(h(X, Y))
```

View File

@@ -0,0 +1,52 @@
# Exercise 13 : Gaussian integral and Variance reduction
```{r}
set.seed(123)
n <- 10000
# Normal distribution
h1 <- function(x) {
sqrt(2 * pi) *
exp(-x^2 / 2) *
(x <= 2) *
(x >= 0)
}
# Uniform (0, 2) distribution
h2 <- function(x) {
2 * exp(-x^2)
}
X1 <- rnorm(n)
X2 <- runif(n, 0, 2)
cat(sprintf("Integral of h1(x) using normal distribution is %f and variance is %f \n", mean(h1(X1)), var(h1(X1))))
cat(sprintf("Integral of h2(x) using normal distribution is %f and variance is %f \n", mean(h2(X2)), var(h2(X2))))
X3 <- 2 - X2
cat(sprintf("Integral of h2(x) using normal distribution is %f and variance is %f \n",
(mean(h2(X3)) + mean(h2(X2))) / 2,
(var(h2(X3)) +
var(h2(X2)) +
2 * cov(h2(X2), h2(X3))) / 4))
X4 <- -X1
cat(sprintf("Integral of h2(x) using normal distribution is %f and variance is %f",
(mean(h1(X1)) + mean(h1(X4))) / 2,
(var(h1(X1)) +
var(h1(X4)) +
2 * cov(h1(X1), h1(X4))) / 4))
```
## K-th moment of a uniform random variable on [0, 2]
```{r}
k <- 1:10
moment <- round(2^k / (k + 1), 2)
cat(sprintf("The k-th moment for k ∈ N* of a uniform random variable on [0, 2] is %s", paste(moment, collapse = ", ")))
```
```{r}
```

View File

@@ -0,0 +1,46 @@
# Exercise 2 : Exponential distribution and related distributions
### Question 1
```{r}
n <- 10000
u <- runif(n)
x <- -1 / 2 * log(1 - u)
hist(x, breaks = 50, freq = FALSE)
curve(dexp(x, rate = 2), add = TRUE, col = "red")
qqplot(x, rexp(n, rate = 2))
```
### Question 2
```{r}
lambda <- 1.5
n <- 10000
d <- 10
S <- numeric(n)
for (i in 1:n) {
u <- runif(d)
x <- -1 / lambda * log(1 - u)
S[i] <- sum(x)
}
hist(S, freq = FALSE, breaks = 50)
curve(dgamma(x, shape = d, rate = lambda), add = TRUE, col = "red")
```
### Question 3
```{r}
n <- 10000
S <- numeric(n)
i <- 1
for (j in (1:n)) {
x <- -1 / 4 * log(1 - runif(1))
while (x <= 1) {
i <- i + 1
x <- x - 1 / 4 * log(1 - runif(1))
}
S[j] <- i
i <- 1
}
hist(S, freq = FALSE)
curve(dpois(S, lambda = 4), add = TRUE, col = "red")
```

View File

@@ -0,0 +1,17 @@
# Exercise 3 : Box Muller Algo
```{r}
BM <- function(n) {
U1 <- runif(n)
U2 <- runif(n)
X1 <- sqrt(-2 * log(U1)) * cos(2 * pi * U2)
X2 <- sqrt(-2 * log(U1)) * sin(2 * pi * U2)
return(c(X1, X2))
}
n <- 10e4
X <- BM(n)
hist(X, breaks = 50, freq = FALSE)
curve(dnorm(x), add = TRUE, col = "red")
```

View File

@@ -0,0 +1,12 @@
# Exercise 5 : Simulation of Brownian Motion
```{r}
n <- 1:1100
brownian <- function (i) {
return ((i >= 1 & i <= 100)*(i/100) + (i >= 101 & i <= 110)*(1 + (i - 100)/10) + (i >= 111 & i <= 1110)*(2 + (i - 110)/1000))
}
t <- brownian(n)
W0 <- 0
Wt <- W0 + sqrt(t) * rnorm(1100)
plot(t, Wt, type = "o")
```

View File

@@ -0,0 +1,25 @@
# Exercise 6 : Rejection - A First Example
```{r}
f <- function(x) {
2 / pi * sqrt(1 - x^2) * (x >= -1 & x <= 1)
}
n <- 10000
M <- 4 / pi
g <- function(x) {
1 / 2 * (x >= -1 & x <= 1)
}
x <- numeric(0)
while (length(x) < n) {
U <- runif(1)
X <- runif(1, -1, 1)
x <- append(x, X[U <= (f(X) / (M * g(X)))])
}
t <- seq(-1, 1, 0.01)
hist(x, freq = FALSE, breaks = 50)
lines(t, f(t), col = "red", lwd = 2)
```

View File

@@ -0,0 +1,32 @@
# Exercise 7 : Rejection
```{r}
n <- 5000
f <- function(x, y) {
return(
1 / pi * (x^2 + y^2 <= 1)
)
}
M <- 4 / pi
g <- function(x, y) {
return(
1 / 4 * (x >= -1 & x <= 1 & y >= -1 & y <= 1)
)
}
x <- NULL
y <- NULL
while (length(x) < n) {
U <- runif(1)
X <- runif(1, -1, 1)
Y <- runif(1, -1, 1)
x <- append(x, X[U <= (f(X, Y) / (M * g(X, Y)))])
y <- append(y, Y[U <= (f(X, Y) / (M * g(X, Y)))])
}
t <- seq(-1, 1, 0.01)
plot(x, y)
contour(t, t, outer(t, t, Vectorize(f)), add = TRUE, col = "red", lwd = 2)
```

View File

@@ -0,0 +1,65 @@
# Exercise 8 - a : Truncated Normal Distribution
```{r}
f <- function(x, b, mean, sd) {
1 / (sqrt(2 * pi * sd^2) * pnorm((mean - b) / sd)) *
exp(-(x - mean)^2 / (2 * sd^2)) *
(x >= b)
}
M <- function(b, mean, sd) {
1 / pnorm((mean - b) / sd)
}
g <- function(x, b, mean, sd) {
1 / sqrt(2 * pi * sd^2) *
exp(-(x - mean)^2 / (2 * sd^2))
}
n <- 10000
mean <- 0
sd <- 2
b <- 2
x <- numeric(0)
while (length(x) < n) {
U <- runif(1)
X <- rnorm(1, mean, sd)
x <- append(x, X[U <= (f(X, b, mean, sd) / (M(b, mean, sd) * g(X, b, mean, sd)))])
}
t <- seq(b, 7, 0.01)
hist(x, freq = FALSE, breaks = 35)
lines(t, f(t, b, mean, sd), col = "red", lwd = 2)
```
# Exercise 8 - b : Truncated Exponential Distribution
```{r}
f <- function(x, b, lambda) {
lambda * exp(-lambda * (x - b)) * (x >= b)
}
M <- function(b, lambda) {
exp(lambda * b)
}
g <- function(x, lambda) {
lambda * exp(-lambda * x)
}
n <- 10000
b <- 2
lambda <- 1
x <- numeric(0)
while (length(x) < n) {
U <- runif(1)
X <- rexp(1, lambda)
x <- append(x, X[U <= (f(X, b, lambda) / (M(b, lambda) * g(X, lambda)))])
}
t <- seq(b, 7, 0.01)
hist(x, freq = FALSE, breaks = 35)
lines(t, f(t, b, lambda), col = "red", lwd = 2)
```

View File

@@ -0,0 +1,37 @@
# Exercise 9 : Estimation of Pi
## Methode 1
```{r}
n <- 15e4
pi_1 <- function(n) {
U <- runif(n, 0, 1)
return(4 / n * sum(sqrt(1 - U^2)))
}
pi_1(n)
```
## Methode 2
```{r}
n <- 15e4
pi_2 <- function(n) {
U1 <- runif(n, 0, 1)
U2 <- runif(n, 0, 1)
return(4 / n * sum(U1^2 + U2^2 <= 1))
}
pi_2(n)
```
## Best Estimator of pi
```{r}
n <- 1000
m <- 15e4
sample_1 <- replicate(n, pi_1(m))
sample_2 <- replicate(n, pi_2(m))
cat(sprintf("[Methode 1] Mean: %s. Variance: %s \n", mean(sample_1), var(sample_1)))
cat(sprintf("[Methode 2] Mean: %s. Variance: %s", mean(sample_2), var(sample_2)))
```

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,909 @@
---
title: "Groupe 03 Projet DANJOU - DUROUSSEAU"
output:
pdf_document:
toc: yes
toc_depth: 3
fig_caption: yes
---
# Exercise 1 : Negative weighted mixture
## Definition
### Question 1
The conditions for a function f to be a probability density are :
- f is defined on $\mathbb{R}$
- f is non-negative, ie $f(x) \ge 0$, $\forall x \in \mathbb{R}$
- f is Lebesgue-integrable
- and $\int_{\mathbb{R}} f(x) \,dx = 1$
The function f, to be a density, needs to be non-negative, ie $f(|x|) \ge 0$ when $|x| \rightarrow \infty$
We have, when $|x| \rightarrow \infty$, $f_i(|x|) \sim exp(\frac{-x^2}{\sigma_i^2})$ for $i = 1, 2$
Then, $f(|x|) \sim exp(\frac{-x^2}{\sigma_1^2}) - exp(\frac{-x^2}{\sigma_2^2})$
We want $f(|x|) \ge 0$, ie, $exp(\frac{-x^2}{\sigma_1^2}) - exp(\frac{-x^2}{\sigma_2^2}) \ge 0$
$\Leftrightarrow \sigma_1^2 \ge \sigma_2^2$
We can see that $f_1$ dominates the tail behavior.
\
### Question 2
For given parameters $(\mu_1, \sigma_1^2)$ and $(\mu_2, \sigma_2^2)$, we have $\forall x \in \mathbb{R}$, $f(x) \ge 0$
$\Leftrightarrow \frac{1}{\sigma_1} exp(\frac{-(x-\mu_1)^2}{2 \sigma_1^2}) \ge \frac{a}{\sigma_2} exp(\frac{-(x-\mu_2)^2}{2 \sigma_2^2})$
$\Leftrightarrow 0 < a \le a^* = \min_{x \in \mathbb{R}} \frac{f_1(x)}{f_2(x)} = \min_{x \in \mathbb{R}} \frac{\sigma_2}{\sigma_1} exp(\frac{(x-\mu_2)^2}{2 \sigma_2^2} - \frac{(x-\mu_1)^2}{2 \sigma_1^2})$
To find $a^*$, we just have to minimize $g(x) := \frac{(x-\mu_2)^2}{2 \sigma_2^2} - \frac{(x-\mu_1)^2}{2 \sigma_1^2}$
First we derive $g$: $\forall x \in \mathbb{R}, g^{\prime}(x) = \frac{x - \mu_2}{\sigma_2^2} - \frac{x - \mu_1}{\sigma_1^2}$
We search $x^*$ such that $g^{\prime}(x^*) = 0$
$\Leftrightarrow x^* = \frac{\mu_2 \sigma_1^2 - \mu_1 \sigma_2^2}{\sigma_1^2 - \sigma_2^2}$
Then, we compute $a^* = \frac{f_1(x^*)}{f_2(x^*)}$
We call $C \in \mathbb{R}$ the normalization constant such that $f(x) = C (f_1(x) - af_2(x))$
To find $C$, we know that $1 = \int_{\mathbb{R}} f(x) \,dx = \int_{\mathbb{R}} C (f_1(x) - af_2(x)) \,dx = C \int_{\mathbb{R}} f_1(x) \,dx - Ca \int_{\mathbb{R}} f_2(x) \,dx = C(1-a)$ as $f_1$ and $f_2$ are density functions and by linearity of the integrals.
$\Leftrightarrow C = \frac{1}{1-a}$
### Question 3
```{r}
f <- function(a, mu1, mu2, s1, s2, x) {
fx <- dnorm(x, mu1, s1) - a * dnorm(x, mu2, s2)
fx[fx < 0] <- 0
return(fx / (1 - a))
}
a_star <- function(mu1, mu2, s1, s2) {
x_star <- (mu2 * s1^2 - mu1 * s2^2) / (s1^2 - s2^2)
return(dnorm(x_star, mu1, s1) / dnorm(x_star, mu2, s2))
}
```
```{r}
mu1 <- 0
mu2 <- 1
s1 <- 3
s2 <- 1
x <- seq(-10, 10, length.out = 1000)
as <- a_star(mu1, mu2, s1, s2)
a_values <- c(0.1, 0.2, 0.3, 0.4, 0.5, as)
plot(x, f(as, mu1, mu2, s1, s2, x),
type = "l",
col = "red",
xlab = "x",
ylab = "f(a, mu1, mu2, s1, s2, x)",
main = "Density function of f(a, mu1, mu2, s1, s2, x) for different a"
)
for (i in (length(a_values) - 1):1) {
lines(x, f(a_values[i], mu1, mu2, s1, s2, x), lty = 3, col = "blue")
}
legend("topright", legend = c("a = a*", "a != a*"), col = c("red", "blue"), lty = 1)
```
We observe that for small values of a, the density f is close to the density of $f_1 \sim \mathcal{N}(\mu_1, \sigma_1^2)$. When a increases, the shape of evolves into the combinaison of two normal distributions. We observe that for $a = a^*$, the density the largest value of a for which the density is still a density function, indeed for $a > a^*$, the function f takes negative values so it is no longer a density.
```{r}
s2_values <- seq(1, 10, length.out = 5)
a <- 0.2
plot(x, f(a, mu1, mu2, s1, max(s2_values), x),
type = "l",
xlab = "x",
ylab = "f(a, mu1, mu2, s1, s2, x)",
col = "red",
main = "Density function of f(a, mu1, mu2, s1, s2, x) for different s2"
)
for (i in length(s2_values):1) {
lines(x, f(a, mu1, mu2, s1, s2_values[i], x), lty = 1, col = rainbow(length(s2_values))[i])
}
legend("topright", legend = paste("s2 =", s2_values), col = rainbow(length(s2_values)), lty = 1)
```
We observe that when $\sigma_2^2 = 1$, the density $f$ has two peaks and when $\sigma_2^1 > 1$, the density $f$ has only one peak.
```{r}
mu1 <- 0
mu2 <- 1
sigma1 <- 3
sigma2 <- 1
a <- 0.2
as <- a_star(mu1, mu2, sigma1, sigma2)
cat(sprintf("a* = %f, a = %f, a <= a* [%s]", as, a, a <= as))
```
We have $\sigma_1^2 \ge \sigma_2^2$ and $0 < a \le a^*$, so the numerical values are compatible with the constraints defined above.
## Inverse c.d.f Random Variable simulation
### Question 4
To prove that the cumulative density function F associated with f is available in closed form, we need to compute $F(x) = \int_{-\infty}^{x} f(t) \,dt = \frac{1}{1-a} (\int_{-\infty}^{x} f_1(t)\, dt - a \int_{-\infty}^{x} f_2(t)\, dt) = \frac{1}{1-a} (F_1(x) - aF_2(x))$ where $F_1$ and $F_2$ are the cumulative density functions of $f_1 \sim \mathcal{N}(\mu1, \sigma_1^2)$ and $f_2 \sim \mathcal{N}(\mu2, \sigma_2^2)$ respectively.
Then, $F$ is a closed-form as a finite sum of closed forms.
```{r}
F <- function(a, mu1, mu2, s1, s2, x) {
Fx <- pnorm(x, mu1, s1) - a * pnorm(x, mu2, s2)
return(Fx / (1 - a))
}
```
To construct an algorithm that returns the value of the inverse function method as a function of $u \in (0,1)$, of the parameters $a, \mu_1, \mu_2, \sigma_1, \sigma_2, x$, and of an approximation precision $\epsilon$, we can use the bisection method.
We fixe $\epsilon > 0$.\
We set $u \in (0, 1)$.\
We define $L = -10$ and $U = -L$, the bounds and $M = \frac{L + U}{2}$, the middle of our interval.\
While $U - L > \epsilon$ :
- We compute $F(M) = \frac{1}{1-a} (F_1(M) - aF_2(M))$
- If $F(M) < u$, we set $L = M$
- Else, we set $U = M$
- We set $M = \frac{L + U}{2}$
End while
For the generation of random variables from F, we can use the inverse transform sampling method.\
We set $X$ as an empty array and $n$ the number of random variables we want to generate.\
We fixe $\epsilon > 0$.\
For $i = 1, \dots, n$ :
- We set $u \in (0, 1)$.\
- We define $L = -10$ and $U = -L$, the bounds and $M = \frac{L + U}{2}$, the middle of our interval.\
- While $U - L > \epsilon$ :
- We compute $F(M) = \frac{1}{1-a} (F_1(M) - aF_2(M))$
- If $F(M) < u$, we set $L = M$
- Else, we set $U = M$
- We set $M = \frac{L + U}{2}$
- End while
- We add $M$ to $X$
End for We return $X$
### Question 5
```{r}
inv_cdf <- function(n) {
X <- numeric(n)
for (i in 1:n) {
u <- runif(1)
L <- -10
U <- -L
M <- (L + U) / 2
while (U - L > 1e-6) {
FM <- F(a, mu1, mu2, s1, s2, M)
if (FM < u) {
L <- M
} else {
U <- M
}
M <- (L + U) / 2
}
X[i] <- M
}
return(X)
}
```
```{r}
set.seed(123)
n <- 10000
X <- inv_cdf(n)
x <- seq(-10, 10, length.out = 1000)
hist(X, breaks = 100, freq = FALSE, col = "lightblue", main = "Empirical density function", xlab = "x")
lines(x, f(a, mu1, mu2, s1, s2, x), col = "red")
legend("topright", legend = c("Theorical density", "Empirical density"), col = c("red", "lightblue"), lty = 1)
```
```{r}
plot(ecdf(X), col = "blue", main = "Empirical cumulative distribution function", xlab = "x", ylab = "F(x)")
lines(x, F(a, mu1, mu2, s1, s2, x), col = "red")
legend("bottomright", legend = c("Theoretical cdf", "Empirical cdf"), col = c("red", "blue"), lty = 1)
```
We can see of both graphs that the empirical cumulative distribution function is close to the theoretical one.
## Accept-Reject Random Variable simulation
### Question 6
To simulate under f using the accept-reject algorithm, we need to find a density function g such that $f(x) \le M g(x)$ for all $x \in \mathbb{R}$, where M is a constant.\
Then, we generate $X \sim g$ and $U \sim \mathcal{U}([0,1])$.\
We accept $Y = X$ if $U \le \frac{f(X)}{Mg(X)}$. Return to $1$ otherwise.
The probability of acceptance is $\int_{\mathbb{R}} \frac{f(x)}{Mg(x)} g(x) \,dx = \frac{1}{M} \int_{\mathbb{R}} f(x) \,dx = \frac{1}{M}$
Here we pose $g = f_1$.
Then we have $\frac{f(x)}{g(x)} = \frac{1}{1-a} (1 - a\frac{f_2(x)}{f_1(x)})$
We pose earlier that $a^* = \min_{x \in \mathbb{R}} \frac{f_1(x)}{f_2(x)} \Rightarrow \frac{1}{a^*} = \max_{x \in \mathbb{R}} \frac{f_2(x)}{f_1(x)}$.
We compute in our fist equation : $\frac{1}{1-a} (1 - a\frac{f_2(x)}{f_1(x)}) \leq \frac{1}{1-a} (1 - \frac{a}{a^*}) \leq \frac{1}{1-a}$ because $a \leq a^* \Rightarrow 1 - \frac{a}{a^*} \leq 0$
To conclude, we have $M = \frac{1}{1-a}$ and the probability of acceptance is $\frac{1}{M} = 1-a$
### Question 7
```{r}
accept_reject <- function(n, a) {
X <- numeric(0)
M <- 1 / (1 - a)
while (length(X) < n) {
Y <- rnorm(1, mu1, s1)
U <- runif(1)
if (U <= f(a, mu1, mu2, s1, s2, Y) / (M * dnorm(Y, mu1, s1))) {
X <- append(X, Y)
}
}
return(X)
}
```
```{r}
set.seed(123)
n <- 10000
a <- 0.2
X <- accept_reject(n, a)
x <- seq(-10, 10, length.out = 1000)
hist(X, breaks = 100, freq = FALSE, col = "lightblue", main = "Empirical density function", xlab = "x")
lines(x, f(a, mu1, mu2, s1, s2, x), col = "red")
legend("topright", legend = c("Theorical density", "Empirical density"), col = c("red", "lightblue"), lty = 1)
```
```{r}
set.seed(123)
acceptance_rate <- function(n, a = 0.2) {
Y <- rnorm(n, mu1, s1)
U <- runif(n)
return(mean(U <= f(a, mu1, mu2, s1, s2, Y) / (M * dnorm(Y, mu1, s1))))
}
M <- 1 / (1 - a)
n <- 10000
cat(sprintf("[M = %.2f] Empirical acceptance rate: %f, Theoretical acceptance rate: %f \n", M, acceptance_rate(n), 1 / M))
```
### Question 8
```{r}
set.seed(123)
a_values <- seq(0.01, 1, length.out = 100)
acceptance_rates <- numeric(length(a_values))
as <- a_star(mu1, mu2, s1, s2)
for (i in seq_along(a_values)) {
acceptance_rates[i] <- acceptance_rate(n, a_values[i])
}
plot(a_values, acceptance_rates, type = "l", col = "blue", xlab = "a", ylab = "Acceptance rate", main = "Acceptance rate as a function of a")
points(as, acceptance_rate(n, as), col = "red", pch = 16)
legend("topright", legend = c("Acceptance rate for all a", "Acceptance rate for a_star"), col = c("lightblue", "red"), lty = c(1, NA), pch = c(NA, 16))
```
## Random Variable simulation with stratification
### Question 9
We consider a partition $\mathcal{P}= (D_0,D_1,...,D_k)$, $k \in \mathbb{N}$ of $\mathbb{R}$ such that $D_0$ covers the tails of $f_1$ and $f_1$ is upper bounded and $f_2$ lower bounded in $D_1, \dots, D_k$.
To simulate under $f$ using the accept-reject algorithm, we need to find a density function $g$ such that $f(x) \le M g(x)$ for all $x \in \mathbb{R}$, where M is a constant.\
We generate $X \sim g$ and $U \sim \mathcal{U}([0,1])$ $\textit{(1)}$.\
We accept $Y = X$ if $U \le \frac{f(X)}{Mg(X)}$. Otherwise, we return to $\textit{(1)}$.
Here we pose $g(x) = f_1(x)$ if $x \in D_0$ and $g(x) = sup_{D_i} f_1(x)$ if $x \in D_i, i \in \{1, \dots, k\}$.
To conclude, we have $M = \frac{1}{1-a}$ and the probability of acceptance is $r = \frac{1}{M} = 1-a$
### Question 10
Let $P_n = (D_0, \dots D_n)$ a partition of $\mathbb{R}$ for $n \in \mathbb{N}$. We have $\forall x \in P_n$ and $\forall i \in \{0, \dots, n\}$, $\lim_{n \to\infty} sup_{D_i} f_1(x) = f_1(x)$ and $\lim_{x\to\infty} inf_{D_i} f_2(x) = f_2(x)$.
$\Rightarrow \lim_{x\to\infty} g(x) = f(x)$
$\Rightarrow \lim_{x\to\infty} \frac{g(x)}{f(x)} = 1$ as $f(x) > 0$ as $f$ is a density function.
$\Rightarrow \forall \epsilon > 0, \exists n_{\epsilon} \in \mathbb{N}$ such that $\forall n \ge n_{\epsilon}$, $M = sup_{x \in P_n} \frac{g(x)}{f(x)} < \epsilon$
$\Rightarrow r = \frac{1}{M} > \frac{1}{\epsilon} := \delta \in ]0, 1]$ where $r$ is the acceptance rate defined in the question 10.
### Question 11
We recall the parameters and the functions of the problem.
```{r}
mu1 <- 0
mu2 <- 1
s1 <- 3
s2 <- 1
a <- 0.2
f1 <- function(x) {
dnorm(x, mu1, s1)
}
f2 <- function(x) {
dnorm(x, mu2, s2)
}
f <- function(x) {
fx <- f1(x) - a * f2(x)
fx[fx < 0] <- 0
return(fx / (1 - a))
}
f1_bounds <- c(mu1 - 3 * s1, mu1 + 3 * s1)
```
We implement the partition, the given $g$ function to understand the behavior of $g$ compared to $f$ and the computation of the supremum and infimum of $f_1$ and $f_2$ on each partition.
```{r}
create_partition <- function(k = 10) {
return(seq(f1_bounds[1], f1_bounds[2], , length.out = k))
}
sup_inf <- function(f, P, i) {
x <- seq(P[i], P[i + 1], length.out = 1000)
f_values <- sapply(x, f)
return(c(max(f_values), min(f_values)))
}
g <- function(X, P) {
values <- numeric(0)
for (x in X) {
if (x <= P[1] | x >= P[length(P)]) {
values <- c(values, 1 / (1 - a) * f1(x))
} else {
for (i in 1:(length(P) - 1)) {
if (x >= P[i] & x <= P[i + 1]) {
values <- c(values, 1 / (1 - a) * (sup_inf(f1, P, i)[1]) - a * sup_inf(f2, P, i)[2])
}
}
}
}
return(values)
}
```
We plot the function $f$ and the dominating function $g$ for different sizes of the partition.
```{r}
library(ggplot2)
X <- seq(-12, 12, length.out = 1000)
# Plot for different k with ggplot on same graph
k_values <- c(10, 20, 50, 100)
P_values <- lapply(k_values, create_partition)
g_values <- lapply(P_values, g, X)
ggplot() +
geom_line(aes(x = X, y = f(X)), col = "red") +
geom_line(aes(x = X, y = g(X, P_values[[1]])), col = "green") +
geom_line(aes(x = X, y = g(X, P_values[[2]])), col = "orange") +
geom_line(aes(x = X, y = g(X, P_values[[3]])), col = "purple") +
geom_line(aes(x = X, y = g(X, P_values[[4]])), col = "black") +
labs(title = "Dominating function g of f for different size of partition", x = "x", y = "Density") +
theme_minimal()
```
Here, we implement the algorithm of accept-reject with the given partition and an appropriate function $g$ to compute $f$.
```{r}
set.seed(123)
g_accept_reject <- function(x, P) {
if (x < P[1] | x >= P[length(P)]) {
return(f1(x))
} else {
for (i in seq_along(P)) {
if (x >= P[i] & x < P[i + 1]) {
return(sup_inf(f1, P, i)[1])
}
}
}
}
stratified <- function(n, P) {
samples <- numeric(0)
rate <- 0
while (length(samples) < n) {
x <- rnorm(1, mu1, s1)
u <- runif(1)
if (u <= f(x) * (1 - a) / g_accept_reject(x, P)) {
samples <- c(samples, x)
}
rate <- rate + 1
}
list(samples = samples, acceptance_rate = n / rate)
}
n <- 10000
k <- 100
P <- create_partition(k)
samples <- stratified(n, P)
X <- seq(-10, 10, length.out = 1000)
hist(samples$samples, breaks = 50, freq = FALSE, col = "lightblue", main = "Empirical density function f", xlab = "x")
lines(X, f(X), col = "red")
```
We also compute the acceptance rate of the algorithm.
```{r}
theorical_acceptance_rate <- 1 - a
cat(sprintf("Empirical acceptance rate: %f, Theoretical acceptance rate: %.1f \n", samples$acceptance_rate, theorical_acceptance_rate))
```
### Question 12
```{r}
set.seed(123)
stratified_delta <- function(n, delta) {
samples <- numeric(0)
P <- create_partition(n * delta)
rate <- 0
while (length(samples) < n | rate < delta * n) {
x <- rnorm(1, mu1, s1)
u <- runif(1)
if (u <= f(x) * delta / g_accept_reject(x, P)) {
samples <- c(samples, x)
}
rate <- rate + 1
}
list(samples = samples, partition = P, acceptance_rate = n / rate)
}
n <- 10000
delta <- 0.8
samples_delta <- stratified_delta(n, delta)
X <- seq(-10, 10, length.out = 1000)
hist(samples_delta$samples, breaks = 50, freq = FALSE, col = "lightblue", main = "Empirical density function f", xlab = "x")
lines(X, f(X), col = "red")
```
We also compute the acceptance rate of the algorithm.
```{r}
theorical_acceptance_rate <- 1 - a
cat(sprintf("Empirical acceptance rate: %f, Theoretical acceptance rate: %f \n", samples_delta$acceptance_rate, theorical_acceptance_rate))
```
Now, we will test the stratified_delta function for different delta:
```{r}
set.seed(123)
n <- 1000
deltas <- seq(0.1, 1, by = 0.1)
for (delta in deltas) {
samples <- stratified_delta(n, delta)
cat(sprintf("Delta: %.1f, Empirical acceptance rate: %f \n", delta, samples$acceptance_rate))
}
```
## Cumulative density function.
### Question 13
The cumulative density function $\forall x \in \mathbb{R}$ $F_X(x) = \int_{-\infty}^{x} f(t) \,dt = \int_{\mathbb{R}} f(t) h(t), dt$ where $h(x) = \mathbb{1}_{X_n \le x}$
For a given $x \in \mathbb{R}$, a Monte Carlo estimator $F_n(x) = \frac{1}{n} \sum_{i=1}^{n} h(X_i)$ where $h$ is the same function as above and $(X_i)_{i=1}^{n} \sim^{iid} X$
### Question 14
As $X_1, \dots, X_n$ are iid and follows the law of X, and $h$ is continuous by parts, and positive, we have $h(X_1), \dots h(X_n)$ are iid and $\mathbb{E}[h(X_i)] < + \infty$. By the law of large numbers, we have $F_n(X) = \frac{1}{n} \sum_{i=1}^{n} h(X_i) \xrightarrow{a.s} \mathbb{E}[h(X_1)] = F_X(x)$.
Moreover, $\forall \epsilon > 0$, $\exists N \in \mathbb{N}$ such that $\forall n \le N$, $|F_n(x) - F_X(x)| < \epsilon$, ie, $sup_{x \in \mathbb{R}} |F_n(x) - F_X(x)| \xrightarrow{a.s} 0$, by Glivenko-Cantelli theorem.
Hence, $F_n$ is a good estimate of $F_X$ as a function of $x$.
### Question 15
```{r}
set.seed(123)
n <- 10000
Xn <- (rnorm(n, mu1, s1) - a * rnorm(n, mu2, s2)) / (1 - a)
X <- seq(-10, 10, length.out = n)
h <- function(x, Xn) {
return(Xn <= x)
}
# Fn empirical
empirical_cdf <- function(x, Xn) {
return(mean(h(x, Xn)))
}
# F theoretical
F <- function(x) {
Fx <- pnorm(x, mu1, s1) - a * pnorm(x, mu2, s2)
return(Fx / (1 - a))
}
cat(sprintf("Empirical cdf: %f, Theoretical cdf: %f \n", empirical_cdf(X, Xn), mean(F(Xn))))
```
Now we plot the empirical and theoretical cumulative density functions for different n.
```{r}
n_values <- c(10, 100, 1000, 10000)
colors <- c("lightblue", "blue", "darkblue", "navy")
plot(NULL, xlim = c(-10, 10), ylim = c(0, 1), xlab = "u", ylab = "Qn(u)", main = "Empirical vs Theoretical CDF")
for (i in seq_along(n_values)) {
n <- n_values[i]
X <- seq(-10, 10, length.out = n)
Xn <- (rnorm(n, mu1, s1) - a * rnorm(n, mu2, s2)) / (1 - a)
lines(X, sapply(X, empirical_cdf, Xn = Xn), col = colors[i], lt = 2)
}
lines(X, F(X), col = "red", lty = 1, lw = 2)
legend("topright", legend = c("Empirical cdf (n=10)", "Empirical cdf (n=100)", "Empirical cdf (n=1000)", "Empirical cdf (n=10000)", "Theoretical cdf"), col = c(colors, "red"), lty = 1)
```
### Question 16
As $X_1, \dots, X_n$ are iid and follows the law of X, and $h$ is continuous and positive, we have $h(X_1), \dots h(X_n)$ are iid and $\mathbb{E}[h(X_i)^2] < + \infty$. By the Central Limit Theorem, we have $\sqrt{n} \frac{(F_n(x) - F_X(x))}{\sigma} \xrightarrow{d} \mathcal{N}(0, 1)$ where $\sigma^2 = Var(h(X_1))$.
So we have $\lim_{x\to\infty} \mathbb{P}(\sqrt{n} \frac{(F_n(x) - F_X(x))}{\sigma} \le q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0, 1)})= 1 - \alpha$
So by computing the quantile of the normal distribution, we can have a confidence interval for $F_X(x)$ : $F_X(x) \in [F_n(x) - \frac{q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0, 1)} \sigma}{\sqrt{n}} ; F_n(x) + \frac{q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0, 1)} \sigma}{\sqrt{n}}]$
```{r}
set.seed(123)
Fn <- empirical_cdf(X, Xn)
sigma <- sqrt(Fn - Fn^2)
q <- qnorm(0.975)
interval <- c(Fn - q * sigma / sqrt(n), Fn + q * sigma / sqrt(n))
cat(sprintf("Confidence interval: [%f, %f] \n", interval[1], interval[2]))
```
### Question 17
```{r}
compute_n_cdf <- function(x, interval_length = 0.01) {
q <- qnorm(0.975)
ceiling((q^2 * F(x) * (1 - F(x))) / interval_length^2)
}
x_values <- c(-15, -1)
n_values <- sapply(x_values, compute_n_cdf)
data.frame(x = x_values, n = n_values)
```
We notice that the size of the sample needed to estimate the cumulative density function is higher for values of x that are close to the mean of the distribution. At $x = -1$, we are on the highest peek of the function and at $x = -15$ we are on the tail of the function.
## Empirical quantile function
### Question 18
We define the empirical quantile function defined on $(0, 1)$ by : $Q_n(u) = inf\{x \in \mathbb{R} : u \le F_n(x)\}$. We recall the estimator $F_n(x) = \frac{1}{n} \sum_{i=1}^{n} \mathbb{1}_{X_i \le x}$
So we have $Q_n(u) = inf\{x \in \mathbb{R} : u \le \frac{1}{n} \sum_{i=1}^{n} \mathbb{1}_{X_i \le x}\} = inf\{x \in \mathbb{R} : n . u \le \sum_{i=1}^{n} \mathbb{1}_{X_i \le x}\}$
We sort the sample $(X_1, \dots, X_n)$ in increasing order, and we define $X_{(1)} \le X_{(2)} \le \dots \le X_{(n)}$ the order statistics of the sample.
As $\sum_{i=1}^{n} \mathbb{1}_{X_{(i)} \le x} = k$ where $X_{(k)} = max\{ i = 1, \dots, n ; X_{(i)} \le x \}$
But as $F_n$ is a step function, we can simplify the expression to $Q_n(u) = X_{(k)}$ where $k = \lceil n \cdot u \rceil$ and $X_{(k)}$ is the k-th order statistic of the sample $(X_1, \dots, X_n)$.
### Question 19
We note $Y_{j,n} := \mathbb{1}_{X_{n,j} < Q(u) + \frac{t}{\sqrt{n}} \frac{\sqrt{u(1-u)}}{f(Q(u))}}$
We know that $(X_{n,j})$ is iid as $X$ is bounded in j and n.
Let $\Delta_{n} = \frac{t}{\sqrt{n}} \frac{\sqrt{u(1-u)}}{f(Q(u))}$. We have $F_{n}(X) = \frac{1}{n} \sum_{j=1}^{n} \mathbb{1}_{X_{n,j}} < Q(u) + \Delta_{n}$
then $\frac{1}{n} \sum_{j=1}^{n}Y_{j,n} = F_{n}(Q(u) + \Delta_{n})$ by definition of the empirical quantile $F_{n}(Q_{n}(u)) = u$
By Taylor formula, we got $F_{n}(Q(u) + \Delta_{n}) = F_{n}(Q(u)) + \Delta_{n}f(Q(u))$
By Lindbergh-Levy Central Limit Theorem, applied to $F_{n}(Q(u))$ as $\mathbb{E}[F_{n}(Q(u))] = u < +\infty$ and $Var(F_{n}(Q(u))) = u(1-u) < +\infty$, we have $\frac{\sqrt{n}(F_{n}(Q(u)) - u)}{\sqrt{u(1-u)}} \rightarrow \mathcal{N}(0,1)$
then $F_{n}(Q(u)) = u + \frac{1}{\sqrt{n}} Z$ with $Z \sim \mathcal{N}(0,u(1-u))$
Thus $F_{n}(Q(u) + \Delta_{n}) = u + \frac{1}{\sqrt{n}} Z + \Delta_{n} f(Q(u))$
By substituting $Q_{n}(u) = Q(u) + \Delta_{n}$, we have
$$
F_{n}(Q_{n}(u)) = F_{n}(Q(u) + \Delta_{n})\\
\Leftrightarrow u = u + \frac{1}{\sqrt{n}} Z + \Delta_{n} f(Q(u))\\
\Leftrightarrow \Delta_{n} = - \frac{1}{\sqrt{n}} \frac{Z}{f(Q(u))}
$$
As $Q_{n}(u) = Q(u) + \Delta_{n} \Rightarrow \Delta_{n} = Q_{n}(u) - Q(u)$
Then we have
$$
Q_{n}(u) - Q(u) = - \frac{1}{\sqrt{n}} \frac{Z}{f(Q(u))}\\
\Leftrightarrow \sqrt{n}(Q_{n}(u) - Q(u)) = \frac{Z}{f(Q(u))}\\
\Leftrightarrow \sqrt{n}(Q\_{n}(u) - Q(u)) \sim \mathcal{N}(0,\frac{u(1-u)}{f(Q(u))^2})
$$
### Question 20
When $u \rightarrow 0$, $Q(u)$ corresponds to the lower tail of the distribution, and when $u \rightarrow 1$, $Q(u)$ corresponds to the upper tail of the distribution.
So $f(Q(u))$ is higher when $u$ is close to 0 and close to 1, so the variance of the estimator is higher for values of $u$ that are close to 0 and 1. So we need a higher sample size to estimate the quantile function for values of $u$ that are close to 0 and 1.
### Question 21
```{r}
set.seed(123)
empirical_quantile <- function(u, Xn) {
sorted_Xn <- sort(Xn)
k <- ceiling(u * length(Xn))
sorted_Xn[k]
}
```
```{r}
set.seed(123)
n <- 1000
Xn <- (rnorm(n, mu1, s1) - a * rnorm(n, mu2, s2)) / (1 - a)
u_values <- seq(0.01, 0.99, by = 0.01)
Qn_values <- sapply(u_values, empirical_quantile, Xn = Xn)
plot(u_values, Qn_values, col = "blue", xlab = "u", ylab = "Qn(u)", type = "o")
lines(u_values, (qnorm(u_values, mu1, s1) - a * qnorm(u_values, mu2, s2)) / (1 - a), col = "red")
legend("topright", legend = c("Empirical quantile function", "Theoretical quantile function"), col = c("blue", "red"), lty = c(2, 1))
```
### Question 22
We can compute the confidence interval of the empirical quantile function using the Central Limit Theorem.
We obtain the following formula for the confidence interval of the empirical quantile function:
$Q(u) \in [Q_n(u) - q_{1 - \frac{\alpha}{2}}^{\mathcal{N}(0,1)} \frac{\sqrt{u (1-u)}}{\sqrt{n} f(Q(u))}; Q_n(u) + q_{1 - \frac{\alpha}{2}}^{\mathcal{N}(0,1)} \frac{\sqrt{u (1-u)}}{\sqrt{n} f(Q(u))}]$
```{r}
f_q <- function(u) {
f(1 / (1 - a) * (qnorm(u, mu1, s1) - a * qnorm(u, mu2, s2)))
}
compute_n_quantile <- function(u, interval_length = 0.01) {
q <- qnorm(0.975)
ceiling((q^2 * u * (1 - u)) / (interval_length^2 * f_q(u)^2))
}
u_values <- c(0.5, 0.9, 0.99, 0.999, 0.9999)
n_values <- sapply(u_values, compute_n_quantile)
data.frame(u = u_values, n = n_values)
```
We deduce that the size of the sample needed to estimate the quantile function is higher for values of u that are close to 1. This corresponds to the deduction made in question 20.
## Quantile estimation Naïve Reject algorithm
### Question 23
We generate a sample $(X_n)_{n \in \mathbb{N}} \sim^{iid} f_X$ the c.d.f. of $X$
We choose all the $i = 1, \dots, n$ such that $X_i \in A$ and plug them in a new set $(Y_n)_{n \in \mathbb{N}}$
Then, the mean of $Y_n$ simulate a random variable $X$ conditional to the event $X \in A$ with $A \subset \mathbb{R}$
So when n is big enough, we can estimate $\mathbb{P}(X \in A)$ with our sample $(Y_n)_{n \in \mathbb{N}}$ and the mean of $Y_n$.
### Question 24
```{r}
accept_reject_quantile <- function(q, n) {
X_samples <- (rnorm(n, mu1, s1) - a * rnorm(n, mu2, s2)) / (1 - a)
return(mean(X_samples >= q))
}
set.seed(123)
n <- 10000
q_values <- seq(-10, 10, by = 2)
delta_quantile <- sapply(q_values, accept_reject_quantile, n)
data.frame(q = q_values, quantile = delta_quantile)
```
```{r}
plot(q_values, delta_quantile, type = "o", col = "blue", xlab = "q", ylab = "Quantile estimation", main = "Quantile estimation using accept-reject algorithm")
lines(q_values, 1 - (pnorm(q_values, mu1, s1) - a * pnorm(q_values, mu2, s2)) / (1 - a), col = "red")
legend("topright", legend = c("Empirical quantile", "Theoretical quantile"), col = c("blue", "red"), lty = c(2, 1))
```
### Question 25
Let $X_1, \dots, X_n$ iid such that $\mathbb{E}[X_i] < \infty$, we have $\mathbb{1}_{X_1 \ge q}, \dots, \mathbb{1}_{X_n \ge q}$ and $\mathbb{E}[\mathbb{1}_{X_1 \ge q}] = \mathbb{P}(X_i \ge q) \le 1 < + \infty$. By TCL, we have $\sqrt{n} \frac{(\frac{1}{n} \sum_{i=1}^{n} \mathbb{1}_{X_i \ge q} - \mathbb{E}[\mathbb{1}_{X_1 \ge q}])}{\sqrt{Var(\mathbb{1}_{X_1 \ge q}})} \rightarrow Z \sim \mathcal{N}(0, 1)$ in distribution, with
- $\mathbb{E}[\mathbb{1}_{X_1 \ge q}] = \mathbb{P}(X_i \ge q) = \delta$
- $\sqrt{Var(\mathbb{1}_{X_1 \ge q}}) = \mathbb{E}[\mathbb{1}_{X_1 \ge q}^2] - \mathbb{E}[\mathbb{1}_{X_1 \ge q}]^2 = \delta(1-\delta)$
In addition, we have $\hat{\delta}^{NR}_n \rightarrow \delta$ a.s. by the LLN, as it is a consistent estimator of $\delta$ .
The function $x \mapsto \sqrt{\frac{\delta(1-\delta)}{x(1-x)}}$ is continuous on $(0,1)$, we have $\sqrt{\frac{\delta(1-\delta)}{\hat{\delta}^{NR}_n(1-\hat{\delta}^{NR}_n)}} \rightarrow \sqrt{\frac{\delta(1-\delta)}{\delta(1-\delta)}} = 1$, then by Slutsky, we have $\sqrt{n} \frac{(\hat{\delta}^{NR}_n - \delta)}{\sqrt{\delta(1-\delta)})} \sqrt{\frac{\delta(1-\delta)}{\hat{\delta}^{NR}_n(1-\hat{\delta}^{NR}_n)}} = \sqrt{n} \frac{(\hat{\delta}^{NR}_n - \delta)}{\sqrt{\hat{\delta}^{NR}_n(1-\hat{\delta}^{NR}_n)})} \rightarrow Z . 1 = Z \sim \mathcal{N}(0, 1)$.
Now we can compute the confident interval for $\delta$: $IC_n(\delta) = [\hat{\delta}^{NR}_n - \frac{q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0,1)}}{\sqrt{n}}\hat{\delta}^{NR}_n(1-\hat{\delta}^{NR}_n);\hat{\delta}^{NR}_n + \frac{q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0,1)}}{\sqrt{n}}\hat{\delta}^{NR}_n(1-\hat{\delta}^{NR}_n)]$ where $q_{1-\frac{\alpha}{2}}^{\mathcal{N}(0,1)}$ is the quantile of the normal distribution $\mathcal{N}(0,1)$
```{r}
IC_Naive <- function(delta_hat, n, alpha = 0.05) {
q <- qnorm(1 - alpha / 2)
c(lower = delta_hat - q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n),
upper = delta_hat + q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n))
}
set.seed(123)
n <- 10000
q_values <- seq(-10, 10, by = 2)
delta_quantile_naive <- sapply(q_values, accept_reject_quantile, n)
IC_values_naive <- sapply(delta_quantile_naive, IC_Naive, n = n)
data.frame(q = q_values, quantile = delta_quantile_naive, IC = t(IC_values_naive))
```
## Importance Sampling
### Question 26
Let $X_1, \dots, X_n \sim^{iid} g$ where $g$ such that $sup(f) \subseteq sup(g)$ and $g$ density easy to simulate.
We want to determine $\delta = \mathbb{P}(X \ge q) = \int_{q}^{+ \infty} f_X(x) \,dx = \int_{\mathbb{R}} f(x) h(x) \,dx = \mathbb{E}[h(X)]$ for any $q$, with $h(x) = \mathbb{1}_{x \ge q}$ and $f$ the density function of $X$.
Given $X_1, \dots, X_n \sim^{iid} g$ , we have $\hat{\delta}^{IS}_n = \frac{1}{n} \sum_{i=1}^{n} \frac{f(X_i)}{g(X_i)} h(X_i)$
So $\hat{\delta}^{IS}_n$ is an unbiased estimator of $\delta$, since $sup(f) \subseteq sup(g)$.
The importance sampling estimator is preferred to the classical Monte Carlo estimator when the variance of the importance sampling estimator is lower than the variance of the classical Monte Carlo estimator. This is the case when q is located on the tails of the distribution $f$
### Question 27
The density $g$ of a Cauchy distribution for parameters $(\mu_0, \gamma)$ is given by: $g(x; \mu_0, \gamma) = \frac{1}{\pi} \frac{\gamma}{(x - \mu_0)^2 + \gamma^2}$, $\forall x \in \mathbb{R}$.
The parameters $(\mu_0, \gamma)$ of the Cauchy distribution used in importance sampling are chosen based on the characteristics of the target density $f(x)$.
- $\mu_0$ is chosen such that $sup(f) \subseteq sup(g)$, where $sup(f)$ is the support of the target density $f(x)$. $\mu_0$ should be chosen to place the center of $g(x)$ in the most likely region of $f(x)$, which can be approximated by the midpoint between $\mu_1$ and $\mu_2$, ie, $\mu_0 = \frac{\mu_1 + \mu_2}{2} = \frac{0 + 1}{2} = \frac{1}{2}$.
- By centering $g(x)$ at $\mu_0$ and setting $\gamma$ to capture the spread of $f(x)$, we maximize the overlap between $f(x)$ and $g(x)$, reducing the variance of the importance sampling estimator. To cover the broader spread of $f(x)$, $\gamma$ should reflect the scale of the wider normal component. A reasonable choice is to set $\gamma$ to the largest standard deviation, ie, $\gamma = max(s_1, s_2) = max(1, 3) = 3$
### Question 28
We can compute the confidence interval of the importance sampling estimator using the Central Limit Theorem, with the same arguments as in the question 25, since we have a consistent estimator of $\delta$.
```{r}
mu0 <- 0.5
gamma <- 3
g <- function(x) {
dcauchy(x, mu0, gamma)
}
IS_quantile <- function(q, n) {
X <- rcauchy(n, mu0, gamma)
w <- f(X) / g(X)
h <- (X >= q)
return(mean(w * h))
}
```
```{r}
IC_IS <- function(delta_hat, n, alpha = 0.05) {
q <- qnorm(1 - alpha / 2)
c(lower = delta_hat - q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n),
upper = delta_hat + q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n))
}
set.seed(123)
q_values <- seq(-10, 10, by = 2)
n <- 10000
delta_quantile_IS <- sapply(q_values, IS_quantile, n = n)
IC_values_IS <- sapply(delta_quantile_IS, IC_IS, n = n)
data.frame(q = q_values, quantile = delta_quantile_IS, IC = t(IC_values_IS))
```
# Control Variate
### Question 29
Let $X_1, \dots, X_n$ iid. For $f$ density with parameter $\theta$, we can compute the score by deriving the partial derivative of the log-likelihood function. $S(\theta) = \frac{\partial l(\theta)}{\partial \theta} = \sum^{n}_{i=1} \frac{\partial log(f(X_i | \theta)}{\partial \theta} = \frac{f_1(x|\theta_1)}{f_1(x|\theta_1) - a f_2(x|\theta_2)} \frac{x - \mu_1}{\sigma_1^2}$
### Question 30
We recall $\delta = \mathbb{P}(X \ge q) = \int_{q}^{+ \infty} f_X(x) \,dx = \int_\mathbb{R} f_X(x) h(x) \, dx$ where $h(x) = \mathbb{1}_{x \ge q}$. We denote $\hat{\delta}^{IC}_n = \frac{1}{n} \sum^{n}_{i=1} h(X_i) - b \cdot (S_{\mu_1}(X_i) - m) = \frac{1}{n} \sum^{n}_{i=1} \mathbb{1}_{X_i \ge q} - b \cdot S_{\mu_1}(X_i)$ where $m = \mathbb{E}[S_{\mu_1}(X_1)] = 0$ and $b=\frac{Cov(\mathbb{1}_{X_1 \ge q}, \, S_{\mu_1}(X_1))}{Var(S_{\mu_1(X_1)})}$
### Question 31
```{r}
CV_quantile <- function(q, n) {
X <- rnorm(n, mu1, s1)
S <- (f1(X) / (f1(X) - a * f2(X))) * (X - mu1) / s1^2
h <- (X >= q)
b <- cov(S, h) / var(S)
delta <- mean(h) - b * mean(S)
return(delta)
}
```
We can compute the confidence interval of the importance sampling estimator using the Central Limit Theorem, with the same arguments as in the question 25, since we have a consistent estimator of $\delta$.
```{r}
IC_CV <- function(delta_hat, n, alpha = 0.05) {
q <- qnorm(1 - alpha / 2)
c(lower = delta_hat - q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n),
upper = delta_hat + q * sqrt(delta_hat * (1 - delta_hat)) / sqrt(n))
}
set.seed(123)
n <- 10000
q_values <- seq(-10, 10, by = 2)
delta_quantile_CV <- sapply(q_values, CV_quantile, n)
IC_values_CV <- sapply(delta_quantile_CV, IC_CV, n = n)
data.frame(q = q_values, quantile = delta_quantile_CV, IC = t(IC_values_CV))
```
### Question 32
```{r}
set.seed(123)
delta_real <- 1 - (pnorm(q_values, mu1, s1) - a * pnorm(q_values, mu2, s2)) / (1 - a)
IS <- data.frame(IC = t(IC_values_IS), length = IC_values_IS[2] - IC_values_IS[1])
naive <- data.frame(IC = t(IC_values_naive), length = IC_values_naive[2] - IC_values_naive[1])
CV <- data.frame(IC = t(IC_values_CV), length = IC_values_CV[2] - IC_values_CV[1])
data.frame(q = q_values, real_quantile = delta_real, quantile_CV = CV, quantile_IS = IS, quantile_Naive = naive)
```
```{r}
plot(q_values, delta_real, type = "l", col = "red", xlab = "q", ylab = "Quantile estimation", main = "Quantile estimation using different methods")
lines(q_values, delta_quantile_IS, col = "blue")
lines(q_values, delta_quantile_naive, col = "green")
lines(q_values, delta_quantile_CV, col = "orange")
legend("topright", legend = c("Real quantile", "Quantile estimation IS", "Quantile estimation Naive", "Quantile estimation CV"), col = c("red", "blue", "green", "orange"), lty = c(1, 1, 1, 1))
```
Now we compare the three methods: Naive vs Importance Sampling vs Control Variate
The naive method is the easiest to implement but is the least precise for some points.
The importance sampling method is more precise than the naive method but requires the choice of a good density $g$, that can be hard to determine in some case.
The control variate method is the most precise but requires the choice of a good control variate, and need to compute the covariance, which require more computation time.
In our case, the control variate method is the most precise, but the importance sampling method is also a good choice.

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,110 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "c897654e0a140cbd",
"metadata": {},
"source": [
"# Automatic Differentiation\n",
"\n",
"### Neural Network\n",
"\n",
"Loss function: softmax layer in $\\mathbb{R}^3$\n",
"\n",
"Architecture: FC/ReLU 4-5-7-3"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "70a4eb1d928b10d0",
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-24T15:16:27.015669Z",
"start_time": "2025-03-24T15:16:23.856887Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean Accuracy: 94%\n",
"STD Accuracy: 3%\n",
"Max accuracy: 100%\n",
"Min accuracy: 88%\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"from sklearn.datasets import make_classification\n",
"from sklearn.metrics import accuracy_score\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.neural_network import MLPClassifier\n",
"\n",
"accuracies = []\n",
"\n",
"for _ in range(10):\n",
" X, y = make_classification(\n",
" n_samples=1000,\n",
" n_features=4,\n",
" n_classes=3,\n",
" n_clusters_per_class=1,\n",
" )\n",
"\n",
" X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n",
" model = MLPClassifier(\n",
" hidden_layer_sizes=(5, 7),\n",
" activation=\"relu\",\n",
" max_iter=10000,\n",
" solver=\"adam\",\n",
" )\n",
" model.fit(X_train, y_train)\n",
"\n",
" y_pred = model.predict(X_test)\n",
" accuracies.append(accuracy_score(y_test, y_pred))\n",
"\n",
"print(f\"Mean Accuracy: {np.mean(accuracies) * 100:.0f}%\")\n",
"print(f\"STD Accuracy: {np.std(accuracies) * 100:.0f}%\")\n",
"print(f\"Max accuracy: {np.max(accuracies) * 100:.0f}%\")\n",
"print(f\"Min accuracy: {np.min(accuracies) * 100:.0f}%\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "96b6d46883ed5570",
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-24T14:37:53.507776Z",
"start_time": "2025-03-24T14:37:53.505376Z"
}
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,614 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Computer session 2\n",
"\n",
"The goal of this computer class is to get a good feel of the Newton method and its\n",
"variants. In a (maybe) surprising way, we actually start with the dichotomy method\n",
"in the one-dimensional case.\n",
"\n",
"## The dichotomy method in the one-dimensional case\n",
"\n",
"When trying to solve the equation $\\phi(x) = 0$ in the one-dimensional case, the\n",
"most naive method, which actually turns out to be quite efficient, is the dichotomy\n",
"method. Namely, starting from an initial pair $(a_L , a_R ) \\in \\mathbb{R}^2$ with $a_L < a_R$ such\n",
"that $\\phi(a_L)\\phi(a_R)<0$, we set $b =\\frac{a_L+a_R}{2}$. If $\\phi(b) = 0$, the algorithm stops. If\n",
"$\\phi(a_L)\\phi(b) < 0$ we set $a_L\\to a_L$ and $a_R \\to b$. In this way, we obtain a linearly\n",
"converging algorithm. In particular, it is globally converging.\n",
"\n",
"\n",
"Write a function `Dichotomy(phi,aL,aR,eps)` that take sas argument\n",
"a function `phi`, an initial guess `aL,aR` and a tolerance `eps` and that runs the dichotomy\n",
"algorithm. Your argument should check that the condition $\\phi(a_L)\\phi(a_R) < 0$ is satisfied,\n",
"stop when the function `phi` reaches a value lower than `eps` and return the number\n",
"of iteration. Run your algorithm on the function $f = tanh$ with initial guesses\n",
"$a_L = 20$ , $a_R = 3$.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:14.314484Z",
"start_time": "2025-03-18T16:19:13.728014Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(-2.3283064365386963e-09, 31)\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"\n",
"def dichotomy(phi, aL, aR, eps=1e-8):\n",
" iter = 0\n",
" b = (aL + aR) / 2\n",
" while (aR - aL) / 2 > eps and phi(b) != 0:\n",
" b = (aL + aR) / 2\n",
" if phi(aL) * phi(b) < 0:\n",
" aR = b\n",
" else:\n",
" aL = b\n",
" iter += 1\n",
" return b, iter\n",
"\n",
"\n",
"def f(x):\n",
" return np.tanh(x)\n",
"\n",
"\n",
"aL, aR = -20, 3\n",
"print(dichotomy(f, aL, aR))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solving one-dimensional equation with the Newton and the secant method\n",
"\n",
"We work again in the one-dimensional case with a function φ we want to find the\n",
"zeros of.\n",
"\n",
"### Newton method\n",
"\n",
"Write a function `Newton(phi,dphi,x0,eps)` that takes, as arguments, a function\n",
"`phi`, its derivative `dphi`, an initial guess `x0` and a tolerance `eps`\n",
"and that returns an approximation of the solutions of the equation $\\phi(x) = 0$. The\n",
"tolerance criterion should again be that $|\\phi| ≤\\text{\\texttt{eps}}$. Your\n",
"algorithm should return an error message in the following cases:\n",
"1. If the derivative is zero (look up the `try` and `except` commands in Python).\n",
"2. If the method diverges.\n",
"\n",
"Apply this code to the minimisation of $x\\mapsto \\ln(e^x + e^{x})$, with initial condition `x0=1.8`.\n",
"Compare this with the results of Exercise 3.10."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:17.447647Z",
"start_time": "2025-03-18T16:19:17.442560Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/tp/_ld5_pzs6nx6mv1pbjhq1l740000gn/T/ipykernel_25957/3868809151.py:14: RuntimeWarning: overflow encountered in exp\n",
" f = lambda x: np.log(np.exp(x) + np.exp(-x))\n"
]
}
],
"source": [
"def Newton(phi, dphi, x0, eps=1e-10):\n",
" iter = 0\n",
" while abs(phi(x0)) > eps:\n",
" try:\n",
" x0 -= phi(x0) / dphi(x0)\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iter += 1\n",
" if iter > 10000 or phi(x0) == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"def f(x):\n",
" return np.log(np.exp(x) + np.exp(-x))\n",
"\n",
"\n",
"x0 = 1.8\n",
"\n",
"\n",
"def df(x):\n",
" return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))\n",
"\n",
"\n",
"print(Newton(f, df, x0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Secant method\n",
"\n",
"Write a function `Secant(phi,x0,x1,eps)` that takes, as arguments, a function `phi`, two initial positions `x0`, `x1` and a tolerance\n",
"`eps` and that returns an approximation of the solutions of the equation\n",
"$\\phi(x) = 0$. The tolerance criterion should again be that $|\\phi| ≤\\text{\\texttt{eps}}$. Apply this code to the minimisation\n",
"of $x\\mapsto \\ln(e^x + e^{x})$, with initial conditions `x0=1`, `x1=1.9`, then\n",
"`x0=1`, `x1=2.3` and\n",
"`x0=1`, `x1=2.4`. Compare with the results of Exercise 3.10."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:19:19.649523Z",
"start_time": "2025-03-18T16:19:19.456149Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n",
"(inf, 'Method diverges')\n",
"(inf, 'Method diverges')\n"
]
}
],
"source": [
"def Secant(phi, x0, x1, eps=1e-8):\n",
" iter = 0\n",
" while abs(phi(x1)) > eps:\n",
" x1, x0 = x1 - phi(x1) * (x1 - x0) / (phi(x1) - phi(x0)), x0\n",
" iter += 1\n",
" if iter > 10000:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"def f(x):\n",
" return np.log(np.exp(x) + np.exp(-x))\n",
"\n",
"\n",
"xx = [(1, 1.9), (1, 2.3), (1, 2.4)]\n",
"\n",
"for x0, x1 in xx:\n",
" print(Secant(f, x0, x1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Combining dichotomy and the Newton method\n",
"\n",
"A possibility to leverage the advantages of dichotomy (the global convergence of\n",
"the method) and of the Newton method (the quadratic convergence rate) is to\n",
"combine both: start from an initial interval `[aL,aR]` of length `InitialLength`\n",
"with $\\phi(a_L)\\phi(a_R)<0$ and fix a real\n",
"number $s \\in [0; 1]$. Run the dichotomy algorithm until the new interval is of length\n",
"`s*InitialLength`. From this point on, apply the Newton method.\n",
"\n",
"Implement this algorithm with `s = 0.1`. Include a possibility to switch\n",
"back to the dichotomy method if, when switching to the Newton method, the new\n",
"iterate falls outside of the computed interval `[aL,aR]`. Apply this to the minimisation\n",
"of the function $f x\\mapsto \\ln(e^x + e^{x})$ with an initial condition that made the Newton\n",
"method diverge. What can you say about the number of iterations?"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T16:44:41.592150Z",
"start_time": "2025-03-18T16:44:41.584318Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(inf, 'Method diverges')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/tp/_ld5_pzs6nx6mv1pbjhq1l740000gn/T/ipykernel_25957/1578277506.py:23: RuntimeWarning: overflow encountered in exp\n",
" f = lambda x: np.log(np.exp(x) + np.exp(-x))\n"
]
}
],
"source": [
"def DichotomyNewton(phi, dphi, aL, aR, s=0.1, eps=1e-10):\n",
" iter = 0\n",
" inital_length = aR - aL\n",
" while (aR - aL) >= s * inital_length:\n",
" b = (aL + aR) / 2\n",
" if phi(aL) * phi(b) < 0:\n",
" aR = b\n",
" else:\n",
" aL = b\n",
" iter += 1\n",
" x0 = (aL + aR) / 2\n",
" while abs(phi(x0)) > eps:\n",
" try:\n",
" x0 -= phi(x0) / dphi(x0)\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iter += 1\n",
" if iter > 10000 or phi(x0) == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" return x0, iter\n",
"\n",
"\n",
"def f(x):\n",
" return np.log(np.exp(x) + np.exp(-x))\n",
"\n",
"\n",
"def df(x):\n",
" return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))\n",
"\n",
"\n",
"print(DichotomyNewton(f, df, -20, 3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solving an optimisation problem using the Newton method\n",
"\n",
"An island (denoted by a point $I$ below) is situated 2 kilometers from the shore (its projection on the shore\n",
"is a point $P$). A guest staying at a nearby hotel $H$ wants to go from the hotel to the\n",
"island and decides that he will run at 8km/hr for a distance $x$, before swimming at\n",
"speed 3km/hr to reach the island.\n",
"\n",
"![illustration of the problem](./images/optiNewton.png)\n",
"\n",
"Taking into account the fact that there are 6 kilometers between the hotel $H$ and $P$,\n",
"how far should the visitor run before swimming?\n",
"\n",
"Model the situation as a minimisation problem, and solve it numerically.\n",
"Compare the efficiency of the dichotomy method and of the Newton algorithm."
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"ExecuteTime": {
"end_time": "2025-03-18T17:43:43.061916Z",
"start_time": "2025-03-18T17:43:43.042625Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimal point (Newton): 0.9299901531755377\n",
"Objective function value at optimal point (Newton): 1.9329918821224974\n",
"Number of iterations (Newton): 100\n",
"Optimal point (Dichotomy): inf\n",
"Objective function value at optimal point (Dichotomy): inf\n",
"Number of iterations (Dichotomy): Method diverges\n"
]
}
],
"source": [
"def u(x):\n",
" return np.sqrt((6 - x) ** 2 + 4)\n",
"\n",
"\n",
"def objective_function(x):\n",
" return x / 8 + u(x) / 3\n",
"\n",
"\n",
"def gradient(x):\n",
" return 1 / 8 + (6 - x) / (3 * u(x))\n",
"\n",
"\n",
"def hessian(x):\n",
" return (12 * u(x) - (2 * x - 12) ** 2 / 12 * u(x)) / 36 * u(x) ** 2\n",
"\n",
"\n",
"# Newton's method for optimization\n",
"def newton_method(initial_guess, tolerance=1e-6, max_iterations=100):\n",
" x = initial_guess\n",
" iterations = 0\n",
" while iterations < max_iterations:\n",
" grad = gradient(x)\n",
" hess = hessian(x)\n",
" if np.abs(grad) < tolerance:\n",
" break\n",
" x -= grad / hess\n",
" iterations += 1\n",
" return x, iterations\n",
"\n",
"\n",
"# Dichotomy method for optimization\n",
"def dichotomy_method(aL, aR, eps=1e-6, max_iterations=1000):\n",
" iterations = 0\n",
" x0 = (aL + aR) / 2\n",
" grad = gradient(x0)\n",
" hess = hessian(x0)\n",
" while abs(grad) > eps:\n",
" try:\n",
" x0 -= grad / hess\n",
" except ZeroDivisionError:\n",
" return np.inf, \"Derivative is zero\"\n",
" iterations += 1\n",
" if iterations > max_iterations or grad == np.inf:\n",
" return np.inf, \"Method diverges\"\n",
" grad = gradient(x0)\n",
" hess = hessian(x0)\n",
" return x0, iterations\n",
"\n",
"\n",
"# Initial guess for Newton's method\n",
"initial_guess_newton = 4\n",
"\n",
"# Run Newton's method\n",
"optimal_point_newton, iterations_newton = newton_method(initial_guess_newton)\n",
"print(f\"Optimal point (Newton): {optimal_point_newton}\")\n",
"print(\n",
" f\"Objective function value at optimal point (Newton): {objective_function(optimal_point_newton)}\",\n",
")\n",
"print(f\"Number of iterations (Newton): {iterations_newton}\")\n",
"\n",
"# Initial interval for dichotomy method\n",
"aL, aR = 0, 6\n",
"\n",
"# Run dichotomy method\n",
"optimal_point_dichotomy, iterations_dichotomy = dichotomy_method(aL, aR)\n",
"print(f\"Optimal point (Dichotomy): {optimal_point_dichotomy}\")\n",
"print(\n",
" f\"Objective function value at optimal point (Dichotomy): {objective_function(optimal_point_dichotomy)}\",\n",
")\n",
"print(f\"Number of iterations (Dichotomy): {iterations_dichotomy}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Newton method to solve boundary value problems\n",
"\n",
"### Shooting method\n",
"\n",
"We consider the following non-linear ODE\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y(b)=\\beta.\n",
"\\end{equation}\n",
"\n",
"\n",
"To classically integrate such an ODE, we usually dont have endpoints for $y$,\n",
"but initial values for $y$ and $y'$. So, we cannot start at $x=a $ and integrate\n",
"up to $x=b$. This is a _boundary value problem_.\n",
"\n",
"One approach is to approximate $y$ by somme finite difference and then arrive at\n",
"a system for the discrete values $y(x_i)$ and finally solve large linear\n",
"systems.\n",
"\n",
"Here, we will see how we can formulate the problem as a _shooting_ method, and\n",
"use Newton method so solve it.\n",
"\n",
"The idea is to use a _guess_ for the initial value of $y'$. Let $s$ be a\n",
"parameter for a fonction $y(\\;\\cdot\\;;s)$ solution of (1) such that\n",
"$$y(a;s)=\\alpha,\\text{ and }y'(a;s)=s.$$\n",
"\n",
"There is no chance that $y(b;s)=y(b)=\\beta$ but we can adjust the value of $s$,\n",
"refining the guess until it is (nearly equal to) the right value.\n",
"\n",
"This method is known as _shooting_ method in analogy to shooting a ball at a\n",
"goal, determining the unknown correct velocity by throwing it too fast/too\n",
"slow until it hits the goal exactly.\n",
"\n",
"#### In Practice\n",
"\n",
"For the parameter $s$, we integrate the following ODE:\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y'(a)=s.\n",
"\\end{equation}\n",
"\n",
"We denote $y(\\;\\cdot\\;;s)$ solution of (2).\n",
"\n",
"Let us now define the _goal function_. Here, we want that $y(b;s)=\\beta$, hence,\n",
"we define:\n",
"$$g:s\\mapsto \\left.y(x;s)\\right|_{x=b}-\\beta$$\n",
"\n",
"We seek $s^*$ such that $g(s^*)=0$.\n",
"\n",
"Note that computing $g(s)$ involves the integration of an ODE, so each\n",
"evaluation of $g$ is expensive. Newtons method seems then to be a good\n",
"way due to its fast convergence.\n",
"\n",
"To be able to code a Newtons method, we need to compute the derivative of $g$.\n",
"For this purpose, let define\n",
"$$z(x;s)=\\frac{\\partial y(x;s)}{\\partial s}.$$\n",
"\n",
"Then by differentiating (2) with respect to $s$, we get\n",
"$$z''=\\frac{\\partial f}{\\partial y}z+\\frac{\\partial f}{\\partial y'}z',\\quad\n",
"z(a;s)=0,\\text{ and }z'(a;s)=1.$$\n",
"\n",
"The derivative of $g$ can now be expressed in term of $z$:\n",
"$$g'(z)=z(b;s).$$\n",
"\n",
"Putting this together, we can code the Newton's method:\n",
"$$s_{n+1}=s_n-\\frac{g(s_n)}{g'(s_n)}.$$\n",
"\n",
"To sum up, a shooting method requires an ODE solver and a Newton solver.\n",
"\n",
"#### Example\n",
"\n",
"Apply this method to\n",
"$$y''=2y^3-6y-2x^3,\\quad y(1)=2,y(2)=5/2,$$\n",
"with standard library for integration, and your own Newton implementation.\n",
"\n",
"Note that you may want to express this with one order ODE. Moreover, it may be\n",
"simpler to solve only one ODE for both $g$ and $g'$.\n",
"\n",
"With python, you can use `scipy.integrate.solve_ivp` function:\n",
"https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html#scipy.integrate.solve_ivp\n",
"\n",
"Plot the solution $y$.\n",
"\n",
"For numerical parameters, compute the solution up to a precision of $10^{-8}$\n",
"and get the function on a grid of 1000 points.\n",
"\n",
"## Finite differences\n",
"\n",
"Here, we are going to use a different approach to solve the boundary value\n",
"problem:\n",
"\n",
"\\begin{equation}\n",
"y''= f(x,y,y'),\\quad x\\in[a,b],\\quad y(a)=\\alpha, y(b)=\\beta.\n",
"\\end{equation}\n",
"\n",
"This problem can be solved by the following direct process:\n",
"\n",
"1. We discretize the domain choosing an integer $N$, grid points $\\{x_n\\}_{n=0,\\dots,N}$ and\n",
" we define the discrete solution $\\{y_n\\}_{n=0,\\dots,N}$.\n",
"1. We discretize the ODE using derivative approximation with finite differences\n",
" in the interior of the domain.\n",
"1. We inject the boundary conditions (here $y_0=\\alpha$ and $y_N=\\beta$) in the\n",
" discretized ODE.\n",
"1. Solve the system of equation for the unknows $\\{y_n\\}_{n=1,\\dots,N-1}$.\n",
"\n",
"We use here a uniform grid :\n",
"$$h:=(b-a)/N, \\quad \\forall n=0,\\dots, N\\quad x_n=hn.$$\n",
"If we use a centered difference formula for $y''$ and $y'$, we obtain:\n",
"$$\\forall n=1,\\dots,N-1,\\quad \\frac{y_{n+1}-2y_n+y_{n-1}}{h^2}=f\\left(x_n,y_n,\\frac{y_{n+1}-y_{n-1}}{2h}\\right).$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is a system of equations for $\\mathbf{y}=(y_1,\\dots,y_{N-1})$ :\n",
"$$G(\\mathbf{y})=0,\\quad G:\\mathbb{R}^{N-1}\\to\\mathbb{R}^{N-1}.$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This system can be solved using Newton's method. Note that the Jacobian\n",
"$\\partial G/\\partial \\mathbb{y}$ is _tridiagonal_.\n",
"\n",
"Of course, here, we are in the multidimensional context, so you will have to\n",
"code a Newton algorithm well suited.\n",
"\n",
"#### Example\n",
"\n",
"Apply this method to\n",
"$$y''=2y^3-6y-2x^3,\\quad y(1)=2,y(2)=5/2.$$\n",
"\n",
"Plot the solution $y$.\n",
"\n",
"For numerical parameters, compute the solution up to a precision of $10^{-8}$\n",
"and get the function on a grid of 1000 points."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Remark:** In the context of numerical optimal control, these two numerical\n",
"methods are often called _indirect method_ (for the shooting method) and _direct\n",
"method_ (for the finite difference method)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Whos the best?\n",
"\n",
"Compare the two methods playing with parameters (grid discretization, precision,\n",
"initialization, etc.). Mesure the time computation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

File diff suppressed because one or more lines are too long

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.4 KiB

View File

@@ -0,0 +1,524 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "81049114d821d00e",
"metadata": {},
"source": [
"# Project - Portfolio Management\n",
"\n",
"## Group: Danjou Arthur & Forest Thais\n",
"\n",
"### Time period studied from 2017-01-01 to 2018-01-01\n",
"\n",
"### Risk-free rate: 2%"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "initial_id",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:46.298758Z",
"start_time": "2024-11-25T13:43:46.293696Z"
},
"collapsed": true
},
"outputs": [],
"source": [
"import yfinance as yf\n",
"\n",
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "9f9fc36832c97e0",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:47.318911Z",
"start_time": "2024-11-25T13:43:47.198820Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[*********************100%***********************] 1 of 1 completed\n",
"[*********************100%***********************] 1 of 1 completed\n",
"[*********************100%***********************] 1 of 1 completed\n",
"[*********************100%***********************] 1 of 1 completed\n",
"/var/folders/tp/_ld5_pzs6nx6mv1pbjhq1l740000gn/T/ipykernel_92506/348989065.py:9: FutureWarning: DataFrame.interpolate with method=pad is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.\n",
" S = S.interpolate(method=\"pad\")\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>^RUT</th>\n",
" <th>^IXIC</th>\n",
" <th>^GSPC</th>\n",
" <th>XWD.TO</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Date</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2017-01-03 00:00:00+00:00</th>\n",
" <td>1365.489990</td>\n",
" <td>5429.080078</td>\n",
" <td>2257.830078</td>\n",
" <td>38.499630</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-01-04 00:00:00+00:00</th>\n",
" <td>1387.949951</td>\n",
" <td>5477.000000</td>\n",
" <td>2270.750000</td>\n",
" <td>38.553375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-01-05 00:00:00+00:00</th>\n",
" <td>1371.939941</td>\n",
" <td>5487.939941</td>\n",
" <td>2269.000000</td>\n",
" <td>38.481716</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-01-06 00:00:00+00:00</th>\n",
" <td>1367.280029</td>\n",
" <td>5521.060059</td>\n",
" <td>2276.979980</td>\n",
" <td>38.517544</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-01-09 00:00:00+00:00</th>\n",
" <td>1357.489990</td>\n",
" <td>5531.819824</td>\n",
" <td>2268.899902</td>\n",
" <td>38.383186</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" ^RUT ^IXIC ^GSPC XWD.TO\n",
"Date \n",
"2017-01-03 00:00:00+00:00 1365.489990 5429.080078 2257.830078 38.499630\n",
"2017-01-04 00:00:00+00:00 1387.949951 5477.000000 2270.750000 38.553375\n",
"2017-01-05 00:00:00+00:00 1371.939941 5487.939941 2269.000000 38.481716\n",
"2017-01-06 00:00:00+00:00 1367.280029 5521.060059 2276.979980 38.517544\n",
"2017-01-09 00:00:00+00:00 1357.489990 5531.819824 2268.899902 38.383186"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>^RUT</th>\n",
" <th>^IXIC</th>\n",
" <th>^GSPC</th>\n",
" <th>XWD.TO</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Date</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2017-12-22 00:00:00+00:00</th>\n",
" <td>1542.930054</td>\n",
" <td>6959.959961</td>\n",
" <td>2683.340088</td>\n",
" <td>44.323349</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-12-26 00:00:00+00:00</th>\n",
" <td>1544.229980</td>\n",
" <td>6936.250000</td>\n",
" <td>2680.500000</td>\n",
" <td>44.323349</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-12-27 00:00:00+00:00</th>\n",
" <td>1543.939941</td>\n",
" <td>6939.339844</td>\n",
" <td>2682.620117</td>\n",
" <td>44.052303</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-12-28 00:00:00+00:00</th>\n",
" <td>1548.930054</td>\n",
" <td>6950.160156</td>\n",
" <td>2687.540039</td>\n",
" <td>43.857414</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2017-12-29 00:00:00+00:00</th>\n",
" <td>1535.510010</td>\n",
" <td>6903.390137</td>\n",
" <td>2673.610107</td>\n",
" <td>43.784576</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" ^RUT ^IXIC ^GSPC XWD.TO\n",
"Date \n",
"2017-12-22 00:00:00+00:00 1542.930054 6959.959961 2683.340088 44.323349\n",
"2017-12-26 00:00:00+00:00 1544.229980 6936.250000 2680.500000 44.323349\n",
"2017-12-27 00:00:00+00:00 1543.939941 6939.339844 2682.620117 44.052303\n",
"2017-12-28 00:00:00+00:00 1548.930054 6950.160156 2687.540039 43.857414\n",
"2017-12-29 00:00:00+00:00 1535.510010 6903.390137 2673.610107 43.784576"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"(251, 4)\n"
]
}
],
"source": [
"# Data Extraction\n",
"Tickers = [\"^RUT\", \"^IXIC\", \"^GSPC\", \"XWD.TO\"]\n",
"start_input = \"2017-01-01\"\n",
"end_input = \"2018-01-01\"\n",
"S = pd.DataFrame()\n",
"for t in Tickers:\n",
" S[t] = yf.Tickers(t).history(start=start_input, end=end_input)[\"Close\"]\n",
"\n",
"S = S.interpolate(method=\"pad\")\n",
"\n",
"# Show the first five and last five values extracted\n",
"display(S.head())\n",
"display(S.tail())\n",
"print(S.shape)"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "53483cf3a925a4db",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:50.080380Z",
"start_time": "2024-11-25T13:43:50.073119Z"
}
},
"outputs": [],
"source": [
"R = S / S.shift() - 1\n",
"R = R[1:]\n",
"mean_d = R.mean()\n",
"covar_d = R.cov()\n",
"corr = R.corr()"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "c327ed5967b1f442",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:50.965092Z",
"start_time": "2024-11-25T13:43:50.961969Z"
}
},
"outputs": [],
"source": [
"mean = mean_d * 252\n",
"covar = covar_d * 252\n",
"std = np.sqrt(np.diag(covar))"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "6bc6a850bf06cc9d",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:51.701725Z",
"start_time": "2024-11-25T13:43:51.695020Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean:\n",
"\n",
"^RUT 0.125501\n",
"^IXIC 0.246863\n",
"^GSPC 0.172641\n",
"XWD.TO 0.133175\n",
"dtype: float64\n",
"\n",
"Covariance:\n",
"\n",
" ^RUT ^IXIC ^GSPC XWD.TO\n",
"^RUT 0.014417 0.008400 0.006485 0.004797\n",
"^IXIC 0.008400 0.009182 0.005583 0.004337\n",
"^GSPC 0.006485 0.005583 0.004426 0.003309\n",
"XWD.TO 0.004797 0.004337 0.003309 0.006996\n",
"\n",
"Standard Deviation:\n",
"\n",
"[0.12007222 0.09582499 0.06653127 0.08364295]\n",
"\n",
"Correlation:\n",
"\n",
" ^RUT ^IXIC ^GSPC XWD.TO\n",
"^RUT 1.000000 0.730047 0.811734 0.477668\n",
"^IXIC 0.730047 1.000000 0.875687 0.541087\n",
"^GSPC 0.811734 0.875687 1.000000 0.594658\n",
"XWD.TO 0.477668 0.541087 0.594658 1.000000\n"
]
}
],
"source": [
"print(\"Mean:\\n\")\n",
"print(mean)\n",
"print(\"\\nCovariance:\\n\")\n",
"print(covar)\n",
"print(\"\\nStandard Deviation:\\n\")\n",
"print(std)\n",
"print(\"\\nCorrelation:\\n\")\n",
"print(corr)"
]
},
{
"cell_type": "markdown",
"id": "fc4bec874f710f7c",
"metadata": {},
"source": "# Question 1"
},
{
"cell_type": "code",
"execution_count": 56,
"id": "780c9cca6e0ed2d3",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:53.113423Z",
"start_time": "2024-11-25T13:43:53.109514Z"
}
},
"outputs": [],
"source": [
"r = 0.02\n",
"d = len(Tickers)\n",
"vec1 = np.linspace(1, 1, d)\n",
"sigma = covar\n",
"inv_sigma = np.linalg.inv(sigma)\n",
"\n",
"a = vec1.T.dot(inv_sigma).dot(vec1)\n",
"b = mean.T.dot(inv_sigma).dot(vec1)"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "81c956f147c68070",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:54.545400Z",
"start_time": "2024-11-25T13:43:54.541579Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Expected return m_T: 0.2364033641931515\n",
"Standard deviation sd_T: 0.07276528490265963\n",
"Allocation pi_T: [-0.60853811 0.45748917 1.17944152 -0.02839259]\n",
"We can verify that the allocation is possible as the sum of the allocations for the different indices is 0.9999999999999993, that is very close to 1\n"
]
}
],
"source": [
"# Tangent portfolio\n",
"pi_T = inv_sigma.dot(mean - r * vec1) / (b - r * a)\n",
"sd_T = np.sqrt(pi_T.T.dot(sigma).dot(pi_T)) # Variance\n",
"m_T = pi_T.T.dot(mean) # expected return\n",
"\n",
"print(f\"Expected return m_T: {m_T}\")\n",
"print(f\"Standard deviation sd_T: {sd_T}\")\n",
"print(f\"Allocation pi_T: {pi_T}\")\n",
"print(\n",
" f\"We can verify that the allocation is possible as the sum of the allocations for the different indices is {sum(pi_T)}, that is very close to 1\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2e121c2dfb946f3c",
"metadata": {},
"source": "# Question 2"
},
{
"cell_type": "code",
"execution_count": 58,
"id": "c169808384ca1112",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:43:59.797115Z",
"start_time": "2024-11-25T13:43:59.792462Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The annualized volatilities of the index ^RUT is 0.12007221535411407\n",
"The annualized expected returns of the index ^RUT is 0.12550141384538263\n",
"\n",
"The annualized volatilities of the index ^IXIC is 0.09582499431305072\n",
"The annualized expected returns of the index ^IXIC is 0.24686267015709437\n",
"\n",
"The annualized volatilities of the index ^GSPC is 0.06653126757186174\n",
"The annualized expected returns of the index ^GSPC is 0.17264098207081371\n",
"\n",
"The annualized volatilities of the index XWD.TO is 0.08364295296865466\n",
"The annualized expected returns of the index XWD.TO is 0.1331750489518068\n",
"\n",
"The annualized volatility of the Tangent Portfolio is 1.155113087587201\n",
"The annualized expected return of the Tangent Portfolio is 59.57364777667418\n"
]
}
],
"source": [
"for i in range(len(std)):\n",
" print(f\"The annualized volatilities of the index {Tickers[i]} is {std[i]}\")\n",
" print(\n",
" f\"The annualized expected returns of the index {Tickers[i]} is {mean[Tickers[i]]}\",\n",
" )\n",
" print()\n",
"\n",
"print(f\"The annualized volatility of the Tangent Portfolio is {sd_T * np.sqrt(252)}\")\n",
"print(f\"The annualized expected return of the Tangent Portfolio is {m_T * 252}\")"
]
},
{
"cell_type": "markdown",
"id": "af8d29ecdbf2ae1",
"metadata": {},
"source": "# Question 3"
},
{
"cell_type": "code",
"execution_count": 59,
"id": "2e0215ab7904906a",
"metadata": {
"ExecuteTime": {
"end_time": "2024-11-25T13:44:01.393591Z",
"start_time": "2024-11-25T13:44:01.388830Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"sharpe ratio of the Tangent portfolio : 2.9739918490340687\n",
"the sharpe ratio of the index ^RUT is 0.8786496820620858\n",
"the sharpe ratio of the index ^IXIC is 2.3674686524473625\n",
"the sharpe ratio of the index ^GSPC is 2.294274371158541\n",
"the sharpe ratio of the index XWD.TO is 1.353073330567601\n"
]
}
],
"source": [
"print(\"sharpe ratio of the Tangent portfolio :\", (m_T - r) / sd_T)\n",
"\n",
"for i in range(4):\n",
" print(\n",
" f\"the sharpe ratio of the index {Tickers[i]} is {(mean[Tickers[i]] - r) / std[i]}\",\n",
" )"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Binary file not shown.

After

Width:  |  Height:  |  Size: 783 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.8 MiB

Some files were not shown because too many files have changed in this diff Show More